使用 Phyton 进行 Un Comtrade 网页抓取 [已关闭]

问题描述 投票:0回答:1

我刚开始使用 Python 进行网页抓取。我需要从以下 URL 中提取数据:https://comtradeplus.un.org/TradeFlow?Frequency=A&Flows=X&CommodityCodes=TOTAL&Partners=0&Reporters=all&period=2023&AggregateBy=none&BreakdownMode=plus

在此页面底部有一个“下载”按钮,默认情况下会创建一个 CSV 文件并将其复制到 Windows 下载文件夹。我不需要整个数据库;例如,我需要一个代码为 070110 且国家/地区的记者代码为 688 的产品。

有人可以帮助我吗?

我尝试了各种方法,但没有成功,因为我是网络抓取新手。

python web-scraping
1个回答
0
投票

以下是如何查询 REST API 并以 Json 形式获取数据(进入 pandas 数据帧)的示例:

import pandas as pd
import requests

api_url = "https://comtradeapi.un.org/public/v1/preview/C/A/HS"

params = {
    "period": "2023",
    "reporterCode": "688",
    "partnerCode": "0",
    "flowCode": "x",
    "cmdCode": "total,070110",
    "customsCode": "c00",
    "motCode": "0",
    "partner2Code": "0",
    "undefinednone": "",
    "breakdownMode": "plus",
    "includeDesc": "True",
    "countOnlyFalse": "",
}

data = requests.get(api_url, params=params).json()

df = pd.DataFrame(data["data"])
print(df.head())

打印:

  typeCode freqCode  refPeriodId  refYear  refMonth period  reporterCode reporterISO reporterDesc flowCode flowDesc  partnerCode partnerISO partnerDesc  partner2Code partner2ISO partner2Desc classificationCode classificationSearchCode  isOriginalClassification cmdCode                                      cmdDesc  aggrLevel  isLeaf customsCode customsDesc mosCode  motCode    motDesc  qtyUnitCode qtyUnitAbbr       qty  isQtyEstimated  altQtyUnitCode altQtyUnitAbbr    altQty  isAltQtyEstimated    netWgt  isNetWgtEstimated  grossWgt  isGrossWgtEstimated cifvalue      fobvalue  primaryValue  legacyEstimationFlag  isReported  isAggregate
0        C        A     20230101     2023        52   2023           688         SRB       Serbia        X   Export            0        W00       World             0         W00        World                 H6                       HS                      True   TOTAL                              All Commodities          0   False         C00   TOTAL CPC       0        0  TOTAL MOT           -1         N/A       0.0           False              -1            N/A       0.0              False       NaN              False       0.0                False     None  3.093460e+10  3.093460e+10                     0       False         True
1        C        A     20230101     2023        52   2023           688         SRB       Serbia        X   Export            0        W00       World             0         W00        World                 H6                       HS                      True  070110  Vegetables; seed potatoes, fresh or chilled          6    True         C00   TOTAL CPC       0        0  TOTAL MOT            8          kg  114750.0           False               8             kg  114750.0              False  114750.0              False       0.0                False     None  9.332200e+04  9.332200e+04                     0       False         True
© www.soinside.com 2019 - 2024. All rights reserved.