我们如何使用Python获取属性值并获取响应json?

问题描述 投票:0回答:1

我正在尝试获取属性值,如果属性

data-has_lei=\"true\"
然后获取属性值,我们如何获取属性值,例如
data-name, data-country_name, data-reg_code, data-address, data-lei_code, data-lei_registration_authority

import requests
from bs4 import BeautifulSoup

api_url = "https://lei-registrations.in/wp/wp-admin/admin-ajax.php"

params = {
    "term": "ditech process solutions",  # <-- search term
    "country": "IN",
    "action": "get_search_companies",
}

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0"
}

data = requests.get(api_url, params=params, headers=headers).json()

if data["success"]:
    soup = BeautifulSoup(data["data"], "html.parser")
    for r in soup.select(".searchResults_title"):
        name = r.select_one(".searchResults_name").text
        number = r.select_one(".searchResults_number").text

        print(f"{name:<50} {number}")
python python-requests
1个回答
0
投票

以下示例如何从搜索结果创建数据框,然后根据您的条件对其进行过滤:

import pandas as pd
import requests
from bs4 import BeautifulSoup

api_url = "https://lei-registrations.in/wp/wp-admin/admin-ajax.php"

params = {
    "term": "ditech process solutions",  # <-- search term
    "country": "IN",
    "action": "get_search_companies",
}

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0"
}

data = requests.get(api_url, params=params, headers=headers).json()

if data["success"]:
    soup = BeautifulSoup(data["data"], "html.parser")

    all_data = []
    for a in soup.select("a.searchResults_item"):
        data_attrs = {
            k.split("-", maxsplit=1)[-1]: v
            for k, v in a.attrs.items()
            if k.startswith("data")
        }
        all_data.append(data_attrs)

    df = pd.DataFrame(all_data)

    # print only records where has_lei == "true"
    print(df[df.has_lei == "true"])

打印:

         id                                      name country_name country_code               reg_code                                                                                                        address              lei_code lei_next_renewal_date lei_initial_reg_date lei_registration_authority has_lei days_left our_client      lou_name lei_status agent_name    agent_website        agent_price
0  11040218  DITECH PROCESS SOLUTIONS PRIVATE LIMITED        India           IN  U72300MH2008PTC179923  Unit No.401, 402, 403 and 414 at 4th Floor ATL Corporate Park, Saki Vihar Road, Andheri E, Mumbai, 400072, IN  8945004FTGFORRFK3W63            2024-07-26           2023-07-26         Companies Register    true       177          1  EQS Group AG     ISSUED  India LEI  www.indialei.in  from 3860₹ / year
© www.soinside.com 2019 - 2024. All rights reserved.