需要在Beautifulsoup中解析并创建表

问题描述 投票:-1回答:1

我正在尝试从网站上的表中解析和检索文本字符串和值,但不是按常规方式用html代码命名类,而是给它们分别指定了一个随机命名的字符串。

这是链接和包含我要获取的所有值的表:https://www.financeattitude.com/market-data-forex-historical-sentiment

当我检查表时,每个表都被分配了一个类,例如'L-M-eb-ib',任何人都可以在这里帮忙或看看我做错了什么吗?下面是我的代码,现在它什么也不返回。

import requests
from bs4 import BeautifulSoup

page = requests.get('https://www.financeattitude.com/market-data-forex-historical-sentiment')

soup = BeautifulSoup(page.content, 'html.parser')

tag = soup.find_all('L-M-eb-ib')

def hastagbutnoid(tag):
    return tag.has_attr('class') and not tag.has_attr('href')

print(tag)

这里是我想要获取的html(至少我相信是)

AUD / CAD + 48.26%+ 42.82%+ 47.30%+ 46.90%

python web beautifulsoup screen-scraping
1个回答
0
投票
import requests
import re
import json
import pandas as pd

params = {
    'path': 'historical_sentiment_index/data',
    'type': 'swfx',
    'jsonp': '_callbacks____2k8ym6sn2'
}

headers = {
    'Referer': 'https://freeserv.dukascopy.com/2.0/?path=historical_sentiment_index/index&showHeader=false&tableBorderColor=%234bafe9&liquidity=consumers&availableInstruments=EUR/JPY%2CUSD/RUB%2CUSD/HKD%2CAUD/CAD%2CAUD/CHF%2CAUD/JPY%2CAUD/NZD%2CAUD/SGD%2CCAD/CHF%2CCAD/HKD%2CCAD/JPY%2CCHF/JPY%2CCHF/PLN%2CCHF/SGD%2CEUR/AUD%2CEUR/CAD%2CEUR/CHF%2CEUR/DKK%2CEUR/GBP%2CEUR/HKD%2CEUR/NOK%2CEUR/NZD%2CEUR/PLN%2CEUR/RUB%2CEUR/SEK%2CEUR/SGD%2CEUR/TRY%2CGBP/AUD%2CGBP/CAD%2CGBP/CHF%2CGBP/JPY%2CGBP/NZD%2CHKD/JPY%2CNZD/CAD%2CNZD/CHF%2CNZD/JPY%2CSGD/JPY%2CTRY/JPY%2CUSD/CNH%2CUSD/DKK%2CUSD/MXN%2CUSD/NOK%2CUSD/PLN%2CUSD/SEK%2CUSD/SGD%2CUSD/TRY%2CUSD/ZAR%2CZAR/JPY&availableCurrencies=AUD%2CCAD%2CCHF%2CGBP%2CHKD%2CJPY%2CMXN%2CNOK%2CNZD%2CPLN%2CRUB%2CSEK%2CSGD%2CTRY%2CUSD%2CZAR%2CEUR%2CXAG%2CXAU&sort=volume&order=asc&last=true&sixhours=true&oneday=true&fivedays=true&width=100%25&height=1385&adv=popup&lang=en',
}


def main(url):
    with requests.Session() as req:
        r = req.get(
            "https://freeserv.dukascopy.com/2.0/index.php", params=params, headers=headers)
        match = re.search(r"2k8ym6sn2\((\[.*?])", r.text).group(1)
        data = json.loads(match)
        df = pd.DataFrame(data).set_index("name")
        print(df)  # For Full DataFrame view.
        df.to_csv("data.csv")  # to save it in CSV file
        print(df.loc['AUD/CAD']) # you can have it as list or dict with to_list() or to_dict()



main("https://freeserv.dukascopy.com/2.0/index.php")
print(df.loc['AUD/CAD'].to_list())

输出:

['-49.119998931884766', '49.12', '-47.13999938964844', '47.14', '-48.2599983215332', '48.26', '-45.720001220703125', '45.72']
print(df.loc['AUD/CAD'].to_dict())

输出:

{'last_long': '-49.119998931884766', 'last_short': '49.12', 'sixhours_long': '-47.13999938964844', 'sixhours_short': '47.14', 'oneday_long': '-48.2599983215332', 'oneday_short': '48.26', 'fivedays_long': '-45.720001220703125', 'fivedays_short': '45.72'}
© www.soinside.com 2019 - 2024. All rights reserved.