使用 beautiful soup 合并两个表来提取链接

问题描述 投票:0回答:1

我想从以下网站上抓取前两张表:

https://fbref.com/en/comps/22/Major-League-Soccer-Stats

我需要的表格是前两个,标题为“东部联盟”和“西部联盟”。

我需要提取小队链接,我可以使用以下代码成功地为第一个表完成此操作:

standings_url = "https://fbref.com/en/comps/22/Major-League-Soccer-Stats"
data = requests.get(standings_url)
soup = BeautifulSoup(data.text, "lxml")
standings_table = soup.select_one("table.stats_table")
links = [l.get("href") for l in standings_table.find_all("a")]
links = [l for l in links if "/squads/" in l]
team_urls = [f"https://fbref.com{l}" for l in links]
print(team_urls)

我不知道如何同时抓取第二个表以一次性提取所有链接以进行进一步操作。

python-3.x web-scraping beautifulsoup
1个回答
0
投票

尝试:

import requests
from bs4 import BeautifulSoup

url = "https://fbref.com/en/comps/22/Major-League-Soccer-Stats"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

table1 = soup.select_one('table[id*="Eastern-Conference_overall"]')
table2 = soup.select_one('table[id*="Western-Conference_overall"]')

for t in [table1, table2]:
    for a in t.select('a[href*="/squads/"]'):
        print(a.text, a["href"], t.caption.text, sep="\t")

打印:

Inter Miami     /en/squads/cb8b86a2/Inter-Miami-Stats   Eastern Conference Table
FC Cincinnati   /en/squads/e9ea41b2/FC-Cincinnati-Stats Eastern Conference Table
NYCFC   /en/squads/64e81410/New-York-City-FC-Stats      Eastern Conference Table
NY Red Bulls    /en/squads/69a0fb10/New-York-Red-Bulls-Stats    Eastern Conference Table
Toronto FC      /en/squads/130f43fa/Toronto-FC-Stats    Eastern Conference Table
Charlotte       /en/squads/eb57545a/Charlotte-FC-Stats  Eastern Conference Table
Crew    /en/squads/529ba333/Columbus-Crew-Stats Eastern Conference Table
Philadelphia    /en/squads/46024eeb/Philadelphia-Union-Stats    Eastern Conference Table
D.C. United     /en/squads/44117292/DC-United-Stats     Eastern Conference Table
Orlando City    /en/squads/46ef01d0/Orlando-City-Stats  Eastern Conference Table
Nashville       /en/squads/35f1b818/Nashville-SC-Stats  Eastern Conference Table
Atlanta Utd     /en/squads/1ebc1a5b/Atlanta-United-Stats        Eastern Conference Table
CF Montréal     /en/squads/fc22273c/CF-Montreal-Stats   Eastern Conference Table
Fire    /en/squads/f9940243/Chicago-Fire-Stats  Eastern Conference Table
NE Revolution   /en/squads/3c079def/New-England-Revolution-Stats        Eastern Conference Table
RSL     /en/squads/f7d86a43/Real-Salt-Lake-Stats        Western Conference Table
Minnesota Utd   /en/squads/99ea75a6/Minnesota-United-Stats      Western Conference Table
Austin  /en/squads/b918956d/Austin-FC-Stats     Western Conference Table
LA Galaxy       /en/squads/d8b46897/LA-Galaxy-Stats     Western Conference Table
LAFC    /en/squads/81d817a3/Los-Angeles-FC-Stats        Western Conference Table
Rapids  /en/squads/415b4465/Colorado-Rapids-Stats       Western Conference Table
Vancouver W'caps        /en/squads/ab41cb90/Vancouver-Whitecaps-FC-Stats        Western Conference Table
Dynamo FC       /en/squads/0d885416/Houston-Dynamo-Stats        Western Conference Table
St. Louis       /en/squads/bd97ac1f/St-Louis-City-Stats Western Conference Table
Seattle /en/squads/6218ebd4/Seattle-Sounders-FC-Stats   Western Conference Table
Portland Timbers        /en/squads/d076914e/Portland-Timbers-Stats      Western Conference Table
FC Dallas       /en/squads/15cf8f40/FC-Dallas-Stats     Western Conference Table
Sporting KC     /en/squads/4acb0537/Sporting-Kansas-City-Stats  Western Conference Table
SJ Earthquakes  /en/squads/ca460650/San-Jose-Earthquakes-Stats  Western Conference Table
© www.soinside.com 2019 - 2024. All rights reserved.