我怎样才能将同一行和同一列中的球队配对在一起
import pandas as pd
from bs4 import BeautifulSoup
import requests
url = "https://www.rotowire.com/basketball/nba-lineups.php"
soup = BeautifulSoup(requests.get(url).text, "html.parser")
lineups = soup.find_all(class_='is-pct-play-100')
positions = [x.find('div').text for x in lineups]
names = [x.find('a')['title'] for x in lineups]
teams = sum([[x.text] * 5 for x in soup.find_all(class_='lineup__abbr')], [])
df = pd.DataFrame(zip(names, teams, positions))
df.to_csv('team_lineups.csv', index=False)
print(df)
输出:
0 Damian Lillard MIL PG
1 Malik Beasley MIL SG
2 Jae Crowder MIL SF
3 Brook Lopez MIL C
4 Tyrese Maxey MIL PG
.. ... ... ..
92 James Harden CHA PG
93 Terance Mann CHA SG
94 Paul George CHA SF
95 Kawhi Leonard POR PF
96 Ivica Zubac POR C
[97 rows x 3 columns]
试试这个:
import pandas as pd
from bs4 import BeautifulSoup
import requests
url = "https://www.rotowire.com/basketball/nba-lineups.php"
soup = BeautifulSoup(requests.get(url).text, "html.parser")
lineups = soup.find_all(class_='is-pct-play-100')
positions = [x.find('div').text for x in lineups]
names = [x.find('a')['title'] for x in lineups]
teams = sum([[x.text] * 5 for x in soup.find_all(class_='lineup__abbr')], [])
# Split the data into two separate DataFrames for each team
team1_df = pd.DataFrame(zip(names[:5], positions[:5]), columns=['Player', 'Position'])
team2_df = pd.DataFrame(zip(names[5:], positions[5:]), columns=['Player', 'Position'])
# Add team names as a column in each DataFrame
team1_df['Team'] = teams[0]
team2_df['Team'] = teams[5]
# Merge the two DataFrames to create the final matchup DataFrame
matchup_df = pd.merge(team1_df, team2_df, left_index=True, right_index=True, suffixes=('_Team1', '_Team2'))
# Save the DataFrame to a CSV file
matchup_df.to_csv('team_matchups.csv', index=False)
print(matchup_df)
输出:
Player_Team1 Position_Team1 Team_Team1 Player_Team2 Position_Team2 Team_Team2
0 Damian Lillard PG MIL Buddy Hield SG PHI
1 Malik Beasley SG MIL Tobias Harris SF PHI
2 Jae Crowder SF MIL Nicolas Batum PF PHI
3 Brook Lopez C MIL Paul Reed C PHI
4 Tyrese Maxey PG MIL D'Angelo Russell PG PHI
这里是更新版本,添加了列
Match
(与玩家人物匹配的内容)和Team
(玩家所在的球队):
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = "https://www.rotowire.com/basketball/nba-lineups.php"
soup = BeautifulSoup(requests.get(url).text, "html.parser")
data = []
for lineup in soup.select(".lineup__box:has(.is-pct-play-100)"):
v, h = (
lineup.select_one(".lineup__team.is-visit").text.strip(),
lineup.select_one(".lineup__team.is-home").text.strip(),
)
for l in lineup.select("ul.lineup__list .is-pct-play-100"):
pos = l.div.text
name = l.a.text
team = l.find_previous("ul")["class"][-1]
if team == "is-home":
team = h
else:
team = v
data.append({"Match": f"{v} - {h}", "Name": name, "Pos": pos, "Team": team})
df = pd.DataFrame(data)
print(df.head(15))
打印:
Match Name Pos Team
0 MIL - PHI D. Lillard PG MIL
1 MIL - PHI Malik Beasley SG MIL
2 MIL - PHI Jae Crowder SF MIL
3 MIL - PHI Brook Lopez C MIL
4 MIL - PHI Tyrese Maxey PG PHI
5 MIL - PHI Buddy Hield SG PHI
6 MIL - PHI Tobias Harris SF PHI
7 MIL - PHI Nicolas Batum PF PHI
8 MIL - PHI Paul Reed C PHI
9 LAL - PHX D. Russell PG LAL
10 LAL - PHX Austin Reaves SG LAL
11 LAL - PHX Rui Hachimura PF LAL
12 LAL - PHX Devin Booker PG PHX
13 LAL - PHX Grayson Allen SF PHX
14 LAL - PHX Kevin Durant PF PHX