与 BeautifulSoup 一起从公告牌热门 100 名艺术家单曲历史中抓取

问题描述 投票:0回答:1

我正在尝试抓取艺术家广告牌页面上的所有信息,因为这些信息与他们的单曲以及他们的表现有关。我正在尝试重新设计我在其他地方看到的解决方案。它可以在一定程度上发挥作用,但是一旦我超过了“峰值位置”,我不知道如何包含“峰值日期”和“周”桌子。我基本上试图捕获网站表格中显示的所有信息,并最终将其放入数据框中,但无法获取最后两列。任何指示将不胜感激。谢谢!

import requests
from bs4 import BeautifulSoup

url = requests.get('https://www.billboard.com/artist/john-lennon/chart-history/hsi/')
soup = BeautifulSoup(url.content, 'html.parser')
result = soup.find_all('div','o-chart-results-list-row')

for res in result:
    song = res.find('h3').text.strip()
    artist = res.find('h3').find_next('span').text.strip()
    debute = res.find('span').find_next('span').text.strip()
    peak = res.find('a').find_next('span').text.strip()
    #peak_date = ?
    #wks = ?

    print("song: "+str(song))
    print("artist: "+ str(artist))
    print("debute: "+ str(debute))
    print("peak: "+ str(peak))
    print("___________________________________________________")

歌曲:(就像)重新开始
艺术家:约翰·列侬
首次亮相:80年1月11日
峰值:1
高峰日期:
周:

python html web-scraping beautifulsoup python-requests
1个回答
0
投票

尝试:

import pandas as pd
import requests
from bs4 import BeautifulSoup

url = "https://www.billboard.com/artist/john-lennon/chart-history/hsi/"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

data = []
for row in soup.select(".o-chart-results-list-row"):
    title = row.h3.get_text(strip=True)
    artist = row.span.get_text(strip=True)
    debut_date = row.select_one(".artist-chart-row-debut-date").get_text(strip=True)
    peak_pos = row.select_one(".artist-chart-row-peak-pos").get_text(strip=True)
    peak_week = row.select_one(".artist-chart-row-peak-week").get_text(strip=True)
    peak_date = row.select_one(".artist-chart-row-peak-date").get_text(strip=True)
    wks_on_chart = row.select_one(".artist-chart-row-week-on-chart").get_text(
        strip=True
    )
    data.append(
        {
            "Title": title,
            "Artist": artist,
            "Debut Date": debut_date,
            "Peak Pos": peak_pos,
            "Peak Week": peak_week,
            "Weeks on Chart": wks_on_chart,
        }
    )


df = pd.DataFrame(data)
print(df)

打印:

                               Title                                                            Artist Debut Date Peak Pos Peak Week Weeks on Chart
0          (Just Like) Starting Over                                                       John Lennon   11.01.80        1     5 WKS             22
1                              Woman                                                       John Lennon   01.17.81        2    12 Wks             20
2                Watching The Wheels                                                       John Lennon   03.28.81       10    12 Wks             17
3   Whatever Gets You Thru The Night                     John Lennon With The Plastic Ono Nuclear Band   09.28.74        1     1 WKS             15
4                     Nobody Told Me                                                       John Lennon   01.21.84        5    12 Wks             14
5    Instant Karma (We All Shine On)                                                   John Ono Lennon   02.28.70        3    12 Wks             13
6                         MIND GAMES                                                       John Lennon   11.10.73       18    12 Wks             13
7                           #9 Dream                                                       John Lennon   12.21.74        9    12 Wks             12
8                        Cold Turkey                                                  Plastic Ono Band   11.15.69       30    12 Wks             12
9                            Imagine                                      John Lennon/Plastic Ono Band   10.23.71        3    12 Wks              9
10               Give Peace A Chance                                                  Plastic Ono Band   07.26.69       14    12 Wks              9
11               Power To The People            John Lennon/Plastic Ono Band Yoko Ono/Plastic Ono Band   04.03.71       11    12 Wks              9
12                       Stand By Me                                                       John Lennon   03.15.75       20    12 Wks              9
13                            Mother            John Lennon/Plastic Ono Band Yoko Ono/Plastic Ono Band   01.09.71       43    12 Wks              6
14          Happy Xmas (War Is Over)  John & Yoko/The Plastic Ono Band With The Harlem Community Choir   12.29.18       38    12 Wks              6
15                  I'm Steppin' Out                                                       John Lennon   03.31.84       55    12 Wks              6
16  Woman Is The Nigger Of The World               John Lennon/Plastic Ono Band With Elephant's Memory   05.20.72       57    12 Wks              5
17                       Jealous Guy                                John Lennon & The Plastic Ono Band   10.15.88       80    12 Wks              4
© www.soinside.com 2019 - 2024. All rights reserved.