我想用 200 个网址提取我的 Excel 工作表中的数据,我想从这些网址中提取数据

问题描述 投票:0回答:0

headers={     'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',} df = pd.read_excel("C:\\Users\\CSEv L031\\Downloads\\Arab.xlsx") data = df.URL name = df.URL_ID for url in data:     page = requests.get(url,headers=headers).text     soup = BeautifulSoup(page,"html.parser")     article_title = soup.find('h1',class_='entry-title')     article = soup.find('div',class_='td-post-content')     print(article)     for i in name:         file=open('%i.txt'%i,'w')         for article_body in soup.find_all('p'):             title = article_title.text             body = article.text             file.write(title)             file.write(body)         file.close()
`

打印文章时结果显示没有。

excel web-scraping scrapy screen-scraping scrape
© www.soinside.com 2019 - 2024. All rights reserved.