我能够使用Beautifulsoup抓取数据,现在希望生成一个文件,其中包含使用Beautiful Soup从中抓取的所有数据。
file = open("copy.txt", "w")
data = soup.get_text()
data
file.write(soup.get_text())
file.close()
我在文本文件中看不到所有标签和全部内容。关于如何实现它的任何想法?
您可以使用:
with open("copy.txt", "w") as file:
file.write(str(soup))
快速解决方案:
您只需将汤转换为字符串。如果其他人希望遵循,请使用测试站点:
from bs4 import BeautifulSoup as BS
import requests
r = requests.get("https://webscraper.io/test-sites/e-commerce/allinone")
soup = BS(r.content)
file = open("copy.txt", "w")
file.write(str(soup))
file.close()
稍微更好的解决方案:
更好的做法是为文件IO使用上下文(使用with
:]
from bs4 import BeautifulSoup as BS
import requests
r = requests.get("https://webscraper.io/test-sites/e-commerce/allinone")
soup = BS(r.content)
with open("copy.txt", "w") as file:
file.write(str(soup))