我正在尝试使用以下代码将我从网站“抓取”的数据写入json输出文件:
from bs4 import BeautifulSoup
import requests
import json
path = ["https://www.test.be?page=,https://www.test2.be?page="]
adresArr = []
for i in path:
pagina = 0;
for x in range(0, 4):
url = i + str(pagina)
response = requests.get(url, timeout=5)
content = BeautifulSoup(response.content, "html.parser")
for adres in content.findAll('tr', attrs={"class": "odd clickable-row"}):
adresObject = {
"postcode": adres.find('td', attrs={"class": "views-field views-field-field-locatie-postal-code"}).text.encode('utf-8'),
"naam": adres.find('td', attrs={"class": "views-field views-field-field-locatie-thoroughfare"}).text.encode('utf-8'),
"plaats": adres.find('td', attrs={"class": "views-field views-field-field-locatie-locality"}).text.encode('utf-8')
}
adresArr.append(adresObject)
for adres in content.findAll('tr', attrs={"class": "odd clickable-row active"}):
adresObject = {
"postcode": adres.find('td', attrs={"class": "views-field views-field-field-locatie-postal-code"}).text.encode('utf-8'),
"naam": adres.find('td', attrs={"class": "views-field views-field-field-locatie-thoroughfare"}).text.encode('utf-8'),
"plaats": adres.find('td', attrs={"class": "views-field views-field-field-locatie-locality"}).text.encode('utf-8')
}
adresArr.append(adresObject)
pagina = x
with open('adresData.json', 'w') as outfile:
json.dump(adresArr, outfile)
我得到以下错误:字节类型的对象不可json序列化
如果我打印数组本身,则看起来不错。但是我一直坚持将其写入json文件。我在做什么错?
这是我第一次使用python进行编码(不是很多编码经验),因此请清楚说明您的答案:)
提前感谢