以下代码应该从网站https://www.selexion.be/返回少数产品的状态代码和型号。当我将所有URL放入代码中的urls数组中时,它工作正常,但是当我从csv文件中获取url时,出现此错误。
[另外,我想将输出的url,状态代码和型号存储在数组中,并希望在提取所有链接的状态代码和型号时将该数组刷新(.flush()
&os.fsync()
)到csv文件。因为我正在终端中获取输出,但是我也希望在csv文件中也输出。
错误:
PS C:\Users\Zandrio> & C:/Users/Zandrio/AppData/Local/Programs/Python/Python38/python.exe "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py"
Traceback (most recent call last):
File "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py", line 49, in <module>
asyncio.run(main())
File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\asyncio\runners.py", line 43, in run
return loop.run_until_complete(main)
File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 612, in run_until_complete
return future.result()
File "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py", line 41, in main
await asyncio.gather(*(worker(f'w{index}', url, session)
File "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py", line 32, in worker
response = await session.get(url, headers=header)
File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\site-packages\aiohttp\client.py", line 380, in _request
url = URL(str_or_url)
File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\site-packages\yarl\__init__.py", line 149, in __new__
raise TypeError("Constructor parameter should be str")
TypeError: Constructor parameter should be str
代码:
import asyncio
import csv
import aiohttp
import time
from bs4 import BeautifulSoup
urls = []
try:
with open('C:\\Users\\Zandrio\\Documents\\Advanced Project\\input_links.csv','r', newline='') as csvIO:
urls = list(csv.reader(csvIO))
except FileNotFoundError:
pass
header = {
'Host': 'www.selexion.be',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Cache-Control': 'max-age=0',
'TE': 'Trailers'
}
async def worker(name, url, session):
response = await session.get(url, headers=header)
html = await response.read()
soup = BeautifulSoup(html, features='lxml').select_one('.title-options span:first-of-type').text
print(f'URL: {url} - {response.status} - {soup}')
async def main():
async with aiohttp.ClientSession() as session:
await asyncio.gather(*(worker(f'w{index}', url, session)
for index, url in enumerate(urls)))
if __name__ == '__main__':
start = time.perf_counter()
asyncio.run(main())
elapsed = time.perf_counter() - start
print(f'Executed in {elapsed:0.2f} seconds')
错误消息说它是类型错误,表示函数返回类型为A的参数,但传递了类型B。
TypeError: Constructor parameter should be str
在线
await asyncio.gather(*(worker(f'w{index}', url, session)
URL的类型是什么?你可以用
找到type(url)
或通过运行调试器。
当您翻转这两行时会发生什么
await asyncio.gather(*(worker(f'w{index}', url, session)
for index, url in enumerate(urls)))
我不知道url的来源。