我正在学习Python网络抓取。当我 scrapy 爬行蜘蛛时,它显示 AttributeError

问题描述 投票:0回答:1

我正在学习使用 scrapy 进行 python 抓取。我做了和教程教的完全一样的事情。 但我得到了一个错误。请帮忙!

我的Python代码:

import scrapy


class BookSpider(scrapy.Spider):
    name = "books"
    allowed_domains = ["books.toscrape.com"]
    start_urls = ["https://books.toscrape.com"]

    def parse(self, response):
        books = response.css("article.product_pod")
                             
        for book in books:
            yield{
                "name":book.css("h3 a::text").get(),
                "price":book.css(".product_price .price_color::text").get(),
                "url": book.css("h3 a").attrib["href"],
            }

终端显示

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Administrator\python\venv\bookscraper\Scripts\scrapy.exe\__main__.py", line 7, in <module>
  File "C:\Users\Administrator\python\venv\bookscraper\Lib\site-packages\scrapy\cmdline.py", line 161, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "C:\Users\Administrator\python\venv\bookscraper\Lib\site-packages\scrapy\cmdline.py", line 114, in _run_print_help
    func(*a, **kw)
  File "C:\Users\Administrator\python\venv\bookscraper\Lib\site-packages\scrapy\cmdline.py", line 169, in _run_command
    cmd.run(args, opts)
  File "C:\Users\Administrator\python\venv\bookscraper\Lib\site-packages\scrapy\commands\crawl.py", line 30, in run
    self.crawler_process.start()
  File "C:\Users\Administrator\python\venv\bookscraper\Lib\site-packages\scrapy\crawler.py", line 390, in start
    install_shutdown_handlers(self._signal_shutdown)
  File "C:\Users\Administrator\python\venv\bookscraper\Lib\site-packages\scrapy\utils\ossignal.py", line 19, in install_shutdown_handlers    reactor._handleSignals()
    ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AsyncioSelectorReactor' object has no attribute '_handleSignals'

ossignal.py 文件:

import signal

signal_names = {}
for signame in dir(signal):
    if signame.startswith("SIG") and not signame.startswith("SIG_"):
        signum = getattr(signal, signame)
        if isinstance(signum, int):
            signal_names[signum] = signame


def install_shutdown_handlers(function, override_sigint=True):
    """Install the given function as a signal handler for all common shutdown
    signals (such as SIGINT, SIGTERM, etc). If override_sigint is ``False`` the
    SIGINT handler won't be install if there is already a handler in place
    (e.g.  Pdb)
    """
    from twisted.internet import reactor

    reactor._handleSignals()
    signal.signal(signal.SIGTERM, function)
    if signal.getsignal(signal.SIGINT) == signal.default_int_handler or override_sigint:
        signal.signal(signal.SIGINT, function)
    # Catch Ctrl-Break in windows
    if hasattr(signal, "SIGBREAK"):
        signal.signal(signal.SIGBREAK, function)
python scrapy python-asyncio twisted
1个回答
1
投票

正如我在评论中指出的,您所描述的问题已经由 scrapy here 解决,并且与其依赖项之一扭曲有关。

另一位用户通过安装以前版本的twisted解决了该问题(请参阅此处)。

基本上,他安装了以下版本的twisted,这解决了他的问题。

pip install Twisted==22.10.0

在问题解决并发布新版本之前,我建议使用以前的版本。

© www.soinside.com 2019 - 2024. All rights reserved.