以编程方式创建子类

Question

我正在使用Scrapy来抓取一组类似的页面（webcomics）。因为这些页面非常相似，所以我编写了一个名为ComicCrawler的类，它包含所有的蜘蛛逻辑和一些类变量（start_url，next_selector等）。然后我在每个蜘蛛的具体类中覆盖这些类变量。

为每个漫画手动创建课程非常麻烦。我现在想要在JSON文件中指定属性并在运行时创建类（即应用工厂模式（？））我如何才能做到最好？

或者：有没有办法运行蜘蛛而不为它创建一个类？编辑：核心问题似乎是Scrapy使用类，而不是蜘蛛的实例。否则我只是创建类变量实例变量并完成它。

例：

class ComicSpider(Spider):
  name = None
  start_url = None
  next_selector = None
  # ...

  # this class contains much more logic than shown here

  def start_requests(self):
    # something including / along the lines of...
    yield Request (self.start_url, self.parse)

  def parse(self, response):
    # something including / along the lines of...
    yield Request(response.css(self.next_selector).get(), self.parse)

在另一个文件中：

class SupernormalStep(ComicSpider):
  name = "SupernormalStep"
  start_url = "https://supernormalstep.com/archives/8"
  next_selector = "a.cc-next"

我想要的是：

myComics = {
  "SupernormalStep": {
    "start_url": "https://supernormalstep.com/archives/8",
    "next_selector": "a.cc-next"
  }, # ...
}

process = CrawlerProcess(get_project_settings())
for name, attributes in myComics:
  process.crawl(build_process(name, attributes))

PS：我负责任地爬行。

Answer 1

class语句是一个声明性的包装器，直接使用type。假设process.crawl以一个类作为参数，

process = CrawlerProcess(get_project_settings())
for name, attributes in myComics.items():
    process.crawl(type(name, (ComicSpider,), attributes))

type(name, (ComicSpider,), attributes)将创建一个名为name的类，它将继承自ComicSpider，并将具有attributes字典中定义的属性。 An example on Python docs.

Answer 2

查找元类。这是Python动态创建新类的方法。 What are metaclasses in Python?

对于这个更简单的情况，有一个更简单的方法，在chepner's answer中描述。

以编程方式创建子类

问题描述投票：0回答：2

2个回答

最新问题

以编程方式创建子类

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2