module._init_()最多接受两个参数(给定的3个参数)(剪贴教程w/ xpath)

时间:2021-12-24 04:45:02

I'm not sure how I have 3 arguements here, and if so, how do i call DmozItem from items.py? This seems like a simple inheritance issue I'm missing. This code is copied directly from the scrapy tutorial website.

我不知道这里有3个论点,如果是,我怎么从items.py中调用DmozItem ?这似乎是一个简单的继承问题。此代码直接从剪贴教程网站复制。

-- Shell error --

——壳牌错误

SyntaxError: invalid syntax
PS C:\Users\Steve\tutorial> scrapy crawl dmoz
Traceback (most recent call last):
  File "c:\python27\scripts\scrapy-script.py", line 9, in <module>
load_entry_point('scrapy==1.0.3', 'console_scripts', 'scrapy')()
  File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\cmdline.py", line 142, in execute
cmd.crawler_process = CrawlerProcess(settings)
  File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\crawler.py", line 209, in __init__
super(CrawlerProcess, self).__init__(settings)
  File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\crawler.py", line 115, in __init__
self.spider_loader = _get_spider_loader(settings)
  File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\crawler.py", line 296, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
  File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\spiderloader.py", line 30, in from_settings
return cls(settings)
  File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\spiderloader.py", line 21, in __init__
for module in walk_modules(name):
  File "C:\Python27\lib\site-packages\scrapy-1.0.3-py2.7.egg\scrapy\utils\misc.py", line 71, in walk_modules
submod = import_module(fullpath)
  File "C:\Python27\lib\importlib\__init__.py", line 37, in import_module
__import__(name)
  File "C:\Users\Steve\tutorial\tutorial\spiders\dmoz_spider.py", line 3, in <module>
from tutorial.items import DmozItem
  File "C:\Users\Steve\tutorial\tutorial\items.py", line 11, in <module>
class DmozItem(scrapy.item):
TypeError: Error when calling the metaclass bases
module._init_() takes at most 2 arguments (3 given)

-- items.py -- my items list for parsing

——项目。py——我的解析项列表。

import scrapy


class DmozItem(scrapy.item):

title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()

-- dmoz_spider.py -- this is the spider

——dmoz_spider。py,这是蜘蛛。

import scrapy

from tutorial.items import DmozItem

class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
    "https://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
    "https://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]

def parse(self, response):
    for sel in response.xpath('//ul/li'):
        item = DmozItem()
        item['title'] = sel.xpath('a/text()').extract()
        item['link'] = sel.xpath('a/@href').extract()
        item['desc'] = sel.xpath('text()').extract()
        yield item

1 个解决方案

#1


1  

You have mistyped scrapy.Item class name.

你有输错的scrapy。项目类的名字。

In items.py, change:

在项目。py变化:

scrapy.item

to

scrapy.Item

It should look like this:

它应该是这样的:

import scrapy

class DmozItem(scrapy.Item):
    title = scrapy.Field()
    link = scrapy.Field()
    desc = scrapy.Field()

#1


1  

You have mistyped scrapy.Item class name.

你有输错的scrapy。项目类的名字。

In items.py, change:

在项目。py变化:

scrapy.item

to

scrapy.Item

It should look like this:

它应该是这样的:

import scrapy

class DmozItem(scrapy.Item):
    title = scrapy.Field()
    link = scrapy.Field()
    desc = scrapy.Field()