问题
I'm trying to reproduce the code of this talk:
https://www.youtube.com/watch?v=eD8XVXLlUTE
When I try to run the spider:
scrapy crawl talkspider_basic
I got this error:
raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: talkspider_basic'
The code of the spider is:
from scrapy.spiders import BaseSpider
from scrapy.selector import HtmlXPathSelector
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.contrib.loader import XPathItemLoader
from pytexas.items import PytexasItem
class TalkspiderBasicSpider(BaseSpider):
name = "talkspider_basic"
allowed_domains = ["www.pytexas.org"]
start_urls = ['http://wwww.pytexas.org/2013/schedule']
def parse(self, response):
hxs = HtmlXPathSelector(response)
dls = hcs.select('///dl')
for dl in dls:
times = dl.select('dt/text()').extract()
titles = dl.select('dd/a/text()').extract()
for time, title in zip(times,titles):
title = title.strip()
yield PytexasItem(title=title,time= time)
The code of the Items is:
from scrapy.item import Item, Field
class PytexasItem(Item):
title = Field()
time = Field()
speaker = Field()
description = Field()
The name of the project and of the spider's file are
pytexas
and
talk_spider_basic.py
respectively, so I don't think that there is any conflict because of the names.
Edit:
It has the default structure:
pytexas/
scrapy.cfg
pytexas/
items.py
pipelines.py
settings.py
spiders/
__init__.py
talk_spider_basic.py
回答1:
According Github Issues #2254. Because some module is deprecated.Like scrapy.contrib.
So you should make some change.
From:
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.contrib.loader import XPathItemLoader
To:
from scrapy.linkextractors import LinkExtractor
from scrapy.loader import XPathItemLoader
回答2:
One solution, which works in some situation, is downgrade your scrapy (if it is >=1.3). To do this you can run the following command:
pip install scrapy==1.2
回答3:
I know that this post may be old. But I have found another problem, which may produce error "spider not found". I have my spiders organized in folders, e.g <crawler-project>/spiders/full
, <crawler-project>/spiders/clean
. So I created new directory - <crawler-project>/spiders/aaa
- in which I placed new spider. This new spider was not found by scrapy, until I created __init__.py
file. So if you want to organize spiders in folders, you should create valid python module folders.
来源:https://stackoverflow.com/questions/38627000/scrapy-spider-not-found