Scrapy0.22: An error occured while connecting: <class 'twisted.internet.error.ConnectionLost'>

醉酒当歌 提交于 2019-12-10 12:21:13

问题


Good morning,

I get a connection error while executing one of my spiders:

2014-02-28 10:21:00+0400 [butik] DEBUG: Retrying <GET http://www.butik.ru/> (failed 1 times): An error occurred while connecting: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion: Connection lost.].

Afterwards the spider shuts down.

All other spiders with a smiliar structure are running smoothly, but this one:

class butik(Spider):
    name = "butik"
    allowed_domains = ['butik.ru']
    start_urls      = ['http://www.butik.ru/']

    def parse(self, response): 
        sel = Selector(response)
        print response.url
        maincats = sel.xpath('//div[@id="main_menu"]//a/@href').extract()
        for maincat in maincats:
            maincat = 'http://www.butik.ru'+ maincat 
            request = Request(maincat, callback=self.categories)
            yield request

I'm quite clueless which steps to take in order to fix this issue and am glad for any hints and answers. If additional informations are needed I would be happy to provide the neccessary code.

Thanks in advance

J


回答1:


You can try the urllib2 instead. I also got similar problem when I'm using scrapy to crawl a page, but I fix this problem by using urllib2 inside a parse:

import urllib2

def parse(self,response):
    # ...
    url = 'www.example.com'
    req = urllib2.Request(url,data)
    response = urllib2.urlopen(req)
    the_page = response.read()
    # ...


来源:https://stackoverflow.com/questions/22088778/scrapy0-22-an-error-occured-while-connecting-class-twisted-internet-error-co

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!