Following hyperlink and “Filtered offsite request”

前端未结

关注

 2  1717

I know that there are several related threads out there, and they have helped me a lot, but I still can\'t get all the way. I am at the point where running the code doesn\'t

相关标签:

2条回答

灰色年华

2020-12-16 13:44

try make this dont_filter=true

yield Request(url=url2, meta{'address':hxs.select("id('searchresult')/tr/td[1]/a[@href]/text()").extract()}, callback=self.parse2,dont_filter=True)

0 讨论(0)
发布评论:

提交评论
- 加载中...
死守一世寂寞

2020-12-16 13:56
You need to modify your yielded Request in parse to use parse2 as its callback.

EDIT: allowed_domains shouldn't include the http prefix eg:
```
allowed_domains = ["boliga.dk"]
```
Try that and see if your spider still runs correctly instead of leaving allowed_domains blank
0 讨论(0)
发布评论:

提交评论
- 加载中...