I know that there are several related threads out there, and they have helped me a lot, but I still can\'t get all the way. I am at the point where running the code doesn\'t
try make this dont_filter=true
yield Request(url=url2, meta{'address':hxs.select("id('searchresult')/tr/td[1]/a[@href]/text()").extract()}, callback=self.parse2,dont_filter=True)
You need to modify your yielded Request
in parse
to use parse2
as its callback.
EDIT: allowed_domains
shouldn't include the http prefix eg:
allowed_domains = ["boliga.dk"]
Try that and see if your spider still runs correctly instead of leaving allowed_domains
blank