I use Tor to crawl web pages. I started tor and polipo service and added
class ProxyMiddleware(object): # overwrite process request def
process_reques
You can yield the first request to check your public IP, and compare this to the IP you see when you go to http://checkip.dyndns.org/ without using Tor/VPN. If they are not the same, scrapy is using a different IP obviously.
def start_reqests():
yield Request('http://checkip.dyndns.org/', callback=self.check_ip)
# yield other requests from start_urls here if needed
def check_ip(self, response):
pub_ip = response.xpath('//body/text()').re('\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}')[0]
print "My public IP is: " + pub_ip
# yield other requests here if needed