I am writing a crawler for a website using scrapy with CrawlSpider.
Scrapy provides an in-built duplicate-request filter which filters duplicate requests based on ur
In the latest scrapy, we can use the default duplication filter or extend and have custom one.
define the below config in spider settings
DUPEFILTER_CLASS = 'scrapy.dupefilters.BaseDupeFilter'