I coded a simple crawler. In the settings.py file, by referring to scrapy documentation, I used
DUPEFILTER_CLASS = \'scrapy.dupefilter.RFPDupeFilter\'
you can rewrite Scheduler with Redis like scrapy-redis then you can avoid duplicate URL crawling when reruning your project.