I have created a spider to scrape problems from projecteuler.net. Here I have concluded my answer to a related question with
I launch this with the comma
If I needed my output file to be sorted (I will assume you have a valid reason to want this), I'd probably write a custom exporter.
This is how Scrapy's built-in JsonItemExporter
is implemented.
With a few simple changes, you can modify it to add the items to a list in export_item()
, and then sort the items and write out the file in finish_exporting()
.
Since you're only scraping a few hundred items, the downsides of storing a list of them and not writing to a file until the crawl is done shouldn't be a problem to you.