Scrapy doesn't seem to be doing DFO

后端 未结 3 1115
臣服心动
臣服心动 2021-02-20 03:21

I have a website for which my crawler needs to follow a sequence. So for example, it needs to go a1, b1, c1 before it starts going a2 etc. each of a, b and c are handled by diff

3条回答
  •  梦谈多话
    2021-02-20 03:57

    I believe that you are noticing the difference between depth-first and breadth-first searching algorithms (see Wikipedia for info on both.)

    Scrapy has the ability to change which algorithm is used:

    "By default, Scrapy uses a LIFO queue for storing pending requests, which basically means that it crawls in DFO order. This order is more convenient in most cases. If you do want to crawl in true BFO order, you can do it by setting the following settings:"

    See http://doc.scrapy.org/en/0.14/faq.html for more information.

提交回复
热议问题