How to get all article pages under a Wikipedia Category and its sub-categories?

不羁岁月 提交于 2019-11-28 08:16:37

The following resource will help you to download all pages from the category and all its subcategories:

http://en.wikipedia.org/wiki/Wikipedia:CatScan

There is also an API available here:

https://www.mediawiki.org/wiki/API:Categorymembers

Adexe Rivera

You can do this through the following two API methods:

For articles pages for this category

YOUR_URL/api.php?action=query&format=json&list=categorymembers&cmtitle=Category:Music

For get subcategories:

YOUR_URL/api.php?action=query&format=json&list=categorymembers&cmtype=subcat&cmtitle=Category:Music

You can get more info on Mediawiki API

Note that Wikipedia's categorization system is not a tree, or even an acyclic graph. It is quite possible that by continually following subcategory links you will eventually wind up back where you started.

If you are going to be making many such queries, you would be best served by downloading a database dump. If this will be an infrequent thing and will only be dealing with small categories, you could probably get away with making repeated queries to list=categorymembers.

incategory:"music" does not appear to do subcategory searching.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!