Parsing a Wikipedia dump

前端 未结 9 1269
生来不讨喜
生来不讨喜 2020-12-03 05:33

For example using this Wikipedia dump:

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=lebron%20james&rvprop=content&redirects=t

9条回答
  •  庸人自扰
    2020-12-03 05:52

    I know this is an old question, but I here is this great script that reads the wiki dump xml and outputs a very nice csv:

    PyPI: https://pypi.org/project/wiki-dump-parser/

    GitHub: https://github.com/Grasia/wiki-scripts/tree/master/wiki_dump_parser

提交回复
热议问题