Generating plain text from a Wikipedia database dump
问题 I found a Python script (here: Wikipedia Extractor) that can generate plain text from (English) Wikipedia database dump. When I use this command (as it's stated on the script's page): $ python enwiki-latest-pages-articles.xml WikiExtractor.py -b 500K -o extracted I get this error: File "enwiki-latest-pages-articles.xml", line 1 < mediawiki xmlns="http://www.mediawiki.org/xml/export-0.8/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml