Extract the first paragraph from a Wikipedia article (Python)

前端 未结 10 1545
闹比i
闹比i 2020-11-28 01:36

How can I extract the first paragraph from a Wikipedia article, using Python?

For example, for Albert Einstein, that would be:

<
10条回答
  •  余生分开走
    2020-11-28 01:59

    The relatively new REST API has a summary method that is perfect for this use, and does a lot of the things mentioned in the other answers here (e.g. removing wikicode). It even includes an image and geocoordinates if applicable.

    Using the lovely requests module and Python 3:

    import requests
    r = requests.get("https://en.wikipedia.org/api/rest_v1/page/summary/Amsterdam")
    page = r.json()
    print(page["extract"]) # Returns 'Amsterdam is the capital and...'
    

提交回复
热议问题