wikipedia-api

Wikipedia revision history using pywikibot

会有一股神秘感。 提交于 2021-02-19 11:58:05
问题 I want to collect all the revisions history data at once. Pywikibot page.revisions() does not have the parameter to fetch number of bytes changed. It gives me all the data that I need except the number of bytes changed. How do I get the number of bytes changed? for example: for the article Main Page the revision history is here: history screenshot My current code: import pywikibot site = pywikibot.Site("en", "wikipedia") page = pywikibot.Page(site, "Main_Page") revs = page.revisions() Showing

Wikipedia revision history using pywikibot

孤者浪人 提交于 2021-02-19 11:56:17
问题 I want to collect all the revisions history data at once. Pywikibot page.revisions() does not have the parameter to fetch number of bytes changed. It gives me all the data that I need except the number of bytes changed. How do I get the number of bytes changed? for example: for the article Main Page the revision history is here: history screenshot My current code: import pywikibot site = pywikibot.Site("en", "wikipedia") page = pywikibot.Page(site, "Main_Page") revs = page.revisions() Showing

Retrieve linked wikidata entities having a wikipedia page

泄露秘密 提交于 2021-02-10 20:37:50
问题 I want to query wikidata by free text or by category, to return entities who has a corresponding wikipedia page. For each page (or for a selected page) I want to fetch all the linked wikidata entities who have a corresponding wikipedia article. Note that: for each wikipedia page and linked pages, I want to fetch the corresponding Wikidata Id a linked wikidata entity may exist on other wikipedias, not necessarily in the queried language (e.g. a page in French History is available in multiple

Retrieve linked wikidata entities having a wikipedia page

痞子三分冷 提交于 2021-02-10 20:32:05
问题 I want to query wikidata by free text or by category, to return entities who has a corresponding wikipedia page. For each page (or for a selected page) I want to fetch all the linked wikidata entities who have a corresponding wikipedia article. Note that: for each wikipedia page and linked pages, I want to fetch the corresponding Wikidata Id a linked wikidata entity may exist on other wikipedias, not necessarily in the queried language (e.g. a page in French History is available in multiple

How to get Wikipedia content as text by API?

徘徊边缘 提交于 2021-02-08 08:52:11
问题 I want to get Wikipedia pages as text. I looked at the Wikipedia API from here https://en.wikipedia.org/w/api.php which says that in order to get pages as text I need to append this to a page address: api.php?action=query&meta=siteinfo&siprop=namespaces&format=txt However, when I try appending this suffix to a normal page's address, the page is not found: https://en.wikipedia.org/wiki/George_Washington/api.php?action=query&meta=siteinfo&siprop=namespaces&format=txt Following the instructions

How to reliably get the image used in the Wikipedia Infobox?

大憨熊 提交于 2021-02-07 22:42:20
问题 How do I (reliably) get the main image(s) used in the Wikipedia Infobox from the API? This question has been asked before and the accepted answer admits that it is just a guess. Subsequent answers seem like a hack, at best and don't return the correct image. For instance, the Jimi Hendrix Wikipedia entry uses "File:Jimi Hendrix 1967.png" as the main image in the InfoBox. The updated answers suggest using this url but for Jimi Hendrix (and other topics) it often returns the wrong image. If I

How to reliably get the image used in the Wikipedia Infobox?

人盡茶涼 提交于 2021-02-07 22:42:19
问题 How do I (reliably) get the main image(s) used in the Wikipedia Infobox from the API? This question has been asked before and the accepted answer admits that it is just a guess. Subsequent answers seem like a hack, at best and don't return the correct image. For instance, the Jimi Hendrix Wikipedia entry uses "File:Jimi Hendrix 1967.png" as the main image in the InfoBox. The updated answers suggest using this url but for Jimi Hendrix (and other topics) it often returns the wrong image. If I

API to retrieve info about famous people [closed]

谁说胖子不能爱 提交于 2021-02-07 12:28:22
问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 8 years ago . I'm looking for some callable way to get information about famous people and celebrities. Given a string, I'd like to determine if it

Transform Pandas string column containing unicodes to ascii to load urls

孤街浪徒 提交于 2021-02-05 12:12:31
问题 I have a pandas DataFrame containing a column with Wikipedia urls, that I want to load. However, some strings won't load because they contain unicodes. For example, 'Kruskal %E2%80%93 Wallis_one-way_analysis_of_variance' raises the following PageError: Page id "Cauchy%E2%80%93Schwarz_inequality" does not match any pages. Try another id! Is there a way to turn all unicodes into ascii? So in this case, I need a function that can create a new column: old column new column Cauchy%E2%80%93Schwarz