wikipedia-api

Wikipedia API: how to get the number of revisions of a page?

馋奶兔 提交于 2019-12-04 03:12:48
Anyone know how to get the number of revisions of a wikipedia page using mediawiki API? I have read this API documentation, but can't find the related API: Revision API The only possibility is to retrieve all revisions and count them. You might need to continue the query for that. Bug 17993 is about including a count, but is still unsolved. Here is code to get number of revisions of a page (in this case, the JSON wiki page ): import requests BASE_URL = "http://en.wikipedia.org/w/api.php" TITLE = 'JSON' parameters = { 'action': 'query', 'format': 'json', 'continue': '', 'titles': TITLE, 'prop':

pass session cookies in http header with python urllib2?

谁都会走 提交于 2019-12-04 01:06:36
问题 I'm trying to write a simple script to log into Wikipedia and perform some actions on my user page, using the Mediawiki api. However, I never seem to get past the first login request (from this page: https://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot#Logging_in). I don't think the session cookie that I set is being sent. This is my code so far: import Cookie, urllib, urllib2, xml.etree.ElementTree url = 'https://en.wikipedia.org/w/api.php?action=login&format=xml' username = 'user'

Retrieving image license and author information in wiki commons

☆樱花仙子☆ 提交于 2019-12-04 00:42:47
I am trying to use the wikimedia API for wiki commons at: http://commons.wikimedia.org/w/api.php It seems like the commons API is very immature and the part at their document that mentions the possibility to retrieve license and author information is empty. Is there anyway I can retrieve the paragraph that contains the information about the licensing using the API? (For example, the paragraph under the title "Licensing" at this page ). Of course I can download the whole page and try to parse it, but what are APIs for? Philippe Green Late answer but you can request the "extmetadata" data with

Getting Wikipedia infobox content with JQuery

∥☆過路亽.° 提交于 2019-12-03 14:48:36
I'm looking to use JQuery to pull back contents of the Wikipedia infobox that contains company details. I think that I'm almost there but I just can't get the last step of the way var searchTerm="toyota"; var url="http://en.wikipedia.org/w/api.php?action=parse&format=json&page=" + searchTerm+"&redirects&prop=text&callback=?"; $.getJSON(url,function(data){ wikiHTML = data.parse.text["*"]; $wikiDOM = $(wikiHTML); $("#result").append($wikiDOM.find('.infobox').html()); }); The first part works - wikiHTML contains the content of the page, parsed by the Wikipedia API to HTML format This contains the

Retrieving the Interlanguage links from an exported Wikipedia article?

懵懂的女人 提交于 2019-12-03 14:12:48
I used to retrieve the interlanguage links from an exported Wikipedia article by parsing the export with some regular expressions. In phase 1 of the WikiData project these links have been moved to a separate page on Wikidata. For example the article Ore Mountains has no language links anymore in the export. The language links are now on Q4198 . How can I export the language links? You are now encouraged to use the Wikidata aPI : http://wikidata.org/w/api.php For your case, use props=labels . This url is self explicative : http://www.wikidata.org/w/api.php?action=wbgetentities&sites=enwiki

How to download images programmatically from Wikimedia Commons without registering for a Bot account?

那年仲夏 提交于 2019-12-03 12:27:55
问题 It seems like the only way to get approval for a Bot account is if it adds to or edits information already on Wikimedia. If you try to download any images, without a bot account, using some of the api libraries out there you get error messages instead of the images. Seems like they block anyone not coming in from a browser? Anyone else have any experience with this? Am I missing something here? 回答1: Try explaining exactly what you want to do? And what you've tried? What error message did you

How to group wikipedia categories in python?

北城以北 提交于 2019-12-03 12:15:50
问题 For each concept of my dataset I have stored the corresponding wikipedia categories. For example, consider the following 5 concepts and their corresponding wikipedia categories. hypertriglyceridemia: ['Category:Lipid metabolism disorders', 'Category:Medical conditions related to obesity'] enzyme inhibitor: ['Category:Enzyme inhibitors', 'Category:Medicinal chemistry', 'Category:Metabolism'] bypass surgery: ['Category:Surgery stubs', 'Category:Surgical procedures and techniques'] perth: [

Wikipedia Category Hierarchy from dumps

谁说胖子不能爱 提交于 2019-12-03 10:20:01
问题 Using Wikipedia's dumps I want to build a hierarchy for its categories. I have downloaded the main dump (enwiki-latest-pages-articles) and the category SQL dump (enwiki-latest-category). But I can't find the hierarchy information. For example, the SQL categories' dump has entries for each category but I can't find anything about how they relate to each other. The other dump (latest-pages-articles) says the parent categories for each page but in an unordered way. It just states all the parents

How to extract data from a Wikipedia article?

雨燕双飞 提交于 2019-12-03 07:52:42
问题 I have a question regarding parsing data from Wikipedia for my Android app. I have a script that can download the XML by reading the source from http://en.wikipedia.org/w/api.php?action=parse&prop=text&format=xml&page=ARTICLE_NAME (and also the JSON by replacing format=xml with format=json . But what I can't figure out is how to only access certain sections from the table of contents. What I want is when the page is loaded, the user can press a button that makes a pop-up appear that displays

Wikipedia list=search REST API: how to retrieve also Url of matching articles

為{幸葍}努か 提交于 2019-12-03 06:17:18
I'm studying Wikipedia REST API but I'm not able to find the right option to get also URLs for a search query. this is the URL of the request: http://it.wikipedia.org/w/api.php?action=query&list=search&srsearch=calvino&format=xml&srprop=snippet this request outputs only the Title and the Snippet but no URLs for articles. I've checked wikipedia API documentation for the list=search query but seems there is no option to get also URLs. Best Regards, Fabio Buda You can form the URL of the article easily by yourself from the title. For the Italian Wikipedia, it's http://it.wikipedia.org/wiki/