wikipedia-api

Retrieve a list of all Wikipedia languages programmatically

徘徊边缘 提交于 2019-12-05 06:59:30
I need to retrieve a list of all existing languages for a certain wiki project. For example, all Wikivoyage or all Wikipedia languages, just like on their landing pages. I prefer to do this via MediaWiki API , if it's possible. Thanks for your time. Approach 3: Using an API in the Wikimedia wiki farm and Extension:Sitematrix https://commons.wikimedia.org/w/api.php?action=sitematrix&smtype=language While this will return all wikis, the matrix knows about, it is easily filtered client side by code [as of now, one of: wiki (Wikipedia), wiktionary , wikibooks , wikinews , wikiquote , wikisource ,

Setting “an informative User-Agent string” in getURL

╄→гoц情女王★ 提交于 2019-12-05 05:15:14
I am trying to access a Wikipedia page so to get a list of pages, and get the following error: library(RCurl) u <- "http://en.wikipedia.org/w/index.php?title=Special%3APrefixIndex&prefix=tal&namespace=4" getURL(u) [1] "Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.\n" I hope to get to that page through the Wikipedia api, but I am not sure it would work . And the thing is that other pages are read without problem, for example: u <- "http://en.wikipedia.org/wiki/Wikipedia:Talk" getURL(u) Any suggestions? Side note: In

Retrieving the Interlanguage links from an exported Wikipedia article?

别来无恙 提交于 2019-12-05 00:25:08
问题 I used to retrieve the interlanguage links from an exported Wikipedia article by parsing the export with some regular expressions. In phase 1 of the WikiData project these links have been moved to a separate page on Wikidata. For example the article Ore Mountains has no language links anymore in the export. The language links are now on Q4198. How can I export the language links? 回答1: You are now encouraged to use the Wikidata aPI : http://wikidata.org/w/api.php For your case, use props

WebRequest to connect to the Wikipedia API

廉价感情. 提交于 2019-12-04 19:31:41
问题 This may be a pathetically simple problem, but I cannot seem to format the post webrequest/response to get data from the Wikipedia API. I have posted my code below if anyone can help me see my problem. string pgTitle = txtPageTitle.Text; Uri address = new Uri("http://en.wikipedia.org/w/api.php"); HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest; request.Method = "POST"; request.ContentType = "application/x-www-form-urlencoded"; string action = "query"; string query =

wikipedia api: get parsed introduction only

♀尐吖头ヾ 提交于 2019-12-04 16:50:56
Using PHP, is there a nice way to get the (parsed) introduction only from a wikipedia page? I have to current methods: The first is to call the api page and return, then call the Wiki parser on the introduction I have pulled from the first request (two requests, extracting the intro from the text isn't pretty either). The second is to call the entire page parser and use xpath to retrieve every <p> tag before the contents table. With both methods I then have to re-parse the HTML to ensure the relevant links inside the introduction link off to wikipedia. Neither are ideal really, there must be a

Find main category for article using Wikipedia API

╄→尐↘猪︶ㄣ 提交于 2019-12-04 13:33:07
I have a list of articles and I want to find the main category of each article. Wikipedia lists its main categories here - http://en.wikipedia.org/wiki/Portal:Contents/Categories . I am able to find the subcategories of each article using: http://en.wikipedia.org/w/api.php?action=query&prop=categories&titles=%s&format=xml I also am able to check whether a subcategory is within a category: http://en.wikipedia.org/w/api.php?action=query&titles=Dog&prop=categories&clcategories=Domesticated animals&format=xml This will tell me whether "domesticated animals" is a subcategory of Dog, but this is not

Wikipedia list=search REST API: how to retrieve also Url of matching articles

半世苍凉 提交于 2019-12-04 11:15:03
问题 I'm studying Wikipedia REST API but I'm not able to find the right option to get also URLs for a search query. this is the URL of the request: http://it.wikipedia.org/w/api.php?action=query&list=search&srsearch=calvino&format=xml&srprop=snippet this request outputs only the Title and the Snippet but no URLs for articles. I've checked wikipedia API documentation for the list=search query but seems there is no option to get also URLs. Best Regards, Fabio Buda 回答1: You can form the URL of the

wikipedia api search titles generator

∥☆過路亽.° 提交于 2019-12-04 10:59:55
Trying to search tiles through the api using a generator. I notice that there are two possible generators, with both I have problems: prefix search - doesn't work well if I have multiple words and the order is reversed in the query (for example "brian adams" would return an answer, however "adams brian" does not search - seems to not allow searching by titles, only by text which returns low-quality results. Anyone knows of a way around this? "srwhat=title" is disabled, so you should use "intitle:" in your search query: https://en.wikipedia.org/w/api.php?action=query&list=search&srnamespace=0

How can I get the principal image from MediaWiki API?

我与影子孤独终老i 提交于 2019-12-04 07:16:59
Hello I'm using Curl to get information from Wikipedia,and I want to receive only information about the principal image,I don't want to receive all images of an article.. For example.. If I want to get info about all images of the English Language ( http://en.wikipedia.org/wiki/English_language ) I should go to this URL: http://en.wikipedia.org/w/api.php?action=query&titles=English_Language&prop=images but I receive flags of countries where people speak English in XML: <?xml version="1.0"?> <api> <query> <normalized> <n from="English_language" to="English language" /> </normalized> <pages>

Getting content using wikipedia API

不问归期 提交于 2019-12-04 06:24:31
问题 How can I get the entire first section/paragraph of a Wikipedia article including the first image with a single request? What I've tried so far (the following url) returns only a snippet: http://en.wikipedia.org/w/api.php?format=xml&action=query&list=search&srsearch=camera&srlimit=1 回答1: If you're wanting the wikitext, use prop=revisions with rvsection=0. If you're wanting HTML, you can add rvparse=1 to that query or you can use action=parse. 回答2: How can I get the entire first section