wikipedia-api

PHP + Wikipedia: Get content from the first paragraph in a Wikipedia article?

只谈情不闲聊 提交于 2019-12-07 19:36:49
问题 I’m trying to use Wikipedia’s API (api.php) to get the content of a Wikipedia article provided by a link (like: http://en.wikipedia.org/wiki/Stackoverflow). And what I want is to get the first paragraph (which in the example of the Stackoverflow wiki article is: Stack Overflow is a website part of the Stack Exchange network[2][3] featuring questions and answers on a wide range of topics in computer programming.[4][5][6] ). I’m going to do some data manipulation with it. I’ve tried with the

Extract related articles in different languages using Wikidata Toolkit

放肆的年华 提交于 2019-12-07 13:05:40
I'm trying to extract interlanguage related articles in Wikidata dump. After searching on the internet, I found out there is a tool named Wikidata Toolkit that helps to work with these type of data. But there is no information about how to find related articles in different languages. For example, the article: "Dresden" in the English language is related to the article: "Dresda" in the Italiano one. I mean the second one is the translated version of the first one. I tried to use the toolkit, but I couldn't find any solution. Please write some example about how to find this related article. you

Setting “an informative User-Agent string” in getURL

不羁岁月 提交于 2019-12-07 00:02:31
问题 I am trying to access a Wikipedia page so to get a list of pages, and get the following error: library(RCurl) u <- "http://en.wikipedia.org/w/index.php?title=Special%3APrefixIndex&prefix=tal&namespace=4" getURL(u) [1] "Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.\n" I hope to get to that page through the Wikipedia api, but I am not sure it would work. And the thing is that other pages are read without problem, for

PHP + Wikipedia: Get content from the first paragraph in a Wikipedia article?

て烟熏妆下的殇ゞ 提交于 2019-12-06 09:25:50
I’m trying to use Wikipedia’s API (api.php) to get the content of a Wikipedia article provided by a link (like: http://en.wikipedia.org/wiki/Stackoverflow ). And what I want is to get the first paragraph (which in the example of the Stackoverflow wiki article is: Stack Overflow is a website part of the Stack Exchange network[2][3] featuring questions and answers on a wide range of topics in computer programming.[4][5][6] ). I’m going to do some data manipulation with it. I’ve tried with the API url: http://en.wikipedia.org/w/api.php?action=parse&page=Stackoverflow&format=xml but it gives me

Find main category for article using Wikipedia API

时光总嘲笑我的痴心妄想 提交于 2019-12-06 06:44:05
问题 I have a list of articles and I want to find the main category of each article. Wikipedia lists its main categories here - http://en.wikipedia.org/wiki/Portal:Contents/Categories. I am able to find the subcategories of each article using: http://en.wikipedia.org/w/api.php?action=query&prop=categories&titles=%s&format=xml I also am able to check whether a subcategory is within a category: http://en.wikipedia.org/w/api.php?action=query&titles=Dog&prop=categories&clcategories=Domesticated

wikipedia api search titles generator

≯℡__Kan透↙ 提交于 2019-12-06 06:11:24
问题 Trying to search tiles through the api using a generator. I notice that there are two possible generators, with both I have problems: prefix search - doesn't work well if I have multiple words and the order is reversed in the query (for example "brian adams" would return an answer, however "adams brian" does not search - seems to not allow searching by titles, only by text which returns low-quality results. Anyone knows of a way around this? 回答1: "srwhat=title" is disabled, so you should use

Wikipedia API: how to get the number of revisions of a page?

我的未来我决定 提交于 2019-12-05 21:12:00
问题 Anyone know how to get the number of revisions of a wikipedia page using mediawiki API? I have read this API documentation, but can't find the related API: Revision API 回答1: The only possibility is to retrieve all revisions and count them. You might need to continue the query for that. Bug 17993 is about including a count, but is still unsolved. 回答2: Here is code to get number of revisions of a page (in this case, the JSON wiki page): import requests BASE_URL = "http://en.wikipedia.org/w/api

How to get coordinates from a Wikipedia page through API?

北战南征 提交于 2019-12-05 17:51:45
I want to get the coordinates of a Wikipedia page through their API. I want to put the page title as 'titles' parameter. I have searched SO for a solution but seems they are scrapping the page then extracting. Is it possible through their API? You need to use Wikipedia API . For your example with Kinkaku-ji the query will be: https://en.wikipedia.org/w/api.php?action=query&prop=coordinates&titles=Kinkaku-ji For more than one title use pipe to separate them: titles=Kinkaku-ji|Paris|... 来源: https://stackoverflow.com/questions/40098656/how-to-get-coordinates-from-a-wikipedia-page-through-api

Wikipedia API: search for famous people

可紊 提交于 2019-12-05 16:03:59
问题 I have the following Wikipedia API search query: http://en.wikipedia.org/w/api.php?&action=query&generator=search&gsrnamespace=0&gsrlimit=20&prop=pageimages|extracts&pilimit=max&exintro&exsentences=1&exlimit=max&continue&pithumbsize=100&gsrsearch=Albert%20Einstein I just want to list famous people - is there a way to do that? 回答1: There isn't an exact way to limit your search results to only famous people. However, you can use a few different filters in with Wikipedia's CirrusSearch to

How to get list of statements for a given Wikidata ID?

拜拜、爱过 提交于 2019-12-05 09:29:42
The only thing I managed to do is this link: https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q568&format=jsonfm But this produces lots of useless data. What I need is to get all the statements for the given item, but I can't see any of the statements in the query above. here it will be: { "instance of" : "chemical element", "element symbol" : "Li", "atomic number" : 3, "oxidation state" : 1, "subclass of" : ["chemical element", "alkali metal"] // etc... } Is there an API for this or must I scrape the web page? The information you want is in your query, except it's hard to decode.