wikipedia-api

Parse birth and death dates from Wikipedia?

丶灬走出姿态 提交于 2020-06-09 11:29:12
问题 I'm trying to write a python program that can search wikipedia for the birth and death dates for people. For example, Albert Einstein was born: 14 March 1879; died: 18 April 1955. I started with Fetch a Wikipedia article with Python import urllib2 opener = urllib2.build_opener() opener.addheaders = [('User-agent', 'Mozilla/5.0')] infile = opener.open('http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&rvsection=0&titles=Albert_Einstein&format=xml') page2 = infile

How to get all URLs in a Wikipedia page

混江龙づ霸主 提交于 2020-05-26 05:02:34
问题 It seems like Wikipedia API's definition of a link is different from URL? I'm trying to use the API to return all the urls in a specific wiki page. I have been playing around with this query that I found from this page under generators and redirects. 回答1: I'm not sure why exactly are you confused (it would help if you explained that), but I'm quite sure that query is not what you want. It lists links ( prop=links ) on pages that are linked ( generator=links ) from the page “Title” ( titles

How to obtain a list of titles of all Wikipedia articles

元气小坏坏 提交于 2020-05-09 19:31:56
问题 I'd like to obtain a list of all the titles of all Wikipedia articles. I know there are two possible ways to get content from a Wikimedia powered wiki. One would be the API and the other one would be a database dump. I'd prefer not to download the wiki dump. First, it's huge, and second, I'm not really experienced with querying databases. The problem with the API on the other hand is that I couldn't figure out a way to only retrieve a list of the article titles and even if it would need > 4

How to obtain a list of titles of all Wikipedia articles

爱⌒轻易说出口 提交于 2020-05-09 19:31:12
问题 I'd like to obtain a list of all the titles of all Wikipedia articles. I know there are two possible ways to get content from a Wikimedia powered wiki. One would be the API and the other one would be a database dump. I'd prefer not to download the wiki dump. First, it's huge, and second, I'm not really experienced with querying databases. The problem with the API on the other hand is that I couldn't figure out a way to only retrieve a list of the article titles and even if it would need > 4

Wikipedia disambiguation error

对着背影说爱祢 提交于 2020-02-24 11:04:45
问题 I have recently been using the wikipedia module to determine a random wikipedia page. I have been doing this with a very large list of words, and the random.choice() module as so: words=open("words.txt","r") words=words.read() words=words.split() text=random.choice(words) string=random.choice(wikipedia.search(text)) p = wikipedia.page(string) The system appears to most often work, but will occasionally choke out the error: Traceback (most recent call last): File "/home/will/google4.py", line

Wikipedia disambiguation error

十年热恋 提交于 2020-02-24 11:04:17
问题 I have recently been using the wikipedia module to determine a random wikipedia page. I have been doing this with a very large list of words, and the random.choice() module as so: words=open("words.txt","r") words=words.read() words=words.split() text=random.choice(words) string=random.choice(wikipedia.search(text)) p = wikipedia.page(string) The system appears to most often work, but will occasionally choke out the error: Traceback (most recent call last): File "/home/will/google4.py", line

Extract related articles in different languages using Wikidata Toolkit

随声附和 提交于 2020-01-23 17:57:07
问题 I'm trying to extract interlanguage related articles in Wikidata dump. After searching on the internet, I found out there is a tool named Wikidata Toolkit that helps to work with these type of data. But there is no information about how to find related articles in different languages. For example, the article: "Dresden" in the English language is related to the article: "Dresda" in the Italiano one. I mean the second one is the translated version of the first one. I tried to use the toolkit,

Cannot fetch data from Wikipedia API

早过忘川 提交于 2020-01-21 10:11:45
问题 let dataObj = []; const query = 'marvel'; fetch(`https://en.wikipedia.org/w/api.php?action=query&titles=${query}&prop=revisions&rvprop=content&format=json&formatversion=2`) .then(data => data.json()) .then(data => dataObj.push(data)) .catch(err => console.log(err)); This is the error that I receive: No ' Access-Control-Allow-Origin ' header is present on the requested resource. Origin 'https://s.codepen.io' is therefore not allowed access. If an opaque response serves your needs, set the

How to get Wikipedia page from Wikidata Id?

大憨熊 提交于 2020-01-14 08:00:29
问题 How to get Wikipedia page (in a particular language, say French) from the Wikidata Id (ex: Q19675)? The question seems obvious but strangely, I find nothing on the web. I'm looking for a url command that I could use with requests Python module, something like: url = "https://www.wikidata.org/w/api.php?action=some_method&ids=Q19675" r = requests.post(url, headers={"User-Agent" : "Magic Browser"}) Someone can help me? 回答1: You have to use MediaWiki API with action=wbgetentities : https://www

Access JSON item when parent key name is unknown

天大地大妈咪最大 提交于 2020-01-14 00:35:11
问题 I’m requesting JSON from Wikipedia’s API at http://en.wikipedia.org/w/api.php?action=query&prop=description&titles=WTO&prop=extracts&exsentences&explaintext&format=json The response looks like this: { "query": { "pages": { "ramdom_number_here": { "pageid": ramdom_number_here, "ns": 0, "title": "Hello", "extract": "Hello world! Enchanté to meet you" } } } } Given that ramdom_number_here changes each request (so we don't know it), how can extrac or title ’s data be accessed? 回答1: Use Object