wikipedia-api

How do I grab just the parsed Infobox of a wikipedia article?

寵の児 提交于 2019-11-29 02:06:42
I'm still stuck on my problem of trying to parse articles from wikipedia. Actually I wish to parse the infobox section of articles from wikipedia i.e. my application has references to countries and on each country page I would like to be able to show the infobox which is on corresponding wikipedia article of that country. I'm using php here - I would greatly appreciate it if anyone has any code snippets or advice on what should I be doing here. Thanks again. EDIT Well I have a db table with names of countries. And I have a script that takes a country and shows its details. I would like to grab

How to get plain text out of wikipedia

て烟熏妆下的殇ゞ 提交于 2019-11-28 18:02:25
问题 I've been searching for about 2 months now to find a script that gets the Wikipedia description section only. (It's for a bot i'm building, not for IRC.) That is, when I say /wiki bla bla bla it will go to the Wikipedia page for bla bla bla, get the following, and return it to the chatroom: "Bla Bla Bla" is the name of a song made by Gigi D'Agostino. He described this song as "a piece I wrote thinking of all the people who talk and talk without saying anything". The prominent but nonsensical

How would you handle different formats of dates?

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-28 14:38:50
I have different types of dates formatting like: 27 - 28 August 663 CE 22 August 1945 19 May May 4 1945 – August 22 1945 5/4/1945 2-7-1232 03-4-1020 1/3/1 (year 1) 09/08/0 (year 0) Note they are all different formats, different order, some have 2 months, some only one, I tried to use moment js with no results, I also tried to use date js yet, no luck. I tried to do some splitting: dates.push({ Time : [] }); function doSelect(text) { return $wikiDOM.find(".infobox th").filter(function() { return $(this).text() === text; }); } dateText = doSelect("Date").siblings('td').text().split(/\s+/g); for

How to parse Wikipedia XML with PHP?

眉间皱痕 提交于 2019-11-28 10:34:18
How to parse Wikipedia XML with PHP? I tried it with simplepie, but I got nothing. Here is a link which I want to get its data. http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=2&gapfilterredir=nonredirects&gapfrom=Re&prop=revisions&rvprop=content&format=xml Edit code: <?php define("EMAIL_ADDRESS", "youlichika@hotmail.com"); $ch = curl_init(); $cv = curl_version(); $user_agent = "curl ${cv['version']} (${cv['host']}) libcurl/${cv['version']} ${cv['ssl_version']} zlib/${cv['libz_version']} <" . EMAIL_ADDRESS . ">"; curl_setopt($ch, CURLOPT_USERAGENT, $user_agent); curl

How to get all article pages under a Wikipedia Category and its sub-categories?

不羁岁月 提交于 2019-11-28 08:16:37
I want to get all the articles names under a category and its sub-categories. Options I'm aware of: Using the Wikipedia API. Does it have such an option?? d/l the dump. Which format would be better for my usage? There is also an option to search in Wikipedia something like incategory:"music" , but I didn't see an option to view that in XML. Please share your thoughts The following resource will help you to download all pages from the category and all its subcategories: http://en.wikipedia.org/wiki/Wikipedia:CatScan There is also an API available here: https://www.mediawiki.org/wiki/API

android Wikipedia api game

大城市里の小女人 提交于 2019-11-28 07:05:34
问题 Hi i have to make an app with the following requirement: When the user opens the app, it displays the text from a random Wikipedia page. (You’re free to use any logic for grabbing text from a random Wiki page(preferably using REST APIs)) The game requires a minimum of 10 lines of text on the screen. However, we want to show complete paragraphs of text to make it easier to understand the content displayed. Use the least number of paragraphs required to cross the 10 sentence limit. I am able to

Fetch excerpt from Wikipedia article?

孤人 提交于 2019-11-28 07:01:14
I've been up and down the Wikipedia API , but I can't figure out if there's a nice way to fetch the excerpt of an article (usually the first paragraph). It would be nice to get the HTML formatting of that paragraph, too. The only way I currently see of getting something that resembles a snippet is by performing a fulltext search ( example ), but that's not really what I want (too short). Is there any other way to fetch the first paragraph of a Wikipedia article than barbarically parsing HTML/WikiText? Use this link to get the unparsed intro in xml form "http://en.wikipedia.org/w/api.php?format

Is there any API in Java to access wikipedia data

风流意气都作罢 提交于 2019-11-28 06:51:16
I want to know: is there any API or a query interface through which I can access Wikipedia data? Mediawiki , the wiki platform that wikipedia uses does have an HTTP based API. See MediaWiki API . For example, to get pages with the title stackoverflow, you call http://en.wikipedia.org/w/api.php?action=query&titles=Stackoverflow There are some (incomplete) Java wrappers around the API - see the Client Code - Java section of the API page for more detail. For the use with Java, try http://code.google.com/p/wiki-java . It is only one class, but a great one! I had the same question and the closest I

How do I grab just the parsed Infobox of a wikipedia article?

随声附和 提交于 2019-11-27 16:40:25
问题 I'm still stuck on my problem of trying to parse articles from wikipedia. Actually I wish to parse the infobox section of articles from wikipedia i.e. my application has references to countries and on each country page I would like to be able to show the infobox which is on corresponding wikipedia article of that country. I'm using php here - I would greatly appreciate it if anyone has any code snippets or advice on what should I be doing here. Thanks again. EDIT Well I have a db table with

How to get results from the Wikipedia API with PHP?

走远了吗. 提交于 2019-11-27 15:24:11
I'm probably not supposed to use file_get_contents() What should I use? I'd like to keep it simple. Warning: file_get_contents(http://en.wikipedia.org/w/api.php?action=query&titles=Your_Highness&prop=revisions&rvprop=content&rvsection=0): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden The problem you are running into here is related to the MW API's User-Agent policy - you must supply a User-Agent header, and that header must supply some means of contacting you. You can do this with file_get_contents() with a stream context : $opts = array('http' => array( 'user_agent' =>