wikipedia

filter data from wikimedia api ios

久未见 提交于 2019-12-12 04:39:05
问题 I got the fllowing json response from wikipedia api while searching Anil_Ambani . I used this api. I got the following response <i>$2 = 0x071882f0 {{BLP sources|date=June 2012}} {{Infobox person | name = Anil Ambani | image =AnilAmbani.jpg | image_size = | caption = Ambani in 2009 | birth_date = {{Birth date and age|1959|6|4|df=y}} | birth_place = [[Mumbai]], [[Maharashtra]], [[India]] | nationality = Indian | occupation = Chairman of [[Anil Dhirubhai Ambani Group]] | networth = {{loss}} [

Extract wikipedia articles belonging to a category from offline dumps

為{幸葍}努か 提交于 2019-12-12 04:25:11
问题 I have wikipedia article dumps in different languages. I want to filter them with articles which belong to a category(specifically Category:WikiProject_Biography) I could get a lot of similar questions for example: Wikipedia API to get articles belonging to a category How do I get all articles about people from Wikipedia? However, I would like to do it all offline. That is using dumps, and also for different languages. Other things which I explored are category table and category link table.

How to get the first parsed paragraph of wikipedia?

六眼飞鱼酱① 提交于 2019-12-12 03:35:26
问题 This is the js I am using: function onSuccess(data){ var markupt = data.parse.text["*"]; $('#usp-custom-4').val(markupt); console.log(markupt); var blurbt = $('<div></div>').html(markupt); blurbt.find(".mw-editsection, #toc, .noprint, .thumb, img, table").remove(); // remove links as they will not work blurbt.find('a').each(function() { $(this).replaceWith($(this).html()); }); // remove any references blurbt.find('sup').remove(); // remove cite error blurbt.find('.mw-ext-cite-error').remove()

Getting Wikipedia page view statistics

守給你的承諾、 提交于 2019-12-12 03:05:53
问题 I'm trying to collect time series data over the last five years on Wikipedia page view statistics for a particular webpage ("Bitcoin"). I found this site to be useful: http://stats.grok.se for getting this data. Two issues: The website triggers an "internal server error" error whenever 2016 is selected as a year for which to obtain data. Is there an existing tool that can put this output in more usable form, such as a .csv? 回答1: I don't know about stats.grok.se as it doesn't appear to live on

disambiguation using wikipedia

主宰稳场 提交于 2019-12-11 23:41:38
问题 I'm trying to do some disambiguation using wikipedi's disambiguation pages. I get the list of links from the disambiguation page using the query http://en.wikipedia.org/w/api.php?action=query&prop=links&format=json&titles=stack%20overflow_(disambiguation) I get the links alright, but what's the speediest way to get the text that appears next to each link? The api doesn't make it readily available. Other parts of my code are doing unavoidably time consuming work, was wondering if anybody could

Wikipedia API to get articles belonging to a category

萝らか妹 提交于 2019-12-11 20:21:26
问题 I would like to get a number of pages belonging to a specific category, say sports and politics. I would like to extract various sections from the pages, such as abstract, title, etc. Is there an API to do that? If not, are there any Wikipedia dumps organized by categories? Thanks 回答1: You're looking for the categorymembers api. Notice that you will only get pages directly in that single category, no subcategories; and there are no intersection operators. You probably will want to use that

from wikipedia title to freebase mid

99封情书 提交于 2019-12-11 19:16:29
问题 Is there a way to map Wikipedia title link to Freebase mids? For example, wikipedia titles are: From_Russia_with_Love_(film), John_Barry_(composer), Lionel_Bart, Matt_Monro [{ "mid": null, "id": "/en/matt_monro" }] It works for titles like "Matt_Monro", "Lionel_Bart", but not "from_russia_with_love_(film)" or "john_barry_(composer)" Any suggestions please? 回答1: The correct key to use is /wikipedia/en_title/Matt_Monro . There's no guarantee that /en/matt_monro points the same place (and newer

MediaWiki Special:MyPage/common.js not working

我的梦境 提交于 2019-12-11 18:38:22
问题 MediaWiki Version and LocalSettings.php MediaWiki 1.22.4 $wgAllowUserJs = true; $wgUseSiteJs = true; Browser Version FireFox 28.0 JavaScript Code $ gvim common.js function myFunction() { alert("Hello World!"); } var onClickAttribute = document.createAttribute("onclick"); onClickAttribute.value="myFunction()"; var button = document.createElement("button"); button.setAttributeNode(onClickAttribute); button.innerHTML = "Say hello"; if (document.URL === 'http://mywiki.com/w/index.php/User:Pjc

How to index wikipedia files in .xml format into solr

心不动则不痛 提交于 2019-12-11 18:32:07
问题 I want to index xml files of Wikipedia into Solr. But I am getting an error, it is unable to index. Solr has a specific format for xml files. I changed the schema.xml and data-config.xml files to suit the tags of the wikipedia files. Still it is unable to index the files. My actual intention is to index wikipedia which is an xml file of 30 GB. How would I go about indexing all wikipedia files into Solr? 回答1: There's an example section in the DataImportHandler documentation for exactly this:

Setting variable to result of function acting very strange

穿精又带淫゛_ 提交于 2019-12-11 17:46:32
问题 I have a function in javascript that is supposed to return an array of all the articles linked to by a wikipedia page, given the title. Here it is: function getLinksFrom(title, returnArray, plcontinue) { var url = 'http://en.wikipedia.org/w/api.php?action=query&prop=links&titles=' + title + '&format=json&pllimit=500&plnamespace=0&callback=?'; if (!returnArray) { returnArray = []; } if (!plcontinue) { plcontinue = ''; } if (returnArray.length === 0 || plcontinue !== '') { if (plcontinue !== ''