wikipedia

Get all Wikipedia Infobox Templates and all Pages using them

纵然是瞬间 提交于 2019-11-29 18:19:00
问题 Given a Wikipedia page like Wikipedia: Stack Overflow there are often Infoboxes (mostly on the right hand at the top of the page). Example screenshot: DBPedia lists all these attributes as RDF triples. You can see the example at DBPedia: Stack Overflow. There you see the property dbpprop:wikiPageUsesTemplate with the value dbpedia:Template:Infobox_website which is interesting. I want to know which Wikipedia pages use this template. How can i do that and list all pages which use the Infobox

Indexing wikipedia with solr

断了今生、忘了曾经 提交于 2019-11-29 17:11:17
I've installed solr 4.6.0 and follow the tutorial available at Solr's home page. Everything was fine, untill I need to do a real job that I'm about to do. I have to get a fast access to wikipedia content and I was advised to use Solr. Well, I was trying to follow the example in the link http://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia , but I couldn't get the example. I am newbie, and I don't know what means data_config.xml! <dataConfig> <dataSource type="FileDataSource" encoding="UTF-8" /> <document> <entity name="page" processor="XPathEntityProcessor" stream="true"

Getting hyperlinks of a Wikipedia page using DBpedia

安稳与你 提交于 2019-11-29 12:51:19
I have two resources in DBPedia: dbr:Diabetes_mellitus and dbr:Hyperglycemia . In Wikipedia, the corresponding pages are wikipedia-en:Diabetes_mellitus and wikipedia-en:Hyperglycemia . In Wikipedia there is a hyperlink from Diabetes_mellitus page to Hyperglycemia page. But when I try to find the link between the 2 resources in DBpedia, I cannot find it. I tried to find the link using the following SPARQL query. SELECT ?prop WHERE { { dbr:Diabetes_mellitus ?prop dbr:Hyperglycemia } UNION { dbr:Hyperglycemia ?prop dbr:Diabetes_mellitus } } But the answer is null. I get nothing as an answer. Is

Does the Wikipedia API support searches for a specific template?

五迷三道 提交于 2019-11-29 12:17:02
问题 Is it possible to query the Wikipedia API for articles that contain a specific template? The documentation does not describe any action that would filter search results to pages that contain a template. Specifically, I am after pages that contain Template:Persondata . After that, I am hoping to be able to retrieve just that specific template in order to populate genealogy data for the openancestry.org project. The query below shows that the Albert Einstein page contains the Persondata

Wikipedia : Java library to remove wikipedia text markup removal

安稳与你 提交于 2019-11-29 10:20:47
I downloaded wikipedia dump and now want to remove the wikipedia markup in the contents of each page. I tried writing regular expressions but they are too many to handle. I found a python library but I need a java library because, I want to integrate into my code. Thank you. Do it in two steps: let some existing tool convert the MediaWiki mark-up into plain HTML; convert the plain HTML into text. The following demo: import net.java.textilej.parser.MarkupParser; import net.java.textilej.parser.builder.HtmlDocumentBuilder; import net.java.textilej.parser.markup.mediawiki.MediaWikiDialect; import

How to add a link in MediaWiki VisualEditor Toolbar?

僤鯓⒐⒋嵵緔 提交于 2019-11-29 03:25:24
I`m trying to insert a custom link to a special page in VisualEditor toolbar. See the image below. See Image I googled a lot but without success. Someone please give a path... My answer is based on the following resources: MediaWiki core JS doc (ooui-js) VisualEditor JS doc (+ reading code of both repositories used for VE, mediawiki/extension/VisualEditor and VisualEditor ) Also, I'm pretty sure, that there is no documented way of adding a tool to the toolbar in VE, as far as I know. Although it's possible to add a tool to a group, which is already added, mostly used for the "Insert" tool

How to access Wikipedia from R?

♀尐吖头ヾ 提交于 2019-11-29 03:16:29
问题 Is there any package for R that allows querying Wikipedia (most probably using Mediawiki API) to get list of available articles relevant to such query, as well as import selected articles for text mining? 回答1: Use the RCurl package for retreiving info, and the XML or RJSONIO packages for parsing the response. If you are behind a proxy, set your options. opts <- list( proxy = "136.233.91.120", proxyusername = "mydomain\\myusername", proxypassword = 'whatever', proxyport = 8080 ) Use the

Wikipedia API: how to search for a term in a specific category

て烟熏妆下的殇ゞ 提交于 2019-11-29 03:06:32
问题 I'm having hard time to figure out a basic task: how to find a term restricted in a specific category.. i feel Wiki API documentation is kinda confusing... I'd just like to receive as output a JSON file with all the suggestions related to that term ex. i search for Matrix, category movies, so i can have The Matrix 1 The Matrix 2 etc excluding math results etc... thanks 回答1: I feel your pain bro, try something like: http://en.wikipedia.org/w/api.php?action=query&list=search&format=jsonfm

How do I grab just the parsed Infobox of a wikipedia article?

寵の児 提交于 2019-11-29 02:06:42
I'm still stuck on my problem of trying to parse articles from wikipedia. Actually I wish to parse the infobox section of articles from wikipedia i.e. my application has references to countries and on each country page I would like to be able to show the infobox which is on corresponding wikipedia article of that country. I'm using php here - I would greatly appreciate it if anyone has any code snippets or advice on what should I be doing here. Thanks again. EDIT Well I have a db table with names of countries. And I have a script that takes a country and shows its details. I would like to grab

Use freebase data on local server?

浪子不回头ぞ 提交于 2019-11-28 23:16:25
问题 Are there any existing ways of using the freebase data dumps to create a database similar to what freebase offers, but on you own server? Pretty much freebase but locally and not through the API? I guess it would be possible to create, but are there any existing solutions for this already? Or any alternative solutions for similar data without using an API? I didnt find this for dbpedia either :| 回答1: Take a look at the freebase-quad-rdfize project on Google Code. It should allow you to