wikipedia

How to get plain text out of wikipedia

て烟熏妆下的殇ゞ 提交于 2019-11-28 18:02:25
问题 I've been searching for about 2 months now to find a script that gets the Wikipedia description section only. (It's for a bot i'm building, not for IRC.) That is, when I say /wiki bla bla bla it will go to the Wikipedia page for bla bla bla, get the following, and return it to the chatroom: "Bla Bla Bla" is the name of a song made by Gigi D'Agostino. He described this song as "a piece I wrote thinking of all the people who talk and talk without saying anything". The prominent but nonsensical

How would you handle different formats of dates?

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-28 14:38:50
I have different types of dates formatting like: 27 - 28 August 663 CE 22 August 1945 19 May May 4 1945 – August 22 1945 5/4/1945 2-7-1232 03-4-1020 1/3/1 (year 1) 09/08/0 (year 0) Note they are all different formats, different order, some have 2 months, some only one, I tried to use moment js with no results, I also tried to use date js yet, no luck. I tried to do some splitting: dates.push({ Time : [] }); function doSelect(text) { return $wikiDOM.find(".infobox th").filter(function() { return $(this).text() === text; }); } dateText = doSelect("Date").siblings('td').text().split(/\s+/g); for

How to get all article pages under a Wikipedia Category and its sub-categories?

不羁岁月 提交于 2019-11-28 08:16:37
I want to get all the articles names under a category and its sub-categories. Options I'm aware of: Using the Wikipedia API. Does it have such an option?? d/l the dump. Which format would be better for my usage? There is also an option to search in Wikipedia something like incategory:"music" , but I didn't see an option to view that in XML. Please share your thoughts The following resource will help you to download all pages from the category and all its subcategories: http://en.wikipedia.org/wiki/Wikipedia:CatScan There is also an API available here: https://www.mediawiki.org/wiki/API

Fetch excerpt from Wikipedia article?

孤人 提交于 2019-11-28 07:01:14
I've been up and down the Wikipedia API , but I can't figure out if there's a nice way to fetch the excerpt of an article (usually the first paragraph). It would be nice to get the HTML formatting of that paragraph, too. The only way I currently see of getting something that resembles a snippet is by performing a fulltext search ( example ), but that's not really what I want (too short). Is there any other way to fetch the first paragraph of a Wikipedia article than barbarically parsing HTML/WikiText? Use this link to get the unparsed intro in xml form "http://en.wikipedia.org/w/api.php?format

Wikipedia : Java library to remove wikipedia text markup removal

女生的网名这么多〃 提交于 2019-11-28 03:42:25
问题 I downloaded wikipedia dump and now want to remove the wikipedia markup in the contents of each page. I tried writing regular expressions but they are too many to handle. I found a python library but I need a java library because, I want to integrate into my code. Thank you. 回答1: Do it in two steps: let some existing tool convert the MediaWiki mark-up into plain HTML; convert the plain HTML into text. The following demo: import net.java.textilej.parser.MarkupParser; import net.java.textilej

Fetch random excerpt from Wikipedia (Javascript, client-only)

南楼画角 提交于 2019-11-28 01:22:29
I have a web page that asks the user for a paragraph of text, then performs some operation on it. To demo it to lazy users, I'd like to add an "I feel lucky" button that will grab some random text from Wikipedia and populate the inputs. How can I use Javascript to fetch a sequence of text from a random Wikipedia article? I found some examples of fetching and parsing articles using the Wikipedia API , but they tend to be server side. I'm looking for a solution that runs entirely from the client and doesn't get scuppered by same origin policy . Note random gibberish is not sufficient; I need

How to add a link in MediaWiki VisualEditor Toolbar?

删除回忆录丶 提交于 2019-11-27 17:32:53
问题 I`m trying to insert a custom link to a special page in VisualEditor toolbar. See the image below. See Image I googled a lot but without success. Someone please give a path... 回答1: My answer is based on the following resources: MediaWiki core JS doc (ooui-js) VisualEditor JS doc (+ reading code of both repositories used for VE, mediawiki/extension/VisualEditor and VisualEditor) Also, I'm pretty sure, that there is no documented way of adding a tool to the toolbar in VE, as far as I know.

How do I grab just the parsed Infobox of a wikipedia article?

随声附和 提交于 2019-11-27 16:40:25
问题 I'm still stuck on my problem of trying to parse articles from wikipedia. Actually I wish to parse the infobox section of articles from wikipedia i.e. my application has references to countries and on each country page I would like to be able to show the infobox which is on corresponding wikipedia article of that country. I'm using php here - I would greatly appreciate it if anyone has any code snippets or advice on what should I be doing here. Thanks again. EDIT Well I have a db table with

How to use wikipedia api if it exists? [closed]

爷,独闯天下 提交于 2019-11-27 02:36:51
I'm trying to find out if there's a wikipedia api (I Think it is related to the mediawiki?). If so, I would like to know how I would tell wikipedia to give me an article about the new york yankees for example. What would the REST url be for this example? All the docs on this subject seem fairly complicated. You really really need to spend some time reading the documentation, as this took me a moment to look and click on the link to fix it. :/ but out of sympathy i'll provide you a link that maybe you can learn to use. http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New

How do I get all articles about people from Wikipedia?

自闭症网瘾萝莉.ら 提交于 2019-11-27 02:12:02
问题 What would be the easiest way to get all articles about people from Wikipedia? I know I can download a dump of all the pages, but then how do I filter those and get only the ones about people? I need as many as I can get (preferably more than a million) so using any sort of API is probably not an option. 回答1: Since articles about people usually contain the Persondata template, you can just search for all articles that contain Persondata. You can find a sample API query for doing just that