wikipedia | 易学教程

Is there a clean wikipedia API just for retrieve content summary?

阅读更多关于 Is there a clean wikipedia API just for retrieve content summary?

问题 I need just to retrieve first paragraph of a Wikipedia page. Content must be html formated, ready to be displayed on my websites (so NO BBCODE, or WIKIPEDIA special CODE!) 回答1: There's a way to get the entire "intro section" without any html parsing! Similar to AnthonyS's answer with an additional explaintext param, you can get the intro section text in plain text. Query Getting Stack Overflow's intro in plain text: https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts

How to extract information from a Wikipedia infobox?

阅读更多关于 How to extract information from a Wikipedia infobox?

There is this fancy infobox in <some Wikipedia article>. How do I get the value of <this field and that>? Tgr The wrong way: trying to parse HTML Use (cURL/jQuery/file_get_contents/requests/wget/ more jQuery ) to fetch the HTML article code of the article, then use a DOM parser to extract table.infobox tr[3] td / use a regex . This is actually a really bad idea most of the time. Wikipedia's HTML code is not particularly parsing-friendly (especially infoboxes which are a system of hand-written templates), the exact structure changes from infobox to infobox, and the structure of an infobox might

Is there a Wikipedia API?

阅读更多关于 Is there a Wikipedia API?

问题 On my Wikipedia user page, I run a Wikipedia script that displays my statistics (number of pages edited, number of new pages, monthly activity, etc.). I\'d like to put this information on my blog. Is there an API that would allow me to do something like this? 回答1: MediaWiki's API is running on Wikipedia (docs). You can also use the Special:Export feature to dump data and parse it yourself. More information. 回答2: Wikipedia is built on MediaWiki, and here's the MediaWiki API. 回答3: If you want

How to extract information from a Wikipedia infobox?

阅读更多关于 How to extract information from a Wikipedia infobox?

问题 There is this fancy infobox in <some Wikipedia article>. How do I get the value of <this field and that>? 回答1: The wrong way: trying to parse HTML Use (cURL/jQuery/file_get_contents/requests/wget/more jQuery) to fetch the HTML article code of the article, then use a DOM parser to extract table.infobox tr[3] td / use a regex. This is actually a really bad idea most of the time. Wikipedia's HTML code is not particularly parsing-friendly (especially infoboxes which are a system of hand-written