wikipedia | 易学教程

Wikipedia with Python

阅读更多关于 Wikipedia with Python

问题 I have this very simple python code to read xml for the wikipedia api: import urllib from xml.dom import minidom usock = urllib.urlopen("http://en.wikipedia.org/w/api.php?action=query&titles=Fractal&prop=links&pllimit=500") xmldoc=minidom.parse(usock) usock.close() print xmldoc.toxml() But this code returns with these errors: Traceback (most recent call last): File "/home/user/workspace/wikipediafoundations/src/list.py", line 5, in <module><br> xmldoc=minidom.parse(usock)<br> File "/usr/lib

wikidata query missing out countries in Europe

阅读更多关于 wikidata query missing out countries in Europe

问题 I am using the following query against wikidata; SELECT ?country ?countryLabel WHERE { ?country wdt:P30 wd:Q46; wdt:P31 wd:Q6256. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } } where P30 is continent; Q46 is Europe; P31 is Instance Of and Q6256 is country; https://query.wikidata.org/#SELECT%20%3Fcountry%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46%3B%0A%20%20%20%20%20

Wikipedia API and SPARQL in a single query

阅读更多关于 Wikipedia API and SPARQL in a single query

问题 I need to search for Wikipedia pages that contain some specific words in their full text . To improve the results I want to limit the results to pages describing entities that are instances of a specific entity. For searching the full text I can use the Wikipedia APIs, using the query action and the search generator . For filtering instances of a given entity I can use the Wikidata APIs and a SPARQL query. Is there a way to execute both operations in a single query that applies both filters?

Link terms on page to Wikipedia articles in pure JavaScript

阅读更多关于 Link terms on page to Wikipedia articles in pure JavaScript

问题 While browsing I came across this blog post about using the Wikipedia API from JavaScript, to link a single search term to it's definition. At the end of the blog post the author mentions possible extensions including: A plugin which auto links terms to Wikipedia articles. This fits the bill perfectly for a project requirement I'm working on, but sadly I lack the programming skills to extend the original source code. What I'd like is to have a pure JavaScript snippet I can add to a webpage,

Find which direct property applied in a SPARQL query

阅读更多关于 Find which direct property applied in a SPARQL query

问题 I have a list of properties I want to apply to a specific entity mathematics: wd:Q395 . In this case: instanceOf: 'wdt:P31' subclassOf: 'wdt:P279' The results are: Mathematics is instance of academic discipline and Mathematics is subclass of exact science and formal science Instead of making two different queries I would like to make them all at once: SELECT ?field ?fieldLabel ?propertyApplied WHERE { wd:Q395 wdt:P31 | wdt:P279 ?field. SERVICE wikibase:label { bd:serviceParam wikibase

How to get the result of a complex Wikipedia template?

阅读更多关于 How to get the result of a complex Wikipedia template?

问题 This is a question that is a bit hard to follow but I will do my best explaining it. First, let me present an example page: http://en.wikipedia.org/wiki/African_bush_elephant That's a wikipedia page, a specie page in particular since it has the 'taxobox' to the right. I'm trying to parse the attributes in that taxobox using PHP. There's two ways in Wikipedia to create such a taxobox: manually, or by using the special "auto taxobox" template. I can parse the manual one. I use Wikipedia's API

How to build wikipedia category hierarchy?

阅读更多关于 How to build wikipedia category hierarchy?

问题 I'm trying to build the treegraph of wikipedia articles and its categories. What do I need to do that? From this site (http://dumps.wikimedia.org/enwiki/latest/), I've downloaded: enwiki-latest-page.sql.gz enwiki-latest-categorylinks.sql.gz enwiki-20141106-category.sql.gz I tried followed the answer here (Wikipedia Category Hierarchy from dumps), but it doesn't seem that the categorylinks has the same schema (no pageId column). What's the right way to build the hierarchy? Bonus question: How

How to access wikipedia

阅读更多关于 How to access wikipedia

问题 I want to access HTML content from wikipedia .But it is showing access denied. How can i access Wiki. Please give some suggestion 回答1: Use HttpWebRequest Try the following: string Text = "http://www.wikipedia.org/"; HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(Text); request.UserAgent = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"; HttpWebResponse respons; respons = (HttpWebResponse)request.GetResponse(); Encoding enc = Encoding.GetEncoding(respons.CharacterSet);

Pythonic beautifulSoup4 : How to get remaining titles from the next page link of a wikipedia category

阅读更多关于 Pythonic beautifulSoup4 : How to get remaining titles from the next page link of a wikipedia category

问题 I wrote successfully the following code to get the titles of a Wikipedia category. The category consists more than 404 titles. But my output file gives only 200 titles/pages. how to extend my code to get all the titles of that category's link (next page) and so on. command : python3 getCATpages.py Code of getCATpages.py ;- from bs4 import BeautifulSoup import requests import csv #getting all the contents of a url url = 'https://en.wikipedia.org/wiki/Category:Free software' content = requests

Getting Wikipedia infobox content with JQuery

阅读更多关于 Getting Wikipedia infobox content with JQuery

问题 I'm looking to use JQuery to pull back contents of the Wikipedia infobox that contains company details. I think that I'm almost there but I just can't get the last step of the way var searchTerm="toyota"; var url="http://en.wikipedia.org/w/api.php?action=parse&format=json&page=" + searchTerm+"&redirects&prop=text&callback=?"; $.getJSON(url,function(data){ wikiHTML = data.parse.text["*"]; $wikiDOM = $(wikiHTML); $("#result").append($wikiDOM.find('.infobox').html()); }); The first part works -