wikipedia | 易学教程

Wikipedia module python: jumping “wikipedia.exceptions.PageError”

阅读更多关于 Wikipedia module python: jumping “wikipedia.exceptions.PageError”

问题 I'm trying to associate to each species name listed in a csv file the wikipedia summary and main image. I write this code: import csv import wikipedia wikipedia.set_lang('it') with open('D:\\GIS\\Dati\\Vinca\\specie_vinca.csv', 'rt', encoding="utf8") as f: reader = csv.reader(f) for row in reader: wikipage = wikipedia.page(row) print (wikipage.title) print (wikipage.summary) print ("Page URL: %s" % wikipage.url) print ("Nr. of images on page: %d" % len(wikipage.images)) print (" - Main Image:

Wikipedia page parsing program caught in endless graph cycle

阅读更多关于 Wikipedia page parsing program caught in endless graph cycle

问题 My program is caught in a cycle that never ends, and I can't see how it get into this trap, or how to avoid it. It's parsing Wikipedia data and I think it's just following a connected component around and around. Maybe I can store the pages I've visited already in a set and if a page is in that set I won't go back to it? This is my project, its quite small, only three short classes. This is a link to the data it generates, I stopped it short, otherwise it would have gone on and on. This is

How can I load content from another site onto mine with JavaScript/jQuery?

阅读更多关于 How can I load content from another site onto mine with JavaScript/jQuery?

问题 I'm trying to get a wikipedia article to load onto my site. I'm trying to follow the instructions here: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Transwiki but I'm at a loss. I've tried: var xyz = document.getElementById(url("http://en.wikipedia.org/w/index.php?title=Special:Export&history=1&action=submit&pages=Albert_einstein") var xyz = $('#xyz').load('http://en.wikipedia.org/w/index.php?title=Special:Export&history=1&action=submit&pages=Albert_einstein'); document.write(xyz); 回答1:

no content from Wikipedia API search

阅读更多关于 no content from Wikipedia API search

问题 Good morning I am using the following API search, which used to return title, content and link of a Wikipedia entry: https://it.wikipedia.org/w/api.php?action=opensearch&search=alessandro%20leogrande&format=json&utf8=1 Just recently I noticed that it is always returning an empty content part ( [""] ): ["alessandro leogrande",["Alessandro Leogrande"],[""],["https://it.wikipedia.org/wiki/Alessandro_Leogrande"]] Can you please give me any insight? 回答1: It seems there is a problem with the

Equals signs in Wikipedia template parameters won't display properly

阅读更多关于 Equals signs in Wikipedia template parameters won't display properly

问题 I've noticed that using links with equals signs in them doesn't seem to work properly (when the link is placed inside the {{missing information}} template). Is there any way to work around this limitation so that links with equals signs can be included inside MediaWiki templates? {{missing information|[https://www.google.com/search?q=google+search+test This link has an equals sign in it, and the template is not displaying properly.]}} {{missing information|[https://www.google.com/ This link

Equals signs in Wikipedia template parameters won't display properly

阅读更多关于 Equals signs in Wikipedia template parameters won't display properly

Is it possible to read Wikipedia using Python requests library?

阅读更多关于 Is it possible to read Wikipedia using Python requests library?

问题 To read a content from a given URL I do the following: import requests proxies = {'http':'http://user:pswd@foo-webproxy.foo.com:7777'} url = 'http://example.com/foo/bar' r = requests.get(url, proxies = proxies) print r.text.encode('utf-8') And it works fine! I get the content. However, if I use another URL: url = 'https://en.wikipedia.org/wiki/Mestisko' It does not work. I get an error message that starts with: requests.exceptions.ConnectionError: ('Connection aborted.', error(10060 Is

Is it possible to read Wikipedia using Python requests library?

阅读更多关于 Is it possible to read Wikipedia using Python requests library?

Blacklist IP database

阅读更多关于 Blacklist IP database

问题 Is there an open database of blacklisted IP for the Web? With a lot of public web proxy you know... such the blacklist used by the Global blocking of Wikipedia. 回答1: The Project Honeypot provides as service called Http:BL. As an active member of Project Honeypot you can query their database of IPs that are known as email address harvesters or comment spammers. 回答2: You can use Blacklist IP Addresses Live Database from myip.ms - http://myip.ms/browse/blacklist/Blacklist_IP_Blacklist_IP

Indexing wikipedia with solr

阅读更多关于 Indexing wikipedia with solr

问题 I've installed solr 4.6.0 and follow the tutorial available at Solr's home page. Everything was fine, untill I need to do a real job that I'm about to do. I have to get a fast access to wikipedia content and I was advised to use Solr. Well, I was trying to follow the example in the link http://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia, but I couldn't get the example. I am newbie, and I don't know what means data_config.xml! <dataConfig> <dataSource type=