urllib2

SOCKS5 proxy using urllib2 and PySocks

女生的网名这么多〃 提交于 2019-12-08 12:30:49
问题 I'm trying to connect to a SOCKS5 proxy using urllib2 and PySocks. My proxy has a username and password and I use the below code, however I always get a socks.SOCKS5Error: 0x02: Connection not allowed by ruleset message when I'm trying to connect. Would anyone know what I'm doing wrong..? import socket import socks import urllib2 socks.set_default_proxy(socks.SOCKS5, "xx.xx.xx", 8080, 'username','pass') socket.socket = socks.socksocket hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64)

Big requests issue: GET doesnt release/reset TCP connections, loop crashes

与世无争的帅哥 提交于 2019-12-08 09:18:33
im using python3.3 and the requests module to scrape links from an arbitrary webpage. My program works as follows: I have a list of urls which in the beginning has just the starting url in it. The program loops over that list and gives the urls to a procedure GetLinks, where im using requests.get and Beautifulsoup to extract all links. Before that procedure appends links to my urllist it gives them to another procedure testLinks to see whether its an internal, external or broken link. In the testLinks im using requests.get too to be able to handle redirects etc. The program worked really well

503 error when trying to access Google Patents using python

柔情痞子 提交于 2019-12-08 08:17:56
问题 Earlier today I was able to pull data from Google Patents using the code below import urllib2 url = 'http://www.google.com/search?tbo=p&q=ininventor:"John-Mudd"&hl=en&tbm=pts&source=lnt&tbs=ptso:us' req = urllib2.Request(url, headers={'User-Agent' : "foobar"}) response = urllib2.urlopen(req) Now when I go to run it I get the following 503 error. I had only looped through this code maybe 30 times on it (i'm trying to get all the patents owned by a list of 30 people). HTTPError Traceback (most

Website form login using Python urllib2

纵然是瞬间 提交于 2019-12-08 08:03:39
问题 I've breen trying to learn to use the urllib2 package in Python. I tried to login in as a student (the left form) to a signup page for maths students: http://reg.maths.lth.se/. I have inspected the code (using Firebug) and the left form should obviously be called using POST with a key called pnr whose value should be a string 10 characters long (the last part can perhaps not be seen from the HTML code, but it is basically my social security number so I know how long it should be). Note that

trying to split the file download buffer to into separate threads

大城市里の小女人 提交于 2019-12-08 07:42:45
问题 I am trying to download the buffer of file into 5 threads but it seems like it's getting garbled. from numpy import arange import requests from threading import Thread import urllib2 url = 'http://pymotw.com/2/urllib/index.html' sizeInBytes = r = requests.head(url, headers={'Accept-Encoding': 'identity'}).headers['content-length'] splitBy = 5 splits = arange(splitBy + 1) * (float(sizeInBytes)/splitBy) dataLst = [] def bufferSplit(url, idx, splits): req = urllib2.Request(url, headers={'Range':

Create new TCP Connections for every HTTP request in python

本秂侑毒 提交于 2019-12-08 06:01:20
问题 For my college project I am trying to develop a python based traffic generator.I have created 2 CentOS machines on vmware and I am using 1 as my client and 1 as my server machine. I have used IP aliasing technique to increase number of clients and severs using just single client/server machine. Upto now I have created 50 IP alias on my client machine and 10 IP alias on my server machine. I am also using multiprocessing module to generate traffic concurrently from all 50 clients to all 10

Download an internet resource in Python and save it on my desired location

只愿长相守 提交于 2019-12-08 05:59:32
问题 I am new to Python and I am using urllib2 to download files over the internet. I am using this code import urllib2 response = urllib2.urlopen('http://www.example.com/myfile.zip') ... This code actually save the zip file on my temp folder, I don't want it to be like that, I want to save it on my desired location. Is it possible? 回答1: You can use the urllib.urlretrieve function, to download the distant file to your local filesystem. >>> import urllib >>> urllib.urlretrieve('http://www.example

Extracting source code from html file using python3.1 urllib.request

落爺英雄遲暮 提交于 2019-12-08 05:41:57
问题 I'm trying to obtain data using regular expressions from a html file, by implementing the following code: import urllib.request def extract_words(wdict, urlname): uf = urllib.request.urlopen(urlname) text = uf.read() print (text) match = re.findall("<tr>\s*<td>([\w\s.;'(),-/]+)</td>\s+<td>([\w\s.,;'()-/]+)</td>\s*</tr>", text) which returns an error: File "extract.py", line 33, in extract_words match = re.findall("<tr>\s*<td>([\w\s.;'(),-/]+)</td>\s+<td>([\w\s.,;'()-/]+)</td>\s*</tr>", text)

urllib2.HTTPError Python

坚强是说给别人听的谎言 提交于 2019-12-08 05:30:05
问题 I have a file with GI numbers and would like to get FASTA sequences from ncbi. from Bio import Entrez import time Entrez.email ="eigtw59tyjrt403@gmail.com" f = open("C:\\bioinformatics\\gilist.txt") for line in iter(f): handle = Entrez.efetch(db="nucleotide", id=line, retmode="xml") records = Entrez.read(handle) print ">GI "+line.rstrip()+" "+records[0]["GBSeq_primary-accession"]+" "+records[0]["GBSeq_definition"]+"\n"+records[0]["GBSeq_sequence"] time.sleep(1) # to make sure not many

How do I translate Python urllib.request code to Java code

三世轮回 提交于 2019-12-08 05:17:30
问题 This is the python code import urllib.request as urllib2 import json data = { "Inputs": { "input1": { "ColumnNames": ["id", "regex"], "Values": [ [ "0", "the regex value" ],] }, }, "GlobalParameters": { "Database query": "select * from expone", } } body = str.encode(json.dumps(data)) url = 'https://ussouthcentral.services.azureml.net/workspaces/4729545551a741e1a2e606d37' \ 'ae61ce0/services/ac7c34ad134d43ca9fdc65e292ce35d3/execute?api-version=2.0&details=true' api_key = '8ku5P6fR3F8ykgMHK5Y8