How to send a urllib2 request with added white spaces

橙三吉。 提交于 2019-12-11 02:34:40

问题


I am trying to send a request to open web page url that uses white spaces so that I can download a file from the page. In a normal browser i.e chrome when you enter the url into the address bar the file is automatically generated and you are asked to download it.

Instead of having to load a web browser every time I want a set of logs I am trying to create a python script that I can run that will do all the hard work for me.

Example:

url = http (ip-address)/supportlog.xml/getlogs&name=0335008 04-05-2013 12.46.47.zip 

Im using the command:

xml_page = opener.open((url))

I have been able to to download other zip files fine from the web sever I am connecting to, using the following command and some other lines of code.

But when i try the same command with the url with added white spaces.

urllib2 knocks off all of the white spaces meaning I get a syntax error back. Ideally you would change the url not to contain white spaces, but this isn't possible.

I have tried addressing the URL with %20 to replace the white spaces but this doesn't work and causes the sever to fail.

I understand you can use the urllib.quote tool, but not sure how to or even if this is the correct pass to go down.

Any help is welcome... I'm still learning python so please be kind.


回答1:


In order to clean your url with whitespaces use urllib.quote like this:

import urllib
url = urllib.quote("http://www.example.com/a url with whitespaces")

To download a file to cannot use functions like urllib2.urlopen. If you want to download a file using the urllib modules you need urllib.urlretrieve. However, requests is easier to grasp in the beginning.

import requests
response = requests.get(url)

The response provides several useful functions:

  • response.text: The source code of the website or the content of the downloaded file.
  • response.status_code: Status code of your request. 200 is ok.

You probably want to save your downloaded file somewhere. So open a file connection with open in binary mode and write the content of your response. Do not forget to close the file.

your_file_connection = open('your_file', 'wb')
your_file_connection.save(response.text)
your_file_connection.flush()
your_file_connection.close()

Summary

import urllib
import requests

url = urllib.quote("http://www.example.com/a url with whitespaces")
response = requests.get(url)

your_file_connection = open('your_file', 'wb')
your_file_connection.save(response.text)
your_file_connection.
your_file_connection.close()

requests Documentation: http://docs.python-requests.org/en/latest/




回答2:


While the answer of Jon was then the correct way, note that in Python 3.X you have to change it to:

import urllib.parse
url = urllib.parse.quote("http://www.example.com/a url with whitespaces"')



回答3:


After attempting this, I figured out that the line: your_file_connection.save(response.content)

needs to be: your_file_connection.write(response.content)

at least on Python 2.*



来源:https://stackoverflow.com/questions/21735801/how-to-send-a-urllib2-request-with-added-white-spaces

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!