urllib2 file name

前端未结

关注

 14  1796

If I open a file using urllib2, like so:

remotefile = urllib2.urlopen(\'http://example.com/somefile.zip\')

Is there an easy way to get the

相关标签:

14条回答

谎友^

2020-12-01 04:13

import os,urllib2
resp = urllib2.urlopen('http://www.example.com/index.html')
my_url = resp.geturl()

os.path.split(my_url)[1]

# 'index.html'

This is not openfile, but maybe still helps :)

0 讨论(0)

一生所求

2020-12-01 04:17
The os.path.basename function works not only for file paths, but also for urls, so you don't have to manually parse the URL yourself. Also, it's important to note that you should use result.url instead of the original url in order to follow redirect responses:
```
import os
import urllib2
result = urllib2.urlopen(url)
real_url = urllib2.urlparse.urlparse(result.url)
filename = os.path.basename(real_url.path)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
猫巷女王i

2020-12-01 04:19
not that I know of.

but you can parse it easy enough like this:
```
url = 'http://example.com/somefile.zip'
print url.split('/')[-1]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
刺人心

2020-12-01 04:22
Did you mean urllib2.urlopen?

You could potentially lift the intended filename if the server was sending a Content-Disposition header by checking remotefile.info()['Content-Disposition'], but as it is I think you'll just have to parse the url.

You could use urlparse.urlsplit, but if you have any URLs like at the second example, you'll end up having to pull the file name out yourself anyway:
```
>>> urlparse.urlsplit('http://example.com/somefile.zip')
('http', 'example.com', '/somefile.zip', '', '')
>>> urlparse.urlsplit('http://example.com/somedir/somefile.zip')
('http', 'example.com', '/somedir/somefile.zip', '', '')
```
Might as well just do this:
```
>>> 'http://example.com/somefile.zip'.split('/')[-1]
'somefile.zip'
>>> 'http://example.com/somedir/somefile.zip'.split('/')[-1]
'somefile.zip'
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
臣服心动

2020-12-01 04:22
If you only want the file name itself, assuming that there's no query variables at the end like http://example.com/somedir/somefile.zip?foo=bar then you can use os.path.basename for this:
```
[user@host]$ python
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) 
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.basename("http://example.com/somefile.zip")
'somefile.zip'
>>> os.path.basename("http://example.com/somedir/somefile.zip")
'somefile.zip'
>>> os.path.basename("http://example.com/somedir/somefile.zip?foo=bar")
'somefile.zip?foo=bar'
```
Some other posters mentioned using urlparse, which will work, but you'd still need to strip the leading directory from the file name. If you use os.path.basename() then you don't have to worry about that, since it returns only the final part of the URL or file path.
0 讨论(0)
发布评论:

提交评论
- 加载中...
你的背包

2020-12-01 04:29
You could also combine both of the two best-rated answers : Using urllib2.urlparse.urlsplit() to get the path part of the URL, and then os.path.basename for the actual file name.

Full code would be :
```
>>> remotefile=urllib2.urlopen(url)
>>> try:
>>>   filename=remotefile.info()['Content-Disposition']
>>> except KeyError:
>>>   filename=os.path.basename(urllib2.urlparse.urlsplit(url).path)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...