urllib2

how to check if the urllib2 follow a redirect?

心不动则不痛 提交于 2019-11-29 10:45:11
I've write this function: def download_mp3(url,name): opener1 = urllib2.build_opener() page1 = opener1.open(url) mp3 = page1.read() filename = name+'.mp3' fout = open(filename, 'wb') fout.write(mp3) fout.close() This function take an url and a name both as string. Then will download and save an mp3 from the url with the name of the variable name. the url is in the form http://site/download.php?id=xxxx where xxxx is the id of an mp3 if this id does not exist the site redirects me to another page. So, the question is: how Can I check if this id exist? I've tried to check if the url exist with a

Python and urllib

廉价感情. 提交于 2019-11-29 10:35:36
I'm trying to download a zip file ("tl_2008_01001_edges.zip") from an ftp census site using urllib. What form is the zip file in when I get it and how do I save it? I'm fairly new to Python and don't understand how urllib works. This is my attempt: import urllib, sys zip_file = urllib.urlretrieve("ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/Autauga_County/", "tl_2008_01001_edges.zip") If I know the list of ftp folders (or counties in this case), can I run through the ftp site list using the glob function? Thanks. gimel Use urllib2.urlopen() for the zip file data and directory listing.

使用python移动飞信模块发送短信

孤者浪人 提交于 2019-11-29 10:15:15
作者: miaoo 1.应用场景 由于自己做的一个系统需要用到发送短信到自己手机的功能,于是搜索了一下,发现了一个通过移动飞信通道发送短信开源库: PyFetion PyFetion 模拟实现了飞信的通信协议,所以能够实现的功能非常多:短信收发、好友管理、修改状态等等等。 但是,由于我只需要发送短信,所以其它功能都很多余;再加上使用 PyFetion 登录飞信时可能需要输入验证码,所以不太适合自动化系统的调用。 继续搜索发现了飞信为手机用户提供了一个wap站点: http://f.10086.cn PS:由于是这一个wap站点,您可能需要在FireFox中安装扩展(Extension): wmlbrowser ,以便正常的浏览. 通过它能够进行在线信息收发。由于wap站点代码结构比较简单,所以很适合用程序模拟用户登录、发送信息的整个流程,以达到发送短信的目的。 2.代码分析 代码主要用到了下面几个lib cj = cookielib.LWPCookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) urllib2.install_opener(opener) 登陆时,首先要处理Cookie信息 cj = cookielib.LWPCookieJar() opener = urllib2

python常见面试题

让人想犯罪 __ 提交于 2019-11-29 10:14:19
1、大数据的文件读取 利用生成器generator 迭代器进行迭代遍历:for line in file 2、迭代器和生成器的区别 迭代器是一个更加抽象的概念,任何对象,如果它的类有next方法和iter方法返回自身。对于string、list、dict、tuple等这类容器对象,使用for循环遍历是很方便的。在后台for语句对容器对象调用iter()函数,iter()是Python的内置函数。iter()会返回一个定义了next()方法的迭代器对象,它在容器中逐个访问容器内元素,next()也是python的内置函数。在没有后续元素时,next()会抛出一个StopIterration的异常。 生成器(Generator)是创建迭代器的简单而强大的工具。它们写起来就像是正规的函数,只是在返回数据的时候需要使用yield语句。每次next()被调用时,生成器会返回它脱离的位置(它记忆语句最后一次执行的位置和所有的数据值) 区别:生成器能做到迭代器能做的所有事,而且因为自动创建了__iter__()和next()方法,生成器显得特别简洁,而且生成器也是高效的,使用生成器表达式取代列表解析可以同时节省内存。除了创建和保持程序状态的自动生成,当发生器终结时,还会自动跑出StopIterration异常。 3.装饰器的作用和功能 引入日志 函数执行时间统计 执行函数钱预备处理

Why does this url raise BadStatusLine with httplib2 and urllib2?

懵懂的女人 提交于 2019-11-29 10:12:34
Using httplib2 and urllib2, I'm trying to fetch pages from this url, but all of them didn't work out and ended up with this exception. content = conn.request(uri="http://www.zdnet.co.kr/news/news_print.asp?artice_id=20110727092902") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1129, in request (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 901, in _request (response,

urllib2 HTTP error 429

廉价感情. 提交于 2019-11-29 08:48:05
问题 So I have a list of sub-reddits and I'm using urllib to open them. As I go through them eventually urllib fails with: urllib2.HTTPError: HTTP Error 429: Unknown Doing some research I found that reddit limits the ammount of requests to their servers by IP: Make no more than one request every two seconds. There's some allowance for bursts of requests, but keep it sane. In general, keep it to no more than 30 requests in a minute. So I figured I'd use time.sleep() to limit my requests to one page

Python:urllib 和urllib2之间的区别

我们两清 提交于 2019-11-29 08:02:42
作为一个Python菜鸟,之前一直懵懂于urllib和urllib2,以为2是1的升级版。今天看到老外写的一篇《 Python: difference between urllib and urllib2 》才明白其中的区别。 You might be intrigued by the existence of two separate URL modules in Python - urllib and urllib2 . Even more intriguing: they are not alternatives for each other. So what is the difference between urllib and urllib2 , and do we need them both? 你可能对于Python中两个独立存在的-urllib2和-urllib2感到好奇。更有趣的是:它们并不是可以相互代替的。那么这两个模块间的区别是什么,并且这两个我们都需要吗? urllib and urllib2 are both Python modules that do URL request related stuff but offer different functionalities. Their two most significant differences

Python: difference between urllib and urllib2

…衆ロ難τιáo~ 提交于 2019-11-29 08:02:28
作为一个Python菜鸟,之前一直懵懂于urllib和urllib2,以为2是1的升级版。今天看到老外写的一篇《 Python: difference between urllib and urllib2 》才明白其中的区别。 You might be intrigued by the existence of two separate URL modules in Python -urllib and urllib2. Even more intriguing: they are not alternatives for each other. So what is the difference between urllib and urllib2, and do we need them both? 你可能对于Python中两个独立存在的-urllib2和-urllib2感到好奇。更有趣的是:它们并不是可以相互代替的。那么这两个模块间的区别是什么,并且这两个我们都需要吗? urllib and urllib2 are both Python modules that do URL request related stuff but offer different functionalities. Their two most significant differences

zabbix详解:(五)添加微信告警配置

徘徊边缘 提交于 2019-11-29 07:42:28
新时代,新事物,前两年大伙还在想着怎么用短信告警比较合理的时候,这回微信告警已经悄悄进入人们视线,邮件报警就变得落后了,甚至有些大牛公司,微信告警还能做成图形告警,把监控图形也发上去,相当NB的架构. 添加微信告警配置 : 微信告警的原理其实是利用微信企业号接口,把告警信息发送到微信企业号发布内容,然后关注这个企业号的微信用户就能收到微信信息了,达到了微信告警的结果. 所以我们要做的事情是: 第一,创建一个微信公众号 第二,写一个调用微信公众号接口的脚本 第三,在zabbix_web配置相关信息 看起来和邮件告警是差不多的,下面来一步步看看 第一步,申请创建一个微信公众号,申请地址如下: https://qy.weixin.qq.com/ 申请要填什么就不说了,关键是下图这个地方,要选择企业号,听说一个身份证能申请两个企业号,各位自己看情况了 然后后面选择团队,其实你能理解字面意思吧, 申请完之后,就登录进去吧,登录需要扫码,关键登录完是有一点记得要做,去设置里面,上传新的logo,原因听说会报错,其实我没试过,不过换一个也没坏,个性一点嘛. 上面的二维码就是这个企业号的二维码了,让收告警的人扫这个二维码来注册吧,不过你手动添加也可以. 虽然企业号创建完成,但是并不代表完事,还要获取接口相关的信息,下面来看. 创建调用接口用的CorpID和Secret: 首先要新增一个成员

Python: saving large web page to file

ⅰ亾dé卋堺 提交于 2019-11-29 07:38:52
Let me start off by saying, I'm not new to programming but am very new to python. I've written a program using urllib2 that requests a web page that I would then like to save to a file. The web page is about 300KB, which doesn't strike me as particularly large but seems to be enough to give me trouble, so I'm calling it 'large'. I'm using a simple call to copy directly from the object returned from urlopen into the file: file.write(webpage.read()) but it will just sit for minutes, trying to write into the file and I eventually receive the following: Traceback (most recent call last): File