发表新帖

发表新帖

Scrape a web page that requires they give you a session cookie first

后端未结

关注

 2  635

故里飘歌 2020-12-13 22:35

I\'m trying to scrape an excel file from a government \"muster roll\" database. However, the URL I have to access this excel file:

http://nrega.ap.gov.in/Nregs/Front

2条回答

一个人的身影 (楼主)

2020-12-13 23:29
Using cookies and urllib2:
```
import cookielib
import urllib2

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
# use opener to open different urls
```
You can use the same opener for several connections:
```
data = [opener.open(url).read() for url in urls]
```
Or install it globally:
```
urllib2.install_opener(opener)
```
In the latter case the rest of the code looks the same with or without cookies support:
```
data = [urllib2.urlopen(url).read() for url in urls]
```
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题