Python urllib2.open Connection reset by peer error

后端 未结 2 1398
独厮守ぢ
独厮守ぢ 2021-01-20 15:30

I\'m trying to scrape a page using python

The problem is, I keep getting Errno54 Connection reset by peer.

The error comes when I run this code -

2条回答
  •  清歌不尽
    2021-01-20 15:43

    $> telnet www.bkstr.com 80
    Trying 64.37.224.85...
    Connected to www.bkstr.com.
    Escape character is '^]'.
    GET /webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100020629&divisionDisplayName=Stanford&departmentDisplayName=ILAC&courseDisplayName=126§ionDisplayName=01&demoKey=d&purpose=browse HTTP/1.0
    
    Connection closed by foreign host.
    

    You're not going to have any joy fetching that URL from python, or anywhere else. If it works in your browser then there must be something else going on, like cookies or authentication or some such. Or, possibly, the server's broken or they've changed their configuration.

    Try opening it in a browser that you've never accessed that site in before to check. Then log in and try it again.

    Edit: It was cookies after all:

    import cookielib, urllib2
    
    cj = cookielib.CookieJar()
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    #Need to set a cookie
    opener.open("http://www.bkstr.com/")
    #Now open the page we want
    data = opener.open("http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10161&langId=-1&programId=562&termId=100020629&divisionDisplayName=Stanford&departmentDisplayName=ILAC&courseDisplayName=126§ionDisplayName=01&demoKey=d&purpose=browse").read()
    

    The output looks ok, but you'll have to check that it does what you want :)

提交回复
热议问题