python-requests | 易学教程

502 error using Requests to search website in Python

阅读更多关于 502 error using Requests to search website in Python

问题 Using a very basic program to search up a query on a website and print out the search results, why do I get a 502 error? import requests from bs4 import BeautifulSoup import re def main(): url = "https://www.last10k.com/Search" dat = {'q':'goog'} resp = requests.get(url, params=dat) print(resp.content) 回答1: Define a User-Agent header. Like this: import requests def main(): url = "https://www.last10k.com/Search" dat = {'q':'goog'} resp = requests.get(url, params=dat, headers={'User-Agent':

How do I decode text from a pdf online with Requests?

阅读更多关于 How do I decode text from a pdf online with Requests?

问题 I am trying to create a pdf puller from the Australian Stock Exchange website which will allow me to search through all the 'Announcements' made by companies and search for key words in the pdfs of those announcements. What I have done so far is used the requests library. Below is my code so far: import requests url = 'http://www.asx.com.au/asxpdf/20171103/pdf/43nyyw9r820c6r.pdf' response = requests.get(url) print(response.content) However what prints is the following string (I will cut this

How to get first child table row from a table in BeautifulSoup ( Python )

阅读更多关于 How to get first child table row from a table in BeautifulSoup ( Python )

问题 Here is the Code and sample results , I just want the first column of the table ignoring the rest. There are similar question on Stackoverflow but they did not help. <tr> <td>JOHNSON</td> <td> 2,014,470 </td> <td>0.81</td> <td>2</td> </tr> I want JOHNSON only, as it is the first child. My python code is : import requests from bs4 import BeautifulSoup def find_raw(): url = 'http://names.mongabay.com/most_common_surnames.htm' r = requests.get(url) html = r.content soup = BeautifulSoup(html) for

Python urllib3 error - ImportError: cannot import name UnrewindableBodyError

阅读更多关于 Python urllib3 error - ImportError: cannot import name UnrewindableBodyError

问题 I set my cronjob to call my script at particular time( ex- 2 4 5 10 * python3 mayank/exp/test.py ). When my test.py is called I'm activating the virtualenv within my test.py script as follows. activate = "/home/myserver/schedule_py3/bin/activate_this.py" exec(open(activate).read()) After activating the virtual environment(which has python3 in it and the packages needed to run the script), I'm trying to import requests it is showing me error as:- File "schedule_module/Schedule/notification

asyncio, wrapping a normal function as asynchronous

阅读更多关于 asyncio, wrapping a normal function as asynchronous

问题 Is a function like: async def f(x): time.sleep(x) await f(5) properly asynchronous/non-blocking? Is the sleep function provided by asyncio any different? and finally, is aiorequests a viable asynchronous replacement for requests? (to my mind it basically wraps main components as asynchronous) https://github.com/pohmelie/aiorequests/blob/master/aiorequests.py 回答1: The provided function is not a correctly written async function because it invokes a blocking call, which is forbidden in asyncio.

python requests lib is not working in amazon aws

阅读更多关于 python requests lib is not working in amazon aws

问题 I am trying following code: import requests headers = { 'authority': 'www.nseindia.com', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36 OPR/72.0.3815.320', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec

python requests lib is not working in amazon aws

阅读更多关于 python requests lib is not working in amazon aws

python requests upload large file with additional data

阅读更多关于 python requests upload large file with additional data

问题 I've been looking around for ways to upload large file with additional data, but there doesn't seem to be any solution. To upload file, I've been using this code and it's been working fine with small file: with open("my_file.csv", "rb") as f: files = {"documents": ("my_file.csv", f, "application/octet-stream")} data = {"composite": "NONE"} headers = {"Prefer": "respond-async"} resp = session.post("my/url", headers=headers, data=data, files=files) The problem is that the code loads the whole

python requests upload large file with additional data

阅读更多关于 python requests upload large file with additional data

Unable to determine SOCKS version from socks

阅读更多关于 Unable to determine SOCKS version from socks

问题 Using proxy connection (HTTP Proxy : 10.3.100.207, Port 8080). Using python's request module's get function, getting following error: "Unable to determine SOCKS version from socks://10.3.100.207:8080/" 回答1: Try export all_proxy="socks5://10.3.100.207:8080" if you want to use socks proxy. Else export all_proxy="" for no proxy. Hope This works. :D 回答2: I resolved this problem by removing "socks:" in_all_proxy. 来源： https://stackoverflow.com/questions/39906836/unable-to-determine-socks-version