python-requests

502 error using Requests to search website in Python

梦想的初衷 提交于 2020-12-06 04:33:40
问题 Using a very basic program to search up a query on a website and print out the search results, why do I get a 502 error? import requests from bs4 import BeautifulSoup import re def main(): url = "https://www.last10k.com/Search" dat = {'q':'goog'} resp = requests.get(url, params=dat) print(resp.content) 回答1: Define a User-Agent header. Like this: import requests def main(): url = "https://www.last10k.com/Search" dat = {'q':'goog'} resp = requests.get(url, params=dat, headers={'User-Agent':

How do I decode text from a pdf online with Requests?

只谈情不闲聊 提交于 2020-12-06 04:17:20
问题 I am trying to create a pdf puller from the Australian Stock Exchange website which will allow me to search through all the 'Announcements' made by companies and search for key words in the pdfs of those announcements. What I have done so far is used the requests library. Below is my code so far: import requests url = 'http://www.asx.com.au/asxpdf/20171103/pdf/43nyyw9r820c6r.pdf' response = requests.get(url) print(response.content) However what prints is the following string (I will cut this

How to get first child table row from a table in BeautifulSoup ( Python )

橙三吉。 提交于 2020-12-06 02:57:22
问题 Here is the Code and sample results , I just want the first column of the table ignoring the rest. There are similar question on Stackoverflow but they did not help. <tr> <td>JOHNSON</td> <td> 2,014,470 </td> <td>0.81</td> <td>2</td> </tr> I want JOHNSON only, as it is the first child. My python code is : import requests from bs4 import BeautifulSoup def find_raw(): url = 'http://names.mongabay.com/most_common_surnames.htm' r = requests.get(url) html = r.content soup = BeautifulSoup(html) for

Python urllib3 error - ImportError: cannot import name UnrewindableBodyError

谁说胖子不能爱 提交于 2020-12-05 07:14:47
问题 I set my cronjob to call my script at particular time( ex- 2 4 5 10 * python3 mayank/exp/test.py ). When my test.py is called I'm activating the virtualenv within my test.py script as follows. activate = "/home/myserver/schedule_py3/bin/activate_this.py" exec(open(activate).read()) After activating the virtual environment(which has python3 in it and the packages needed to run the script), I'm trying to import requests it is showing me error as:- File "schedule_module/Schedule/notification

asyncio, wrapping a normal function as asynchronous

心不动则不痛 提交于 2020-12-05 04:55:13
问题 Is a function like: async def f(x): time.sleep(x) await f(5) properly asynchronous/non-blocking? Is the sleep function provided by asyncio any different? and finally, is aiorequests a viable asynchronous replacement for requests? (to my mind it basically wraps main components as asynchronous) https://github.com/pohmelie/aiorequests/blob/master/aiorequests.py 回答1: The provided function is not a correctly written async function because it invokes a blocking call, which is forbidden in asyncio.

python requests lib is not working in amazon aws

*爱你&永不变心* 提交于 2020-12-04 05:11:52
问题 I am trying following code: import requests headers = { 'authority': 'www.nseindia.com', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36 OPR/72.0.3815.320', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec

python requests lib is not working in amazon aws

。_饼干妹妹 提交于 2020-12-04 05:09:59
问题 I am trying following code: import requests headers = { 'authority': 'www.nseindia.com', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.183 Safari/537.36 OPR/72.0.3815.320', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec

python requests upload large file with additional data

落爺英雄遲暮 提交于 2020-12-01 02:49:31
问题 I've been looking around for ways to upload large file with additional data, but there doesn't seem to be any solution. To upload file, I've been using this code and it's been working fine with small file: with open("my_file.csv", "rb") as f: files = {"documents": ("my_file.csv", f, "application/octet-stream")} data = {"composite": "NONE"} headers = {"Prefer": "respond-async"} resp = session.post("my/url", headers=headers, data=data, files=files) The problem is that the code loads the whole

python requests upload large file with additional data

蓝咒 提交于 2020-12-01 02:47:35
问题 I've been looking around for ways to upload large file with additional data, but there doesn't seem to be any solution. To upload file, I've been using this code and it's been working fine with small file: with open("my_file.csv", "rb") as f: files = {"documents": ("my_file.csv", f, "application/octet-stream")} data = {"composite": "NONE"} headers = {"Prefer": "respond-async"} resp = session.post("my/url", headers=headers, data=data, files=files) The problem is that the code loads the whole

Unable to determine SOCKS version from socks

夙愿已清 提交于 2020-11-28 02:47:20
问题 Using proxy connection (HTTP Proxy : 10.3.100.207, Port 8080). Using python's request module's get function, getting following error: "Unable to determine SOCKS version from socks://10.3.100.207:8080/" 回答1: Try export all_proxy="socks5://10.3.100.207:8080" if you want to use socks proxy. Else export all_proxy="" for no proxy. Hope This works. :D 回答2: I resolved this problem by removing "socks:" in_all_proxy. 来源: https://stackoverflow.com/questions/39906836/unable-to-determine-socks-version