python-requests

Downloading multiple files with requests in Python

天涯浪子 提交于 2021-02-10 23:38:38
问题 Currently im facing following problem: I have 3 download links in a list. Only the last file in the list is downloaded completely. The others have a file size of one kilobyte. Code: from requests import get def download(url, filename): with open(filename, "wb") as file: response = get(url, stream=True) file.write(response.content) for link in f: url = link split_url = url.split("/") filename = split_url[-1] filename = filename.replace("\n", "") download(url,filename) The result looks like

Error 401 sending recaptcha token with requests

谁都会走 提交于 2021-02-10 22:18:02
问题 I tried to make a cart for this site: https://www.off---white.com/en/IT When I tried to send values for the params of the cart, one of this params is a token of recaptcha. I tried manually to get the token using this project 'https://github.com/Cosmo3904/Recaptcha-Harvester-V2' When I tried to make the request I pass all params: token = 'recaptcha_token' #(I get it manually and expires every 110s) payload = {"variant_id": "111380", "quantity": "1", 'g-recaptcha-response': token} s = requests

Downloading files in chunks in python?

て烟熏妆下的殇ゞ 提交于 2021-02-10 21:42:43
问题 I am writing a simple synchronous download manager which downloads a video file in 10 sections. I am using requests to get content-length from headers. Using this I am breaking and downloading files in 10; byte chunks and then merging them to form a complete video. The code below suppose to work this way but the end merged file only works for seconds and after that it gets corrupted. What is wrong in my code? import requests import os def intervals(parts, duration): part_duration = duration /

Downloading files in chunks in python?

丶灬走出姿态 提交于 2021-02-10 21:42:27
问题 I am writing a simple synchronous download manager which downloads a video file in 10 sections. I am using requests to get content-length from headers. Using this I am breaking and downloading files in 10; byte chunks and then merging them to form a complete video. The code below suppose to work this way but the end merged file only works for seconds and after that it gets corrupted. What is wrong in my code? import requests import os def intervals(parts, duration): part_duration = duration /

Decode a web page using request and BeautifulSoup package

不问归期 提交于 2021-02-10 20:20:55
问题 I am trying a practice question of python. The question is "Use the BeautifulSoup and requests Python packages to print out a list of all the article titles on the New York Times homepage." Below is my solution but it doesn't give any output. I am using Jupyter Notebook and when I run the below code it does nothing. My kernel is also working properly which means I have a problem with my code. import requests from bs4 import BeautifulSoup from urllib.request import urlopen base_url= 'https:/

python requests.post data without value don't work

拟墨画扇 提交于 2021-02-10 20:08:58
问题 how can i post only key of form data to a server like the site itself? In [1]: from requests import Request In [2]: req = Request('POST', 'http://www.google.com', data={'json':''}).prepare() In [3]: req.headers, req.body Out[3]: ({'Content-Length': '5', 'Content-Type': 'application/x-www-form-urlencoded'}, 'json=') In [4]: req = Request('POST', 'http://www.google.com', data={'json':None}).prepare() In [5]: req.headers, req.body Out[5]: ({'Content-Type': 'application/x-www-form-urlencoded'}, '

HTTP headers - Requests - Python

不羁岁月 提交于 2021-02-10 19:51:55
问题 I am trying to scrape a website in which the request headers are having some new (for me) attributes such as :authority, :method, :path, :scheme . {':authority':'xxxx',':method':'GET',':path':'/xxxx',':scheme':'https','accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8','accept-encoding':'gzip, deflate, br','accept-language':'en-US,en;q=0.9','cache-control':'max-age=0',GOOGLE_ABUSE_EXEMPTION=ID=0d5af55f1ada3f1e:TM=1533116294:C=r:IP=182.71.238.62-:S

HTTP headers - Requests - Python

喜夏-厌秋 提交于 2021-02-10 19:51:01
问题 I am trying to scrape a website in which the request headers are having some new (for me) attributes such as :authority, :method, :path, :scheme . {':authority':'xxxx',':method':'GET',':path':'/xxxx',':scheme':'https','accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8','accept-encoding':'gzip, deflate, br','accept-language':'en-US,en;q=0.9','cache-control':'max-age=0',GOOGLE_ABUSE_EXEMPTION=ID=0d5af55f1ada3f1e:TM=1533116294:C=r:IP=182.71.238.62-:S

Paginate with network requests scraper

跟風遠走 提交于 2021-02-10 19:05:03
问题 I am trying to scrape Naukri job postings. Web scraping was too time-consuming, so I switched to network requests. I believe I got the request pattern for pagination by changing the URL right (not clicking the next tab). URLs Example: https://www.naukri.com/maintenance-jobs?xt=catsrch&qf%5B%5D=19 https://www.naukri.com/maintenance-jobs-2?xt=catsrch&qf%5B%5D=19 https://www.naukri.com/maintenance-jobs-3?xt=catsrch&qf%5B%5D=19 https://www.naukri.com/maintenance-jobs-4?xt=catsrch&qf%5B%5D=19 The

Paginate with network requests scraper

会有一股神秘感。 提交于 2021-02-10 19:01:26
问题 I am trying to scrape Naukri job postings. Web scraping was too time-consuming, so I switched to network requests. I believe I got the request pattern for pagination by changing the URL right (not clicking the next tab). URLs Example: https://www.naukri.com/maintenance-jobs?xt=catsrch&qf%5B%5D=19 https://www.naukri.com/maintenance-jobs-2?xt=catsrch&qf%5B%5D=19 https://www.naukri.com/maintenance-jobs-3?xt=catsrch&qf%5B%5D=19 https://www.naukri.com/maintenance-jobs-4?xt=catsrch&qf%5B%5D=19 The