urllib3

python 采集斗图啦(多线程)

徘徊边缘 提交于 2020-08-05 04:54:57
import concurrent import requests; from concurrent.futures import ThreadPoolExecutor import os; import parsel; def send_request(url): header = { "user-agent":'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36' } requests.packages.urllib3.disable_warnings() response = requests.get(url,headers=header) return response def pare_data(data): selector = parsel.Selector(data) result_list = selector.xpath('//a[@class="col-xs-6 col-sm-3"]') for result in result_list: title = result.xpath('./img/@alt').get() src_url = result.xpath('.

Connection pool is full, discarding connection with ThreadPoolExecutor and multiple headless browsers through Selenium and Python

时光毁灭记忆、已成空白 提交于 2020-07-30 07:28:06
问题 I'm writing some automation software using selenium==3.141.0 , python 3.6.7 , chromedriver 2.44 . Most of the the logic is ok to be executed by the single browser instance, but for some part i have to launch 10-20 instances to have a decent execution speed. Once it comes to the part which is executed by ThreadPoolExecutor , browser interactions start throwing this error: WARNING|05/Dec/2018 17:33:11|connectionpool|_put_conn|274|Connection pool is full, discarding connection: 127.0.0.1 WARNING

MaxRetryError: HTTPConnectionPool: Max retries exceeded (Caused by ProtocolError('Connection aborted.', error(111, 'Connection refused')))

我们两清 提交于 2020-07-23 06:48:07
问题 I have one question:I want to test "select" and "input".can I write it like the code below: original code: 12 class Sinaselecttest(unittest.TestCase): 13 14 def setUp(self): 15 binary = FirefoxBinary('/usr/local/firefox/firefox') 16 self.driver = webdriver.Firefox(firefox_binary=binary) 17 18 def test_select_in_sina(self): 19 driver = self.driver 20 driver.get("https://www.sina.com.cn/") 21 try: 22 WebDriverWait(driver,30).until( 23 ec.visibility_of_element_located((By.XPATH,"/html/body/div[9

MaxRetryError: HTTPConnectionPool: Max retries exceeded (Caused by ProtocolError('Connection aborted.', error(111, 'Connection refused')))

淺唱寂寞╮ 提交于 2020-07-23 06:46:11
问题 I have one question:I want to test "select" and "input".can I write it like the code below: original code: 12 class Sinaselecttest(unittest.TestCase): 13 14 def setUp(self): 15 binary = FirefoxBinary('/usr/local/firefox/firefox') 16 self.driver = webdriver.Firefox(firefox_binary=binary) 17 18 def test_select_in_sina(self): 19 driver = self.driver 20 driver.get("https://www.sina.com.cn/") 21 try: 22 WebDriverWait(driver,30).until( 23 ec.visibility_of_element_located((By.XPATH,"/html/body/div[9

Python urllib3: close idle connection after some time

怎甘沉沦 提交于 2020-07-08 11:00:57
问题 Is there a way to tell Python urllib3 to not reuse idle connections after some period of time, and instead to close them? Looking in https://urllib3.readthedocs.io/en/latest/reference/index.html#module-urllib3.connectionpool doesn't seem to show anything relevant. 回答1: Remember: A connection pool is a cache of database connections maintained so that the connections can be "reused" when future requests to the database are required. You can do this is many ways (I guess): Set retries to one.

Python urllib3: close idle connection after some time

你离开我真会死。 提交于 2020-07-08 11:00:22
问题 Is there a way to tell Python urllib3 to not reuse idle connections after some period of time, and instead to close them? Looking in https://urllib3.readthedocs.io/en/latest/reference/index.html#module-urllib3.connectionpool doesn't seem to show anything relevant. 回答1: Remember: A connection pool is a cache of database connections maintained so that the connections can be "reused" when future requests to the database are required. You can do this is many ways (I guess): Set retries to one.

Python requests is slow and takes very long to complete HTTP or HTTPS request

纵饮孤独 提交于 2020-06-30 23:52:53
问题 When requesting a web resource or website or web service with the requests library, the request takes a long time to complete. The code looks similar to the following: import requests requests.get("https://www.example.com/") This request takes over 2 minutes (exactly 2 minutes 10 seconds) to complete! Why is it so slow and how can I fix it? 回答1: There can be multiple possible solutions to this problem. There are a multitude of answers on StackOverflow for any of these, so I will try to

Python requests is slow and takes very long to complete HTTP or HTTPS request

佐手、 提交于 2020-06-30 23:51:10
问题 When requesting a web resource or website or web service with the requests library, the request takes a long time to complete. The code looks similar to the following: import requests requests.get("https://www.example.com/") This request takes over 2 minutes (exactly 2 minutes 10 seconds) to complete! Why is it so slow and how can I fix it? 回答1: There can be multiple possible solutions to this problem. There are a multitude of answers on StackOverflow for any of these, so I will try to

Multipart form encoding and posting with urllib3

大憨熊 提交于 2020-06-28 07:16:59
问题 I'm attempting to upload a csv file to this site. However, I've encountered a few issues, and I think it stems from the incorrect mimetype (maybe). I'm attempting to manually post the file via urllib2 , so my code looks as follows: import urllib import urllib2 import mimetools, mimetypes import os, stat from cStringIO import StringIO #============================ # Note: I found this recipe online. I can't remember where exactly though.. #============================= class Callable: def _

Multipart form encoding and posting with urllib3

♀尐吖头ヾ 提交于 2020-06-28 07:16:28
问题 I'm attempting to upload a csv file to this site. However, I've encountered a few issues, and I think it stems from the incorrect mimetype (maybe). I'm attempting to manually post the file via urllib2 , so my code looks as follows: import urllib import urllib2 import mimetools, mimetypes import os, stat from cStringIO import StringIO #============================ # Note: I found this recipe online. I can't remember where exactly though.. #============================= class Callable: def _