I\'m scraping Bet365, probably one of the most tricky websites I\'ve encountered, with selenium and Chrome. The issue with this page is that, even though my scraper takes s
I don't see any significant issue either in your approach or your code block. However, another approach would be to make use of all the proxies marked with in the Last Checked column which gets updated within the Free Proxy List.
As a solution you can write a script to grab all the proxies available and create a List dynamically every time you initialize your program. The following program will invoke a proxy from the Proxy List one by one until a successful proxied connection is established and verified through the Page Title of https://www.bet365.es
to contain the text bet365. An exception may arise because the free proxy which your program grabbed was overloaded with users trying to get their proxy traffic through.
Code Block:
driver.get("https://sslproxies.org/")
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//th[contains(., 'IP Address')]"))))
ips = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 1]")))]
ports = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 2]")))]
driver.quit()
proxies = []
for i in range(0, len(ips)):
proxies.append(ips[i]+':'+ports[i])
print(proxies)
for i in range(0, len(proxies)):
try:
print("Proxy selected: {}".format(proxies[i]))
options = webdriver.ChromeOptions()
options.add_argument('--proxy-server={}'.format(proxies[i]))
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://www.bet365.es")
if "Proxy Type" in WebDriverWait(driver, 20).until(EC.title_contains("bet365")):
# Do your scrapping here
break
except Exception:
driver.quit()
print("Proxy was Invoked")
Console Output:
['190.7.158.58:39871', '175.139.179.65:54980', '186.225.45.146:45672', '185.41.99.100:41258', '43.230.157.153:52986', '182.23.32.66:30898', '36.37.160.253:31450', '93.170.15.214:56305', '36.67.223.67:43628', '78.26.172.44:52490', '36.83.135.183:3128', '34.74.180.144:3128', '206.189.122.177:3128', '103.194.192.42:55546', '70.102.86.204:8080', '117.254.216.97:23500', '171.100.221.137:8080', '125.166.176.153:8080', '185.146.112.24:8080', '35.237.104.97:3128']
Proxy selected: 190.7.158.58:39871
Proxy selected: 175.139.179.65:54980
Proxy selected: 186.225.45.146:45672
Proxy selected: 185.41.99.100:41258