Clicking on Get Data button for Monthly Settlement Statistics on nseindia.com doesn't fetch results using Selenium and Python

感情迁移 提交于 2020-04-24 05:49:27

问题


I am trying to scrape data from here.

By clicking on the capital market and 2019-20 year. I want to click on Get data.

I have used following code:

driver = webdriver.Chrome(executable_path=chrome_path,options=chrome_options)

driver.get( nse_cash_keystats_page )


 driver.find_element_by_xpath( "//select[@id='h_filetype']/option[text()='Capital Market ']" ).click()

driver.find_element_by_xpath( "//select[@id='yearField']/option[text()='2019-2020']" ).click()

     downloadButton=WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.XPATH,'//input[@type="image"][@src="/common/images/btn-get-data.gif"]')))

driver.execute_script("arguments[0].click();", downloadButton)

By using the above code, I am able to click on Get DATA. But it is not showing output.

Please help me.Thanks in advance.


回答1:


I took your code added a few tweaks and ran the test as follows:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_monthly_statistics.htm')
    Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"select#h_filetype")))).select_by_visible_text("Capital Market ")
    Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"select#yearField")))).select_by_visible_text("2019-2020")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.getdata-button#get[type='image'][src^='/common/images/btn-get-data.gif']"))).click()
    

Observation

Similar to your observation, I have hit the same roadblock with no results as follows:


Deep Dive

It seems the click() on the element with text as Get Data does happens. But while inspecting the DOM Tree of the webpage you will find that some of the <script> tag refers to JavaScripts having keyword akam. As an example:

  • <script type="text/javascript" src="https://www1.nseindia.com/akam/11/52349752" defer=""></script>
  • <noscript><img src="https://www1.nseindia.com/akam/11/pixel_52349752?a=dD01ZDZiMTA5OGQ0MDljYTYxN2RjMjc3MzBlN2YwMDQ0NjlkZDNiNTMzJmpzPW9mZg==" style="visibility: hidden; position: absolute; left: -999px; top: -999px;" /></noscript>

Which is a clear indication that the website is protected by Bot Manager an advanced bot detection service provided by Akamai and the response gets blocked.


Bot Manager

As per the article Bot Manager - Foundations:


Conclusion

So it can be concluded that the request for the data is detected as being performed by Selenium driven WebDriver instance and the response is blocked.


References

A couple of documentations:

  • Bot Manager
  • Bot Manager : Foundations

tl; dr

A couple of relevant discussions:

  • Unable to use Selenium to automate Chase site login


来源:https://stackoverflow.com/questions/59872920/clicking-on-get-data-button-for-monthly-settlement-statistics-on-nseindia-com-do

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!