Wait until page is loaded with Selenium WebDriver for Python

后端 未结 12 1083
借酒劲吻你
借酒劲吻你 2020-11-22 00:26

I want to scrape all the data of a page implemented by a infinite scroll. The following python code works.

for i in range(100):
    driver.execute_script(\"w         


        
12条回答
  •  情书的邮戳
    2020-11-22 00:43

    Solution for ajax pages that continuously load data. The previews methods stated do not work. What we can do instead is grab the page dom and hash it and compare old and new hash values together over a delta time.

    import time
    from selenium import webdriver
    
    def page_has_loaded(driver, sleep_time = 2):
        '''
        Waits for page to completely load by comparing current page hash values.
        '''
    
        def get_page_hash(driver):
            '''
            Returns html dom hash
            '''
            # can find element by either 'html' tag or by the html 'root' id
            dom = driver.find_element_by_tag_name('html').get_attribute('innerHTML')
            # dom = driver.find_element_by_id('root').get_attribute('innerHTML')
            dom_hash = hash(dom.encode('utf-8'))
            return dom_hash
    
        page_hash = 'empty'
        page_hash_new = ''
        
        # comparing old and new page DOM hash together to verify the page is fully loaded
        while page_hash != page_hash_new: 
            page_hash = get_page_hash(driver)
            time.sleep(sleep_time)
            page_hash_new = get_page_hash(driver)
            print(' - page not loaded')
    
        print(' - page loaded: {}'.format(driver.current_url))
    

提交回复
热议问题