Do not want the Images to load and CSS to render on Firefox in Selenium WebDriver - Python

后端 未结 6 859
长情又很酷
长情又很酷 2020-11-29 00:37

I am using Selenium 2 with python bindings to fetch some data from our partner\'s site. But on an average it\'s taking me around 13 secs to perform this operation.

I

相关标签:
6条回答
  • 2020-11-29 00:52

    Tossing in my 2¢.

    Better to use javascript snippets to accomplish.

    driver.execute_script(
       'document.querySelectorAll("img").forEach(function(ev){ev.remove()});'
    );
    

    That will remove the img elements. If you do this right after you load the page, they will have little chance to download image data.

    Here is a similar solution I found elsewhere on StackOverflow. (Can't find it anymore)

    driver.execute_script(
       "document.head.parentNode.removeChild(document.head)"
    );
    
    0 讨论(0)
  • 2020-11-29 00:54

    You can disable images/css using the Web Developer toolbar Addon.

    https://addons.mozilla.org/en-US/firefox/addon/web-developer/

    go to CSS->Disable and Images->Disable

    0 讨论(0)
  • 2020-11-29 00:57

    I have figured out a way to prevent Firefox from loading CSS, images and Flash.

    from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
    
    def disableImages(self):
        ## get the Firefox profile object
        firefoxProfile = FirefoxProfile()
        ## Disable CSS
        firefoxProfile.set_preference('permissions.default.stylesheet', 2)
        ## Disable images
        firefoxProfile.set_preference('permissions.default.image', 2)
        ## Disable Flash
        firefoxProfile.set_preference('dom.ipc.plugins.enabled.libflashplayer.so',
                                      'false')
        ## Set the modified profile while creating the browser object 
        self.browserHandle = webdriver.Firefox(firefoxProfile)
    

    Thanks again @Simon and @ernie for your suggestions.

    0 讨论(0)
  • 2020-11-29 00:59

    Unfortunately the option firefox_profile.set_preference('permissions.default.image', 2) no longer seems to work to disable images with the latest version of Firefox - [for reason see Alecxe's answer to my question Can't turn off images in Selenium / Firefox ]

    The best solution i had was to use the firefox extension quickjava , which amongst other things can disable images- https://addons.mozilla.org/en-us/firefox/addon/quickjava/

    My Python code:

     from selenium import webdriver
     firefox_profile = webdriver.FirefoxProfile()
    
     firefox_profile.add_extension(folder_xpi_file_saved_in + "\\quickjava-2.0.6-fx.xpi")
     firefox_profile.set_preference("thatoneguydotnet.QuickJava.curVersion", "2.0.6.1") ## Prevents loading the 'thank you for installing screen'
     firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Images", 2)  ## Turns images off
     firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.AnimatedImage", 2)  ## Turns animated images off
    
     driver = webdriver.Firefox(firefox_profile)
     driver.get(web_address_desired)
    

    Disabling CSS (and i think flash) still work with firefox propertiees. but they and other parts can also be switched off by adding the lines:

      firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.CSS", 2)  ## CSS
      firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Cookies", 2)  ## Cookies
      firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Flash", 2)  ## Flash
      firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Java", 2)  ## Java
      firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.JavaScript", 2)  ## JavaScript
      firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Silverlight", 2) 
    
    0 讨论(0)
  • 2020-11-29 01:10

    New Edit

    It has been so long since I've written this and I can say the field of web automation (either for testing or crawling/scraping purposes) has changed a lot. The major browsers have already presented a --headless flag and even interactive shell. No more change the good old DISPLAY variable on Linux.

    Firefox has also changed, migrating to Servo engine written with Rust. I've tried the profile below with a contemporary version (specifically, 62.0). Some worked, some did not. Keep that in mind.


    I'm just extending the answer of kyrenia in this question. However, disabling the CSS might cause Jquery not to be able to manipulate DOM elements. Use QuickJava and those below:

    profile.set_preference("network.http.pipelining", True)
    profile.set_preference("network.http.proxy.pipelining", True)
    profile.set_preference("network.http.pipelining.maxrequests", 8)
    profile.set_preference("content.notify.interval", 500000)
    profile.set_preference("content.notify.ontimer", True)
    profile.set_preference("content.switch.threshold", 250000)
    profile.set_preference("browser.cache.memory.capacity", 65536) # Increase the cache capacity.
    profile.set_preference("browser.startup.homepage", "about:blank")
    profile.set_preference("reader.parse-on-load.enabled", False) # Disable reader, we won't need that.
    profile.set_preference("browser.pocket.enabled", False) # Duck pocket too!
    profile.set_preference("loop.enabled", False)
    profile.set_preference("browser.chrome.toolbar_style", 1) # Text on Toolbar instead of icons
    profile.set_preference("browser.display.show_image_placeholders", False) # Don't show thumbnails on not loaded images.
    profile.set_preference("browser.display.use_document_colors", False) # Don't show document colors.
    profile.set_preference("browser.display.use_document_fonts", 0) # Don't load document fonts.
    profile.set_preference("browser.display.use_system_colors", True) # Use system colors.
    profile.set_preference("browser.formfill.enable", False) # Autofill on forms disabled.
    profile.set_preference("browser.helperApps.deleteTempFileOnExit", True) # Delete temprorary files.
    profile.set_preference("browser.shell.checkDefaultBrowser", False)
    profile.set_preference("browser.startup.homepage", "about:blank")
    profile.set_preference("browser.startup.page", 0) # blank
    profile.set_preference("browser.tabs.forceHide", True) # Disable tabs, We won't need that.
    profile.set_preference("browser.urlbar.autoFill", False) # Disable autofill on URL bar.
    profile.set_preference("browser.urlbar.autocomplete.enabled", False) # Disable autocomplete on URL bar.
    profile.set_preference("browser.urlbar.showPopup", False) # Disable list of URLs when typing on URL bar.
    profile.set_preference("browser.urlbar.showSearch", False) # Disable search bar.
    profile.set_preference("extensions.checkCompatibility", False) # Addon update disabled
    profile.set_preference("extensions.checkUpdateSecurity", False)
    profile.set_preference("extensions.update.autoUpdateEnabled", False)
    profile.set_preference("extensions.update.enabled", False)
    profile.set_preference("general.startup.browser", False)
    profile.set_preference("plugin.default_plugin_disabled", False)
    profile.set_preference("permissions.default.image", 2) # Image load disabled again
    

    What does it do? You can actually see what it does in comment lines. However, I've also found a couple of about:config entries to increase the performance. For example, the code above does not load the font or colors of the document, but it loads CSS, so Jquery -or any other library- can manipulate DOM elements and does not raise an error. (For a further debug, you still download CSS, but your browser will jump the lines which contains a special font-family or color definition. So browser will download and load CSS, but use system-defaults in styling and renders the page faster.)

    For more information, check out this article.


    Edit (Tests)

    I just made a performance test. You do not really need to take the results serious since I made this test just once, for you to have an idea.

    I made the test in an old machine on 2.2 gHZ Intel Pentium processor, 3 gB RAM with 4gB swap area, Ubuntu 14.04 x64 system.

    The test takes three steps:

    • Driver Loading Performance: The seconds wasted to load the driver in webdriver module.
    • Page Loading Performance: The seconds wasted to load the page. It also includes the internet speed, however the render process is included as well.
    • DOM Inspecting Performance: DOM inspecting speed on the page.

    I used this page as subject and inspected .xxy a as CSS selector. Then I used a special process one by one.

    Selenium, Firefox, No Profile

    Driver Loading Performance: 13.124099016189575
    Page Loading Performance: 3.2673521041870117
    DOM Inspecting Performance: 67.82778096199036
    

    Selenium, Firefox, Profile Above

    Driver Loading Performance: 7.535895824432373
    Page Loading Performance: 2.9704301357269287
    DOM Inspecting Performance: 64.25136017799377
    

    Edit (About Headlessness)

    I made a test maybe a month ago, but I could not take the results. However, I want to mention that driver loading, page loading and DOM inspecting speed decreases under ten seconds when Firefox is used headless. That was really cool.

    0 讨论(0)
  • 2020-11-29 01:14

    For everyone interested in still using the original straight-forward approach suggested by Anupam:

    Just install firefox version 20.0.1 (https://ftp.mozilla.org/pub/firefox/releases/20.0.1/) - works perfectly fine.

    Other versions may work as well (versions 32 and higher and versions 3.6.9 and lower do NOT work)

    0 讨论(0)
提交回复
热议问题