I am using Selenium 2 with python bindings to fetch some data from our partner\'s site. But on an average it\'s taking me around 13 secs to perform this operation.
I
Tossing in my 2¢.
Better to use javascript snippets to accomplish.
driver.execute_script(
'document.querySelectorAll("img").forEach(function(ev){ev.remove()});'
);
That will remove the img elements. If you do this right after you load the page, they will have little chance to download image data.
Here is a similar solution I found elsewhere on StackOverflow. (Can't find it anymore)
driver.execute_script(
"document.head.parentNode.removeChild(document.head)"
);
You can disable images/css using the Web Developer toolbar Addon.
https://addons.mozilla.org/en-US/firefox/addon/web-developer/
go to CSS->Disable and Images->Disable
I have figured out a way to prevent Firefox from loading CSS, images and Flash.
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
def disableImages(self):
## get the Firefox profile object
firefoxProfile = FirefoxProfile()
## Disable CSS
firefoxProfile.set_preference('permissions.default.stylesheet', 2)
## Disable images
firefoxProfile.set_preference('permissions.default.image', 2)
## Disable Flash
firefoxProfile.set_preference('dom.ipc.plugins.enabled.libflashplayer.so',
'false')
## Set the modified profile while creating the browser object
self.browserHandle = webdriver.Firefox(firefoxProfile)
Thanks again @Simon and @ernie for your suggestions.
Unfortunately the option firefox_profile.set_preference('permissions.default.image', 2)
no longer seems to work to disable images with the latest version of Firefox - [for reason see Alecxe's answer to my question Can't turn off images in Selenium / Firefox ]
The best solution i had was to use the firefox extension quickjava , which amongst other things can disable images- https://addons.mozilla.org/en-us/firefox/addon/quickjava/
My Python code:
from selenium import webdriver
firefox_profile = webdriver.FirefoxProfile()
firefox_profile.add_extension(folder_xpi_file_saved_in + "\\quickjava-2.0.6-fx.xpi")
firefox_profile.set_preference("thatoneguydotnet.QuickJava.curVersion", "2.0.6.1") ## Prevents loading the 'thank you for installing screen'
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Images", 2) ## Turns images off
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.AnimatedImage", 2) ## Turns animated images off
driver = webdriver.Firefox(firefox_profile)
driver.get(web_address_desired)
Disabling CSS (and i think flash) still work with firefox propertiees. but they and other parts can also be switched off by adding the lines:
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.CSS", 2) ## CSS
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Cookies", 2) ## Cookies
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Flash", 2) ## Flash
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Java", 2) ## Java
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.JavaScript", 2) ## JavaScript
firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Silverlight", 2)
It has been so long since I've written this and I can say the field of web automation (either for testing or crawling/scraping purposes) has changed a lot. The major browsers have already presented a --headless
flag and even interactive shell. No more change the good old DISPLAY
variable on Linux.
Firefox has also changed, migrating to Servo engine written with Rust. I've tried the profile below with a contemporary version (specifically, 62.0). Some worked, some did not. Keep that in mind.
I'm just extending the answer of kyrenia in this question. However, disabling the CSS might cause Jquery not to be able to manipulate DOM elements. Use QuickJava and those below:
profile.set_preference("network.http.pipelining", True)
profile.set_preference("network.http.proxy.pipelining", True)
profile.set_preference("network.http.pipelining.maxrequests", 8)
profile.set_preference("content.notify.interval", 500000)
profile.set_preference("content.notify.ontimer", True)
profile.set_preference("content.switch.threshold", 250000)
profile.set_preference("browser.cache.memory.capacity", 65536) # Increase the cache capacity.
profile.set_preference("browser.startup.homepage", "about:blank")
profile.set_preference("reader.parse-on-load.enabled", False) # Disable reader, we won't need that.
profile.set_preference("browser.pocket.enabled", False) # Duck pocket too!
profile.set_preference("loop.enabled", False)
profile.set_preference("browser.chrome.toolbar_style", 1) # Text on Toolbar instead of icons
profile.set_preference("browser.display.show_image_placeholders", False) # Don't show thumbnails on not loaded images.
profile.set_preference("browser.display.use_document_colors", False) # Don't show document colors.
profile.set_preference("browser.display.use_document_fonts", 0) # Don't load document fonts.
profile.set_preference("browser.display.use_system_colors", True) # Use system colors.
profile.set_preference("browser.formfill.enable", False) # Autofill on forms disabled.
profile.set_preference("browser.helperApps.deleteTempFileOnExit", True) # Delete temprorary files.
profile.set_preference("browser.shell.checkDefaultBrowser", False)
profile.set_preference("browser.startup.homepage", "about:blank")
profile.set_preference("browser.startup.page", 0) # blank
profile.set_preference("browser.tabs.forceHide", True) # Disable tabs, We won't need that.
profile.set_preference("browser.urlbar.autoFill", False) # Disable autofill on URL bar.
profile.set_preference("browser.urlbar.autocomplete.enabled", False) # Disable autocomplete on URL bar.
profile.set_preference("browser.urlbar.showPopup", False) # Disable list of URLs when typing on URL bar.
profile.set_preference("browser.urlbar.showSearch", False) # Disable search bar.
profile.set_preference("extensions.checkCompatibility", False) # Addon update disabled
profile.set_preference("extensions.checkUpdateSecurity", False)
profile.set_preference("extensions.update.autoUpdateEnabled", False)
profile.set_preference("extensions.update.enabled", False)
profile.set_preference("general.startup.browser", False)
profile.set_preference("plugin.default_plugin_disabled", False)
profile.set_preference("permissions.default.image", 2) # Image load disabled again
What does it do? You can actually see what it does in comment lines. However, I've also found a couple of about:config entries to increase the performance. For example, the code above does not load the font or colors of the document, but it loads CSS, so Jquery -or any other library- can manipulate DOM elements and does not raise an error. (For a further debug, you still download CSS, but your browser will jump the lines which contains a special font-family or color definition. So browser will download and load CSS, but use system-defaults in styling and renders the page faster.)
For more information, check out this article.
I just made a performance test. You do not really need to take the results serious since I made this test just once, for you to have an idea.
I made the test in an old machine on 2.2 gHZ Intel Pentium processor, 3 gB RAM with 4gB swap area, Ubuntu 14.04 x64 system.
The test takes three steps:
webdriver
module.I used this page as subject and inspected .xxy a
as CSS selector. Then I used a special process one by one.
Driver Loading Performance: 13.124099016189575
Page Loading Performance: 3.2673521041870117
DOM Inspecting Performance: 67.82778096199036
Driver Loading Performance: 7.535895824432373
Page Loading Performance: 2.9704301357269287
DOM Inspecting Performance: 64.25136017799377
I made a test maybe a month ago, but I could not take the results. However, I want to mention that driver loading, page loading and DOM inspecting speed decreases under ten seconds when Firefox is used headless. That was really cool.
For everyone interested in still using the original straight-forward approach suggested by Anupam:
Just install firefox version 20.0.1 (https://ftp.mozilla.org/pub/firefox/releases/20.0.1/) - works perfectly fine.
Other versions may work as well (versions 32 and higher and versions 3.6.9 and lower do NOT work)