How to get the html source of a specific element with selenium?

前端 未结 4 965
梦谈多话
梦谈多话 2020-12-29 14:46

The page I\'m looking at contains :

text 1

text 2

text 3

text 4

相关标签:
4条回答
  • 2020-12-29 15:08

    What about using jQuery?

    Edit:

    First you have to add the required .JS files, for that go to www.jQuery.com.

    Then all you need to do is call a simple jQuery selector:

    alert($("div#1").html());
    
    0 讨论(0)
  • 2020-12-29 15:10

    The following code will give you the HTML in the div element:

    sel = selenium('localhost', 4444, browser, my_url)
    html = sel.get_eval("this.browserbot.getCurrentWindow().document.getElementById('1').innerHTML")
    

    then you can use BeautifulSoup to parse it and extract what you really want.

    I hope it helps

    0 讨论(0)
  • 2020-12-29 15:14

    The selected answer does not work in Python 3 at the time of writing. Instead use this:

    from selenium import webdriver
    
    wd = webdriver.Firefox()
    wd.get(url)
    return wd.execute_script('return window.document.getElementById('1').innerHTML')
    
    0 讨论(0)
  • 2020-12-29 15:23

    Use xpath. From selenium.py:

    Without an explicit locator prefix, Selenium uses the following default strategies:

    • \**dom**\ , for locators starting with "document."
    • \**xpath**\ , for locators starting with "//"
    • \**identifier**\ , otherwise

    In your case, you could try

    selenium.get_text("//div[@id='1']/descendant::*[not(self::h1)]")
    

    You can learn more about xpath here.

    P.S. I don't know if there's good HTML documentation available for python-selenium, but I haven't found any; on the other hand, the docstrings of the selenium.py file seem to constitute comprehensive documentation. So I'd suggest looking up the source to get a better understanding of how it works.

    0 讨论(0)
提交回复
热议问题