Find specific comments in HTML code using python

后端 未结 1 394
有刺的猬
有刺的猬 2021-01-15 06:14

I cant find a specific comment in python, in example the . My main reason is to find all the links inside 2 specific comments. Something like

1条回答
  •  暖寄归人
    2021-01-15 06:43

    If you want all the comments, you can use findAll with a callable:

    >>> from bs4 import BeautifulSoup, Comment
    >>> 
    >>> s = """
    ... 

    header

    ... ... www.test1.com ... www.test2.org ... ...

    tail

    ... """ >>> >>> soup = BeautifulSoup(s) >>> comments = soup.findAll(text = lambda text: isinstance(text, Comment)) >>> >>> comments [u' why ', u' why not ']

    And once you've got them, you can use the usual tricks to move around:

    >>> comments[0].next
    u'\nwww.test1.com\nwww.test2.org\n'
    >>> comments[0].next.split()
    [u'www.test1.com', u'www.test2.org']
    

    Depending on what the page actually looks like, you may have to tweak it a bit, and you'll have to choose which comments you want, but that should work to get you started.

    Edit:

    If you really want only the ones which look like some specific text, you can do something like

    >>> comments = soup.findAll(text = lambda text: isinstance(text, Comment) and text.strip() == 'why')
    >>> comments
    [u' why ']
    

    or you could filter them after the fact using a list comprehension:

    >>> [c for c in comments if c.strip().startswith("why")]
    [u' why ', u' why not ']
    

    0 讨论(0)
提交回复
热议问题