Unbalanced parenthesis error with Regex

六眼飞鱼酱① 提交于 2021-02-16 05:32:53

问题


I am using the following regex to obtain all data from a website Javascript data source that is contained within the following character pattern

[[]]);

The code I am using is this:

regex = r'\[\[.*?\]]);'
        match2 = re.findall(regex, response.body, re.S)
        print match2

This is throwing up an error message of:

    raise error, v # invalid expression
sre_constants.error: unbalanced parenthesis

I think I am fairly safe in assuming that this is being caused by the closing bracket within my regex. How can I define the regex that I want without getting this error?

Thanks


回答1:


You need to escape those last parenthesis as well. Close square brackets outside a character class do not have to be escaped:

regex = r'\[\[.*?]]\);'
                   ^

If you are trying to obtain the content between the square brackets, use a capturing group here.

>>> import re
>>> s = 'foo [[bar]]); baz [[quz]]); not [[foobar]]'
>>> matches = re.findall(r'\[\[(.*?)]]\);', s, re.S)
>>> matches
['bar', 'quz']



回答2:


escape the last ) and ] r'\[\[.*?\]\]\)




回答3:


Your regex should be,

regex = r'\[\[.*?\]\]\);'

It would match literal [[ symbols and the following characters upto the next ]]); symbols.

Explanation:

  • \[\[ Matches the Literal [[ symbols.
  • .*? Matches any charcter zero or more times. ? after * forces the regex engine to does a shortest (non-greedy) match.
  • \]\]\); Matches the literal ]]); symbols.


来源:https://stackoverflow.com/questions/25108542/unbalanced-parenthesis-error-with-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!