问题
So yeah, the title is pretty weird but I have no other idea how to describe my problem properly. Whatever... lets get to the problem.
Job to get done
My boss wants a function that read all functions of a python file and return a DataTable containing the found functions. This function should be written in IronPython (Python which actually uses C# libraries).
The Problem
I am relatively new to Python and I have no idea what this language is capable of, so I started to write my function and yeah it works pretty well, except one weird problem. I wrote a regular expression to find the functions and to test it I downloaded a RegEx Tester. The Regex Tester showed the results I wanted: Group 1 - The function name, Group 2 - The functions parameter and Group 3 - the content of the function.
For some magical reasons, it doesn't work when it goes to live testing. And with doesn't work I mean, Group 3 has actually no output. After testing the expression with another (online) RegEx Tester, it showed me, that Group 3 has actually not the content of the function, it only has a small part of it, starting with a newline/return character.
In my test cases, the results of Group 3 where all the same, starting with a newline/return character and ended with the functions return (e.g. return objDic).
Question: What the hell is going wrong there? I have no idea what is wrong on my RegEx.
The Regex
objRegex = Regex(r"(?i)def[\s]+([\w]+)\(([\, [\w]+)\)(?:[\:{1}]\s*)([\n].*(?!\ndef[\s]+))+")
The Data
def test_function(some_parameter):
try:
some_cool_code_goes_here()
return obj
except Exception as ex:
DetailsBox.Show(ex)
def another_cool_function(another_parameter):
try:
what_you_want()
return obj
except Exception as ex:
DetailsBox.Show(ex)
The Result
Match: def test_function(some_parameter):...
Position: ..
Length: ..
Group 1: test_function
Group 2: some_parameter
Group 3: (newline/return character)
return obj
But Group 3 should be:
try:
some_cool_code_goes_here()
return obj
except Exception as ex:
DetailsBox.Show(ex)
I hope you can help me :3 Thank you guys!
回答1:
Although @Hamza said in his comment that you have several problems in your regex, I think they are more of uneeded complexity, the reason for not matching the body might be that you haven't let the .
special meta-character match the new line so it is stopping at the first new line character after the first Try:
statement.
To fix this you will need to let the .
match new line characters and here is a stripped down version of your regex that works:
(?i)def\s+(\w+)\s*\(([\, \w]+)\)(?:\s*:\s*)(.+?)(?=def|$)
回答2:
Thanks to HamZa for the quick help (and of course also thanks for all the other helpers), he actually solved the problem. There were just a few adjustments necessary (to make it work for C# :-)) but the main point comes from him, thanks a lot.
Solution for my problem:
Regex(r"(?is)def\s*(?<name>\w+)\s*\((?<parameter>[^)]+)\)\s*:\s*(?:\r?\n)+(?<body>.*?)(?=\r?\ndef|$)")
来源:https://stackoverflow.com/questions/18232343/unable-to-get-a-block-of-code-into-my-regex-match-groups