Unable to get a block of code into my regex match groups

故事扮演 提交于 2019-12-10 19:08:59

问题


So yeah, the title is pretty weird but I have no other idea how to describe my problem properly. Whatever... lets get to the problem.

Job to get done

My boss wants a function that read all functions of a python file and return a DataTable containing the found functions. This function should be written in IronPython (Python which actually uses C# libraries).

The Problem

I am relatively new to Python and I have no idea what this language is capable of, so I started to write my function and yeah it works pretty well, except one weird problem. I wrote a regular expression to find the functions and to test it I downloaded a RegEx Tester. The Regex Tester showed the results I wanted: Group 1 - The function name, Group 2 - The functions parameter and Group 3 - the content of the function.

For some magical reasons, it doesn't work when it goes to live testing. And with doesn't work I mean, Group 3 has actually no output. After testing the expression with another (online) RegEx Tester, it showed me, that Group 3 has actually not the content of the function, it only has a small part of it, starting with a newline/return character.

In my test cases, the results of Group 3 where all the same, starting with a newline/return character and ended with the functions return (e.g. return objDic).

Question: What the hell is going wrong there? I have no idea what is wrong on my RegEx.

The Regex

objRegex = Regex(r"(?i)def[\s]+([\w]+)\(([\, [\w]+)\)(?:[\:{1}]\s*)([\n].*(?!\ndef[\s]+))+")

The Data

def test_function(some_parameter):
    try:
        some_cool_code_goes_here()
        return obj
    except Exception as ex:
        DetailsBox.Show(ex)

def another_cool_function(another_parameter):
    try:
        what_you_want()
        return obj
    except Exception as ex:
        DetailsBox.Show(ex)

The Result

Match: def test_function(some_parameter):...
Position: ..
Length: ..
Group 1: test_function
Group 2: some_parameter
Group 3: (newline/return character) return obj

But Group 3 should be:

    try:
        some_cool_code_goes_here()
        return obj
    except Exception as ex:
        DetailsBox.Show(ex)

I hope you can help me :3 Thank you guys!


回答1:


Although @Hamza said in his comment that you have several problems in your regex, I think they are more of uneeded complexity, the reason for not matching the body might be that you haven't let the . special meta-character match the new line so it is stopping at the first new line character after the first Try: statement.

To fix this you will need to let the . match new line characters and here is a stripped down version of your regex that works:

(?i)def\s+(\w+)\s*\(([\, \w]+)\)(?:\s*:\s*)(.+?)(?=def|$)



回答2:


Thanks to HamZa for the quick help (and of course also thanks for all the other helpers), he actually solved the problem. There were just a few adjustments necessary (to make it work for C# :-)) but the main point comes from him, thanks a lot.

Solution for my problem:

Regex(r"(?is)def\s*(?<name>\w+)\s*\((?<parameter>[^)]+)\)\s*:\s*(?:\r?\n)+(?<body>.*?)(?=\r?\ndef|$)")


来源:https://stackoverflow.com/questions/18232343/unable-to-get-a-block-of-code-into-my-regex-match-groups

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!