Testing regexes in Python using py.test

问题

Regexes are still something of a dark art to me, but I think that's one of those things that just takes practice. As such, I'm more concerned with being able to produce py.test functions that show me where my regexes are failing. My current code is something like this:

my_regex = re.compile("<this is where the magic (doesn't)? happen(s)?>")

def test_my_regex():
    tests = ["an easy test that I'm sure will pass",
             "a few things that may trip me up",
             "a really pathological, contrived example",
             "something from the real world?"]

    test_matches = [my_regex.match(test) for test in tests]

    for i in range(len(tests)):
        print("{}: {!r}".format(i, tests[i]))
        assert test_matches[i] is not None

for which the output when I run py.test myfile.py is something like

0: "an easy..."
1: "a few things..."
2: "a really pathological..."

where the last one is the first (only?) one to have not passed the test.

I suppose I could do something like an

assertSequenceEqual(test_matches, [not None]*len(test_matches))

but that seems gross, and I was under the impression that <object> is not None is the preferred way of checking that an object isn't None rather than <object> != None.

回答1:

Another approach is to use parametrize.

my_regex = re.compile("<this is where the magic (doesn't)? happen(s)?>")

@pytest.mark.parametrize('test_str', [
    "an easy test that I'm sure will pass",
    "a few things that may trip me up",
    "a really pathological, contrived example",
    "something from the real world?",
])
def test_my_regex(test_str):
     assert my_regex.match(test_str) is not None

This will produce an independent test case for each test string. This IMO is cleaner, easier to add new cases and also has the advantage of allowing each test_str to fail individually without affecting the others.

回答2:

You could use all:

assert all([my_regex.match(test) for test in goodinputs])

You might also want to test inputs that should NOT match, and test those with a negated any.

assert not any([my_regex.match(test) for test in badinputs])

If you want to see which matches fail, you could reorganise your existing code slightly, something like:

for test in tests:
    assert my_regex.match(test), test

which should print out the value of test if the assertion fails.

However, this will only print out the details of the first failure.

If you want to see all failures, you could do:

failures = [test for test in tests if not my_regex.match(test)]
assert len(failures) == 0, failures

来源：https://stackoverflow.com/questions/22818948/testing-regexes-in-python-using-py-test

标签

python

unit-testing

python-3.x

pytest