Exception handling and testing with pytest and hypothesis

☆樱花仙子☆ 提交于 2021-01-28 06:46:08

问题


I'm writing tests for a statistical analysis with hypothesis. Hypothesis led me to a ZeroDivisionError in my code when it is passed very sparse data. So I adapted my code to handle the exception; in my case, that means log the reason and reraise the exception.

try:
    val = calc(data)
except ZeroDivisionError:
    logger.error(f"check data: {data}, too sparse")
    raise

I need to pass the exception up through the call stack because the top-level caller needs to know there was an exception so that it can pass an error code to the external caller (a REST API request).

Edit: I can't also assign a reasonable value to val; essentially I need a histogram, and this happens when I'm calculating a reasonable bin width from the data. Obviously this fails when the data is sparse. And without the histogram, the algorithm cannot proceed any further.

Now my issue is, in my test when I do something like this:

@given(dataframe)
def test_my_calc(df):
    # code that executes the above code path

hypothesis keeps generating failing examples that trigger ZeroDivisionError, and I don't know how to ignore this exception. Normally I would mark a test like this with pytest.mark.xfail(raises=ZeroDivisionError), but here I can't do that as the same test passes for well behaved inputs.

Something like this would be ideal:

  1. continue with the test as usual for most inputs, however
  2. when ZeroDivisionError is raised, skip it as an expected failure.

How could I achieve that? Do I need to put a try: ... except: ... in the test body as well? What would I need to do in the except block to mark it as an expected failure?

Edit: to address the comment by @hoefling, separating out the failing cases would be the idea solution. But unfortunately, hypothesis doesn't give me enough handles to control that. At most I can control the total count, and limits (min, max) of the generated data. However the failing cases have a very narrow spread. There is no way for me to control that. I guess that's the point of hypothesis, and maybe I shouldn't be using hypothesis at all for this.

Here's how I generate my data (slightly simplified):

cities = [f"city{i}" for i in range(4)]
cats = [f"cat{i}" for i in range(4)]


@st.composite
def dataframe(draw):
    data_st = st.floats(min_value=0.01, max_value=50)
    df = []
    for city, cat in product(cities, cats):
        cols = [
            column("city", elements=st.just(city)),
            column("category", elements=st.just(cat)),
            column("metric", elements=data_st, fill=st.nothing()),
        ]
        _df = draw(data_frames(cols, index=range_indexes(min_size=2)))
        # my attempt to control the spread
        assume(np.var(_df["metric"]) >= 0.01)
        df += [_df]
    df = pd.concat(df, axis=0).set_index(["city", "category"])
    return df

回答1:


from hypothesis import assume, given, strategies as st

@given(...)
def test_stuff(inputs):
    try:
        ...
    except ZeroDivisionError:
        assume(False)

The assume call will tell Hypothesis that this example is "bad" and it should try another, without failing the test. It's equivalent to calling .filter(will_not_cause_zero_division) on your strategy, if you had such a function. See the docs for details.



来源:https://stackoverflow.com/questions/57208801/exception-handling-and-testing-with-pytest-and-hypothesis

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!