Numpy: Check if float array contains whole numbers

前端 未结 3 709
后悔当初
后悔当初 2020-12-16 00:55

In Python, it is possible to check if a float contains an integer value using n.is_integer(), based on this QA: How to check if a float value is a

3条回答
  •  无人及你
    2020-12-16 01:51

    I needed an answer to this question for a slightly different reason: checking when I can convert an entire array of floating point numbers to integers without losing data.

    Hunse's answer almost works for me, except that I obviously can't use the in-place trick, since I need to be able to undo the operation:

    if np.all(np.mod(x, 1) == 0):
        x = x.astype(int)
    

    From there, I thought of the following option which probably is faster in many situations:

    x_int = x.astype(int)
    if np.all((x - x_int) == 0):
        x = x_int
    

    The reason is that the modulo operation is slower than subtraction. However, now we do the casting to integers up-front - I don't know how fast that operation is, relatively speaking. But if most of your arrays are integers (they are in my case), the latter version is almost certainly faster.

    Another benefit is that you could replace the subraction with something like np.isclose to check within a certain tolerance (of course you should be careful here, since truncation is not proper rounding!).

    x_int = x.astype(int)
    if np.all(np.isclose(x, x_int, 0.0001)):
        x = x_int
    

    EDIT: Slower, but perhaps worth it depending on your use-case, is also converting integers individually if present.

    x_int = x.astype(int)
    safe_conversion = (x - x_int) == 0
    # if we can convert the whole array to integers, do that
    if np.all(safe_conversion):
        x = x_int.tolist()
    else:
        x  = x.tolist()
        # if there are _some_ integers, convert them
        if np.any(safe_conversion):
            for i in range(len(x)):
                if safe_conversion[i]:
                    x[i] = int(x[i])
    

    As an example of where this matters: this works out for me, because I have sparse data (which means mostly zeros) which I then convert to JSON, once, and reuse later on a server. For floats, ujson converts those as [ ...,0.0,0.0,0.0,... ], and for ints that results in [...,0,0,0,...], saving up to half the numbers of characters in the string. This reduces overhead on both the server (shorter strings) and the client (shorter strings, presumably slightly faster JSON parsing).

提交回复
热议问题