ISIN function does not work for dates

后端 未结 4 1866
孤城傲影
孤城傲影 2020-12-06 02:56
d = {\'Dates\':[pd.Timestamp(\'2013-01-02\'),
              pd.Timestamp(\'2013-01-03\'),
              pd.Timestamp(\'2013-01-04\')],
     \'Num1\':[1,2,3],
     \'         


        
4条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-06 03:21

    Yep, that looks like a bug to me. It comes down to this part of lib.ismember:

    for i in range(n):
        val = util.get_value_at(arr, i)
        if val in values:
            result[i] = 1
        else: 
            result[i] = 0
    

    val is a numpy.datetime64 object, and values is a set of Timestamp objects. Testing membership should work, but doesn't:

    >>> import pandas as pd, numpy as np
    >>> ts = pd.Timestamp('2013-01-04')
    >>> ts
    Timestamp('2013-01-04 00:00:00', tz=None)
    >>> dt64 = np.datetime64(ts)
    >>> dt64
    numpy.datetime64('2013-01-03T19:00:00.000000-0500')
    >>> dt64 == ts
    True
    >>> dt64 in [ts]
    True
    >>> dt64 in {ts}
    False
    

    I think usually that behaviour -- working in a list, not working in a set -- is due to something going wrong with __hash__:

    >>> hash(dt64)
    1357257600000000
    >>> hash(ts)
    -7276108168457487299
    

    You can't do membership testing in a set if the hashes aren't the same. I can think of a few ways to fix this, but choosing the best one would depend upon design choices they made when implementing Timestamps that I'm not qualified to comment on.

提交回复
热议问题