Where can I find mad (mean absolute deviation) in scipy?

前端 未结 10 1567
故里飘歌
故里飘歌 2020-12-23 16:36

It seems scipy once provided a function mad to calculate the mean absolute deviation for a set of numbers:

http://projects.scipy.org/scipy/browser/trunk

相关标签:
10条回答
  • 2020-12-23 17:16

    [EDIT] Since this keeps on getting downvoted: I know that median absolute deviation is a more commonly-used statistic, but the questioner asked for mean absolute deviation, and here's how to do it:

    from numpy import mean, absolute
    
    def mad(data, axis=None):
        return mean(absolute(data - mean(data, axis)), axis)
    
    0 讨论(0)
  • 2020-12-23 17:17

    It looks like scipy.stats.models was removed in august 2008 due to insufficient baking. Development has migrated to statsmodels.

    0 讨论(0)
  • 2020-12-23 17:19

    If you enjoy working in Pandas (like I do), it has a useful function for the mean absolute deviation:

    import pandas as pd
    df = pd.DataFrame()
    df['a'] = [1, 1, 2, 2, 4, 6, 9]
    df['a'].mad()
    

    Output: 2.3673469387755106

    0 讨论(0)
  • 2020-12-23 17:20

    It's not the scipy version, but here's an implementation of the MAD using masked arrays to ignore bad values: http://code.google.com/p/agpy/source/browse/trunk/agpy/mad.py

    Edit: A more recent version is available here.

    Edit 2: There's also a version in astropy here.

    0 讨论(0)
  • 2020-12-23 17:20

    Do not want to be misleaded, the mad is now in scipy.stats: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.median_absolute_deviation.html

    0 讨论(0)
  • 2020-12-23 17:23

    The current version of statsmodels has mad in statsmodels.robust:

    >>> import numpy as np
    >>> from statsmodels import robust
    >>> a = np.matrix( [
    ...     [ 80, 76, 77, 78, 79, 81, 76, 77, 79, 84, 75, 79, 76, 78 ],
    ...     [ 66, 69, 76, 72, 79, 77, 74, 77, 71, 79, 74, 66, 67, 73 ]
    ...  ], dtype=float )
    >>> robust.mad(a, axis=1)
    array([ 2.22390333,  5.18910776])
    

    Note that by default this computes the robust estimate of the standard deviation assuming a normal distribution by scaling the result a scaling factor; from help:

    Signature: robust.mad(a, 
                          c=0.67448975019608171, 
                          axis=0, 
                          center=<function median at 0x10ba6e5f0>)
    

    The version in R makes a similar normalization. If you don't want this, obviously just set c=1.

    (An earlier comment mentioned this being in statsmodels.robust.scale. The implementation is in statsmodels/robust/scale.py (see github) but the robust package does not export scale, rather it exports the public functions in scale.py explicitly.)

    0 讨论(0)
提交回复
热议问题