mean

Datetime objects with pandas mean function

倾然丶 夕夏残阳落幕 提交于 2019-11-29 11:24:09
I am new to programming so I apologize in advance if this question does not make any sens. I noticed that when I try to calculate the mean value of a pandas data frame with a date time object formatted like this: datetime.datetime(2014, 7, 10), it can not calculate the mean value of it however it seems to be able to calculate the minimum and maximum value of that same data frame with out a problem. d={'one' : Series([1, 2, 3], index=['a', 'b', 'c']), 'two' :Series([datetime.datetime(2014, 7, 9) , datetime.datetime(2014, 7, 10) , datetime.datetime(2014, 7, 11) ], index=['a', 'b', 'c'])} df=pd

Get mean of 2D slice of a 3D array in numpy

江枫思渺然 提交于 2019-11-29 09:26:08
I have a numpy array with a shape of: (11L, 5L, 5L) I want to calculate the mean over the 25 elements of each 'slice' of the array [0, :, :], [1, :, :] etc, returning 11 values. It seems silly, but I can't work out how to do this. I've thought the mean(axis=x) function would do this, but I've tried all possible combinations of axis and none of them give me the result I want. I can obviously do this using a for loop and slicing, but surely there is a better way? Use a tuple for axis : >>> a = np.arange(11*5*5).reshape(11,5,5) >>> a.mean(axis=(1,2)) array([ 12., 37., 62., 87., 112., 137., 162.,

How to use numpy with 'None' value in Python?

久未见 提交于 2019-11-29 09:06:32
I'd like to calculate the mean of an array in Python in this form: Matrice = [1, 2, None] I'd just like to have my None value ignored by the numpy.mean calculation but I can't figure out how to do it. tom10 You are looking for masked arrays . Here's an example. import MA a = MA.array([1, 2, None], mask = [0, 0, 1]) print "average =", MA.average(a) Unfortunately, masked arrays aren't thoroughly supported in numpy, so you've got to look around to see what can and can't be done with them. You can use scipy for that: import scipy.stats.stats as st m=st.nanmean(vec) haven't used numpy, but in

How to get column mean for specific rows only?

二次信任 提交于 2019-11-29 08:20:33
问题 I need to get the mean of one column (here: score) for specific rows (here: years). Specifically, I would like to know the average score for three periods: period 1: year <= 1983 period 2: year >= 1984 & year <= 1990 period 3: year >= 1991 This is the structure of my data: country year score Algeria 1980 -1.1201501 Algeria 1981 -1.0526943 Algeria 1982 -1.0561565 Algeria 1983 -1.1274560 Algeria 1984 -1.1353926 Algeria 1985 -1.1734330 Algeria 1986 -1.1327666 Algeria 1987 -1.1263586 Algeria 1988

Confusion about (Mean) Average Precision

那年仲夏 提交于 2019-11-29 07:14:51
In this question I asked clarifications about the precision-recall curve. In particular, I asked if we have to consider a fixed number of rankings to draw the curve or we can reasonably choose ourselves. According to the answer , the second one is correct. However now I have a big doubt about the Average Precision (AP) value: AP is used to estimate numerically how good is our algorithm given a certain query. Mean Average Precision (MAP) is average precision on multiple queries. My doubt is: if AP changes according to how many objects we retrieve then we can tune this parameter to our advantage

MATLAB Accumarray weighted mean

拈花ヽ惹草 提交于 2019-11-29 04:37:41
So I am currently using 'accumarray' to find the averages of a range of numbers wich correspond to matching ID's. Ex Input: ID----Value 1 215 1 336 1 123 2 111 2 246 2 851 My current code finds the unweighted average of the above values, using the ID as the 'seperator' so that I don't get the average for all of the values together as one number, but rather seperate results for just values which have corresponding ID's. EX Output: ID----Value 1 224.66 2 402.66 To achieve this I am using this code: [ID, ~, Groups] = unique(StarData2(:,1),'stable'); app = accumarray(Groups, StarData2(:,2), [],

Find and replace missing values with row mean

落爺英雄遲暮 提交于 2019-11-29 04:10:11
I have a data frame with NAs and I want to replace the NAs with row means c1 = c(1,2,3,NA) c2 = c(3,1,NA,3) c3 = c(2,1,3,1) df = data.frame(c1,c2,c3) > df c1 c2 c3 1 1 3 2 2 2 1 1 3 3 NA 3 4 NA 3 1 so that > df c1 c2 c3 1 1 3 2 2 2 1 1 3 3 3 3 4 2 3 1 Very similar to @baptiste's answer > ind <- which(is.na(df), arr.ind=TRUE) > df[ind] <- rowMeans(df, na.rm = TRUE)[ind[,1]] I think this works, df[which(is.na(df), arr.ind=TRUE)] <- rowMeans(df[!complete.cases(df), ], na.rm=TRUE) Using apply (note the returned object is a matrix ): t( apply( df , 1 , function(x) { x[ is.na(x) ] = mean( x , na.rm

Sort boxplot by mean (and not median) in R

戏子无情 提交于 2019-11-29 03:55:13
I have a simple boxplot, showing the distribution of a score for factor TYPE: myDataFrame = data.frame( TYPE=c("a","a","b","b","c","c"), SCORE=c(1,1,2,3,2,1) ) boxplot( SCORE~TYPE, data=myDataFrame ) The various types are shown in the order they have in the data frame. I'd like to sort the boxplot by the mean of SCORE in each TYPE (in the example above, the order should be a,c,b ). Any hint? This is a job for reorder() : myDataFrame$TYPE <- with(myDataFrame, reorder(TYPE, SCORE, mean)) boxplot( SCORE~TYPE, data=myDataFrame ) 来源: https://stackoverflow.com/questions/9741600/sort-boxplot-by-mean

pd.rolling_mean becoming deprecated - alternatives for ndarrays

最后都变了- 提交于 2019-11-29 03:35:00
It looks like pd.rolling_mean is becoming deprecated for ndarrays , pd.rolling_mean(x, window=2, center=False) FutureWarning: pd.rolling_mean is deprecated for ndarrays and will be removed in a future version but it seems to be the fastest way of doing this, according to this SO answer . Are there now new ways of doing this directly with SciPy or NumPy that are as fast as pd.rolling_mean ? EDIT -- Unfortunately, it looks like the new way is not nearly as fast: New version of Pandas: In [1]: x = np.random.uniform(size=100) In [2]: %timeit pd.rolling_mean(x, window=2) 1000 loops, best of 3: 240

Mean Squared Error in Numpy?

爷,独闯天下 提交于 2019-11-28 20:01:42
Is there a method in numpy for calculating the Mean Squared Error between two matrices? I've tried searching but found none. Is it under a different name? If there isn't, how do you overcome this? Do you write it yourself or use a different lib? Saullo G. P. Castro You can use: mse = ((A - B)**2).mean(axis=ax) Or mse = (np.square(A - B)).mean(axis=ax) with ax=0 the average is performed along the row, for each column, returning an array with ax=1 the average is performed along the column, for each row, returning an array with ax=None the average is performed element-wise along the array,