nan

python pandas dataframe : fill nans with a conditional mean

无人久伴 提交于 2020-04-12 10:49:30
问题 I have the following dataframe: import numpy as np import pandas as pd df = pd.DataFrame(data={'Cat' : ['A', 'A', 'A','B', 'B', 'A', 'B'], 'Vals' : [1, 2, 3, 4, 5, np.nan, np.nan]}) Cat Vals 0 A 1 1 A 2 2 A 3 3 B 4 4 B 5 5 A NaN 6 B NaN And I want indexes 5 and 6 to be filled with the conditional mean of 'Vals' based on the 'Cat' column, namely 2 and 4.5 The following code works fine: means = df.groupby('Cat').Vals.mean() for i in df[df.Vals.isnull()].index: df.loc[i, 'Vals'] = means[df.loc[i

python pandas dataframe : fill nans with a conditional mean

自作多情 提交于 2020-04-12 10:47:45
问题 I have the following dataframe: import numpy as np import pandas as pd df = pd.DataFrame(data={'Cat' : ['A', 'A', 'A','B', 'B', 'A', 'B'], 'Vals' : [1, 2, 3, 4, 5, np.nan, np.nan]}) Cat Vals 0 A 1 1 A 2 2 A 3 3 B 4 4 B 5 5 A NaN 6 B NaN And I want indexes 5 and 6 to be filled with the conditional mean of 'Vals' based on the 'Cat' column, namely 2 and 4.5 The following code works fine: means = df.groupby('Cat').Vals.mean() for i in df[df.Vals.isnull()].index: df.loc[i, 'Vals'] = means[df.loc[i

python pandas dataframe : fill nans with a conditional mean

霸气de小男生 提交于 2020-04-12 10:41:30
问题 I have the following dataframe: import numpy as np import pandas as pd df = pd.DataFrame(data={'Cat' : ['A', 'A', 'A','B', 'B', 'A', 'B'], 'Vals' : [1, 2, 3, 4, 5, np.nan, np.nan]}) Cat Vals 0 A 1 1 A 2 2 A 3 3 B 4 4 B 5 5 A NaN 6 B NaN And I want indexes 5 and 6 to be filled with the conditional mean of 'Vals' based on the 'Cat' column, namely 2 and 4.5 The following code works fine: means = df.groupby('Cat').Vals.mean() for i in df[df.Vals.isnull()].index: df.loc[i, 'Vals'] = means[df.loc[i

Removing NAN's from numpy 2-D arrays

雨燕双飞 提交于 2020-04-11 05:18:26
问题 Similar to this question I would like to remove some NAN's from a 2-D numpy array. However, instead of removing an entire row that has NAN's I want to remove the corresponding element from each row of the array. For example (using list format for simplicity) x=[ [1,2,3,4], [2,4,nan,8], [3,6,9,0] ] would become x=[ [1,2,4], [2,4,8], [3,6,0] ] I can imagine using a numpy.where to figure out where in each row the NAN's appear and then use some loops and logic statements to make a new array from

Checking if particular value (in cell) is NaN in pandas DataFrame not working using ix or iloc

无人久伴 提交于 2020-04-05 15:44:47
问题 Lets say I have following pandas DataFrame : import pandas as pd df = pd.DataFrame({"A":[1,pd.np.nan,2], "B":[5,6,0]}) Which would look like: >>> df A B 0 1.0 5 1 NaN 6 2 2.0 0 First option I know one way to check if a particular value is NaN , which is as follows: >>> df.isnull().ix[1,0] True Second option (not working) I thought below option, using ix , would work as well, but it's not: >>> df.ix[1,0]==pd.np.nan False I also tried iloc with same results: >>> df.iloc[1,0]==pd.np.nan False

sklearn Logistic Regression ValueError: X has 42 features per sample; expecting 1423

谁说我不能喝 提交于 2020-03-03 10:03:39
问题 I'm stuck trying to fix an issue. Here is what I'm trying to do : I'd like to predict missing values (Nan) (categorical one) using logistic regression. Here is my code : df_1 : my dataset with missing values only in the "Metier" feature (missing values I'm trying to predict) X_train = pd.get_dummies(df_1[df_1['Metier'].notnull()].drop(columns='Metier'),drop_first = True) X_test = pd.get_dummies(df_1[df_1['Metier'].isnull()].drop(columns='Metier'),drop_first = True,dummy_na = True) Y_train =

java中的NAN和INFINITY

你。 提交于 2020-02-28 22:08:41
java浮点数运算中有两个特殊的情况:NAN、INFINITY。 1、INFINITY: 在浮点数运算时,有时我们会遇到除数为0的情况,那java是如何解决的呢? 我们知道,在整型运算中,除数是不能为0的,否则直接运行错误。但是在浮点数运算中,引入了无限这个概念,我们来看一下Double和Float中的定义。 Double: public static final double POSITIVE_INFINITY = 1.0 / 0.0; public static final double NEGATIVE_INFINITY = -1.0 / 0.0; Float: public static final float POSITIVE_INFINITY = 1.0f / 0.0f; public static final float NEGATIVE_INFINITY = -1.0f / 0.0f; 那么这些值对运算会有什么影响呢? 我们先思考一下下面几个问题: Float和Double中的无限有什么区别? 例如无限乘以0会是什么? 0除以0又会有什么结果? 再来看下面的示例: public static void main(String[] args) { float fPos=Float.POSITIVE_INFINITY; float fNeg=Float.NEGATIVE

JavaScript: what is NaN, Object or primitive?

空扰寡人 提交于 2020-02-27 23:11:51
问题 what is NaN, Object or primitive? NaN - Not a Number 回答1: NaN is a primitive Number value. Just like 1 , 2 , etc. 回答2: It's a primitive. You can check in a number of ways: typeof NaN gives "number," not "object." Add a property, it disappears. NaN.foo = "hi"; console.log(NaN.foo) // undefined NaN instanceof Number gives false (but we know it's a number, so it must be a primitive). It wouldn't really make sense for NaN to be an object, because expressions like 0 / 0 need to result in NaN , and

When to use NaN or +/-Infinity?

守給你的承諾、 提交于 2020-02-23 09:32:13
问题 What are the benefits of NaN , PositiveInfinity or NegativeInfinity for float and double ? When should we use or avoid them? If there are constants like these, why does float.Parse("a") throw an error rather than returning float.NaN ? How is NaN different than null ? Why is division by zero even possible for floating types? 回答1: Infinities are used because they are part of the arithmetic system supported by floating point. There are various operations, such as dividing by zero, in which

Negation in np.select() condition

别等时光非礼了梦想. 提交于 2020-02-23 07:13:43
问题 Here is my code: import pandas as pd import numpy as np df = pd.DataFrame({ 'var1': ['a', 'b', 'c',np.nan, np.nan], 'var2': [1, 2, np.nan , 4, np.nan] }) conditions = [ (not(pd.isna(df["var1"]))) & (not(pd.isna(df["var2"]))), (pd.isna(df["var1"])) & (pd.isna(df["var2"]))] choices = ["No missing", "Both missing"] df['Result'] = np.select(conditions, choices, default=np.nan) Output: File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1478, in __nonzero__ f"The truth