nan

How to avoid NaN when using np.where function in python?

青春壹個敷衍的年華 提交于 2020-01-05 08:34:08
问题 I have a dataframe like this, col1 col2 col3 1 apple a,b 2 car c 3 dog a,c 4 dog NaN I tried to create three new columns, a , b and c , which give '1' if it contains a specific string, otherwise, '0'. df['a']= np.where(df['col3'].str.contains('a'),1,0) df['b']= np.where(df['col3'].str.contains('b'),1,0) df['c']= np.where(df['col3'].str.contains('c'),1,0) But it seems NaN values were not handled correctly. It gives me a result like, col1 col2 col3 a b c 1 apple a,b 1 1 0 2 car c 0 0 1 3 dog a

How to avoid NaN when using np.where function in python?

点点圈 提交于 2020-01-05 08:34:02
问题 I have a dataframe like this, col1 col2 col3 1 apple a,b 2 car c 3 dog a,c 4 dog NaN I tried to create three new columns, a , b and c , which give '1' if it contains a specific string, otherwise, '0'. df['a']= np.where(df['col3'].str.contains('a'),1,0) df['b']= np.where(df['col3'].str.contains('b'),1,0) df['c']= np.where(df['col3'].str.contains('c'),1,0) But it seems NaN values were not handled correctly. It gives me a result like, col1 col2 col3 a b c 1 apple a,b 1 1 0 2 car c 0 0 1 3 dog a

Memory efficient way to store bool and NaN values in pandas

徘徊边缘 提交于 2020-01-04 07:37:07
问题 I am working with quite a large dataset (over 4 GB), which I imported in pandas . Quite some columns in this dataset are simple True/False indicators, and naturally the most memory-efficient way to store these would be using a bool dtype for this column. However, the column also contains some NaN values I want to preserve. Right now, this leads to the column having dtype float (with values 1.0 , 0.0 and np.nan ) or object, but they both use way too much memory. As an example: df = pd

Memory efficient way to store bool and NaN values in pandas

梦想与她 提交于 2020-01-04 07:37:06
问题 I am working with quite a large dataset (over 4 GB), which I imported in pandas . Quite some columns in this dataset are simple True/False indicators, and naturally the most memory-efficient way to store these would be using a bool dtype for this column. However, the column also contains some NaN values I want to preserve. Right now, this leads to the column having dtype float (with values 1.0 , 0.0 and np.nan ) or object, but they both use way too much memory. As an example: df = pd

Find out minimum value from a multidimensional array with NaNs

…衆ロ難τιáo~ 提交于 2020-01-04 04:04:21
问题 I have a bidimensional array ( double[ , ] ), and I want to find out what's the minimum. I tried Linq.Select.Min, but since my arrays typically contain NaN values, then minvalue is always NaN . So, I need some way to find the Minimum value that "skips" the NaNs. Any help is much appreciated! 回答1: Today is the day for extension methods! Use this to have a generic Min() function on all your double[,] ! Heres some generic [,] extensions. Please note this will only be available for types that

How to divide by zero without error

随声附和 提交于 2020-01-03 09:46:17
问题 i need to obtain float NaN and infinity, but i can't use such constructions 0. / 0. 1. / 0. because it cause compile time error C2124: divide or mod by zero EDIT, it is cool to have answers for where i can get this numbers (+1 for every), but is it possible to divide by zero? 回答1: You can simply return a NaN or an infinity, for example: return std::numeric_limits<float>::quiet_NaN(); or return std::numeric_limits<float>::infinity(); See std::numeric_limits, from header <limits> . 回答2: Use std

Pandas Changing the format of NaN values when saving to CSV

左心房为你撑大大i 提交于 2020-01-03 08:41:39
问题 I am working with a df and using numpy to transform data - including setting blanks (or '') to NaN. But when I write the df to csv - the output contains the string 'nan' as oppose to being NULL. I have looked around but can't find a workable solution. Here's the basic issue: df index x y z 0 1 NaN 2 1 NaN 3 4 CSV output: index x y z 0 1 nan 2 1 nan 3 4 I have tried a few things to set 'nan' to NULL but the csv output results in a 'blank' rather than NULL: dfDemographics = dfDemographics

Pandas Changing the format of NaN values when saving to CSV

 ̄綄美尐妖づ 提交于 2020-01-03 08:41:18
问题 I am working with a df and using numpy to transform data - including setting blanks (or '') to NaN. But when I write the df to csv - the output contains the string 'nan' as oppose to being NULL. I have looked around but can't find a workable solution. Here's the basic issue: df index x y z 0 1 NaN 2 1 NaN 3 4 CSV output: index x y z 0 1 nan 2 1 nan 3 4 I have tried a few things to set 'nan' to NULL but the csv output results in a 'blank' rather than NULL: dfDemographics = dfDemographics

DataFrame correlation produces NaN although its values are all integers

ぐ巨炮叔叔 提交于 2020-01-02 03:53:30
问题 I have a dataframe df : df = pandas.DataFrame(pd.read_csv(loggerfile, header = 2)) values = df.as_matrix() df2 = pd.DataFrame.from_records(values, index = datetimeIdx, columns = Columns) EDIT: Now reading the data this way as suggested: df2 = pd.read_csv(loggerfile, header = None, skiprows = [0,1,2]) Sample: 0 1 2 3 4 5 6 7 8 \ 0 2014-03-19T12:44:32.695Z 1395233072695 703425 0 2 1 13 5 21 1 2014-03-19T12:44:32.727Z 1395233072727 703425 0 2 1 13 5 21 9 10 11 12 13 14 15 16 0 25 0 25 209 0 145

Scipy selects nan as inputs while minimizing

荒凉一梦 提交于 2020-01-01 19:53:13
问题 I have this objective function (in python) : actions= [...] # some array Na= len(actions) # maximize p0 * qr(s,a0,b0) + ... + pn * qr(s,an,bn) def objective(x): p = x[:Na] # p is a probability distribution b = x[Na:2 * Na] # b is an array of positive unbounded scalars q = np.array([qr(s, actions[a], b[a]) for a in range(0, Na)]) # s is an array rez = - np.dot(p, q) # np stands for numpy library return rez qr and qc are regression trees, these are functions mapping arrays to scalars. I have