nan

Negation in np.select() condition

半腔热情 提交于 2020-02-23 07:13:39
问题 Here is my code: import pandas as pd import numpy as np df = pd.DataFrame({ 'var1': ['a', 'b', 'c',np.nan, np.nan], 'var2': [1, 2, np.nan , 4, np.nan] }) conditions = [ (not(pd.isna(df["var1"]))) & (not(pd.isna(df["var2"]))), (pd.isna(df["var1"])) & (pd.isna(df["var2"]))] choices = ["No missing", "Both missing"] df['Result'] = np.select(conditions, choices, default=np.nan) Output: File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1478, in __nonzero__ f"The truth

Sort rows of a dataframe in descending order of NaN counts

断了今生、忘了曾经 提交于 2020-02-02 11:40:10
问题 I'm trying to sort the following Pandas DataFrame: RHS age height shoe_size weight 0 weight NaN 0.0 0.0 1.0 1 shoe_size NaN 0.0 1.0 NaN 2 shoe_size 3.0 0.0 0.0 NaN 3 weight 3.0 0.0 0.0 1.0 4 age 3.0 0.0 0.0 1.0 in such a way that the rows with a greater number of NaNs columns are positioned first. More precisely, in the above df, the row with index 1 (2 Nans) should come before ther row with index 0 (1 NaN). What I do now is: df.sort_values(by=['age', 'height', 'shoe_size', 'weight'], na

How to delete a column in pandas dataframe based on a condition?

戏子无情 提交于 2020-01-31 08:17:00
问题 I have a pandas DataFrame, with many NAN values in it. How can I drop columns such that number_of_na_values > 2000 ? I tried to do it like that: toRemove = set() naNumbersPerColumn = df.isnull().sum() for i in naNumbersPerColumn.index: if(naNumbersPerColumn[i]>2000): toRemove.add(i) for i in toRemove: df.drop(i, axis=1, inplace=True) Is there a more elegant way to do it? 回答1: Here's another alternative to keep the columns that have less than or equal to the specified number of nans in each

Is static_cast<double>(std::nanf(“”)) well defined?

≡放荡痞女 提交于 2020-01-24 13:11:14
问题 Title pretty much asks it all, but to provide a MCVE: #include <cmath> int main() { float f = std::nanf(""); double d = static_cast<double>(f); return 0; } Under MSVC 2017, both f and d report as nan , but that proves nothing, since it may be that the static_cast is undefined behavior. In a similar vein, 0.0f / 0.0f produces -nan(ind) which I am going to assume is a signalling nan, does that follow the same defined / undefined rule? Ditto inf . 回答1: This looks guaranteed by the standard, we

Is static_cast<double>(std::nanf(“”)) well defined?

為{幸葍}努か 提交于 2020-01-24 13:11:05
问题 Title pretty much asks it all, but to provide a MCVE: #include <cmath> int main() { float f = std::nanf(""); double d = static_cast<double>(f); return 0; } Under MSVC 2017, both f and d report as nan , but that proves nothing, since it may be that the static_cast is undefined behavior. In a similar vein, 0.0f / 0.0f produces -nan(ind) which I am going to assume is a signalling nan, does that follow the same defined / undefined rule? Ditto inf . 回答1: This looks guaranteed by the standard, we

Deserialization of non-finite floating-point numbers fails even with appropriate facets

余生颓废 提交于 2020-01-23 14:08:23
问题 I need to use Boost.Serialization to serialize floating-point numbers. Since NaN and infinites cannot natively be read from an input stream, I am trying to use the facets in boost/math/special_functions. I have tested them on my platform using code similar to the examples we can find here: http://www.boost.org/doc/libs/1_50_0/libs/math/doc/sf_and_dist/html/math_toolkit/utils/fp_facets/intro.html However, the following code still fails to properly unserialize non-finite floating point values

Python Drop all instances of Feature from DF if NaN thresh is met

与世无争的帅哥 提交于 2020-01-23 13:00:11
问题 Using df.dropna(thresh = x, inplace=True) , I can successfully drop the rows lacking at least x non-nan values. But because my df looks like: 2001 2002 2003 2004 bob A 123 31 4 12 bob B 41 1 56 13 bob C nan nan 4 nan bill A 451 8 nan 24 bill B 32 5 52 6 bill C 623 12 41 14 #Repeating features (A,B,C) for each index/name This drops the one row/instance where the thresh= condition is met, but leaves the other instances of that feature. What I want is something that drops the entire feature , if

Alternatives to nullable types in C#

坚强是说给别人听的谎言 提交于 2020-01-22 10:56:05
问题 I am writing algorithms that work on series of numeric data, where sometimes, a value in the series needs to be null. However, because this application is performance critical, I have avoided the use of nullable types. I have perf tested the algorithms to specifically compare the performance of using nullable types vs non-nullable types, and in the best case scenario nullable types are 2x slower, but often far worse. The data type most often used is double, and currently the chosen

Drop Series from Entire DF if Row has at least 2 NaN values

十年热恋 提交于 2020-01-16 03:50:09
问题 Having an issue with dropping all instances of a given series from the whole DF given a .dropna(thresh= x) , that I thought had been Previously Resolved Dataframe: Note that it is Multi-indexed 2001 2002 2003 2004 bob A 123 31 4 12 bob B 41 1 56 13 bob C nan nan 4 nan bill A 451 8 nan 24 bill B 32 5 52 6 bill C 623 12 41 14 #Repeating features (A,B,C) for each index/name This drops the one row/instance where the thresh= condition is met, but leaves the other instances of that feature. drop

pandas cut a series with nan values

你说的曾经没有我的故事 提交于 2020-01-15 06:21:32
问题 I would like to apply the pandas cut function to a series that includes NaNs. The desired behavior is that it buckets the non-NaN elements and returns NaN for the NaN-elements. import pandas as pd numbers_with_nan = pd.Series([3,1,2,pd.NaT,3]) numbers_without_nan = numbers_with_nan.dropna() The cutting works fine for the series without NaNs: pd.cut(numbers_without_nan, bins=[1,2,3], include_lowest=True) 0 (2.0, 3.0] 1 (0.999, 2.0] 2 (0.999, 2.0] 4 (2.0, 3.0] When I cut the series that