dataframe

Cannot replace special characters in a Python pandas dataframe

女生的网名这么多〃 提交于 2021-02-07 09:55:41
问题 I'm working with Python 3.5 in Windows. I have a dataframe where a 'titles' str type column contains titles of headlines, some of which have special characters such as â , € , ˜ . I am trying to replace these with a space '' using pandas.replace . I have tried various iterations and nothing works. I am able to replace regular characters, but these special characters just don't seem to work. The code runs without error, but the replacement simply does not occur, and instead the original title

How to replace a dataframe column values with NaN based on a condition?

坚强是说给别人听的谎言 提交于 2021-02-07 09:53:45
问题 For example, The dataframe looks like: DF=pd.DataFrame([[1,1],[2,120],[3,25],[4,np.NaN],[5,45]],columns=["ID","Age"]) In the Age column, the values below 5 and greater than 100 have to converted to NaN. Any help is appreciated! 回答1: Using where and between df.Age=df.Age.where(df.Age.between(5,100)) df ID Age 0 10 NaN 1 20 NaN 2 30 25.0 来源: https://stackoverflow.com/questions/53983563/how-to-replace-a-dataframe-column-values-with-nan-based-on-a-condition

Cannot convert non-finite values (NA or inf) to integer [duplicate]

不问归期 提交于 2021-02-07 09:45:59
问题 This question already has answers here : Convert Pandas column containing NaNs to dtype `int` (17 answers) Closed 2 years ago . I have a dataframe looks like this survived pclass sex age sibsp parch fare embarked 0 1 1 female 29.0000 0 0 211.3375 S 1 1 1 male 0.9167 1 2 151.5500 S 2 0 1 female 2.0000 1 2 151.5500 S 3 0 1 male 30.0000 1 2 151.5500 S 4 0 1 female 25.0000 1 2 151.5500 S I want to convert 'sex' to 0, 1 coding and used isnull checked that there is no NA in the column However, on

Cannot convert non-finite values (NA or inf) to integer [duplicate]

老子叫甜甜 提交于 2021-02-07 09:45:42
问题 This question already has answers here : Convert Pandas column containing NaNs to dtype `int` (17 answers) Closed 2 years ago . I have a dataframe looks like this survived pclass sex age sibsp parch fare embarked 0 1 1 female 29.0000 0 0 211.3375 S 1 1 1 male 0.9167 1 2 151.5500 S 2 0 1 female 2.0000 1 2 151.5500 S 3 0 1 male 30.0000 1 2 151.5500 S 4 0 1 female 25.0000 1 2 151.5500 S I want to convert 'sex' to 0, 1 coding and used isnull checked that there is no NA in the column However, on

How to Replace All the “nan” Strings with Empty String in My DataFrame?

雨燕双飞 提交于 2021-02-07 09:27:30
问题 I have "None" and "nan" strings scattered in my dataframe. Is there a way to replace all of those with empty string "" or nan so they do not show up when I export the dataframe as excel sheet? Simplified Example: Note: nan in col4 are not strings ID col1 col2 col3 col4 1 Apple nan nan nan 2 None orange None nan 3 None nan banana nan The output should be like this after removing all the "None" and "nan" strings when we replaced them by empty strings "" : ID col1 col2 col3 col4 1 Apple nan 2

How to Replace All the “nan” Strings with Empty String in My DataFrame?

巧了我就是萌 提交于 2021-02-07 09:27:00
问题 I have "None" and "nan" strings scattered in my dataframe. Is there a way to replace all of those with empty string "" or nan so they do not show up when I export the dataframe as excel sheet? Simplified Example: Note: nan in col4 are not strings ID col1 col2 col3 col4 1 Apple nan nan nan 2 None orange None nan 3 None nan banana nan The output should be like this after removing all the "None" and "nan" strings when we replaced them by empty strings "" : ID col1 col2 col3 col4 1 Apple nan 2

How to delete empty data.frame in a list after subsetting in R [duplicate]

风流意气都作罢 提交于 2021-02-07 09:26:32
问题 This question already has answers here : Filtering list of dataframes based on the number of observations in each dataframe (4 answers) Closed 1 year ago . Suppose I'm subsetting from a list of named data.frame s with respect to a subsetting variable called long . After subsetting, some data.frame s in the list may be empty because there is no match for subsetting in them. I was wondering how I could delete all such empty data.frame s in my final output. A simple example, and my unsuccessful

easy multidimensional numpy ndarray to pandas dataframe method?

二次信任 提交于 2021-02-07 09:17:12
问题 Having a 4-D numpy.ndarray, e.g. myarr = np.random.rand(10,4,3,2) dims={'time':1:10,'sub':1:4,'cond':['A','B','C'],'measure':['meas1','meas2']} But with possible higher dimensions. How can I create a pandas.dataframe with multiindex, just passing the dimensions as indexes, without further manual adjustments (reshaping the ndarray into 2D shape)? I can't wrap my head around the reshaping, not even really in 3 dimensions quite yet, so I'm searching for an 'automatic' method if possible. What

easy multidimensional numpy ndarray to pandas dataframe method?

浪子不回头ぞ 提交于 2021-02-07 09:14:03
问题 Having a 4-D numpy.ndarray, e.g. myarr = np.random.rand(10,4,3,2) dims={'time':1:10,'sub':1:4,'cond':['A','B','C'],'measure':['meas1','meas2']} But with possible higher dimensions. How can I create a pandas.dataframe with multiindex, just passing the dimensions as indexes, without further manual adjustments (reshaping the ndarray into 2D shape)? I can't wrap my head around the reshaping, not even really in 3 dimensions quite yet, so I'm searching for an 'automatic' method if possible. What

Iterate each row in a dataframe, store it in val and pass as parameter to Spark SQL query

浪尽此生 提交于 2021-02-07 08:44:35
问题 I am trying to fetch rows from a lookup table (3 rows and 3 columns) and iterate row by row and pass values in each row to a SPARK SQL as parameters. DB | TBL | COL ---------------- db | txn | ID db | sales | ID db | fee | ID I tried this in spark shell for one row, it worked. But I am finding it difficult to iterate over rows. val sqlContext = new org.apache.spark.sql.SQLContext(sc) val db_name:String = "db" val tbl_name:String = "transaction" val unique_col:String = "transaction_number" val