na | 易学教程

R - convert nan to 0 results in all 0's

阅读更多关于 R - convert nan to 0 results in all 0's

问题 I have a data frame containing NaN's that I'd like to convert to 0's. I wrote a function that I think should work: fix_nan <- function(x){ return(x[is.nan(x)] <- 0) } And then I apply it to the data frame: train_e <- structure(list(pack_id = structure(1:10, .Label = c("1", "2", "4", "5", "7", "8", "9", "10", "11", "14"), class = "factor"), item_1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), item_2 = c(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN), item_3 = c(1.45225232891169, 0.613104472886409, NaN

R (arules) Convert dataframe into transactions and remove NA

阅读更多关于 R (arules) Convert dataframe into transactions and remove NA

问题 i have a set dataframe. My purpose is to convert the dataframe into transactions data in order to do market basket analysis using Arules package in R. I did do some research online regarding conversion of dataframe to transactions data, e.g.(How to prep transaction data into basket for arules) and (Transform csv into transactions for arules), but the result i got was different. dput(df) structure(list(Transaction_ID = c("A001", "A002", "A003", "A004", "A005", "A006"), Fruits = c(NA, "Apple",

Variable in CSV File Contains Numbers But Imported as Character

阅读更多关于 Variable in CSV File Contains Numbers But Imported as Character

问题 I have a variable in a dataset (CSV format) that consists only numbers but when the dataset is imported into R, it becomes a character variable. Any reasons why? When I tried to coerce it as numeric, a lot of NAs are introduced. > df1$Postal_Code<-as.numeric(df1$Postal_Code) Warning message: NAs introduced by coercion sum(is.na(df1$Postal_Code)) ## [1] 2822 sum(is.na(as.numeric(df1$Postal_Code))) ## [1] 2837 来源： https://stackoverflow.com/questions/38449245/variable-in-csv-file-contains

Sort dataframe rows independently by values in another dataframe

阅读更多关于 Sort dataframe rows independently by values in another dataframe

问题 Suppose two dataframes: import pandas as pd import numpy as np d1 = {} d2 = {} np.random.seed(5) for col in list("ABCDEF"): d1[col] = np.random.randn(12) d2[col+'2'] = np.random.random_integers(0,100, 12) t_index = pd.date_range(start = '2015-01-31', periods = 12, freq = "M") dat1 = pd.DataFrame(d1, index = t_index) dat2 = pd.DataFrame(d2, index = t_index) I want to sort dat1's rows by the rows in dat2 and extract a subset of the ordered data from dat1. Below, is an example where the top 5

Pandas Dataframe with NA values throwing ValueError

阅读更多关于 Pandas Dataframe with NA values throwing ValueError

问题 I have a dataframe in pandas that looks like this df.head(2) Out[25]: CompanyName Region MachineType recvd_dttm 2014-07-13 12:40:40 Company1 NA Machine1 2014-07-13 15:31:39 Company2 NA Machine2 I am first taking data in a certain date range, then trying to get data that is in the Region NA and is MachineType Machine1. However, I keep getting this error: ValueError: Length mismatch: Expected axis has 4 elements, new values have 3 elements This code worked until I added the region column and

How to fill NA in R for quasi-same row?

阅读更多关于 How to fill NA in R for quasi-same row?

问题 I'm looking for a way to fillNA in duplicated() rows. There are totally same rows and at one time there is a NA, so I decide to fill this one by value of complete row but I don't see how to deal with it. Using the duplicated() function, I could have a data frame like that: df <- data.frame( Year = rnorm(5), hour = rnorm(5), LOT = rnorm(5), S123_AA = c('ABF4576','ABF4576','ABF4576','ABF4576','ABF4576'), S135_AA = c('ABF5403',NA,'ABF5403','ABF5403','ABF5403'), S13_BB = c('BF50343','BF50343',

Weighted average value in the presence of NA values

阅读更多关于 Weighted average value in the presence of NA values

问题 Here's a very simple example of what I'm dealing with: data_stack <- data.table(CompA_value = c(10,20,30,40), CompB_value = c(60,70,80,80), CompC_value = c(NA, NA, NA, 100), CompA_weight = c(0.2, 0.3,0.4,0.4), CompB_weight = c(0.8,0.7,0.6,0.4), CompC_weight = c(NA, NA, NA,0.2)) CompA_value CompB_value CompC_value CompA_weight CompB_weight CompC_weight 1: 10 60 NA 0.2 0.8 NA 2: 20 70 NA 0.3 0.7 NA 3: 30 80 NA 0.4 0.6 NA 4: 40 80 100 0.4 0.4 0.2 What I want to do is calculate the weighted

Excel, Array Formulas, N/A outside of range, and ROW()

阅读更多关于 Excel, Array Formulas, N/A outside of range, and ROW()

问题 I have a problem with ROW() in an array formula in Excel 2013. Example: I make a named range, called 'input', say 4 cells wide and 10 high. Then I make an array formula =ROW(input) one cell wide, 15 cells high. I get 10 numbers - the first is the first row of input, and the rest count up from that, and then 5 #N/A follow. This is as it should be. If instead of =ROW(input) I try one of the following: =IFERROR(ROW(input),"x") or =IF(ISNA(ROW(input)),"x",ROW(input)) to catch the #N/As then what

error glm, NA/NaN/Inf in 'y

阅读更多关于 error glm, NA/NaN/Inf in 'y

问题 I am trying to fit a GLM model to my data. The data ( rope_complete ) looks like this: rope.X...Sound rope.directional.change rope.Time.of.the.shark.in.the.video 1 5_min_blank 5 23 2 Snorkeling 11 37 3 Fish1 1 17 4 Fish1 6 46 5 Diving 6 37 Now i wanted to check if I have NA values: table(is.na(rope_complete)) and saw that I have none: FALSE : 3225 Now I did my GLM: directional_turn_fit<-glm(rope_complete$rope.directional.change~ rope_complete$rope.X...Sound +offset( log(rope_complete$rope

as.date creates some NAs in dataset

阅读更多关于 as.date creates some NAs in dataset

问题 I have a simple little dataset: > str(SFdischg) 'data.frame': 11932 obs. of 4 variables: $ date: Factor w/ 11932 levels "1/01/1985","1/01/1986",..: 97 4409 8697 9677 10069 10461 10853 11245 11637 489 ... $ ddmm: Factor w/ 366 levels "01-Apr","01-Aug",..: 1 13 25 37 49 61 73 85 97 109 ... $ year: int 1984 1984 1984 1984 1984 1984 1984 1984 1984 1984 ... $ cfs : int 1500 1430 1500 1850 1810 1830 1850 1880 1970 1980 ... I would like to have a column of dates so that I can plot temporal data: