I have a persistent multiple warning of \"unknown column\" for all types of commands (e.g., str(x) to installing updates on packages), and not sure how to debug this or fix
This is an issue with the Diagnostics tool in RStudio (the tool that shows warnings and possible mistakes in your code). It was partially fixed at this commit in RStudio v1.1.103 or later by @kevin-ushey. The fix is partial, because the warnings still appear (albeit with less frequency). This issue has been reported with a reproducible example at https://github.com/rstudio/rstudio/issues/7372 and has been fixed on RStudio v1.4 pull request (to be released)
There are several workarounds available, choose the solution you prefer:
Disable the code diagnostics for all files in Preferences/Code/Diagnostics
Disable all diagnostics for a specific file:
Add at the beginning of the opened file(s):
# !diagnostics off
Then save the files and the warnings should stop appearing.
Disable the diagnostics for the variables that cause the warning
Add at the beginning of the opened file(s):
# !diagnostics suppress=<comma-separated list of variables>
Then save the files and the warnings should stop appearing.
The warnings appear because the diagnostics tool in RStudio parses the source code to detect errors and when it performs the diagnostic checks it accesses columns in your tibble that are not initialized, giving the Warning we see. The warnings do not appear because you run unrelated things, they appear when the RStudio diagnostics are executed (when a file is saved, then modified, when you run something...).
I get these warnings when I rename a column using dplyr::rename
after reading it using the readr
package.
The old name of the column is not renamed in the spec
attribute. So removing the the spec
attribute makes the warnings go away. Also removing the "spec_tbl_df" class seems like a good idea.
attr(dat, "spec") <- NULL
class(dat) <- setdiff(class(dat), "spec_tbl_df")
I had this problem when dealing with tibble and lapply functions together. The tibble seemed to save things as a list inside the dataframe.
I solved it by using unlist before adding the results of an lapply function to the tibble.
Let's say I wanted to select the following column(s)
best.columns = 'id'
For me the following gave the warning:
df%>% select_(one_of(best.columns))
While this worked as expected, although, as far as I know dplyr
, this should be identical.
df%>% select_(.dots = best.columns)
Converting the class into data.frame
solved the problem for me:
library(dplyr)
df <- data.frame(id = c(1,1:3), name = c("mary", "jo", "jill","steve"))
dfTbl <- df %>%
group_by(id) %>%
summarize (n = n())
class(dfTbl) # [1] "tbl_df" "tbl" "data.frame"
dfTbl = as.data.frame(dfTbl)
class(dfTbl) # [1] "data.frame"
Borrowed the partial script from @adts
I ran into this problem too except through a tibble created using a dyplyr block. Here's slight modification of sabre's code to show how I came to the same error.
library(dplyr)
df <- data.frame(id = c(1,1:3), name = c("mary", "jo", "jill","steve"))
t <- df %>%
group_by(id) %>%
summarize (n = n())
t
str(t)
t$newvar[t$id==1] <- 0