tidyverse

readr::read_csv issue: Chinese Character becomes messy codes

僤鯓⒐⒋嵵緔 提交于 2019-12-19 09:25:23
问题 I'm trying to import a dataset to RStudio, however I am stuck with Chinese characters, as they become messy codes. Here is the code: library(tidyverse) df <- read_csv("中文,英文\n英文,德文") df # A tibble: 1 x 2 `\xd6\xd0\xce\xc4` `Ӣ\xce\xc4` <chr> <chr> 1 "<U+04E2>\xce\xc4" "<U+00B5>\xc2\xce\xc4" When I use the base function read.csv, it works well. I guess I must do something wrong with encoding. But there are no encoding option in read_csv, how can I do this? 回答1: This is because that the

Fit a different model for each row of a list-columns data frame

落花浮王杯 提交于 2019-12-19 08:01:25
问题 What is the best way to fit different model formulae that vary by the row of a data frame with the list-columns data structure in tidyverse? In R for Data Science, Hadley presents a terrific example of how to use the list-columns data structure and fit many models easily (http://r4ds.had.co.nz/many-models.html#gapminder). I am trying to find a way to fit many models with slightly different formulae. In the below example adapted from his original example, what is the best way to fit a

How to install Tidyverse on Ubuntu 16.04 and 17.04

送分小仙女□ 提交于 2019-12-18 15:17:09
问题 I'm running Ubuntu 16.04 [now 17.04: see note in bold below] and R 3.4.1. I installed the latter this morning, so I presume it's the latest version. I want to install Tidyverse, which I've spent many happy hours with under Windows. But when I do install.packages("tidyverse") , I get errors about unrecognized command line options to gcc. These start when the install hits the colorspace and munsell packages. I'll show an example at the end of this post, just for munsell. I've not found anyone

What is the difference between as.tibble(), as_data_frame(), and tbl_df()?

和自甴很熟 提交于 2019-12-18 13:07:16
问题 I remember reading somewhere that as.tibble() is an alias for as_data_frame() , but I don't know what exactly an alias is in programming terminology. Is it similar to a wrapper? So I guess my question probably comes down to the difference in possible usages between tbl_df() and as_data_frame() : what are the differences between them, if any? More specifically, given a (non-tibble) data frame df , I often turn it into a tibble by using: df <- tbl_df(df) Wouldn't df <- as_data_frame(df) do the

Importing multiple .csv files with variable column types into R

≡放荡痞女 提交于 2019-12-17 20:42:52
问题 How can I properly build an lapply to read (from out of one directory) all the .csv files, load all the columns as strings and then bind them into one data frame. Per this, I have a way to get all the .csv files loaded and bound into a dataframe. Unfortunately they are getting hung up on the variablity of how the columns are getting type cast. Thus giving me this error: Error: Can not automatically convert from character to integer in column I have tried supplementing the code with the

R Googlsheets: Unable to use `gs_auth()` in googlesheets package - Sign In With Google Temporarily Disabled App Not Verified Issue

人走茶凉 提交于 2019-12-17 16:40:25
问题 I am unable to authenticate my googlesheets package. Everytime I run the gs_auth() command I am taken to the chrome where I would usually login to enable the package to access my googlesheets: However, lately every time I do this I have the following error from Google: Here my session information: sessionInfo() R version 3.6.1 (2019-07-05) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 9 (stretch) Matrix products: default BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so

Multiply rows (with row names) in one data frame with matching column names in another

天大地大妈咪最大 提交于 2019-12-14 03:56:38
问题 I have two data frames: df1 <- data.frame(Values=c(0.01,0.05), row.names=c("X", "Y")) df1 Values X 0.01 Y 0.05 df2 <-data.frame(c(0,1,1), c(1,0,0), c(1,1,1)) colnames(df2) <- c("X","Y","Z") df2 X Y Z 1 0 1 1 2 1 0 1 3 1 0 1 I wish to perform a rowwise operation on df2, where I multiply every column in df2 with its corresponding row in df1 and then perform a summation . For example, for row 1 of df2, I wish to calculate: df2 %>% rowwise %>% mutate(newVAL=(df1["X",]*df2[1,"X"])+(df1["Y",]*df2[1

Group-specific calculations involving both row-specific and whole-group elements

怎甘沉沦 提交于 2019-12-14 03:48:17
问题 I am having a little trouble matching the logic of this problem to that of dplyr . Usually if you want to reduce a group to a single number per group, you use summarise , while if you want to calculate a separate number for each line, you use mutate . But what if you want to make a calculation on the group for each row? In the example below, mloc contains a pointer to pnum , and the goal is to add a new column nm_child which, for each row, counts the number of mloc values within the group

counting values after and before change in value, within groups, generating new variables for each unique shift

假装没事ソ 提交于 2019-12-14 03:39:27
问题 I am looking for a way to, within id groups, count unique occurrences of value shifts in TF in the data data tbl . I want to count both forward and backwards from when TF changes between 1 and 0 or o and 1 . The counting is to be stored in a new variable PM## , so that the PM## s holds each unique shift in TF , in both plus and minus. The MWE below leads to an outcome with 7 PM, but my production data can have 15 or more shifts. If a TF values does not change between NA 's I want to mark it 0

creating reproducible example using reprex package in r where a local file is being read

喜夏-厌秋 提交于 2019-12-14 03:08:06
问题 I often use reprex::reprex to create reproducible examples of R code to get help from others to get rid of errors in my code. Usually, I create minimal examples using datasets like iris or mtcars and it works well. But I always fail to use reprex any time I need to use my own data since the problem is so specific and I can't rely on datasets from datasets library. In that case, I get the following error: # loading needed libraries library(ggplot2) library(cowplot) library(devtools) # reading