r-factor | 易学教程

Plotting with ggplot2: “Error: Discrete value supplied to continuous scale” on categorical y-axis

阅读更多关于 Plotting with ggplot2: “Error: Discrete value supplied to continuous scale” on categorical y-axis

问题 The plotting code below gives Error: Discrete value supplied to continuous scale What\'s wrong with this code? It works fine until I try to change the scale so the error is there... I tried to figure out solutions from similar problem but couldn\'t. This is a head of my data: > dput(head(df)) structure(list(`10` = c(0, 0, 0, 0, 0, 0), `33.95` = c(0, 0, 0, 0, 0, 0), `58.66` = c(0, 0, 0, 0, 0, 0), `84.42` = c(0, 0, 0, 0, 0, 0), `110.21` = c(0, 0, 0, 0, 0, 0), `134.16` = c(0, 0, 0, 0, 0, 0),

Imported a csv-dataset to R but the values becomes factors

阅读更多关于 Imported a csv-dataset to R but the values becomes factors

问题 I am very new to R and I am having trouble accessing a dataset I\'ve imported. I\'m using RStudio and used the Import Dataset function when importing my csv-file and pasted the line from the console-window to the source-window. The code looks as follows: setwd(\"c:/kalle/R\") stuckey <- read.csv(\"C:/kalle/R/stuckey.csv\") point <- stuckey$PTS time <- stuckey$MP However, the data isn\'t integer or numeric as I am used to but factors so when I try to plot the variables I only get histograms,

Factors in R: more than an annoyance?

阅读更多关于 Factors in R: more than an annoyance?

问题 One of the basic data types in R is factors. In my experience factors are basically a pain and I never use them. I always convert to characters. I feel oddly like I\'m missing something. Are there some important examples of functions that use factors as grouping variables where the factor data type becomes necessary? Are there specific circumstances when I should be using factors? 回答1: You should use factors. Yes they can be a pain, but my theory is that 90% of why they're a pain is because

Create frequency tables for multiple factor columns in R

阅读更多关于 Create frequency tables for multiple factor columns in R

问题 I am a novice in R. I am compiling a separate manual on the syntax for the common functions/features for my work. My sample dataframe as follows: x.sample <- structure(list(Q9_A = structure(c(5L, 3L, 5L, 3L, 5L, 3L, 1L, 5L, 5L, 5L), .Label = c(\"Impt\", \"Neutral\", \"Not Impt at all\", \"Somewhat Impt\", \"Very Impt\"), class = \"factor\"), Q9_B = structure(c(5L, 5L, 5L, 3L, 5L, 5L, 3L, 5L, 3L, 3L), .Label = c(\"Impt\", \"Neutral\", \"Not Impt at all\", \"Somewhat Impt\", \"Very Impt\"),

How to concatenate factors, without them being converted to integer level?

阅读更多关于 How to concatenate factors, without them being converted to integer level?

问题 I was surprised to see that R will coerce factors into a number when concatenating vectors. This happens even when the levels are the same. For example: > facs <- as.factor(c(\"i\", \"want\", \"to\", \"be\", \"a\", \"factor\", \"not\", \"an\", \"integer\")) > facs [1] i want to be a factor not an integer Levels: a an be factor i integer not to want > c(facs[1 : 3], facs[4 : 5]) [1] 5 9 8 3 1 what is the idiomatic way to do this in R (in my case these vectors can be pretty large)? Thank you.

Confusion between factor levels and factor labels

阅读更多关于 Confusion between factor levels and factor labels

问题 There seems to be a difference between levels and labels of a factor in R. Up to now, I always thought that levels were the \'real\' name of factor levels, and labels were the names used for output (such as tables and plots). Obviously, this is not the case, as the following example shows: df <- data.frame(v=c(1,2,3),f=c(\'a\',\'b\',\'c\')) str(df) \'data.frame\': 3 obs. of 2 variables: $ v: num 1 2 3 $ f: Factor w/ 3 levels \"a\",\"b\",\"c\": 1 2 3 df$f <- factor(df$f, levels=c(\'a\',\'b\',\

Replacing numbers within a range with a factor

阅读更多关于 Replacing numbers within a range with a factor

问题 Given a dataframe column which is a series of integers (age), I want to convert ranges of integers into ordinal variables. My current code doesn\'t work, how do I do this? df <- read.table(\"http://dl.dropbox.com/u/822467/df.csv\", header = TRUE, sep = \",\") df[(df >= 0) & (df <= 14)] <- \"Age1\" df[(df >= 15) & (df <= 44)] <- \"Age2\" df[(df >= 45) & (df <= 64)] <- \"Age3\" df[(df > 64)] <- \"Age4\" table(df) 回答1: Use cut to do this in one step: dfc <- cut(df$x, breaks=c(0, 15, 45, 56, Inf)

R error “sum not meaningful for factors”

阅读更多关于 R error “sum not meaningful for factors”

问题 I have a file called rRna_RDP_taxonomy_phylum with the following data : 364 \"Firmicutes\" 39.31 244 \"Proteobacteria\" 26.35 218 \"Actinobacteria\" 23.54 65 \"Bacteroidetes\" 7.02 22 \"Fusobacteria\" 2.38 6 \"Thermotogae\" 0.65 3 unclassified_Bacteria 0.32 2 \"Spirochaetes\" 0.22 1 \"Tenericutes\" 0.11 1 Cyanobacteria 0.11 And I\'m using this code for creating a pie chart in R: if(file.exists(\"rRna_RDP_taxonomy_phylum\")){ family <- read.table (\"rRna_RDP_taxonomy_phylum\", sep=\"\\t\")

Idiom for ifelse-style recoding for multiple categories

阅读更多关于 Idiom for ifelse-style recoding for multiple categories

问题 I run across this often enough that I figure there has to be a good idiom for it. Suppose I have a data.frame with a bunch of attributes, including \"product.\" I also have a key which translates products to brand + size. Product codes 1-3 are Tylenol, 4-6 are Advil, 7-9 are Bayer, 10-12 are Generic. What\'s the fastest (in terms of human time) way to code this up? I tend to use nested ifelse \'s if there are 3 or fewer categories, and type out the data table and merge it in if there are more

Drop factor levels in a subsetted data frame

阅读更多关于 Drop factor levels in a subsetted data frame

问题 I have a data frame containing a factor . When I create a subset of this dataframe using subset or another indexing function, a new data frame is created. However, the factor variable retains all of its original levels, even when/if they do not exist in the new dataframe. This causes problems when doing faceted plotting or using functions that rely on factor levels. What is the most succinct way to remove levels from a factor in the new dataframe? Here\'s an example: df <- data.frame(letters