r-factor

Plotting with ggplot2: “Error: Discrete value supplied to continuous scale” on categorical y-axis

家住魔仙堡 提交于 2019-11-26 09:41:01
问题 The plotting code below gives Error: Discrete value supplied to continuous scale What\'s wrong with this code? It works fine until I try to change the scale so the error is there... I tried to figure out solutions from similar problem but couldn\'t. This is a head of my data: > dput(head(df)) structure(list(`10` = c(0, 0, 0, 0, 0, 0), `33.95` = c(0, 0, 0, 0, 0, 0), `58.66` = c(0, 0, 0, 0, 0, 0), `84.42` = c(0, 0, 0, 0, 0, 0), `110.21` = c(0, 0, 0, 0, 0, 0), `134.16` = c(0, 0, 0, 0, 0, 0),

Imported a csv-dataset to R but the values becomes factors

痞子三分冷 提交于 2019-11-26 09:29:05
问题 I am very new to R and I am having trouble accessing a dataset I\'ve imported. I\'m using RStudio and used the Import Dataset function when importing my csv-file and pasted the line from the console-window to the source-window. The code looks as follows: setwd(\"c:/kalle/R\") stuckey <- read.csv(\"C:/kalle/R/stuckey.csv\") point <- stuckey$PTS time <- stuckey$MP However, the data isn\'t integer or numeric as I am used to but factors so when I try to plot the variables I only get histograms,

Factors in R: more than an annoyance?

混江龙づ霸主 提交于 2019-11-26 08:47:05
问题 One of the basic data types in R is factors. In my experience factors are basically a pain and I never use them. I always convert to characters. I feel oddly like I\'m missing something. Are there some important examples of functions that use factors as grouping variables where the factor data type becomes necessary? Are there specific circumstances when I should be using factors? 回答1: You should use factors. Yes they can be a pain, but my theory is that 90% of why they're a pain is because

Create frequency tables for multiple factor columns in R

一曲冷凌霜 提交于 2019-11-26 08:33:40
问题 I am a novice in R. I am compiling a separate manual on the syntax for the common functions/features for my work. My sample dataframe as follows: x.sample <- structure(list(Q9_A = structure(c(5L, 3L, 5L, 3L, 5L, 3L, 1L, 5L, 5L, 5L), .Label = c(\"Impt\", \"Neutral\", \"Not Impt at all\", \"Somewhat Impt\", \"Very Impt\"), class = \"factor\"), Q9_B = structure(c(5L, 5L, 5L, 3L, 5L, 5L, 3L, 5L, 3L, 3L), .Label = c(\"Impt\", \"Neutral\", \"Not Impt at all\", \"Somewhat Impt\", \"Very Impt\"),

How to concatenate factors, without them being converted to integer level?

谁都会走 提交于 2019-11-26 08:21:02
问题 I was surprised to see that R will coerce factors into a number when concatenating vectors. This happens even when the levels are the same. For example: > facs <- as.factor(c(\"i\", \"want\", \"to\", \"be\", \"a\", \"factor\", \"not\", \"an\", \"integer\")) > facs [1] i want to be a factor not an integer Levels: a an be factor i integer not to want > c(facs[1 : 3], facs[4 : 5]) [1] 5 9 8 3 1 what is the idiomatic way to do this in R (in my case these vectors can be pretty large)? Thank you.

Confusion between factor levels and factor labels

懵懂的女人 提交于 2019-11-26 06:55:36
问题 There seems to be a difference between levels and labels of a factor in R. Up to now, I always thought that levels were the \'real\' name of factor levels, and labels were the names used for output (such as tables and plots). Obviously, this is not the case, as the following example shows: df <- data.frame(v=c(1,2,3),f=c(\'a\',\'b\',\'c\')) str(df) \'data.frame\': 3 obs. of 2 variables: $ v: num 1 2 3 $ f: Factor w/ 3 levels \"a\",\"b\",\"c\": 1 2 3 df$f <- factor(df$f, levels=c(\'a\',\'b\',\

Replacing numbers within a range with a factor

大城市里の小女人 提交于 2019-11-26 06:49:32
问题 Given a dataframe column which is a series of integers (age), I want to convert ranges of integers into ordinal variables. My current code doesn\'t work, how do I do this? df <- read.table(\"http://dl.dropbox.com/u/822467/df.csv\", header = TRUE, sep = \",\") df[(df >= 0) & (df <= 14)] <- \"Age1\" df[(df >= 15) & (df <= 44)] <- \"Age2\" df[(df >= 45) & (df <= 64)] <- \"Age3\" df[(df > 64)] <- \"Age4\" table(df) 回答1: Use cut to do this in one step: dfc <- cut(df$x, breaks=c(0, 15, 45, 56, Inf)

R error “sum not meaningful for factors”

痞子三分冷 提交于 2019-11-26 05:37:49
问题 I have a file called rRna_RDP_taxonomy_phylum with the following data : 364 \"Firmicutes\" 39.31 244 \"Proteobacteria\" 26.35 218 \"Actinobacteria\" 23.54 65 \"Bacteroidetes\" 7.02 22 \"Fusobacteria\" 2.38 6 \"Thermotogae\" 0.65 3 unclassified_Bacteria 0.32 2 \"Spirochaetes\" 0.22 1 \"Tenericutes\" 0.11 1 Cyanobacteria 0.11 And I\'m using this code for creating a pie chart in R: if(file.exists(\"rRna_RDP_taxonomy_phylum\")){ family <- read.table (\"rRna_RDP_taxonomy_phylum\", sep=\"\\t\")

Idiom for ifelse-style recoding for multiple categories

谁说我不能喝 提交于 2019-11-26 05:29:27
问题 I run across this often enough that I figure there has to be a good idiom for it. Suppose I have a data.frame with a bunch of attributes, including \"product.\" I also have a key which translates products to brand + size. Product codes 1-3 are Tylenol, 4-6 are Advil, 7-9 are Bayer, 10-12 are Generic. What\'s the fastest (in terms of human time) way to code this up? I tend to use nested ifelse \'s if there are 3 or fewer categories, and type out the data table and merge it in if there are more

Drop factor levels in a subsetted data frame

风格不统一 提交于 2019-11-25 23:57:11
问题 I have a data frame containing a factor . When I create a subset of this dataframe using subset or another indexing function, a new data frame is created. However, the factor variable retains all of its original levels, even when/if they do not exist in the new dataframe. This causes problems when doing faceted plotting or using functions that rely on factor levels. What is the most succinct way to remove levels from a factor in the new dataframe? Here\'s an example: df <- data.frame(letters