factors

Linear model with categorical variables in R

主宰稳场 提交于 2019-12-06 05:17:02
问题 I am trying to fit a lineal model with some categorical variables model <- lm(price ~ carat+cut+color+clarity) summary(model) The answer is: Call: lm(formula = price ~ carat + cut + color + clarity) Residuals: Min 1Q Median 3Q Max -11495.7 -688.5 -204.1 458.2 9305.3 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3696.818 47.948 -77.100 < 2e-16 *** carat 8843.877 40.885 216.311 < 2e-16 *** cut.L 755.474 68.378 11.049 < 2e-16 *** cut.Q -349.587 60.432 -5.785 7.74e-09 *** cut.C

Aggregate with max and factors

邮差的信 提交于 2019-12-05 12:22:57
I have a data.frame with columns of factors, on which I want to compute a max (or min, or quantiles). I can't use these functions on factors, but I want to. Here's some example : set.seed(3) df1 <- data.frame(id = rep(1:5,each=2),height=sample(c("low","medium","high"),size = 10,replace=TRUE)) df1$height <- factor(df1$height,c("low","medium","high")) df1$height_num <- as.numeric(df1$height) # > df1 # id height height_num # 1 1 low 1 # 2 1 high 3 # 3 2 medium 2 # 4 2 low 1 # 5 3 medium 2 # 6 3 medium 2 # 7 4 low 1 # 8 4 low 1 # 9 5 medium 2 # 10 5 medium 2 I can easily do this: aggregate(height

Optimization of Algorithm [closed]

╄→гoц情女王★ 提交于 2019-12-04 16:48:18
Here is the link to the problem. The problem asks the number of solutions to the Diophantine equation of the form 1/x + 1/y = 1/z (where z = n! ). Rearranging the given equation clearly tells that the answer is the number of factors of z 2 . So the problem boils down to finding the number of factors of n! 2 ( n factorial squared). My algorithm is as follows Make a Boolean look up table for all primes <= n using Sieve of Eratosthenes algorithm. Iterate over all primes P <= n and find its exponent in n! . I did this using step function formula. Let the exponent be K , then the exponent of P in n

How to Make a Grouped Barplot for a Factor with Many Levels

这一生的挚爱 提交于 2019-12-04 06:00:20
问题 The dataframe named 'temp' (below) has three columns (1) Canopy Index; (2) Under_tree; and (3) Open_Canopy. The columns Under_tree and Open_Canopy are factors with 5 levels each. data(temp) Canopy_index Under_tree Open_Canopy 1 75 Undergrowth Grass 2 85 Litter Grass 3 75 Litter Grass 4 35 Litter Grass 5 85 Undergrowth Grass The dataframe 'temp' was reformatted to be in long format named df.melt (below) to produce a barplot where the y-axis is denoted as Canopy_index and the x-axis represents

Decompose a number into 2 prime co-factors

房东的猫 提交于 2019-12-04 04:46:31
问题 One of the requirements for Telegram Authentication is decomposing a given number into 2 prime co-factors. In particular P*Q = N, where N < 2^63 How can we find the smaller prime co-factor, such that P < square_root(N) My Suggestions: 1) pre-compute primes from 3 to 2^31.5 , then test if N mod P = 0 2) Find an algorithm to test for primes (but we still have to test N mod P =0 ) Is there an algorithm for primes that is well suited to this case? 回答1: Ugh! I just put this program in and then

Setting levels when creating a factor vs. `levels()<-`

ぃ、小莉子 提交于 2019-12-03 14:21:15
Let's create some factors first: F1 <- factor(c(1,2,20,10,25,3)) F2 <- factor(paste0(F1, " years")) F3 <- F2 levels(F3) <- paste0(sort(F1), " years") F4 <- factor(paste0(F1, " years"), levels=paste0(sort(F1), " years")) then take a look at them: > F1 [1] 1 2 20 10 25 3 Levels: 1 2 3 10 20 25 > F2 [1] 1 years 2 years 20 years 10 years 25 years 3 years Levels: 1 years 10 years 2 years 20 years 25 years 3 years > F3 [1] 1 years 3 years 10 years 2 years 20 years 25 years Levels: 1 years 2 years 3 years 10 years 20 years 25 years > F4 [1] 1 years 2 years 20 years 10 years 25 years 3 years Levels: 1

Why is my Swift loop failing with error “Can't form range with end < start”?

依然范特西╮ 提交于 2019-12-03 04:25:56
I have a for loop that checks if a number is a factor of a number, then checks if that factor is prime, and then it adds it to an array. Depending on the original number, I will get an error saying fatal error: Can't form range with end < start This happens almost every time, but for some numbers it works fine. The only numbers I have found to work with it are 9, 15, and 25. Here is the code: let num = 16 // or any Int var primes = [Int]() for i in 2...(num/2) { if ((num % i) == 0) { var isPrimeFactor = true for l in 2...i-1 { if ((i%l) == 0) { isPrimeFactor = false; }//end if }//end for if

Replace values in column by factor level

谁说我不能喝 提交于 2019-12-02 09:01:11
I got a survey data.frame they are 100 columns and each columns have 2 factors - Yes or No. However some survey have answers like, Yes! or Nope or Yay or Nah... which really they are yes or no. My question is how can I achieve my converting all values in other columns based on their factor level? e.g if factor level is 1 replace text to Yes else No. My second question is, sometimes I am left with the 3rd level that isn't used, how can I remove all unused factors in ALL columns in data frame. I got more than 100 columns. We can loop over the columns and replace the levels using %in% df1[] <-

Convert Factors in 2 Data Frames of a List into Numeric

梦想的初衷 提交于 2019-12-02 08:13:44
I am having trouble converting the columns of 2 data frames in a list to numeric. Right now both data frames have 2 columns consisting of factors. I want to convert them to numeric so that I can do mathematical operations on them. Below is sample code: library(XML) bal <- "http://www.baseball-reference.com/teams/BAL/2014-schedule-scores.shtml" bos <- "http://www.baseball-reference.com/teams/BOS/2014-schedule-scores.shtml" mylist <- list(bal, bos) a <- lapply(mylist, readHTMLTable) b <- lapply(a, function(x) x[["team_schedule"]][, c("R", "RA")]) c <- as.numeric(as.character(b)) When I run this

How to Make a Grouped Barplot for a Factor with Many Levels

纵然是瞬间 提交于 2019-12-02 08:09:42
The dataframe named 'temp' (below) has three columns (1) Canopy Index; (2) Under_tree; and (3) Open_Canopy. The columns Under_tree and Open_Canopy are factors with 5 levels each. data(temp) Canopy_index Under_tree Open_Canopy 1 75 Undergrowth Grass 2 85 Litter Grass 3 75 Litter Grass 4 35 Litter Grass 5 85 Undergrowth Grass The dataframe 'temp' was reformatted to be in long format named df.melt (below) to produce a barplot where the y-axis is denoted as Canopy_index and the x-axis represents the factor Topography (3rd column) which has five levels grouped by two conditions (2nd column - Under