factors

dummy variables to single categorical variable (factor) in R

﹥>﹥吖頭↗ 提交于 2019-11-28 04:16:43
问题 I have a set of variables coded as binomial. Pre VALUE_1 VALUE_2 VALUE_3 VALUE_4 VALUE_5 VALUE_6 VALUE_7 VALUE_8 1 1 0 0 0 0 0 1 0 0 2 1 0 0 0 0 1 0 0 0 3 1 0 0 0 0 1 0 0 0 4 1 0 0 0 0 1 0 0 0 I would like to merge the variables (VALUE_1, VALUE_2...VALUE_8) into one single ordered factor, while conserving the column (Pre) as is, duch that the data would look like this: Pre VALUE 1 1 VALUE_6 2 1 VALUE_5 3 1 VALUE_5 Or even better: Pre VALUE 1 1 6 2 1 5 3 1 5 I am aware that this exists:

Finding factors of a given integer

a 夏天 提交于 2019-11-28 00:54:40
问题 I have something like this down: int f = 120; for(int ff = 1; ff <= f; ff++){ while (f % ff != 0){ } Is there anything wrong with my loop to find factors? I'm really confused as to the workings of for and while statements, so chances are they are completely wrong. After this, how would I go about assigning variables to said factors? 回答1: public class Solution { public ArrayList<Integer> allFactors(int a) { int upperlimit = (int)(Math.sqrt(a)); ArrayList<Integer> factors = new ArrayList

Recode categorical factor with N categories into N binary columns

可紊 提交于 2019-11-27 20:11:12
Original data frame: v1 = sample(letters[1:3], 10, replace=TRUE) v2 = sample(letters[1:3], 10, replace=TRUE) df = data.frame(v1,v2) df v1 v2 1 b c 2 a a 3 c c 4 b a 5 c c 6 c b 7 a a 8 a b 9 a c 10 a b New data frame: new_df = data.frame(row.names=rownames(df)) for (i in colnames(df)) { for (x in letters[1:3]) { #new_df[x] = as.numeric(df[i] == x) new_df[paste0(i, "_", x)] = as.numeric(df[i] == x) } } v1_a v1_b v1_c v2_a v2_b v2_c 1 0 1 0 0 0 1 2 1 0 0 1 0 0 3 0 0 1 0 0 1 4 0 1 0 1 0 0 5 0 0 1 0 0 1 6 0 0 1 0 1 0 7 1 0 0 1 0 0 8 1 0 0 0 1 0 9 1 0 0 0 0 1 10 1 0 0 0 1 0 For small datasets this

Basic - T-Test -> Grouping Factor Must have Exactly 2 Levels

烂漫一生 提交于 2019-11-27 16:48:30
问题 I am relatively new to R. For my assignment I have to start by conducting a T-Test by looking at the effect of a politician's (Conservative or Labour) wealth on their real gross wealth and real net wealth. I have to attempt to estimate the effect of serving in office wealth using a simple t-test. The dataset is called takehome.dta Labour and Tory are binary where 1 indicates that they serve for that party and 0 otherwise. The variables for wealth are lnrealgross and lnrealnet. I have imported

Remove unused factor levels from a ggplot bar plot

霸气de小男生 提交于 2019-11-27 13:59:44
I want to do the opposite of this question , and sort of the opposite of this question , though that's about legends, not the plot itself. The other SO questions seem to be asking about how to keep unused factor levels. I'd actually like mine removed. I have several name variables and several columns (wide format) of variable attributes that I'm using to create numerous bar plots. Here's a reproducible example: library(ggplot2) df <- data.frame(name=c("A","B","C"), var1=c(1,NA,2),var2=c(3,4,5)) ggplot(df, aes(x=name,y=var1)) + geom_bar() I get this: I'd like only the names that have

Why is the terminology of labels and levels in factors so weird?

老子叫甜甜 提交于 2019-11-27 11:28:22
An example of a non-settable function would be labels . You can only set factor labels when they are created with the factor function. There is no labels<- function. Not that 'labels' and 'levels' in factors make any sense.... > fac <- factor(1:3, labels=c("one", "two", "three")) > fac [1] one two three Levels: one two three > labels(fac) [1] "1" "2" "3" OK, I asked for labels, which one might assume were as set by the factor call, but I get something quite ... what's the word, unintuitive? > levels(fac) [1] "one" "two" "three" So it appears that setting labels is really setting levels. > fac

Algorithm to find all the exact divisors of a given integer

十年热恋 提交于 2019-11-27 03:25:28
I want to find all the exact divisors of a number. Currently I have this: { int n; int i=2; scanf("%d",&n); while(i<=n/2) { if(n%i==0) printf("%d,",i); i++; } getch(); } Is there any way to improve it? Rndm First, your code should have the condition of i <= n/2 , otherwise it can miss one of the factors, for example 6 will not be printed if n=12. Run the loop to the square root of the number (ie. i <= sqrt(n) ) and print both i and n/i (both will be multiples of n). { int n; int i=2; scanf("%d",&n); while(i <= sqrt(n)) { if(n%i==0) { printf("%d,",i); if (i != (n / i)) { printf("%d,",n/i); } }

R * not meaningful for factors ERROR

北战南征 提交于 2019-11-27 02:05:29
I have the following data.frame and I want to perform some calculations on the 2nd column. > test code age 1 101 15 2 102 25 3 103 16 4 104 u1 5 105 u1 6 106 u2 7 107 27 8 108 27 As you can see, the 2nd column does not include only numbers. I omitted these cases: > new<-subset(test,code<104 | code>106) > new code age 1 101 15 2 102 25 3 103 16 7 107 27 8 108 27 But when I try to do a calculation in a new column this is what I get: > new["MY_NEW_COLUMN"] <- NA > new code age MY_NEW_COLUMN 1 101 15 NA 2 102 25 NA 3 103 16 NA 7 107 27 NA 8 108 27 NA > new$MY_NEW_COLUMN <-new[,2] * 5 Warning

Enumerate factors of a number directly in ascending order without sorting?

允我心安 提交于 2019-11-27 01:58:59
Is there an efficient algorithm to enumerate the factors of a number n , in ascending order, without sorting? By “efficient” I mean: The algorithm avoids a brute-force search for divisors by starting with the prime-power factorization of n . The runtime complexity of the algorithm is O( d log₂ d ) or better, where d is the divisor count of n . The spatial complexity of the algorithm is O( d ). The algorithm avoids a sort operation. That is, the factors are produced in order rather than produced out of order and sorted afterward. Although enumerating using a simple recursive approach and then

nth ugly number

我们两清 提交于 2019-11-27 00:05:30
Numbers whose only prime factors are 2, 3 or 5 are called ugly numbers. Example: 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, ... 1 can be considered as 2^0. I am working on finding nth ugly number. Note that these numbers are extremely sparsely distributed as n gets large. I wrote a trivial program that computes if a given number is ugly or not. For n > 500 - it became super slow. I tried using memoization - observation: ugly_number * 2, ugly_number * 3, ugly_number * 5 are all ugly. Even with that it is slow. I tried using some properties of log - since that will reduce this problem from