factors | 易学教程

Recode categorical factor with N categories into N binary columns

阅读更多关于 Recode categorical factor with N categories into N binary columns

问题 Original data frame: v1 = sample(letters[1:3], 10, replace=TRUE) v2 = sample(letters[1:3], 10, replace=TRUE) df = data.frame(v1,v2) df v1 v2 1 b c 2 a a 3 c c 4 b a 5 c c 6 c b 7 a a 8 a b 9 a c 10 a b New data frame: new_df = data.frame(row.names=rownames(df)) for (i in colnames(df)) { for (x in letters[1:3]) { #new_df[x] = as.numeric(df[i] == x) new_df[paste0(i, "_", x)] = as.numeric(df[i] == x) } } v1_a v1_b v1_c v2_a v2_b v2_c 1 0 1 0 0 0 1 2 1 0 0 1 0 0 3 0 0 1 0 0 1 4 0 1 0 1 0 0 5 0 0

Remove unused factor levels from a ggplot bar plot

阅读更多关于 Remove unused factor levels from a ggplot bar plot

问题 I want to do the opposite of this question, and sort of the opposite of this question, though that's about legends, not the plot itself. The other SO questions seem to be asking about how to keep unused factor levels. I'd actually like mine removed. I have several name variables and several columns (wide format) of variable attributes that I'm using to create numerous bar plots. Here's a reproducible example: library(ggplot2) df <- data.frame(name=c("A","B","C"), var1=c(1,NA,2),var2=c(3,4,5))

Finding largest prime number out of 600851475143?

阅读更多关于 Finding largest prime number out of 600851475143?

I'm trying to solve problem 3 from http://projecteuler.net . However, when I run thing program nothing prints out. What am I doing wrong? Problem: What is the largest prime factor of the number 600851475143 ? public class project_3 { public boolean prime(long x) // if x is prime return true { boolean bool = false; for(long count=1L; count<x; count++) { if( x%count==0 ) { bool = false; break; } else { bool = true; } } return bool; } public static void main(String[] args) { long ultprime = 0L; // largest prime value project_3 object = new project_3(); for(long x=1L; x <= 600851475143L; x++) { if

Algorithm to find all the exact divisors of a given integer

阅读更多关于 Algorithm to find all the exact divisors of a given integer

问题 I want to find all the exact divisors of a number. Currently I have this: { int n; int i=2; scanf(\"%d\",&n); while(i<=n/2) { if(n%i==0) printf(\"%d,\",i); i++; } getch(); } Is there any way to improve it? 回答1: First, your code should have the condition of i <= n/2 , otherwise it can miss one of the factors, for example 6 will not be printed if n=12. Run the loop to the square root of the number (ie. i <= sqrt(n) ) and print both i and n/i (both will be multiples of n). { int n; int i=2;

Why is the terminology of labels and levels in factors so weird?

阅读更多关于 Why is the terminology of labels and levels in factors so weird?

问题 An example of a non-settable function would be labels . You can only set factor labels when they are created with the factor function. There is no labels<- function. Not that \'labels\' and \'levels\' in factors make any sense.... > fac <- factor(1:3, labels=c(\"one\", \"two\", \"three\")) > fac [1] one two three Levels: one two three > labels(fac) [1] \"1\" \"2\" \"3\" OK, I asked for labels, which one might assume were as set by the factor call, but I get something quite ... what\'s the

Enumerate factors of a number directly in ascending order without sorting?

阅读更多关于 Enumerate factors of a number directly in ascending order without sorting?

问题 Is there an efficient algorithm to enumerate the factors of a number n , in ascending order, without sorting? By “efficient” I mean: The algorithm avoids a brute-force search for divisors by starting with the prime-power factorization of n . The runtime complexity of the algorithm is O( d log₂ d ) or better, where d is the divisor count of n . The spatial complexity of the algorithm is O( d ). The algorithm avoids a sort operation. That is, the factors are produced in order rather than

nth ugly number

阅读更多关于 nth ugly number

问题 Numbers whose only prime factors are 2, 3 or 5 are called ugly numbers. Example: 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, ... 1 can be considered as 2^0. I am working on finding nth ugly number. Note that these numbers are extremely sparsely distributed as n gets large. I wrote a trivial program that computes if a given number is ugly or not. For n > 500 - it became super slow. I tried using memoization - observation: ugly_number * 2, ugly_number * 3, ugly_number * 5 are all ugly. Even with that

Cleaning up factor levels (collapsing multiple levels/labels)

阅读更多关于 Cleaning up factor levels (collapsing multiple levels/labels)

问题 What is the most effective (ie efficient / appropriate) way to clean up a factor containing multiple levels that need to be collapsed? That is, how to combine two or more factor levels into one. Here\'s an example where the two levels \"Yes\" and \"Y\" should be collapsed to \"Yes\", and \"No\" and \"N\" collapsed to \"No\": ## Given: x <- c(\"Y\", \"Y\", \"Yes\", \"N\", \"No\", \"H\") # The \'H\' should be treated as NA ## expectedOutput [1] Yes Yes Yes No No <NA> Levels: Yes No # <~~ NOTICE

Segmented Sieve of Eratosthenes?

阅读更多关于 Segmented Sieve of Eratosthenes?

问题 It\'s easy enough to make a simple sieve: for (int i=2; i<=N; i++){ if (sieve[i]==0){ cout << i << \" is prime\" << endl; for (int j = i; j<=N; j+=i){ sieve[j]=1; } } cout << i << \" has \" << sieve[i] << \" distinct prime factors\\n\"; } But what about when N is very large and I can\'t hold that kind of array in memory? I\'ve looked up segmented sieve approaches and they seem to involve finding primes up until sqrt(N) but I don\'t understand how it works. What if N is very large (say 10^18)?