factors | 易学教程

How do I convert certain columns of a data frame to become factors? [duplicate]

阅读更多关于 How do I convert certain columns of a data frame to become factors? [duplicate]

Possible Duplicate: identifying or coding unique factors using R I'm having some trouble with R. I have a data set similar to the following, but much longer. A B Pulse 1 2 23 2 2 24 2 2 12 2 3 25 1 1 65 1 3 45 Basically, the first 2 columns are coded. A has 1, 2 which represent 2 different weights. B has 1, 2, 3 which represent 3 different times. As they are coded numerical values, R will treat them as numerical variables. I need to use the factor function to convert these variables into factors. Help? Here's an example: #Create a data frame > d<- data.frame(a=1:3, b=2:4) > d a b 1 1 2 2 2 3 3

In aggregate: sum not meaningful for factors

阅读更多关于 In aggregate: sum not meaningful for factors

I am trying something that should be simple, any hint on what is going on is very welcomed. I have a large data frame with country imports from some municipalities. For some countries I have 2 entries. I want to sum the imports from each municipality and having a unique row for each country. I am using the aggregate function. For example (I include a small part of the data frame): municipalities<-c("country",1100056, 1100106,1100205,1100304,1200104,1200252) c1<-c("Afghanistan",2,34,23.4,5,0,0) c2<-c("Afghanistan",0,20,11.1,5.4,2,0) c3<-c("Albania",12,120,11.4,5.1,12,10) c4<-c("Albania",0,40,61

Python Pandas: how to turn a DataFrame with “factors” into a design matrix for linear regression?

阅读更多关于 Python Pandas: how to turn a DataFrame with “factors” into a design matrix for linear regression?

问题 If memory servies me, in R there is a data type called factor which when used within a DataFrame can be automatically unpacked into the necessary columns of a regression design matrix. For example, a factor containing True/False/Maybe values would be transformed into: 1 0 0 0 1 0 or 0 0 1 for the purpose of using lower level regression code. Is there a way to achieve something similar using the pandas library? I see that there is some regression support within Pandas, but since I have my own

Getting Factors of a Number

阅读更多关于 Getting Factors of a Number

问题 I'm trying to refactor this algorithm to make it faster. What would be the first refactoring here for speed? public int GetHowManyFactors(int numberToCheck) { // we know 1 is a factor and the numberToCheck int factorCount = 2; // start from 2 as we know 1 is a factor, and less than as numberToCheck is a factor for (int i = 2; i < numberToCheck; i++) { if (numberToCheck % i == 0) factorCount++; } return factorCount; } 回答1: The first optimization you could make is that you only need to check up

Finding factors of a given integer

阅读更多关于 Finding factors of a given integer

I have something like this down: int f = 120; for(int ff = 1; ff <= f; ff++){ while (f % ff != 0){ } Is there anything wrong with my loop to find factors? I'm really confused as to the workings of for and while statements, so chances are they are completely wrong. After this, how would I go about assigning variables to said factors? Sharad Dargan public class Solution { public ArrayList<Integer> allFactors(int a) { int upperlimit = (int)(Math.sqrt(a)); ArrayList<Integer> factors = new ArrayList<Integer>(); for(int i=1;i <= upperlimit; i+= 1){ if(a%i == 0){ factors.add(i); if(i != a/i){ factors

Basic - T-Test -> Grouping Factor Must have Exactly 2 Levels

阅读更多关于 Basic - T-Test -> Grouping Factor Must have Exactly 2 Levels

I am relatively new to R. For my assignment I have to start by conducting a T-Test by looking at the effect of a politician's (Conservative or Labour) wealth on their real gross wealth and real net wealth. I have to attempt to estimate the effect of serving in office wealth using a simple t-test. The dataset is called takehome.dta Labour and Tory are binary where 1 indicates that they serve for that party and 0 otherwise. The variables for wealth are lnrealgross and lnrealnet. I have imported and attached the dataset, but when I attempt to conduct a simple t-test. I get the following message

How do I convert certain columns of a data frame to become factors? [duplicate]

阅读更多关于 How do I convert certain columns of a data frame to become factors? [duplicate]

问题 This question already has an answer here : Closed 7 years ago . Possible Duplicate: identifying or coding unique factors using R I'm having some trouble with R. I have a data set similar to the following, but much longer. A B Pulse 1 2 23 2 2 24 2 2 12 2 3 25 1 1 65 1 3 45 Basically, the first 2 columns are coded. A has 1, 2 which represent 2 different weights. B has 1, 2, 3 which represent 3 different times. As they are coded numerical values, R will treat them as numerical variables. I need

Identifying or coding unique factors

阅读更多关于 Identifying or coding unique factors

I would like to create a new variable,litter, to indicate each sow or litter in different farrowing dates (fdate). Each litter is to be numbered from 1 to N with an increament of 1 as shown in the last column. sow season piglet fdate litter 1M521 1 5702 14/09/2009 1 1M521 1 5703 14/09/2009 1 1M521 2 22920 17/02/2010 2 1M521 2 22920 17/02/2010 2 1M521 2 22920 17/02/2010 2 1M584 1 8516 28/09/2009 3 1M584 1 8516 28/09/2009 3 1M584 1 8516 28/09/2009 3 1N312 1 6192 16/09/2009 4 1N312 1 6193 16/09/2009 4 1N312 1 6194 16/09/2009 4 1N312 2 21818 11/02/2010 5 1N312 2 21819 11/02/2010 5 1N312 2 21820 11

How can I ensure that a partition has representative observations from each level of a factor?

阅读更多关于 How can I ensure that a partition has representative observations from each level of a factor?

I wrote a small function to partition my dataset into training and testing sets. However, I am running into trouble when dealing with factor variables. In the model validation phase of my code, I get an error if the model was built on a dataset that doesn't have representation from each level of a factor. How can I fix this partition() function to include at least one observation from every level of a factor variable? test.df <- data.frame(a = sample(c(0,1),100, rep = T), b = factor(sample(letters, 100, rep = T)), c = factor(sample(c("apple", "orange"), 100, rep = T))) set.seed(123) partition

In aggregate: sum not meaningful for factors

阅读更多关于 In aggregate: sum not meaningful for factors

问题 I am trying something that should be simple, any hint on what is going on is very welcomed. I have a large data frame with country imports from some municipalities. For some countries I have 2 entries. I want to sum the imports from each municipality and having a unique row for each country. I am using the aggregate function. For example (I include a small part of the data frame): municipalities<-c("country",1100056, 1100106,1100205,1100304,1200104,1200252) c1<-c("Afghanistan",2,34,23.4,5,0,0