regression

Block bootstrap from subject list

这一生的挚爱 提交于 2019-11-28 07:13:47
问题 I'm trying to efficiently implement a block bootstrap technique to get the distribution of regression coefficients. The main outline is as follows. I have a panel data set, and say firm and year are the indices. For each iteration of the bootstrap, I wish to sample n subjects with replacement. From this sample, I need to construct a new data frame that is an rbind() stack of all the observations for each sampled subject, run the regression, and pull out the coefficients. Repeat for a bunch of

chaid regression tree to table conversion in r

大兔子大兔子 提交于 2019-11-28 07:04:09
问题 I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision tree. .But i dont understand how to access nodes and paths in this chaid object..Kindly help me.. I followed the procedure given in this link i cant post my data here since it is too long.So i am posting a code which takes the sample dataset provided with chaid to perform the task. copied from help manual of chaid: library

Rolling regression xts object in R

微笑、不失礼 提交于 2019-11-28 06:21:18
问题 I am attempting to perform a rolling 100 day regression on an xts object and return the t statistic of the slope coefficient for all dates. I have an xts object, prices: > tail(prices) DBC EEM EFA GLD HYG IEF IWM IYR MDY TLT 2012-11-02 27.14 41.60 53.69 162.60 92.41 107.62 81.19 64.50 179.99 122.26 2012-11-05 27.37 41.80 53.56 163.23 92.26 107.88 81.73 64.02 181.10 122.95 2012-11-06 27.86 42.13 54.07 166.30 92.40 107.39 82.34 64.16 182.69 121.79 2012-11-07 27.34 41.44 53.26 166.49 91.85 108

model.matrix(): why do I lose control of contrast in this case

你。 提交于 2019-11-28 05:59:07
问题 Suppose we have a toy data frame: x <- data.frame(x1 = gl(3, 2, labels = letters[1:3]), x2 = gl(3, 2, labels = LETTERS[1:3])) I would like to construct a model matrix # x1b x1c x2B x2C # 1 0 0 0 0 # 2 0 0 0 0 # 3 1 0 1 0 # 4 1 0 1 0 # 5 0 1 0 1 # 6 0 1 0 1 by: model.matrix(~ x1 + x2 - 1, data = x, contrasts.arg = list(x1 = contr.treatment(letters[1:3]), x2 = contr.treatment(LETTERS[1:3]))) but actually I get: # x1a x1b x1c x2B x2C # 1 1 0 0 0 0 # 2 1 0 0 0 0 # 3 0 1 0 1 0 # 4 0 1 0 1 0 # 5 0

large-scale regression in R with a sparse feature matrix

你说的曾经没有我的故事 提交于 2019-11-28 05:07:10
I'd like to do large-scale regression (linear/logistic) in R with many (e.g. 100k) features, where each example is relatively sparse in the feature space---e.g., ~1k non-zero features per example. It seems like the SparseM package slm should do this, but I'm having difficulty converting from the sparseMatrix format to a slm -friendly format. I have a numeric vector of labels y and a sparseMatrix of features X \in {0,1}. When I try model <- slm(y ~ X) I get the following error: Error in model.frame.default(formula = y ~ X) : invalid type (S4) for variable 'X' presumably because slm wants a

Ordering of points in R lines plot

不想你离开。 提交于 2019-11-28 04:00:14
问题 I want to add a fitted line of a quadratic fit to a scatteprlot, but the ordering of the points is somehow messed up. attach(mtcars) plot(hp, mpg) fit <- lm(mpg ~ hp + I(hp^2)) summary(fit) res <- data.frame(cbind(mpg, fitted(fit), hp)) with(res, plot(hp, mpg)) with(res, lines(hp, V2)) This draws lines all over the place, as opposed to the smooh fit through the scatterplot. I'm sure this is pretty straightforward, but I'm a little stumped. 回答1: When you plot a line, all the points are

Linear regression analysis with string/categorical features (variables)?

ⅰ亾dé卋堺 提交于 2019-11-28 03:23:58
Regression algorithms seem to be working on features represented as numbers. For example: This dataset doesn't contain categorical features/variables. It's quite clear how to do regression on this data and predict price. But now I want to do regression analysis on data that contain categorical features: There are 5 features: District , Condition , Material , Security , Type How can I do regression on this data? Do I have to transform all this string/categorical data to numbers manually? I mean if I have to create some encoding rules and according to that rules transform all data to numeric

multiple ggplot linear regression lines

可紊 提交于 2019-11-28 01:30:36
问题 I am plotting the occurrence of a species according to numerous variables on the same plot. There are many other variables but I've only kept the important ones for the sake of this post: > str(GH) 'data.frame': 288 obs. of 21 variables: $ Ee : int 2 2 1 7 6 3 0 9 3 7 ... $ height : num 14 25.5 25 21.5 18.5 36 18 31.5 28.5 19 ... $ legumes : num 0 0 55 30 0 0 55 10 30 0 ... $ grass : num 60 50 30 35 40 35 40 40 35 30 ... $ forbs : num 40 70 40 50 65 70 40 65 70 70 ... I've managed to plot

Exponential regression in R

試著忘記壹切 提交于 2019-11-28 00:35:20
I have some points that look like a logarithmic curve. The curve that I'm trying to obtain look like: y = a * exp(-b*x) + c My code: x <- c(1.564379666,1.924250092,2.041559879,2.198696382,2.541267447,2.666400433,2.922534874,2.965726615,3.009969443,3.248480245,3.32927682,3.371404563,3.423759668,3.713001284,3.841419166,3.847632349,3.947993339,4.024541136,4.030779671,4.118849343,4.154008445,4.284232251,4.491359108,4.585182188,4.643299476,4.643299476,4.643299476,4.684369939,4.84424144,4.867973977,5.144490521,5.324298915,5.324298915,5.988637637,6.146599422,6.674937463) y <- c(25600,23800,11990

Java 8 change in UTF-8 decoding

邮差的信 提交于 2019-11-28 00:13:30
问题 We recently migrated our application to JDK 8 from JDK 7. After the change, we ran into a problem with the following snippet of code. String output = new String(byteArray, "UTF-8"); The byte array may contain invalid UTF-8 byte sequences. The same byte array upon UTF-8 decoding, results in two difference strings on Java 7 and Java 8. According to the answer to this SO post, Java 8 "fixes" an error in Java 7 and replaces invalid UTF-8 byte sequences with a replacement string, which is in