stata

Using memisc to import stata .dta file into R

℡╲_俬逩灬. 提交于 2019-12-10 13:49:10
问题 I have a 700mb .dta Stata file with 28 million observations and 14 column variables When I attempt to import into R using foreign's read.dta() function I run out of RAM on my 8GB machine (page outs shoot into GBs very quickly). staph <- read.dta("Staph_1999_2010.dta") I hunted around and it sounds like a more efficient alternative would be to use the Stata.file() function from the memisc package. When I call: staph <- Stata.file("Staph_1999_2010.dta") I get a segfault: *** caught segfault ***

first-difference linear panel model variance in R and Stata

雨燕双飞 提交于 2019-12-09 05:47:49
问题 I would like for a colleague to replicate a first-difference linear panel data model that I am estimating with Stata with the plm package in R (or some other package). In Stata, xtreg does not have a first difference option, so instead I run: reg D.(y x), nocons cluster(ID) In R, I am doing: plm(formula = y ~ -1 + x, data = data, model = "fd", index = c("ID","Period")) The coefficients match, but the standard errors in R are larger than in Stata. I looked in the plm help and pdf documentation

Any Python Library Produces Publication Style Regression Tables

丶灬走出姿态 提交于 2019-12-09 04:04:02
问题 I've been using Python for regression analysis. After getting the regression results, I need to summarize all the results into one single table and convert them to LaTex (for publication). Is there any package that does this in Python? Something like estout in Stata that gives the following table: 回答1: Well, there is summary_col in statsmodels ; it doesn't have all the bells and whistles of estout , but it does have the basic functionality you are looking for (including export to LaTeX):

Reading in only part of a Stata .DTA file in R

允我心安 提交于 2019-12-08 17:08:07
问题 I apologize in advance if this has a simple answer somewhere. It seems like the kind of thing that would, but I can't seem to locate it in the help files, by searching SO, or by Googling. I'm working with some datasets that are several GB right now. It's enough to fit in memory on one of the cluster nodes I have access to, but takes quite a bit of time to load. For many debugging/programming activities with this data, I don't need the entire file loaded, just the first few thousand

Stata - Moving Finite Product

自作多情 提交于 2019-12-08 12:35:43
问题 I am looking to create a product that takes the values 15 observations before a given instance for an entire dataset. My dataset has dates (in chronological order) but there are gaps. Here is an example: date wage_thousands moving_15_day_product 1/1/2000 3 . 1/3/2000 2 . 1/7/2000 3 . 1/10/2000 6 . 1/12/2000 6 . 1/14/2000 2 . 1/15/2000 1 . 1/16/2000 1 . 1/19/2000 2 . 1/21/2000 1 . 1/22/2000 2 . 1/24/2000 3 . 1/26/2000 1 . 1/28/2000 1 . 1/29/2000 2 . 2/1/2000 1 31,104 2/10/2000 5 51,850 2/12

Parsing through all folders in a directory

时光毁灭记忆、已成空白 提交于 2019-12-08 08:59:19
问题 I am working with Stata and am a beginner. I have a question regarding grabbing folder names. I have a directory, \Test\abc, that has the following folders like this: Q100 Q101 Q102 .... I would like to go into each folder, Q* (where * denotes anything after the Q), find a file named "filenameQ*", do something, and then send the output back to \Test\abc. The following code shows the idea of what I want to do, where varlist Q* denotes the array of all the folders in the directory that begin

How to fill in observations using other observations R or Stata

北战南征 提交于 2019-12-08 08:14:49
问题 I have a dataset like this: ID dum1 dum2 dum3 var1 1 0 1 . hi 1 0 . 0 hi 2 1 . . bye 2 0 0 1 . What I'm trying to do is that I want to fill in information based on the same ID if observations are missing. So my end product would be something like: ID dum1 dum2 dum3 var1 1 0 1 0 hi 1 0 1 0 hi 2 1 0 1 bye 2 0 0 1 bye Is there any way I can do this in R or Stata? 回答1: This continues discussion of Stata solutions. The solution by @Pearly Spencer looks backward and forward from observations with

Parse variable names in R vs Stata [closed]

我只是一个虾纸丫 提交于 2019-12-08 07:06:57
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 3 years ago . I have a variable name family that only changes in the last four positions (years) and I would like to create variables addressing this family all at once. In Stata I would simply do this: forvalues n=1991(1)1995 { gen comp`n’== (year_begin<`n’ & (year_end>`n’ | year_end==.)) } Here’s what I’m

Stata: Extracting values and save them as scalars (and more)

两盒软妹~` 提交于 2019-12-08 06:24:13
问题 This question is a follow-up question from Stata: replace, if, forvalues. Consider this data: set seed 123456 set obs 5000 g firmid = "firm" + string(_n) /* Observation (firm) id */ g nw = floor(100*runiform()) /* Number of workers in a firm */ g double lat = 39+runiform() /* Latitude in decimal degree of a firm */ g double lon = -76+runiform() /* Longitude in decimal degree of a firm */ The first 10 observations are: +--------------------------------------+ | firmid nw lat lon | |-----------

In-script command to copy all do files in the current directory to another folder in Stata

点点圈 提交于 2019-12-08 04:13:19
The following is the copy command. copy "main.do" "main2.do", replace However, I want to copy all do files in the current directory, and paste to another directory. I hope this is done without me needing to specify the file names of each do file. How is this possible? From link https://www.stata.com/statalist/archive/2006-08/msg00620.html I tried cd "C:\Users\Owner\Google Drive\test" local files : dir "`c(pwd)'" files "*.do" foreach file in `files' { copy `file' "`file'_copied" } so I could at least copy within the current folder. But I still don't know how to copy all files to another folder.