stata

Stata: combining coefficients/standard errors from several regressions in a single dataset (number of variables may differ)

走远了吗. 提交于 2019-12-20 04:32:59
问题 I have already asked a question about storing coefficients and standard errors of several regressions in a single dataset. Let me just reiterate the objective of my initial question: I would like to run several regressions and store their results in a DTA file that I could later use for analysis. My constraints are: I cannot install modules (I am writing code for other people and not sure what modules they have installed) Some of the regressors are factor variables. Each regression differ

stata odbc sqlfile

笑着哭i 提交于 2019-12-20 04:11:30
问题 I am trying to load data from database (either MS Access or SQL server) using odbc sqlfile it seems that the code is running with any error but I am not getting data. I am using the following code odbc sqlfile("sqlcode.sql"),dsn("mysqlodbcdata") . Note that sqlcode.sql contains just sql statement with SELECT . The thing is that the same sql code is giving data with odbc load,exec(sqlstmt) dsn("mysqlodbcdata") . Can anyone suggest how can I use odbc sqlfile to import data? This would be a

Tag all duplicate rows in R as in Stata

三世轮回 提交于 2019-12-20 01:39:09
问题 Following up from my question here, I am trying to replicate in R the functionality of the Stata command duplicates tag , which allows me to tag all the rows of a dataset that are duplicates in terms of a given set of variables: clear * set obs 16 g f1 = _n expand 104 bys f1: g f2 = _n expand 2 bys f1 f2: g f3 = _n expand 41 bys f1 f2 f3: g f4 = _n des // describe the dataset in memory preserve sample 10 // draw a 10% random sample tempfile sampledata save `sampledata', replace restore //

Tag all duplicate rows in R as in Stata

时光毁灭记忆、已成空白 提交于 2019-12-20 01:39:07
问题 Following up from my question here, I am trying to replicate in R the functionality of the Stata command duplicates tag , which allows me to tag all the rows of a dataset that are duplicates in terms of a given set of variables: clear * set obs 16 g f1 = _n expand 104 bys f1: g f2 = _n expand 2 bys f1 f2: g f3 = _n expand 41 bys f1 f2 f3: g f4 = _n des // describe the dataset in memory preserve sample 10 // draw a 10% random sample tempfile sampledata save `sampledata', replace restore //

pandas and Stata 13 files

若如初见. 提交于 2019-12-19 10:25:19
问题 I have pandas 0.13.1 installed but pandas.read_stata() fails to read .dta files created in Stata 13 format, with the error TypeError: cannot concatenate 'str' and 'NoneType' objects No problem at all with the same dataset saved in Stata 12 format. I thought that the latest release of pandas (0.13.1) handled Stata 13 dataset files. Am I doing something wrong? 回答1: My guess is you're not doing anything inherently wrong, but that your version of pandas can't handle Stata 13 dataset files. As

Is there a command line editor that highlights the Stata syntax?

谁都会走 提交于 2019-12-19 06:16:23
问题 My internet connection is extremely slow and therefore I execute batch files on the server without GUI, i.e. directly from the terminal. However, oftentimes I need to make a few changes in the code and a text editor highlighting Stata syntax would not hurt. Is there one? 回答1: Sublime Text editor has a package for Stata. If you're using mac you can find installation instructions here. 回答2: There is a whole list of text editors that Stata users have found useful here: http://fmwww.bc.edu/repec

Use value label in if command

梦想与她 提交于 2019-12-19 02:18:09
问题 I am working with a set of dta files representing surveys from different years. Conveniently, each year uses different values for the country variable, so I am trying to set the country value labels for each year to match. I am having trouble comparing value labels though. So far, I have come up with the following code: replace country=1 if countryO=="Japan" replace country=2 if countryO=="South Korea" | countryO=="Korea" replace country=3 if countryO=="China" replace country=4 if countryO==

Between/within standard deviations

你。 提交于 2019-12-18 16:54:30
问题 When working on a hierarchical/multilevel/panel dataset, it may be very useful to adopt a package which returns the within- and between-group standard deviations of the available variables. This is something that with the following data in Stata can be easily done through the command xtsum, i(momid) I made a research, but I cannot find any R package which can do that.. edit: Just to fix ideas, an example of hierarchical dataset could be this: son_id mom_id hispanic mom_smoke son_birthweigth 1

Searching for a straightforward way to do Stata's bysort tasks in R

余生长醉 提交于 2019-12-18 11:32:01
问题 I'm very new to R, and have been struggling for a couple of days to do something that Stata makes quite straightforward. A friend has given me a relatively complicated answer to this question, but I was wondering if there was a simple way to do the following. Suppose I have a two variable dataframe, organized as below: category var1 a 1 a 2 a 3 b 4 b 6 b 8 b 10 c 11 c 14 c 17 I want to generate five additional variables, each of which should be inserted into this same dataframe: var2 , var3 ,

Saving in a file an array or DataFrame together with other information

£可爱£侵袭症+ 提交于 2019-12-18 10:04:07
问题 The statistical software Stata allows short text snippets to be saved within a dataset. This is accomplished either using notes and/or characteristics. This is a feature of great value to me as it allows me to save a variety of information, ranging from reminders and to-do lists to information about how I generated the data, or even what the estimation method for a particular variable was. I am now trying to come up with a similar functionality in Python 3.6. So far, I have looked online and