stata

Finding the second smallest value

微笑、不失礼 提交于 2019-12-25 02:24:12
问题 For every observation, I want to get the second smallest value of the last five observations of a variable. Do you know which command I have to use? * Example generated by -dataex-. To install: ssc install dataex clear input str5 var1 str26 var2 "Value" "2nd smallest of previous 5" "8" "" "0" "" "4" "" "5" "" "0" "" "6" "0" "8" "0" "10" "4" "8" "5" "8" "6" end 回答1: Original problem: 2nd of last 5 Another way to do it is that the 2nd lowest out of 5 will be returned as the lower quartile: .

Stata Nested foreach loop substring comparison

无人久伴 提交于 2019-12-25 01:41:41
问题 I have just started learning Stata and I'm having a hard time. My problem is this: I have two different variables, ATC and A , where A is potentially a substring of ATC . Now I want to mark all the observations in which A is a substring of ATC with OK = 1 . I tried this using a simple nested loop: foreach x in ATC { foreach j in A { replace OK = 1 if strpos(`x',`j')!=0 } } However, whenever I run this loop no changes are being made even though there should be plenty. I feel like I should

How can I import specific files?

允我心安 提交于 2019-12-25 01:17:20
问题 I am trying to import hundreds of U.S. county xls files together to form a complete dataset in Stata. The problem is that for every county, I have several files for different years, so that my list of file names looks like this: county1-year1970.xls county1-year1975.xls county2-year1960.xls county2-year1990.xls For each county, I only want the file from the most recent year (which varies across counties). So far, I have written code to loop through each possible file name, and if the file

Populating new variable using vlookup with multiple criteria in another variable

牧云@^-^@ 提交于 2019-12-24 23:57:31
问题 1) A new variable should be created for each unique observation listed in variable sku , which contains repeated values. 2) These newly created variables should be assigned the value of own product's price at the store/week level, as long as observations' sku value is in the same subcategory ( subc ) as the variable itself. For example, in eta2,3, observations in line 3, 4, and 5 have the same value because they all belong to the same subcategory as sku #3. [ eta2,3 indicates sku 3, subc 2.]

Obtain values separated by hyphens

纵然是瞬间 提交于 2019-12-24 08:48:54
问题 I have a bunch of values in a dataset that are formulated like 2000-3222 and 10/1-10 . I would like to split these so that it lists 2000 , 2001 etc. and 10/1 , 10/2 etc., all in their own rows. Is there any command to do this in Stata or R? EDIT: Example data: input int SRNo str200 SchemeName str30 CTSNo1 str4 CTSNo2 69 "Khimji Nagar SRA Co-op.Housing Society Ltd." "467" "" 70 "Jai Bhavani CHS Ltd. (Proposed)" "7 (Pt.)" "" 71 "Shivshakti SRA CHS Ltd." "364 ‘A’" "" 72 "Shree Ram CHS Ltd. (Prop

Matching variable strings

假装没事ソ 提交于 2019-12-24 07:52:48
问题 I have a variable that looks like this: 045672, 19274 061483, 21124 068346, 32948 And another that looks as follows: [045672, 19274; 056843, 20483] AAA8793307546; [061483, 21124] AZS69482148 [045672, 19274; 056843, 20483] AAA8793307546; [061483, 21124] AZS69482148 [068346, 32948] BGJ569788313 How can I keep the part of the second variable that matches the first? 回答1: The following works for me: clear input str15 v1 str75 v2 "045672, 19274" "[045672, 19274; 056843, 20483] AAA8793307546;

How to bound time using Stata?

情到浓时终转凉″ 提交于 2019-12-24 07:37:41
问题 I want to restrict TRD_EVENT_TM variable of my dataset, which is time value, between 9:30 t0 11:00. * Example generated by -dataex-. To install: ssc install dataex clear input str8 TRD_EVENT_TM str6 TRD_STCK_CD double TRD_PR long TRD_EVENT_DT "09:53:17" "BANK1" 909 18293 "10:25:40" "HSHM1" 1706 19205 "11:32:03" "SIPA1" 2231 18866 "11:01:55" "AZAB1" 2283 18916 "12:19:56" "SIPA1" 2063 17683 "10:48:01" "CHML1" 6048 18672 "10:59:49" "DADE1" 3044 18847 "11:40:34" "CHML1" 6406 18798 "10:54:45"

Stata: replace, if, forvalues

送分小仙女□ 提交于 2019-12-24 07:13:33
问题 use "locationdata.dta", clear gen ring=. * Philly City Hall gen lat_center = 39.9525468 gen lon_center = -75.1638855 destring(INTPTLAT10), replace destring(INTPTLON10), replace vincenty INTPTLAT10 INTPTLON10 lat_center lon_center , hav(distance_km) inkm quietly su distance_km local min = r(min) replace ring=0 if (`min' <= distance_km < 1) local max = ceil(r(max)) * forval loop does not work forval i=1/`max'{ local j = `i'+1 replace ring=`i' if (`i' <= distance_km < `j') } I am drawing rings

How do I find and replace a part of a string variable in Stata?

﹥>﹥吖頭↗ 提交于 2019-12-24 04:16:08
问题 I am working with a variable which is basically URLs. So observations include values like for example www.google.com https://www.google.com https://yahoo.movies.com I am trying to create a do file to import a bunch of these files into Stata and need a reliable method to remove the www. and the https:// parts from these variables over a wide range of URLs. In Excel I can do this simply by finding https:// or www. and replacing it with nothing, how do I achieve the same in Stata? I am working

how to generate a dataset of correlated variables with different distributions?

╄→гoц情女王★ 提交于 2019-12-24 04:12:38
问题 For teaching purposes, I need to generate random datasets of correlated random variables with different distributions. I have tried corr2data in Stata but it will not allow me to specify max and min values of the variables to be generated, just means, sd's and the covariance matrix. Therefore, I need to do messy adjustments after generation of the data. Various other details annoy me with corr2data . Is there a simpler way of doing this with MATLAB? I am not as familiar with this software as