问题
I have two data sets. First one is:
countyGroup income other_data_
1 20990 …
2 25622 …
3 24289 …
4 27391 …
5 23326 …
6 19261 …
7 15197 …
8 11132 …
The second one is:
countyGroup state county other_data
1 IL 123 …
1 IL 123 …
2 MI 365 …
1 IL 123 …
3 AK 65 …
4 IL 546 …
5 MI 689 …
6 AK 32 …
Variable countyGroup
uniquely identifies both state
and county
. The second data set contains countyGroup
, state
and county
. The first data set contains only countyGroup
. I need to generate two variables (state
and county
) in the first data set.
I tried to match the values using merge
:
use dataset1
merge 1:m countyGroup using dataset2, keepusing(state state cntyfp) update replace
The result that I get is the duplicate rows...I tried 1:1
it did not work...I can always drop
those duplicate rows, but I wanted to merge them correctly from the beginning.
回答1:
We need to know about the duplicate rows (Stata's term is "observations"). If they are identical on all variables, there is no extra information, and it's cleanest and simplest to use duplicates drop
to get them out of the way before the merge
.
duplicates
is a general tool for managing duplicates.
If there is different information on the other variables, tell us more about what that is and your goals here.
(LATER) With your datasets you posted
merge 1:m countyGroup using dataset2
seems to work fine.
来源:https://stackoverflow.com/questions/16996481/stata-how-to-match-values-in-1m-relationship