问题
I have a Stata dataset that represents connections between users that looks like this:
src_user linked_user
1 2
2 3
3 5
1 4
6 7
I would like to get something like this:
user cluster
1 1
2 1
3 1
4 1
5 1
6 2
7 2
where isid user
evaluates to TRUE and I have grouped all users into disjoint clusters. I have tried thinking of this as a reshape
problem, but without much success. None of the user-written SNA commands seem to accomplish this as far as I can tell.
What is the most efficient way of doing it with Stata, other than looping, which I am eager to avoid ?
回答1:
If you reshape
the data to long form, you can use group_id
(from SSC) to get what you want.
clear
input user1 user2
1 2
2 3
3 5
1 4
6 7
end
gen id = _n
reshape long user, i(id) j(n)
clonevar cluster = id
list, sepby(cluster)
group_id cluster, match(user)
bysort cluster user (id): keep if _n == 1
list, sepby(cluster)
来源:https://stackoverflow.com/questions/26643899/from-edge-or-arc-list-to-clusters-in-stata