问题
I have just started learning Stata and I'm having a hard time.
My problem is this: I have two different variables, ATC
and A
, where A
is potentially a substring of ATC
.
Now I want to mark all the observations in which A
is a substring of ATC
with OK = 1
.
I tried this using a simple nested loop:
foreach x in ATC {
foreach j in A {
replace OK = 1 if strpos(`x',`j')!=0
}
}
However, whenever I run this loop no changes are being made even though there should be plenty.
I feel like I should probably give an index specifying which OK
is being changed (the one belonging to the ATC
/x), but I have no idea how to do this. This is probably really simple but I've been struggling with it for some time.
I should have clarified: my A
list is separate from the main list (simply appended to it) and only contains unique keys which I use to identify the ATC
s which I want. So I have ~120 A
-keys and a couple million ATC
keys. What I wanted to do was iterate over every ATC
key for every single A
-key and mark those ATC
-keys with A
that qualify.
That means I don't have complete tuples of (ATC
,A
,OK
) but instead separate lists of different sizes.
For example: I have
ATC OK A
ABCD 0 .
EFGH 0 .
... ... ...
. . AB
. . ET
and want the result that "ABCD"
having OK
is marked as 1
while "EFGH"
remains at 0
.
回答1:
We can separate your question into two parts. Your title implies a problem with loops, but your loops are just equivalent to
replace OK = 1 if strpos(ATC, A)!=0
so the use of looping appears irrelevant. That leaves the substring comparison.
Let's supply an example:
. set obs 3
obs was 0, now 3
. gen OK = 0
. gen A = cond(_n == 1, "42", "something else")
. gen ATC = "answer is 42"
. replace OK = 1 if strpos(ATC, A) != 0
(1 real change made)
. list
+------------------------------------+
| OK A ATC |
|------------------------------------|
1. | 1 42 answer is 42 |
2. | 0 something else answer is 42 |
3. | 0 something else answer is 42 |
+------------------------------------+
So it works fine; and you really need to give a reproducible example if you think you have something different.
As for specifying where the variable should be changed: your code does precisely that, as again the example above shows.
The update makes the problem clear. Stata will only look in the same observation for a matching substring when you specify the syntax you gave. A variable in Stata is a field in a dataset. To cycle over a set of values, something like this should suffice
gen byte OK = 0
levelsof A, local(Avals)
quietly foreach A of local Avals {
replace OK = 1 if strpos(ATC, `"`A'"') > 0
}
Notes:
Specifying
byte
cuts down storage.You may need an
if
orin
restriction onlevelsof
.quietly
cuts out messages about changed values. When debugging, it is often better left out.> 0
could be omitted as a positive result fromstrpos()
is automatically treated as true in logical comparisons. See this FAQ.
来源:https://stackoverflow.com/questions/27337523/stata-nested-foreach-loop-substring-comparison