问题
Consider the following toy string:
my first name is Pearly, and my surname is Spencer
Is there an out-of-the-box way in Stata (mata included) to get the number of tokens based on a user-specified parsing character? In this particular example, two tokens separated by a comma.
Solutions like the macro extended function for parsing word count
use a space
and I would like to avoid writing a program for this.
回答1:
The number of tokens is the number of parsing characters PLUS 1.
That being so, using commas as example parsing characters,
gen ntokens = 1 + strlen(strvar) - strlen(subinstr(strvar, ",", "", .))
See https://www.stata-journal.com/sjpdf.html?articlenum=dm0056 for a write-up of this simple trick.
来源:https://stackoverflow.com/questions/51062401/get-the-number-of-tokens-using-a-specific-parsing-character