How do I use strtok with every single nonalpha character as a delimeter? (C)

前端 未结 4 1698
我寻月下人不归
我寻月下人不归 2020-12-12 05:41

So I have a string:

**BOB**123(*&**blah**02938*(*&91820**FOO**

I want to be able to use strtok to deliminate each word

4条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-12 05:59

    You could probably use strtok for this, but it's probably easier to roll your own. Below is an example that uses a custom struct to hold the state and the results of the tokeniser. The state is just a pointer into the string, which must be initialised with the string to tokenise.

    The result represents a substring of that string as combination of starting pointer and length. That result is not zero-terminated, so you have to take care. This approach has the benefit that the solution doesn't allocate extra memory and doesn't overwrite the original string, so unlike strtok it works on read-only strings.

    The tokeniser itself is invoked with a function that returns 1 or 0, depending on whether a new token has been found, which makes for easy loop syntax.

    Here goes:

    #include 
    #include 
    #include       /* for isalpha(c) */
    
    struct alpha_t {
        const char *p;      /* Pointer int string; must be initialised */
        const char *str;    /* start of current token */
        int len;            /* length of token */
    };
    
    /*
     *      Get next alpha token from string; alpha->p must be initialised
     *      to the (possible read-only) string to work on.
     */
    int next_alpha(struct alpha_t *alpha)
    {
        if (alpha->p == NULL) return 0;
    
        /* Skip non-alpha and check for end of string */
        while (*alpha->p && !isalpha(*alpha->p)) alpha->p++;
        if (*alpha->p == 0) return 0;
    
        /* Read token of alpha charactzers */
        alpha->str = alpha->p;
        while (isalpha(*alpha->p)) alpha->p++;
        alpha->len = alpha->p - alpha->str;
    
        return 1;
    }
    
    /*
     *      Example client code
     */
    int main()
    {
        char *str = "BOB123(&blah02938(*&91820FOO";
        struct alpha_t token = {str};
    
        while (next_alpha(&token)) {
            printf("'%.*s'\n", token.len, token.str);
        }
    
        return 0;   
    }
    

    This solution uses isalpha, as you already suggested. It is easily extended to other functions - you could even pass a delimiter on non-delimiter function as argument or make it part of the struct, for a customisable tokeniser.

提交回复
热议问题