Parsing command line statements as a list of tokens

China☆狼群 提交于 2019-12-13 01:27:35

问题


#include <stdio.h>
#include <string.h> /* needed for strtok */
#include <unistd.h>
#include <stdlib.h>

int main(int argc, char **argv) {
        char text[10000];
    fgets(text, sizeof(text), stdin);
    char *t;
    int i;

    t = strtok(text, "\"\'| ");
    for (i=0; t != NULL; i++) {
        printf("token %d is \"%s\"\n", i, t);
        t = strtok(NULL, "\"\'| ");
    }
}

This is part of the code that im trying to make it is supposed to separate tokens

Let's say the input is 'abc' "de f'g" hij| k "lm | no"

The output should be

token 1: "abc"
token 2: "de f'g"
token 3: "hij"
token 4: "|"
token 5: "k"
token 6: "lm | no"

I get something different but close anyway I can change it to this format?


回答1:


What you're trying to do is essentially a parser. strtok isn't a very good tool for this, and you may have better luck writing your own. strtok works on the presumption that whatever delimits your tokens is unimportant and so can be overwritten with '\0'. But you DO care what the delimiter is.

The only problem you'll have is that | syntax. The fact that you want to use it as a token delimiter and a token is likely to make your code more complicated (but not too much). Here, you have the issue that hij is followed immediately by |. If you terminate hij to get the token, you will have to overwrite the |. You either have to store the overwritten character and restore it, or copy the string out somewhere else.

You basically have three cases:

  • The bar | is a special delimiter that is also a token;
  • Quoted delimiters " and ' match everything until the next quote of the same kind;
  • Otherwise, tokens are delimited by whitespace.



回答2:


#include <stdio.h>
#include <string.h>

char *getToken(char **sp){
    static const char *sep = " \t\n";
    static char vb[] = "|", vbf;
    char *p, *s;
    if(vbf){
        vbf = 0;
        return vb;
    }
    if (sp == NULL || *sp == NULL || **sp == '\0') return(NULL);
    s = *sp;
    if(*s == '"')
        p = strchr(++s, '"');
    else if(*s == '\'')
        p = strchr(++s, '\'');
    else
        p = s + strcspn(s, "| \t\n");
    if(*p != '\0'){
        if(*p == '|'){
            *vb = vbf = '|';
        }
        *p++ = '\0';
        p += strspn(p, sep);
    }
    *sp = p;
    if(!*s){
        vbf = 0;
        return vb;
    }
    return s;
}

int main(int argc, char **argv) {
    char text[10000];
    fgets(text, sizeof(text), stdin);
    char *t, *p = text;
    int i;

    t = getToken(&p);
    for (i=1; t != NULL; i++) {
        printf("token %d is \"%s\"\n", i, t);
        t = getToken(&p);
    }
    return 0;
}


来源:https://stackoverflow.com/questions/21896644/parsing-command-line-statements-as-a-list-of-tokens

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!