C dictionary/map

蹲街弑〆低调 提交于 2020-01-23 03:57:12

问题


I want to map struct members so I can eliminate if branches in a loop. What is the best way or convention to implement this in C? I suppose it could be a 2 dimensional array instead...then I could map integers to the char keys?

    char chunk[32];
    int n;
    int i;
    char *ptr = config;
    while (*ptr != '\0') {
        int items_read = sscanf(ptr, "%31[^;]%n", chunk, &n);

        if(chunk[0] == 'S' && chunk[1] == 'P') {
            for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
                theMeas[i].signal_path = atoi(&chunk[2]);
            }
        }    
        if(chunk[0] == 'T' && chunk[1] == 'L') {
            for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
                theMeas[i].trace_length = atoi(&chunk[2]);
            }
        }    
        if(chunk[0] == 'S' && chunk[1] == 'R') {
            for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
                theMeas[i].sample_rate = atoi(&chunk[2]);
            }
        }   

        chunk[0]='\0';
        if (items_read == 1)
            ptr += n;
        if ( *ptr != ';' ) {
            break;
        }
        ++ptr;
    }

回答1:


I suspect what you (ideally) want is a dictionary:

theMeas[i]["signal_path"] = atoi(&chunk[2]);

Of course, the above syntax will never happen in C, but that's not really important here. The problem is that you would have to write all the code implementing a dictionary data type, and I suspect that's overkill.

So I suspect what you (really) want is a way to have names that can be used in a loop:

foreach(signal_path, trace_length, sample_rate)

And I'm here to tell you that you can do this (kind of)! The simplest way is with an enum:

enum fields {
  signal_path,
  trace_length,
  sample_rate,
  END_fields,
  UNKNOWN_fields,
  BEGIN_fields = 0,
};

Instead of struct members, you use an array:

int theMeas[size][END_fields];

To index a "member", use this:

theMeas[i][signal_path];

You can loop through all the "members," you can use this:

for(enum fields j = BEGIN_fields; j != END_fields; j++)
    theMeas[i][j];

This does break down a little when you want to get character-based comparisons, but we can do a little bit:

const char *to_str(enum fields f)
{
#define FIELD(x) case x: return #x
    switch(f)
      {
        FIELD(signal_path);
        FIELD(trace_length);
        FIELD(sample_rate);
        default: return "<unknown>";
      }
#undef FIELD
}

enum fields from_str(const char *c)
{
#define FIELD(x) if(!strcmp(c, #x)) return x
        FIELD(signal_path);
        FIELD(trace_length);
        FIELD(sample_rate);
        default: return UNKNOWN_fields;
#undef FIELD
}

enum fields from_abv(char *c)
{
    for(enum fields i = BEGIN_fields; i < END_fields; i++)
      {
        char *field = field_str(i);
        if(tolower(c[0]) == field[0] && tolower(c[1]) == strchr(field, '_')[1])
            return i;
      }
    return UNKNOWN_fields;
}

Your if statements could be replaced with:

theMeas[i][from_abv(chunk)] = atoi(&chunk[2]);

Or, more safely:

enum fields j = from_abv(chunk);
if(j != UNKNOWN_fields) theMeas[i][j] = atoi(&chunk[2]);
else /* erroneous user input */;

Which is about as close as I can get.

Note that I've deliberately used a naming scheme to facilitate the creation of macros that will automate much of this. Let's try:

#define member(name, ...) \
  enum name { __VA_ARGS__, \
              M_END_##name, \
              M_UNKNOWN_##name, \
              M_BEGIN_##name = 0 }

#define miter(name, var) \
        enum name var = M_BEGIN_##name; var != M_END_##name; var++

#define msize(name) M_END_##name

Usage:

// define our fields
member(fields, signal_path, trace_length, sample_rate);

// declare object with fields
int theMeas[N][msize(fields)];

for(size_t i = 0; i < N; i++)
    // iterate over fields
    for(miter(fields, j))
        // match against fields
        if(j == from_abv(chunk))
            theMeas[i][j] = atoi(&chunk[2]);

That last bit doesn't seem so bad. It still allows you something close to struct-like access via theMeas[i][signal_path], but allows you to iterate over the "members," and hides most of the heavy lifting behind macros.

The to_str and from_str functions take a little more macro trickery to automate. You'll probably need to look into P99 for that. The from_abv function isn't something I'd recommend for the general case, as we have no way of guaranteeing that the next time you make iterable fields you'll use names with underscores. (Of course, you could drop the from_abv function and give your members inscrutable names like SP, TL, and SR, allowing you to directly compare them to your string data, but you'd need to change the strcmp to a memcmp with a size argument of (sizeof(#x) - 1). Then all the places you have from_abv you'd just use from_str, which can be automatically generated for you.)

However, from_abv isn't hard to define, and you could honestly just copy and paste your if blocks from above into it - it'd be slightly more efficient, though if you added a "member" you'd have to update the function (as written, it'll update itself if you add a member.)




回答2:


C supports pointers to functions, so you could create an array of pointers to functions and address the array according to your input. This would require you to implement additional functions with same signatures.

Another way might be to encapsulate the if-clauses in a separate function and calling it with the arguments.

However, I think neither way you will gain much speedup, if any.




回答3:


You might rewrite your logic something like this, using a pointer to an integer:

while (*ptr != '\0') {
    int items_read = sscanf(ptr, "%31[^;]%n", chunk, &n);

    int *p = NULL;
    if(chunk[0] == 'S' && chunk[1] == 'P') {
        p = &theMeas[i].signal_path;
    }    
    if(chunk[0] == 'T' && chunk[1] == 'L') {
        p = &theMeas[i].trace_length;
    }    
    if(chunk[0] == 'S' && chunk[1] == 'R') {
        p = &theMeas[i].sample_rate;
    }

    for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
        *p = atoi(&chunk[2]);
    }   

This approach separates the decision about what variable to change (the if statements) from the code that actually does the work that is common to each case (the for loop).

You will of course want to check whether p == NULL in case chunk[0] and chunk[1] didn't match anything you were expecting.




回答4:


Unfortunately with simple C99 this is not possible, since array indices can only be unsigned integers. But propably the strncmp() function suits you more?

#define EQUALN(a,b,n) (strncmp(a, b, n) == 0)

...

if(EQUALN(chunk, "SP", 2)) {
    for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
        theMeas[i].signal_path = atoi(&chunk[2]);
    }
}    
else if(EQUALN(chunk, "TL", 2)) {
    for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
        theMeas[i].trace_length = atoi(&chunk[2]);
    }
}    
else if(EQUALN(chunk, "SR", 2)) {
    for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
        theMeas[i].sample_rate = atoi(&chunk[2]);
    }
}



回答5:


If (and it's a fairly big if) you can rely on the data always being one of those three options, then we could construct a "minimal perfect hash" over the three cases. Assuming the charset is ASCII (or consistent with ASCII):

L = 76, 0 mod 4
P = 80, 0 mod 4
R = 82, 2 mod 4
S = 83, 3 mod 4
T = 84, 0 mod 4

So, S+P is 3 mod 4, T+L is 0 mod 4, and S+R is 1 mod 4. Not minimal, but close enough:

size_t lookup[3] = {
    offsetof(Mea, trace_length),
    offsetof(Mea, sample_rate),
    0,
    offsetof(Mea, signal_path)
};

size_t offset = lookup[((unsigned)chunk[0] + chunk[1]) % 4];

for(i=0;i<GLOBAL_MEAS_CUTOFF; i++) {
    int *fieldptr = (int*)(((char*)(theMeas+i)) + offset);
    *fieldptr = atoi(&chunk[2]);
}

You might prefer to put some lipstick on this pig with macros or inline functions, or instead of int *fieldptr have a char *fieldptr, start it at ((char*)theMeas) + offset, and increment it by sizeof(Mea) each time.

If you can't rely on friendly data, then you need at least one branch of some kind (a conditional or a call through a function pointer), just to avoid writing anything in the case where the data is bad. Even to keep it to 1 you probably need a 64k-entry lookup table for 3 cases, which is kind of sparse, so you're probably better off with the conditionals.



来源:https://stackoverflow.com/questions/5970656/c-dictionary-map

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!