Inserting strings into another string in C

為{幸葍}努か 提交于 2019-12-13 03:48:55

问题


I'm implementing a function which, given a string, a character and another string (since now we can call it the "substring"); puts the substring everywhere the character is in the string. To explain me better, given these parameters this is what the function should return (pseudocode):

func ("aeiou", 'i', "hello")  ->  aehelloou

I'm using some functions from string.h lib. I have tested it with pretty good result:

char *somestring= "this$ is a tes$t wawawa$wa";
printf("%s", strcinsert(somestring, '$', "WHAT?!") );

Outputs:    thisWHAT?! is a tesWHAT?!t wawawaWHAT?!wa

so for now everything is allright. The problem is when I try to do the same with, for example this string:

char *somestring= "this \"is a test\" wawawawa";
printf("%s", strcinsert(somestring, '"', "\\\"") );

since I want to change every " for a \" . When I do this, the PC collapses. I don't know why but it stops working and then shutdown. I've head some about the bad behavior of some functions of the string.h lib but I couldn't find any information about this, I really thank any help.

My code:

#define salloc(size) (str)malloc(size+1) //i'm lazy
typedef char* str;

str strcinsert (str string, char flag, str substring)
{
    int nflag= 0; //this is the number of times the character appears
    for (int i= 0; i<strlen(string); i++)
        if (string[i]==flag)
            nflag++;
    str new=string;
    int pos;
    while (strchr(string, flag)) //since when its not found returns NULL
    {
        new= salloc(strlen(string)+nflag*strlen(substring)-nflag);
        pos= strlen(string)-strlen(strchr(string, flag));
        strncpy(new, string, pos);
        strcat(new, substring);
        strcat(new, string+pos+1);
        string= new;      
    }
    return new;
}

Thanks for any help!


回答1:


Some advices:

  • refrain from typedef char* str;. The char * type is common in C and masking it will just make your code harder to be reviewed
  • refrain from #define salloc(size) (str)malloc(size+1) for the exact same reason. In addition don't cast malloc in C
  • each time you write a malloc (or calloc or realloc) there should be a corresponding free: C has no garbage collection
  • dynamic allocation is expensive, use it only when needed. Said differently a malloc inside a loop should be looked at twice (especially if there is no corresponding free)
  • always test allocation function (unrelated: and io) a malloc will simply return NULL when you exhaust memory. A nice error message is then easier to understand than a crash
  • learn to use a debugger: if you had executed your code under a debugger the error would have been evident

Next the cause: if the replacement string contains the original one, you fall again on it and run in an endless loop

A possible workaround: allocate the result string before the loop and advance both in the original one and the result. It will save you from unnecessary allocations and de-allocations, and will be immune to the original char being present in the replacement string.

Possible code:

// the result is an allocated string that must be freed by caller
str strcinsert(str string, char flag, str substring)
{
    int nflag = 0; //this is the number of times the character appears
    for (int i = 0; i<strlen(string); i++)
        if (string[i] == flag)
            nflag++;
    str new_ = string;
    int pos;
    new_ = salloc(strlen(string) + nflag*strlen(substring) - nflag);
    // should test new_ != NULL
    char * cur = new_;
    char *old = string;
    while (NULL != (string = strchr(string, flag))) //since when its not found returns NULL
    {
        pos = string - old;
        strncpy(cur, old, pos);
        cur[pos] = '\0';             // strncpy does not null terminate the dest. string
        strcat(cur, substring);
        strcat(cur, string + 1);
        cur += strlen(substring) + pos; // advance the result
        old = ++string;                 // and the input string
    }
    return new_;
}

Note: I have not reverted the str and salloc but you really should do.




回答2:


In your second loop, you always look for the first flag character in the string. In this case, that’ll be the one you just inserted from substring. The strchr function will always find that quote and never return NULL, so your loop will never terminate and just keep allocating memory (and not enough of it, since your string grows arbitrarily large).

Speaking of allocating memory, you need to be more careful with that. Unlike in Python, C doesn’t automatically notice when you’re no longer using memory; anything you malloc must be freed. You also allocate far more memory than you need: even in your working "this$ is a tes$t wawawa$wa" example, you allocate enough space for the full string on each iteration of the loop, and never free any of it. You should just run the allocation once, before the second loop.

This isn’t as important as the above stuff, but you should also pay attention to performance. Each call to strcat and strlen iterates over the entire string, meaning you look at it far more often than you need. You should instead save the result of strlen, and copy the new string directly to where you know the NUL terminator is. The same goes for strchr; you already replaced the beginning of the string and don’t want to waste time looking at it again, apart from the part where that’s causing your current bug.

In comparison to these issues, the style issues mentioned in the comments with your typedef and macro are relatively minor, but they are still worth mentioning. A char* in C is different from a str in Python; trying to typedef it to the same name just makes it more likely you’ll try to treat them as the same and run into these issues.




回答3:


I don't know why but it stops working

strchr(string, flag) is looking over the whole string for flag. Search needs to be limited to the portion of the string not yet examined/updated. By re-searching the partially replaces string, code is finding the flag over and over.


The whole string management approach needs re-work. As OP reported a Python background, I've posted a very C approach as mimicking Python is not a good approach here. C is different especially in the management of memory.


Untested code

// Look for needles in a haystack and replace them
// Note that replacement may be "" and result in a shorter string than haystack
char *strcinsert_alloc(const char *haystack, char needle, const char *replacment) {
  size_t n = 0;
  const char *s = haystack;
  while (*s) {
    if (*s == needle) n++;  // Find needle count
    s++;
  }
  size_t replacemnet_len = strlen(replacment);
  //                        string length  - needles + replacements      + \0
  size_t new_size = (size_t)(s - haystack) - n*1     + n*replacemnet_len + 1;
  char *dest = malloc(new_size);
  if (dest) {
    char *d = dest;
    s = haystack;
    while (*s) {
      if (*s == needle) {
        memcpy(d, s, replacemnet_len);
        d += replacemnet_len;
      } else {
        *d = *s;
        d++;
      }
      s++;
    }
    *d = '\0';
  }
  return dest;
}



回答4:


In your program, you are facing problem for input -

char *somestring= "this \"is a test\" wawawawa";

as you want to replace " for a \".

The first problem is whenever you replace " for a \" in string, in next iteration strchr(string, flag) will find the last inserted " of \". So, in subsequent interations your string will form like this -

this \"is a test" wawawawa
this \\"is a test" wawawawa
this \\\"is a test" wawawawa

So, for input string "this \"is a test\" wawawawa" your while loop will run for infinite times as every time strchr(string, flag) finds the last inserted " of \".

The second problem is the memory allocation you are doing in your while loop in every iteration. There is no free() for the allocated memory to new. So when while loop run infinitely, it will eat up all the memory which will lead to - the PC collapses.

To resolve this, in every iteration, you should search for flag only in the string starting from a character after the last inserted substring to the end of the string. Also, make sure to free() the dynamically allocated memory.



来源:https://stackoverflow.com/questions/46958109/inserting-strings-into-another-string-in-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!