Some people seem to think that C\'s strcpy() function is bad or evil. While I admit that it\'s usually better to use strncpy() in order to avoid bu
I think strncpy is evil too.
To truly protect yourself from programming errors of this kind, you need to make it impossible to write code that (a) looks OK, and (b) overruns a buffer.
This means you need a real string abstraction, which stores the buffer and capacity opaquely, binds them together, forever, and checks bounds. Otherwise, you end up passing strings and their capacities all over the shop. Once you get to real string ops, like modifying the middle of a string, it's almost as easy to pass the wrong length into strncpy (and especially strncat), as it is to call strcpy with a too-small destination.
Of course you might still ask whether to use strncpy or strcpy in implementing that abstraction: strncpy is safer there provided you fully grok what it does. But in string-handling application code, relying on strncpy to prevent buffer overflows is like wearing half a condom.
So, your strdup-replacement might look something like this (order of definitions changed to keep you in suspense):
string *string_dup(const string *s1) {
string *s2 = string_alloc(string_len(s1));
if (s2 != NULL) {
string_set(s2,s1);
}
return s2;
}
static inline size_t string_len(const string *s) {
return strlen(s->data);
}
static inline void string_set(string *dest, const string *src) {
// potential (but unlikely) performance issue: strncpy 0-fills dest,
// even if the src is very short. We may wish to optimise
// by switching to memcpy later. But strncpy is better here than
// strcpy, because it means we can use string_set even when
// the length of src is unknown.
strncpy(dest->data, src->data, dest->capacity);
}
string *string_alloc(size_t maxlen) {
if (maxlen > SIZE_MAX - sizeof(string) - 1) return NULL;
string *self = malloc(sizeof(string) + maxlen + 1);
if (self != NULL) {
// empty string
self->data[0] = '\0';
// strncpy doesn't NUL-terminate if it prevents overflow,
// so exclude the NUL-terminator from the capacity, set it now,
// and it can never be overwritten.
self->capacity = maxlen;
self->data[maxlen] = '\0';
}
return self;
}
typedef struct string {
size_t capacity;
char data[0];
} string;
The problem with these string abstractions is that nobody can ever agree on one (for instance whether strncpy's idiosyncrasies mentioned in comments above are good or bad, whether you need immutable and/or copy-on-write strings that share buffers when you create a substring, etc). So although in theory you should just take one off the shelf, you can end up with one per project.