Could someone explain me what differences there are between strtok() and strsep()?
What are the advantages and disadvantages of them?
And why would
First difference in strtok() and strsep() is the way they handle contiguous delimiter characters in the input string.
Contiguous delimiter characters handling by strtok():
#include
#include
#include
int main(void) {
const char* teststr = "aaa-bbb --ccc-ffffd"; //Contiguous delimiters between bbb and ccc sub-string
const char* delims = " -"; // delimiters - space and hyphen character
char* token;
char* ptr = strdup(teststr);
if (ptr == NULL) {
fprintf(stderr, "strdup failed");
exit(EXIT_FAILURE);
}
printf ("Original String: %s\n", ptr);
token = strtok (ptr, delims);
while (token != NULL) {
printf("%s\n", token);
token = strtok (NULL, delims);
}
printf ("Original String: %s\n", ptr);
free (ptr);
return 0;
}
Output:
# ./example1_strtok
Original String: aaa-bbb --ccc-ffffd
aaa
bbb
ccc
ffffd
Original String: aaa
In the output, you can see the token "bbb" and "ccc" one after another. strtok() does not indicate the occurrence of contiguous delimiter characters. Also, the strtok() modify the input string.
Contiguous delimiter characters handling by strsep():
#include
#include
#include
int main(void) {
const char* teststr = "aaa-bbb --ccc-ffffd"; //Contiguous delimiters between bbb and ccc sub-string
const char* delims = " -"; // delimiters - space and hyphen character
char* token;
char* ptr1;
char* ptr = strdup(teststr);
if (ptr == NULL) {
fprintf(stderr, "strdup failed");
exit(EXIT_FAILURE);
}
ptr1 = ptr;
printf ("Original String: %s\n", ptr);
while ((token = strsep(&ptr1, delims)) != NULL) {
if (*token == '\0') {
token = "";
}
printf("%s\n", token);
}
if (ptr1 == NULL) // This is just to show that the strsep() modifies the pointer passed to it
printf ("ptr1 is NULL\n");
printf ("Original String: %s\n", ptr);
free (ptr);
return 0;
}
Output:
# ./example1_strsep
Original String: aaa-bbb --ccc-ffffd
aaa
bbb
<==============
<==============
ccc
ffffd
ptr1 is NULL
Original String: aaa
In the output, you can see the two empty string (indicated through ) between bbb and ccc. Those two empty strings are for "--" between "bbb" and "ccc". When strsep() found a delimiter character ' ' after "bbb", it replaced delimiter character with '\0' character and returned "bbb". After this, strsep() found another delimiter character '-'. Then it replaced delimiter character with '\0' character and returned the empty string. Same is for the next delimiter character.
Contiguous delimiter characters are indicated when strsep() returns a pointer to a null character (that is, a character with the value '\0').
The strsep() modify the input string as well as the pointer whose address passed as first argument to strsep().
Second difference is, strtok() relies on a static variable to keep track of the current parse location within a string. This implementation requires to completely parse one string before beginning a second string. But this is not the case with strsep().
Calling strtok() when another strtok() is not finished:
#include
#include
void another_function_callng_strtok(void)
{
char str[] ="ttt -vvvv";
char* delims = " -";
char* token;
printf ("Original String: %s\n", str);
token = strtok (str, delims);
while (token != NULL) {
printf ("%s\n", token);
token = strtok (NULL, delims);
}
printf ("another_function_callng_strtok: I am done.\n");
}
void function_callng_strtok ()
{
char str[] ="aaa --bbb-ccc";
char* delims = " -";
char* token;
printf ("Original String: %s\n", str);
token = strtok (str, delims);
while (token != NULL)
{
printf ("%s\n",token);
another_function_callng_strtok();
token = strtok (NULL, delims);
}
}
int main(void) {
function_callng_strtok();
return 0;
}
Output:
# ./example2_strtok
Original String: aaa --bbb-ccc
aaa
Original String: ttt -vvvv
ttt
vvvv
another_function_callng_strtok: I am done.
The function function_callng_strtok() only print token "aaa" and does not print the rest of the tokens of input string because it calls another_function_callng_strtok() which in turn call strtok() and it set the static pointer of strtok() to NULL when it finishes with extracting all the tokens. The control comes back to function_callng_strtok() while loop, strtok() returns NULL due to the static pointer pointing to NULL and which make the loop condition false and loop exits.
Calling strsep() when another strsep() is not finished:
#include
#include
void another_function_callng_strsep(void)
{
char str[] ="ttt -vvvv";
const char* delims = " -";
char* token;
char* ptr = str;
printf ("Original String: %s\n", str);
while ((token = strsep(&ptr, delims)) != NULL) {
if (*token == '\0') {
token = "";
}
printf("%s\n", token);
}
printf ("another_function_callng_strsep: I am done.\n");
}
void function_callng_strsep ()
{
char str[] ="aaa --bbb-ccc";
const char* delims = " -";
char* token;
char* ptr = str;
printf ("Original String: %s\n", str);
while ((token = strsep(&ptr, delims)) != NULL) {
if (*token == '\0') {
token = "";
}
printf("%s\n", token);
another_function_callng_strsep();
}
}
int main(void) {
function_callng_strsep();
return 0;
}
Output:
# ./example2_strsep
Original String: aaa --bbb-ccc
aaa
Original String: ttt -vvvv
ttt
vvvv
another_function_callng_strsep: I am done.
Original String: ttt -vvvv
ttt
vvvv
another_function_callng_strsep: I am done.
Original String: ttt -vvvv
ttt
vvvv
another_function_callng_strsep: I am done.
bbb
Original String: ttt -vvvv
ttt
vvvv
another_function_callng_strsep: I am done.
ccc
Original String: ttt -vvvv
ttt
vvvv
another_function_callng_strsep: I am done.
Here you can see, calling strsep() before completely parse one string doesn't makes any difference.
So, the disadvantage of strtok() and strsep() is that both modify the input string but strsep() has couple of advantages over strtok() as illustrated above.
From strsep:
The strsep() function is intended as a replacement for the strtok() function. While the strtok() function should be preferred for portability reasons (it conforms to ISO/IEC 9899:1990 (``ISO C90'')) it is unable to handle empty fields, i.e., detect fields delimited by two adjacent delimiter characters, or to be used for more than a single string at a time. The strsep() function first appeared in 4.4BSD.
For reference: