I am working on getting some song lyrics using an API, and converting the lyrics string into an array of words. I am getting some unusual behaviors in preg_replace function.
The likeliest answer is that the string contains non-printable characters beyond "you". To figure out what exactly it contains, you'll have to look at the raw bytes. Do this with echo bin2hex($word)
. This outputs a string like 666f6f...
, where every 2 characters are one byte in hexadecimal notation. You may make that more readable with something like:
echo join(' ', str_split(bin2hex($word), 2));
// 66 6f 6f ...
Now use your favourite ASCII/Unicode table (depending on the encoding of the string) to figure out what individual characters those represent and where you got them from.
Perhaps your string is encoded in UTF-16, in which case you should see telltale 00
bytes every two characters.