I\'m reading out lots of texts from various RSS feeds and inserting them into my database.
Of course, there are several different character encodings used in the fee
I was checking for solutions to encoding since ages, and this page is probably the conclusion of years of search! I tested some of the suggestions you mentioned and here's my notes:
This is my test string:
this is a "wròng wrìtten" string bùt I nèed to pù 'sòme' special chàrs to see thèm, convertèd by fùnctìon!! & that's it!
I do an INSERT to save this string on a database in a field that is set as utf8_general_ci
The character set of my page is UTF-8.
If I do an INSERT just like that, in my database, I have some characters probably coming from Mars...
So I need to convert them into some "sane" UTF-8. I tried utf8_encode()
, but still aliens chars were invading my database...
So I tried to use the function forceUTF8
posted on number 8, but in the database the string saved looks like this:
this is a "wròng wrìtten" string bùt I nèed to pù 'sòme' special chà rs to see thèm, convertèd by fùnctìon!! & that's it!
So collecting some more information on this page and merging them with other information on other pages I solved my problem with this solution:
$finallyIDidIt = mb_convert_encoding(
$string,
mysql_client_encoding($resourceID),
mb_detect_encoding($string)
);
Now in my database I have my string with correct encoding.
NOTE:
Only note to take care of is in function mysql_client_encoding
!
You need to be connected to the database, because this function wants a resource ID as a parameter.
But well, I just do that re-encoding before my INSERT so for me it is not a problem.