Encoding issues are among the one topic that have bitten me most often during development. Every platform insists on its own encoding, most likely some non-UTF-8 defaults are in
In PHP we use the mb_ functions such as mb_detect_encoding() and mb_convert_encoding(). They aren't perfect, but they get us 99.9% of the way there. Than we have a few regular expressions to strip out funky characters that somehow make there way in at times.
If you are going international, you definitely want to use UTF-8. We have yet to find the perfect solution for getting all of our data into UTF-8, and i'm not sure one exists. You just have to keep tinkering with it.