I\'m building an XML file from scratch and need to know if htmlentities() converts every character that could potentially break an XML file (and possibly UTF-8 data)?
So your question is "is htmlentities()'s result guaranteed to be XML-compliant and UTF-8-compliant?" The answer is no, it's not.
htmlspecialchars() should be enough to escape XML's special characters but you'll have to sanitize your UTF-8 strings either way. Even if you build your XML with, say, SimpleXML, you'll have to sanitize the strings. I don't know about other librairies such as XMLWriter or DOM, I think it's the same.