When receiving user input on forms I want to detect whether fields like \"username\" or \"address\" does not contain markup that has a special meaning in XML (RSS feeds) or
The correct way to detect whether string inputs contain HTML tags, or any other markup that has a special meaning in XML or (X)HTML when displayed (other than being an entity) is simply
if (mb_strpos($data, '<') === FALSE AND mb_strpos($data, '>') === FALSE)
You are correct! All XSS and CSFR attacks require < or > around the values to get the browser to execute the code (at least from IE6+).
Considering the output context given, this is sufficient to safely display in a format like HTML:
Of course, if we have any entity in the input, like á, a browser will not output it as á, but as á, unless we use a function like htmlspecialchars when doing the output. In this case, even the < and > would be also safe.
In the case of using the string input as the value of an attribute, the safety depends on the attribute.
If the attribute is an input value, we must quote it and use a function like htmlspecialchars in order to have the same content back for editing.
Again, even the < and > characters would be safe here.
We may conclude that we do not have to do any kind of detection and rejection of the input, if we will always use htmlspecialchars to output it, and our context will fit always the above cases (or equally safe ones).
[And we also have a number of ways to safely store it in the database, preventing SQL exploits.]
What if the user wants his "username" to be & is not an &? It does not contain < nor >... will we detect and reject it? Will we accept it? How will we display it? (This input gives interesting results in the new bounty!)
Finally, if our context expands, and we will use the string input as an anchor href, then our whole approach suddenly changes dramatically. But this scenario is not included in the question.
(It worths mentioning that even using htmlspecialchars the output of a string input may differ if the character encodings are different on each step.)