问题
I'm trying to tag my text with a delimiter at specific places that will be used later for parsing. I want to use a delimiter character that is least frequently used. I'm currently looking at the "\2" or the U+0002 character. Is that safe enough to use? What other suggestions are there? The text is unicode and will have both english and non-english characters.
A want to use a character that can still be "exploded()" by PHP.
Edit:
Also I want to be able to display this piece of text on screen (to the browser) and the delimiter will be "invisible" to the user. I can definitely use a str_replace() to get rid of visible delimiters, but if there are good invisible delimiters, then no such processing is needed.
回答1:
If this is only for an internal representation (i.e. not for interchange and storage), then you can use a non-character code point such as U+FFFF. Java uses that as the signal that a CharacterIterator is done, for example.
来源:https://stackoverflow.com/questions/6493956/least-used-unicode-delimiter