When should I use StringComparison.InvariantCulture instead of StringComparison.CurrentCulture to test string equality?

元气小坏坏 提交于 2020-05-29 06:55:09

问题


Based on my understanding (see my other question), in order to decide whether to test string equality by using ordinal or cultural rules, the semantic of the performed comparison must be taken into account.

If the two compared strings must be considered as raw sequences of characters (in other words, two symbols) then an ordinal string comparison must be performed. This is the case for most string comparisons performed in server side code.

Example: performing a user lookup by username. In this case the usernames of available users and the searched username are just symbols, they are not words in a specific language, so there is no need to take linguistic elements into account when comparing them. In this context two symbols composed by different characters must be considered different, regardless of any linguistic rule.

If the two compared strings must be considerd as words in a specific language, then cultural rules must be taken into account during the comparison. It is entirely possible that two strings, composed by different characters, are considerd the same word in a certain language, based on the grammatical rules of that language.

Example: the two words strasse and straße have the same meaning of street in the german language. So, in the context of comparing strings representing words of the german language this grammatical rule must be taken into account and these two strings must be considered equal (think of an application for the german market where the user inputs the name of a street and that street must be searched into a database, in order to get the city where the street is located).

So far, so good.

Given all of this, in which cases using the .NET invariant culture for strings equality makes sense ?

The point is that the invariant culture (as opposed of the German culture, mentioned in the example above) is a fake culture based on the american english linguistic rules. Put another way, there is no human language whose rules are based on the .NET invariant culture, so why should I compare two strings by using this fictitious culture ?

I know that the invariant culture is typically used to format and parse strings used in machine to machine communication scenarios (such as the contracts exposed by a web API).

I would like to understand when calling string.equals using StringComparison.InvariantCulture as opposed of StringComparison.CurrentCulture (for some manually set thread culture, in order to not depend on the machine OS configuations) really makes sense.


回答1:


Combining diacritics / non-normalised strings is one example. See this answer for a decent treatment with code: https://stackoverflow.com/a/31361980/2701753

In summary for (many) 'alphabets' there are several potential Unicode (and UCS-2) representations for the same glyph (letter)

For example:

Unicode Character “á” (U+00E1) [one unicode codepoint]
Unicode Character “a” (U+0061) [followed by] Unicode Character “◌́” (U+0301) [two unicode codepoints]

so:
á
á

Same linguistic string (for all cultures, they are supposed to represent the same character) but different ordinal string (different bytes).

So Invariant equality comparison is [in this case] like normalising the strings before comparing them

Look-up unicode normalisation / decomposition for more info.

There are other interesting cases, ligatures for example. And left to right and right to left marks and ....

So, in summary, once you have 'interesting' alphabets in play (pretty much anything outside pure ascii), once you are interested in any sort of comparison of the strings as linguistic items / streams of glyphs, you probably do want to go beyond ordinal comparison.

To directly answer the question: If you have a multicultural user-base, but still need the above linguistic sensitivity, what culture would you choose for:

StringComparison.CurrentCulture (for some manually set thread culture, in order to not depend on the machine OS configuations)

other than InvariantCulture?



来源:https://stackoverflow.com/questions/61740030/when-should-i-use-stringcomparison-invariantculture-instead-of-stringcomparison

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!