I\'m working on a application which supports several languages and has a functionality in place which tries to use the language requested by the browser and also allows manu
You can use the languages from
$_SERVER['HTTP_ACCEPT_LANGUAGE']
It contains something like
de-de,de;q=0.8,en-us;q=0.5,en;q=0.3
so you need to parse this string. Then you can use the preferred language in the setLocale function.
If you'd like to check only for valid unicode letters regardless of the used language I'd propose to use a regular expression (if your pcre-regex extension is built with unicode support):
// adjust pattern to your needs
// $input needs to be UTF-8 encoded
if (preg_match('/^\p{L}+$/u', $input)) {
// OK
} else {
// not OK
}
\p{L} checks for unicode characters with the L(etter) property which includes the properties Ll (lower case letter), Lm (modifier letter), Lo (other letter), Lt (title case letter) and Lu (upper case letter) - from: Regular Expression Details).
I wouldn't use an array of characters. That would get impossible to manage.
What I'd suggest is working out a 'default' language from the IP address and using that as the locale for a request. You could also get it from the browser-agent string in some cases. You could provide the user a way to override so that if your default isn't correct they aren't stuck with a strange site. (E.g. provide on the form 'language set to english. If this isn't correct, please change: '. This isn't the nicest thing to provide but you won't get any working validation otherwise as you NEED a language/locale set in order to have a sensible alpha validation (An A isn't a letter in chinese).
This is rather an encoding issue than a language detection issue. Because UTF-8 can encode any Unicode character.
The best approach is to use UTF-8 throughout your project: in your database, in your output and as expected encoding for the input.