We\'re getting ready to translate our PHP website into various languages, and the gettext support in PHP looks like the way to go.
All the tutorials I see recommend
I use meaningful IDs such as "welcome_back_1" which would be "welcome back, %1" etc. I always have English as my "base" language so in the worst case scenario when a specific language doesn't have a message ID, I fall-back on English.
I don't like to use actual English phrases as message ID's because if the English changes so does the ID. This might not affect you much if you use some automated tools, but it bothers me. I don't like to use simple codes (like msg3975) because they don't mean anything, so reading the code is more difficult unless you litter comments everywhere.
The reason for the IDs being English is so that the ID is returned if the translation fails for whatever reason - the translation for the current language and token not being available, or other errors. That of course assumes the developer is writing the original English text, not some documentation person.
Also if the English text changes then probably the other translations need to be updated?
In practice we also use Pure IDs rather than then English text, but it does mean we have to do lots of extra work to default to English.
In addition to the considerations above, there are many cases where you'd want the "key" (msgid) to be different from the source text (English). For example, in the HTML view, I might want to say [yyyy] where the destination and label of that anchor tag depend on the locale of the user. E.g. it might be a link to a social network, and in US it would be Facebook but in China it would be Weibo. So the MsgIds might be something like socialSiteUrl and socialSiteLabel.
I use a mix.
For basic strings that I don't think will have conflicts/changes/weird meanings, I'll make the key be the same as the English.
In a word don't do this.
The same word/phrase in English can often enough have more than one meaning, and each meaning a different translation.
Define mnemonic ids for your strings,and treat English as just another language.
Agree with other posters that id numbers in code are a nightmare for code readability.
Ex localisation engineer
At the end of the day, a translator should be able to sit down and change the texts for every language (so they match in meaning) without having to involve the programmer that already did his/her job.
This makes me feel like the proper answer is to use a modified version of gettext where you put strings like this
_(id, backup_text, context)
_('ABOUT_ME', 'About Me', 'HOMEPAGE')
context being optional
why like this? because you need to identify text in the system using unique ID's not english text that could get repeated elsewhere.
You should also keep the backup, id and context in the same place in your code to reduce discrepancies.
The id's also have to be readable, which brings in the problem of synonyms and duplicate use (even as ids), we could prefix the ids like this "HOMEPAGE_ABOUT_ME" or "MAIL_LETTER", but
which is why I also added the context variable at the end
the backup text can be pretty much anything, could even be "[ABOUT_ME@HOMEPAGE text failed to load, please contact example@example.com]"
It won't work with the current gettext editing programs like "poedit", but I think you can define custom variable names for translations like just "t()" without the underscore at the start.
I know that gettext also has support for contexts, but its not very well documented or widely used.
P.S. I'm not sure about the best variable order to enforce good and extendable code so suggestions are welcome.