boost to_upper function of string_algo doesn't take into account the locale

倾然丶 夕夏残阳落幕 提交于 2019-12-23 19:31:34

问题


I have a problem with the functions in the string_algo package.

Consider this piece of code:

#include <boost/algorithm/string.hpp>
int main() {
   try{
      string s = "meißen";
      locale l("de_DE.UTF-8");
      to_upper(s, l);
      cout << s << endl;
   catch(std::runtime_error& e){
      cerr << e.what() << endl;
   }

   try{
      string s = "composición";
      locale l("es_CO.UTF-8");
      to_upper(s, l);
      cout << s << endl;
   catch(std::runtime_error& e){
      cerr << e.what() << endl;
   }
}

The expected output for this code would be:

MEISSEN
COMPOSICIÓN

however the only thing I get is

MEIßEN
COMPOSICIóN

so, clearly the locale is not being taken into account. I even try to set the global locale with no success. What can I do?


回答1:


std::toupper assumes a 1:1 conversion, so there is no hope for the ß to SS case, Boost.StringAlgo or not.

Looking at StringAlgo's code, we see that it does use the locale (Except on Borland, it seems). So, for the other case, I'm curious: What is the result of toupper('ó', std::locale("es_CO.UTF-8"))on your platform?

Writing the above makes me think about something else: What is the encoding of the strings in your sources? UTF8? In that case, std::toupper will see two code units for 'ó', so there is no hope. Latin1? In that case, using a locale named ".UTF-8" is inconsistent.




回答2:


In addition to the answer of Éric Malenfant -- std::locale facets works on single character. To get better result you may use std::wstring -- thus more characters would be converterd, but as you can see it is still not perfect (example ß).

I would suggest to give a try to Boost.Locale (new library for boost, not yet in boost), that does stuff

http://cppcms.sourceforge.net/boost_locale/docs/,

Especially see http://cppcms.sourceforge.net/boost_locale/docs/index.html#conversions that deals with the problem you are talking about.




回答3:


In the standard library there is std::toupper (which boost::to_upper uses) that operates on one character at a time.

This explains why the ß doesn't work. You didn't say which standard library and codepage you are using so I don't know why the ó didn't work.

What happens if you use wstring instead?




回答4:


You can use boost::locale. Here is an example.



来源:https://stackoverflow.com/questions/1770763/boost-to-upper-function-of-string-algo-doesnt-take-into-account-the-locale

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!