Compare std::wstring and std::string

前端 未结 3 580
青春惊慌失措
青春惊慌失措 2020-11-29 09:10

How can I compare a wstring, such as L\"Hello\", to a string? If I need to have the same type, how can I convert them into the same ty

3条回答
  •  一整个雨季
    2020-11-29 09:43

    Since you asked, here's my standard conversion functions from string to wide string, implemented using C++ std::string and std::wstring classes.

    First off, make sure to start your program with set_locale:

    #include 
    
    int main()
    {
      std::setlocale(LC_CTYPE, "");  // before any string operations
    }
    

    Now for the functions. First off, getting a wide string from a narrow string:

    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    
    // Dummy overload
    std::wstring get_wstring(const std::wstring & s)
    {
      return s;
    }
    
    // Real worker
    std::wstring get_wstring(const std::string & s)
    {
      const char * cs = s.c_str();
      const size_t wn = std::mbsrtowcs(NULL, &cs, 0, NULL);
    
      if (wn == size_t(-1))
      {
        std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
        return L"";
      }
    
      std::vector buf(wn + 1);
      const size_t wn_again = std::mbsrtowcs(buf.data(), &cs, wn + 1, NULL);
    
      if (wn_again == size_t(-1))
      {
        std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
        return L"";
      }
    
      assert(cs == NULL); // successful conversion
    
      return std::wstring(buf.data(), wn);
    }
    

    And going back, making a narrow string from a wide string. I call the narrow string "locale string", because it is in a platform-dependent encoding depending on the current locale:

    // Dummy
    std::string get_locale_string(const std::string & s)
    {
      return s;
    }
    
    // Real worker
    std::string get_locale_string(const std::wstring & s)
    {
      const wchar_t * cs = s.c_str();
      const size_t wn = std::wcsrtombs(NULL, &cs, 0, NULL);
    
      if (wn == size_t(-1))
      {
        std::cout << "Error in wcsrtombs(): " << errno << std::endl;
        return "";
      }
    
      std::vector buf(wn + 1);
      const size_t wn_again = std::wcsrtombs(buf.data(), &cs, wn + 1, NULL);
    
      if (wn_again == size_t(-1))
      {
        std::cout << "Error in wcsrtombs(): " << errno << std::endl;
        return "";
      }
    
      assert(cs == NULL); // successful conversion
    
      return std::string(buf.data(), wn);
    }
    

    Some notes:

    • If you don't have std::vector::data(), you can say &buf[0] instead.
    • I've found that the r-style conversion functions mbsrtowcs and wcsrtombs don't work properly on Windows. There, you can use the mbstowcs and wcstombs instead: mbstowcs(buf.data(), cs, wn + 1);, wcstombs(buf.data(), cs, wn + 1);


    In response to your question, if you want to compare two strings, you can convert both of them to wide string and then compare those. If you are reading a file from disk which has a known encoding, you should use iconv() to convert the file from your known encoding to WCHAR and then compare with the wide string.

    Beware, though, that complex Unicode text may have multiple different representations as code point sequences which you may want to consider equal. If that is a possibility, you need to use a higher-level Unicode processing library (such as ICU) and normalize your strings to some common, comparable form.

提交回复
热议问题