unicode-string | 易学教程

Clarifying Java's evolutionary support of Unicode [closed]

阅读更多关于 Clarifying Java's evolutionary support of Unicode [closed]

问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 3 years ago . I'm finding Java's differentiation of char and codepoint to be strange and out of place. For example, a string is an array of characters or "letters which appear in an alphabet"; in contrast to codepoint which MAY be a single letter or possibly a composite or surrogate pair.

Arabic characters don't show in excel VBA code

阅读更多关于 Arabic characters don't show in excel VBA code

问题 I can't write arabic strings in VBA code in excel, it shows as weird characters. Tried it on many machines with excel 2013 or 2010, on windows 8 or windows 7, with or without arabic proofing tools installed. Arabic language is already installed on all machines, system locale is arabic. There's no problem typing arabic characters on excel worksheets or even MS word but not in VBA code. Please help. 回答1: in VB Editor: - 1- click tools 2- Select Options... 3- click Editor Fofmat 4- Change font

Python .split() without 'u

阅读更多关于 Python .split() without 'u

In Python, if I have a string like: a =" Hello - to - everybody" And I do a.split('-') then I get [u'Hello', u'to', u'everybody'] This is just an example. How can I get a simple list without that annoying u'?? The u means that it's a unicode string - your original string must also have been a unicode string. Generally it's a good idea to keep strings Unicode as trying to convert to normal strings could potentially fail due to characters with no equivalent. The u is purely used to let you know it's a unicode string in the representation - it will not affect the string itself. In general,

What is std::wifstream::getline doing to my wchar_t array? It's treated like a byte array after getline returns

阅读更多关于 What is std::wifstream::getline doing to my wchar_t array? It's treated like a byte array after getline returns

I want to read lines of Unicode text (UTF-16 LE, line feed delimited) from a file. I'm using Visual Studio 2012 and targeting a 32-bit console application. I was not able to find a ReadLine function within WinAPI so I turned to Google. It is clear I am not the first to seek such a function. The most commonly recommended solution involves using std::wifstream. I wrote code similar to the following: wchar_t buffer[1024]; std::wifstream input(L"input.txt"); while (input.good()) { input::getline(buffer, 1024); // ... do stuff... } input.close(); For the sake of explanation, assume that input.txt

What is std::wifstream::getline doing to my wchar_t array? It's treated like a byte array after getline returns

阅读更多关于 What is std::wifstream::getline doing to my wchar_t array? It's treated like a byte array after getline returns

问题 I want to read lines of Unicode text (UTF-16 LE, line feed delimited) from a file. I'm using Visual Studio 2012 and targeting a 32-bit console application. I was not able to find a ReadLine function within WinAPI so I turned to Google. It is clear I am not the first to seek such a function. The most commonly recommended solution involves using std::wifstream. I wrote code similar to the following: wchar_t buffer[1024]; std::wifstream input(L"input.txt"); while (input.good()) { input::getline

Automatically change between std::string and std::wstring according to unicode setting in MSVC++?

阅读更多关于 Automatically change between std::string and std::wstring according to unicode setting in MSVC++?

I'm writing a DLL and want to be able to switch between the unicode and multibyte setting in MSVC++2010. For example, I use _T("string") and LPCTSTR and WIN32_FIND_DATA instead of the -W and -A versions and so on. Now I want to have std::strings which change between std::string and std::wstring according to the unicode setting. Is that possible? Otherwise, this will probably end up getting extremely complicated. Why not do like the Win32 API does: Use wide characters internally, and provide a character-converting facade of DoSomethingA functions which simply convert their input to Unicode.

Perl: printing Unicode strings to the Windows console

阅读更多关于 Perl: printing Unicode strings to the Windows console

I am encountering a strange problem in printing Unicode strings to the Windows console*. Consider this text: אני רוצה לישון Intermediary היא רוצה לישון אתם, הם Bye Hello, world! test Assume it's in a file called "file.txt". When I go*: "type file.txt", it prints out fine. But when it's printed from a Perl program, like this: use strict; use warnings; use Encode; use 5.014; use utf8; use autodie; use warnings qw< FATAL utf8 >; use open qw< :std :utf8 >; use feature qw< unicode_strings >; use warnings 'all'; binmode STDOUT, ':utf8'; # output should be in UTF-8 my $word; my @array = ( 'אני רוצה

Python 3 - TypeError: a bytes-like object is required, not 'str'

阅读更多关于 Python 3 - TypeError: a bytes-like object is required, not 'str'

I'm working on a lesson from Udacity and am having some issue trying to find out if the result from this site returns true or false. I get the TypeError with the code below. from urllib.request import urlopen #check text for curse words def check_profanity(): f = urlopen("http://www.wdylike.appspot.com/?q=shit") output = f.read() f.close() print(output) if "b'true'" in output: print("There is a profane word in the document") check_profanity() The output prints b'true' and I'm not really sure where that 'b' is coming from. In python 3 strings are by default unicode . The b in b'true' means that

PHP - length of string containing emojis/special chars

阅读更多关于 PHP - length of string containing emojis/special chars

I'm building an API for a mobile application and I seem to have a problem with counting the length of a string containing emojis. My code: $str = "👍🏿✌🏿️ @mention"; printf("strlen: %d" . PHP_EOL, strlen($str)); printf("mb_strlen UTF-8: %d" . PHP_EOL, mb_strlen($str, "UTF-8")); printf("mb_strlen UTF-16: %d" . PHP_EOL, mb_strlen($str, "UTF-16")); printf("iconv UTF-16: %d" . PHP_EOL, iconv_strlen(iconv("UTF-8", "UTF-16", $str))); printf("iconv UTF-16: %d" . PHP_EOL, iconv_strlen(iconv("ISO-8859-1", "UTF-16", $str))); the response of this is: strlen: 27 mb_strlen UTF-8: 14 mb_strlen UTF-16: 13

Changing the first letter of every line in a file to uppercase

阅读更多关于 Changing the first letter of every line in a file to uppercase

I need to change the first letter of every line in a file to uppercase, e.g. the bear ate the fish. the river was too fast. Would become: The bear ate the fish. The river was too fast. The document contains some special letters: a, a, á, à, ǎ, ā, b, c, d, e, e, é, è, ě, ē, f, g, h, i, i, í, ì, ǐ, ī, j, k, l, m, n, o, o, ó, ò, ǒ, ō, p, q, r, s, t, u, u, ú, ù, ǔ, ü, ǘ, ǜ, ǚ, ǖ, ū, v, w, x, y, and z. The uppercase forms of these letters are: A, A, Á, À, Ǎ, Ā, B, C, D, E, E, É, È, Ě, Ē, F, G, H, I, I, Í, Ì, Ǐ, Ī, J, K, L, M, N, O, O, Ó, Ò, Ǒ, Ō, P, Q, R, S, T, U, U, Ú, Ù, Ǔ, Ü, Ǘ, Ǜ, Ǚ, Ǖ, Ū, V, W