unicode-string

Clarifying Java's evolutionary support of Unicode [closed]

↘锁芯ラ 提交于 2019-12-01 07:45:46
问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 3 years ago . I'm finding Java's differentiation of char and codepoint to be strange and out of place. For example, a string is an array of characters or "letters which appear in an alphabet"; in contrast to codepoint which MAY be a single letter or possibly a composite or surrogate pair.

Arabic characters don't show in excel VBA code

谁说我不能喝 提交于 2019-12-01 07:27:49
问题 I can't write arabic strings in VBA code in excel, it shows as weird characters. Tried it on many machines with excel 2013 or 2010, on windows 8 or windows 7, with or without arabic proofing tools installed. Arabic language is already installed on all machines, system locale is arabic. There's no problem typing arabic characters on excel worksheets or even MS word but not in VBA code. Please help. 回答1: in VB Editor: - 1- click tools 2- Select Options... 3- click Editor Fofmat 4- Change font

Python .split() without 'u

爱⌒轻易说出口 提交于 2019-12-01 03:45:21
In Python, if I have a string like: a =" Hello - to - everybody" And I do a.split('-') then I get [u'Hello', u'to', u'everybody'] This is just an example. How can I get a simple list without that annoying u'?? The u means that it's a unicode string - your original string must also have been a unicode string. Generally it's a good idea to keep strings Unicode as trying to convert to normal strings could potentially fail due to characters with no equivalent. The u is purely used to let you know it's a unicode string in the representation - it will not affect the string itself. In general,

What is std::wifstream::getline doing to my wchar_t array? It's treated like a byte array after getline returns

谁说胖子不能爱 提交于 2019-12-01 01:31:12
I want to read lines of Unicode text (UTF-16 LE, line feed delimited) from a file. I'm using Visual Studio 2012 and targeting a 32-bit console application. I was not able to find a ReadLine function within WinAPI so I turned to Google. It is clear I am not the first to seek such a function. The most commonly recommended solution involves using std::wifstream. I wrote code similar to the following: wchar_t buffer[1024]; std::wifstream input(L"input.txt"); while (input.good()) { input::getline(buffer, 1024); // ... do stuff... } input.close(); For the sake of explanation, assume that input.txt

What is std::wifstream::getline doing to my wchar_t array? It's treated like a byte array after getline returns

天涯浪子 提交于 2019-11-30 21:45:20
问题 I want to read lines of Unicode text (UTF-16 LE, line feed delimited) from a file. I'm using Visual Studio 2012 and targeting a 32-bit console application. I was not able to find a ReadLine function within WinAPI so I turned to Google. It is clear I am not the first to seek such a function. The most commonly recommended solution involves using std::wifstream. I wrote code similar to the following: wchar_t buffer[1024]; std::wifstream input(L"input.txt"); while (input.good()) { input::getline

Automatically change between std::string and std::wstring according to unicode setting in MSVC++?

狂风中的少年 提交于 2019-11-30 20:43:21
I'm writing a DLL and want to be able to switch between the unicode and multibyte setting in MSVC++2010. For example, I use _T("string") and LPCTSTR and WIN32_FIND_DATA instead of the -W and -A versions and so on. Now I want to have std::strings which change between std::string and std::wstring according to the unicode setting. Is that possible? Otherwise, this will probably end up getting extremely complicated. Why not do like the Win32 API does: Use wide characters internally, and provide a character-converting facade of DoSomethingA functions which simply convert their input to Unicode.

Perl: printing Unicode strings to the Windows console

丶灬走出姿态 提交于 2019-11-30 20:20:27
I am encountering a strange problem in printing Unicode strings to the Windows console*. Consider this text: אני רוצה לישון Intermediary היא רוצה לישון אתם, הם Bye Hello, world! test Assume it's in a file called "file.txt". When I go*: "type file.txt", it prints out fine. But when it's printed from a Perl program, like this: use strict; use warnings; use Encode; use 5.014; use utf8; use autodie; use warnings qw< FATAL utf8 >; use open qw< :std :utf8 >; use feature qw< unicode_strings >; use warnings 'all'; binmode STDOUT, ':utf8'; # output should be in UTF-8 my $word; my @array = ( 'אני רוצה

Python 3 - TypeError: a bytes-like object is required, not 'str'

北城以北 提交于 2019-11-30 14:11:50
I'm working on a lesson from Udacity and am having some issue trying to find out if the result from this site returns true or false. I get the TypeError with the code below. from urllib.request import urlopen #check text for curse words def check_profanity(): f = urlopen("http://www.wdylike.appspot.com/?q=shit") output = f.read() f.close() print(output) if "b'true'" in output: print("There is a profane word in the document") check_profanity() The output prints b'true' and I'm not really sure where that 'b' is coming from. In python 3 strings are by default unicode . The b in b'true' means that

PHP - length of string containing emojis/special chars

江枫思渺然 提交于 2019-11-30 13:57:16
I'm building an API for a mobile application and I seem to have a problem with counting the length of a string containing emojis. My code: $str = "👍🏿✌🏿️ @mention"; printf("strlen: %d" . PHP_EOL, strlen($str)); printf("mb_strlen UTF-8: %d" . PHP_EOL, mb_strlen($str, "UTF-8")); printf("mb_strlen UTF-16: %d" . PHP_EOL, mb_strlen($str, "UTF-16")); printf("iconv UTF-16: %d" . PHP_EOL, iconv_strlen(iconv("UTF-8", "UTF-16", $str))); printf("iconv UTF-16: %d" . PHP_EOL, iconv_strlen(iconv("ISO-8859-1", "UTF-16", $str))); the response of this is: strlen: 27 mb_strlen UTF-8: 14 mb_strlen UTF-16: 13

Changing the first letter of every line in a file to uppercase

前提是你 提交于 2019-11-30 09:40:24
I need to change the first letter of every line in a file to uppercase, e.g. the bear ate the fish. the river was too fast. Would become: The bear ate the fish. The river was too fast. The document contains some special letters: a, a, á, à, ǎ, ā, b, c, d, e, e, é, è, ě, ē, f, g, h, i, i, í, ì, ǐ, ī, j, k, l, m, n, o, o, ó, ò, ǒ, ō, p, q, r, s, t, u, u, ú, ù, ǔ, ü, ǘ, ǜ, ǚ, ǖ, ū, v, w, x, y, and z. The uppercase forms of these letters are: A, A, Á, À, Ǎ, Ā, B, C, D, E, E, É, È, Ě, Ē, F, G, H, I, I, Í, Ì, Ǐ, Ī, J, K, L, M, N, O, O, Ó, Ò, Ǒ, Ō, P, Q, R, S, T, U, U, Ú, Ù, Ǔ, Ü, Ǘ, Ǜ, Ǚ, Ǖ, Ū, V, W