unicode-literals

Unicode literals that work in python 3 and 2

空扰寡人 提交于 2019-11-28 22:11:04
问题 So I have a python script that I'd prefer worked on python 3.2 and 2.7 just for convenience. Is there a way to have unicode literals that work in both? E.g. #coding: utf-8 whatever = 'שלום' The above code would require a unicode string in python 2.x ( u'' ) and in python 3.x that little u causes a syntax error. 回答1: Edit - Since Python 3.3, the u'' literal works again, so the u() function isn't needed. The best option is to make a method that creates unicode objects from string objects in

How to write 3 bytes unicode literal in Java?

旧街凉风 提交于 2019-11-28 14:17:51
I'd like to write unicode literal U+10428 in Java. http://www.marathon-studios.com/unicode/U10428/Deseret_Small_Letter_Long_I I tried with '\u10428' and it doesn't compile. Because Java went full-out unicode when people thought 64K are enough for everyone (Where did one hear such before?), they started out with UCS-2 and later upgraded to UTF-16. But they never bothered to add an escape sequence for unicode characters outside the BMP. Thus, your only recourse is manually recoding to a UTF-16 surrogate-pair and using two UTF-16 escapes. Your example codepoint U+10428 is "\uD801\uDC28" . I used

Unicode code point escapes in regex literals - Javascript

纵然是瞬间 提交于 2019-11-28 08:36:16
问题 Can this regex literal syntax having Unicode escape sequence syntax, var regpat= /^[\u0041-\u005A\u0061-\u007A\.\' \-]{2,15}/; be written using Unicode code point escape syntax(as shown below)? var regpat= /^[\u{41}-\u{5A}\u{61}-\u{7A}\u{1F4A9}\.\' \-]{2,15}/; Note: Unicode code point escapes is used to simplify ES5-compatible surrogate pair syntax representing code point value more than FFFF 回答1: Yes, according to the spec this is now a valid escape sequence, however in order to enable

How do I encode Unicode character codes in a PowerShell string literal?

不打扰是莪最后的温柔 提交于 2019-11-27 19:36:57
How can I encode the Unicode character U+0048 (H), say, in a PowerShell string? In C# I would just do this: "\u0048" , but that doesn't appear to work in PowerShell. Shay Levy Replace '\u' with '0x' and cast it to System.Char: PS > [char]0x0048 H You can also use the "$()" syntax to embed a Unicode character into a string: PS > "Acme$([char]0x2122) Company" AcmeT Company Where T is PowerShell's representation of the character for non-registered trademarks. According to the documentation, PowerShell Core 6.0 adds support with this escape sequence: PS> "`u{0048}" H see https://docs.microsoft.com

How to write 3 bytes unicode literal in Java?

左心房为你撑大大i 提交于 2019-11-27 08:06:59
问题 I'd like to write unicode literal U+10428 in Java. http://www.marathon-studios.com/unicode/U10428/Deseret_Small_Letter_Long_I I tried with '\u10428' and it doesn't compile. 回答1: Because Java went full-out unicode when people thought 64K are enough for everyone (Where did one hear such before?), they started out with UCS-2 and later upgraded to UTF-16. But they never bothered to add an escape sequence for unicode characters outside the BMP. Thus, your only recourse is manually recoding to a

How do I encode Unicode character codes in a PowerShell string literal?

十年热恋 提交于 2019-11-26 19:56:26
问题 How can I encode the Unicode character U+0048 (H), say, in a PowerShell string? In C# I would just do this: "\u0048" , but that doesn't appear to work in PowerShell. 回答1: Replace '\u' with '0x' and cast it to System.Char: PS > [char]0x0048 H You can also use the "$()" syntax to embed a Unicode character into a string: PS > "Acme$([char]0x2122) Company" AcmeT Company Where T is PowerShell's representation of the character for non-registered trademarks. 回答2: According to the documentation,

Any gotchas using unicode_literals in Python 2.6?

谁都会走 提交于 2019-11-26 19:20:52
We've already gotten our code base running under Python 2.6. In order to prepare for Python 3.0, we've started adding: from __future__ import unicode_literals into our .py files (as we modify them). I'm wondering if anyone else has been doing this and has run into any non-obvious gotchas (perhaps after spending a lot of time debugging). Koba The main source of problems I've had working with unicode strings is when you mix utf-8 encoded strings with unicode ones. For example, consider the following scripts. two.py # encoding: utf-8 name = 'helló wörld from two' one.py # encoding: utf-8 from _

Any gotchas using unicode_literals in Python 2.6?

最后都变了- 提交于 2019-11-26 06:56:37
问题 We\'ve already gotten our code base running under Python 2.6. In order to prepare for Python 3.0, we\'ve started adding: from __future__ import unicode_literals into our .py files (as we modify them). I\'m wondering if anyone else has been doing this and has run into any non-obvious gotchas (perhaps after spending a lot of time debugging). 回答1: The main source of problems I've had working with unicode strings is when you mix utf-8 encoded strings with unicode ones. For example, consider the