unicode

Why is /[\w-+]/ a valid regex but /[\w-+]/u invalid?

北战南征 提交于 2021-01-29 11:20:57
问题 If I type /[\w-+]/ in the Chrome console, it accepts it. I get a regex object I can use to test strings as usual. But if I type /[\w-+]/u , it says VM112:1 Uncaught SyntaxError: Invalid regular expression: /[\w-+]/: Invalid character class . In Firefox, /[\w-+]/ works fine, but if I type /[\w-+]/u in the console, it just goes to the next line as if I typed an incomplete statement. If I try to force it to create the regex by running eval('/[\w-+]/u') , it tells me SyntaxError: invalid range in

Copying emojis in text from MySQL to SQL Server

為{幸葍}努か 提交于 2021-01-29 10:46:02
问题 I am copying data from MySQL to SQL Server using a linked server. SELECT comment FROM openquery(my_linked_server, 'SELECT comment FROM search_data'); The text in the MySQL table column is xxx 🤘 xxx . By time I receive it in SQL Server it is xxx 🤘 xxx . The MySQL table is utf8mb4 , and I have set up the ODBC config for the linked server to use this. I am using MySQL ODBC 5.3.13 Any advice would be appreciated. the SQL Server version is 2016, I have seen examples to put do select N'🤘' etc,

What is an example for non unicode character set for -Dfile.encoding=?

試著忘記壹切 提交于 2021-01-29 09:29:54
问题 I have a JVM. where character set as "-Dfile.encoding=UTF-8" . This is how UTF-8 is set. I would want to set it to a non Unicode character set. Is there an example/value for non unicode character set so that I can set to -Dfile.encoding= ? 回答1: [ TLDR => Application encoding a confusing issue, but this document from Oracle should help . ] First a few important general points about specifying the encoding by setting the System Property file.encoding at run time: It's use is not formally

Retaining special character while reading from html java?

↘锁芯ラ 提交于 2021-01-29 07:22:07
问题 i am trying to read html source file which contains German characters like ä ö ü ß € Reading using JSOUP citAttr.nextElementSibling().text() Encoding the string with unicodeEscaper.translate(citAttr.nextElementSibling().text()) org.apache.commons.lang3.text.translate.UnicodeEscaper Issue is after reading the charecters turns into � But where as reading CSV with Encoded type UTF-8 with above unicodeEscaper saving & retriving the charecters works fine. unicodeEscaper.translate(record.get

How to convert String index to character index in Dart

[亡魂溺海] 提交于 2021-01-29 06:55:04
问题 If I have an arbitrary String like this: final family = '\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}'; // 👨‍👩‍👧 final myString = 'Let me introduce my $family to you.'; And I know the String index of the character after the family emoji (the space) is 28 , how do I find the String index of the first code unit of the family emoji? In other words, how to I find the length in UTF-16 code units of the family emoji? I've asked a similar question before, but that was before the characters package

How to convert String index to character index in Dart

≡放荡痞女 提交于 2021-01-29 06:53:47
问题 If I have an arbitrary String like this: final family = '\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}'; // 👨‍👩‍👧 final myString = 'Let me introduce my $family to you.'; And I know the String index of the character after the family emoji (the space) is 28 , how do I find the String index of the first code unit of the family emoji? In other words, how to I find the length in UTF-16 code units of the family emoji? I've asked a similar question before, but that was before the characters package

Windows console app stops printing when I switch to Edge

落爺英雄遲暮 提交于 2021-01-29 06:45:12
问题 I have to write a console app to log the active window PID, text length and text. It works except when I switch to Edge. The execution doesn't stop, but only the PID and text length get printed to the screen. Please help, I don't know what else to try. #include <iostream> #include <Windows.h> #include <WinUser.h> int main() { // Use environment's default locale for char type setlocale(LC_CTYPE, ""); std::cout << "Hello é World!\n"; while (1) { // Get foreground window HWND hwnd =

Matching Unicode letter characters in PCRE/PHP

不打扰是莪最后的温柔 提交于 2021-01-29 06:36:11
问题 I'm trying to write a reasonably permissive validator for names in PHP, and my first attempt consists of the following pattern: // unicode letters, apostrophe, hyphen, space $namePattern = "/^([\\p{L}'\\- ])+$/"; This is eventually passed to a call to preg_match() . As far as I can tell, this works with your vanilla ASCII alphabet, but seems to trip up on spicier characters like Ă or 张. Is there something wrong with the pattern itself? Perhaps I'm expecting \p{L} to do more work than I think

output utf8 in console with Visual Studio (wide stream)

戏子无情 提交于 2021-01-29 04:11:45
问题 This piece of code works if i compiled it with mingw32 on windows 10. and emits right result, as you can see below : C:\prj\cd>bin\main.exe 1°à€3§4ç5@の,は,でした,象形字 ; Indeed when i try to compile it with Visual Studio 17, same code emits wrong chracters /out:prova.exe prova.obj C:\prj\cd>prova.exe 1°à €3§4ç5@ã®,ã¯,ã§ã—ãŸ,象形字 ; C:\prj\cd> here source code : #include <windows.h> #include <io.h> #include <fcntl.h> #include <stdio.h> #include <string> #include <iostream> int main ( void )

Unicode String in urllib.request [duplicate]

佐手、 提交于 2021-01-29 03:52:09
问题 This question already has answers here : UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' - -when using urlib.request python3 (2 answers) Closed 1 year ago . The short version: I have a variable s = 'bär' . I need to convert s to ASCII so that s = 'b%C3%A4r' . Long version: I'm using urllib.request.urlopen() to read an mp3 pronunciation file from URL. This has worked very well, except I ran into a problem because the URLs often contain unicode characters. For example, the