utf-16

How to convert a utf-8 string to a utf-16 string in PHP

随声附和 提交于 2019-12-18 05:46:08
问题 How do I convert a utf-8 string to a utf-16 string in PHP? 回答1: mbstring supports UTF-16, so you can use mb_convert_encoding. 回答2: You could also use iconv. It's native in PHP, but require that all your text is one charset. Else it could discard characters. 来源: https://stackoverflow.com/questions/155514/how-to-convert-a-utf-8-string-to-a-utf-16-string-in-php

Converting xml from UTF-16 to UTF-8 using PowerShell

好久不见. 提交于 2019-12-18 02:14:55
问题 What's the easiest way to convert XML from UTF16 to a UTF8 encoded file? 回答1: This may not be the most optimal, but it works. Simply load the xml and push it back out to a file. the xml heading is lost though, so this has to be re-added. $files = get-ChildItem "*.xml" foreach ( $file in $files ) { [System.Xml.XmlDocument]$doc = new-object System.Xml.XmlDocument; $doc.set_PreserveWhiteSpace( $true ); $doc.Load( $file ); $root = $doc.get_DocumentElement(); $xml = $root.get_outerXml(); $xml = '<

How to convert string to unicode(UTF-8) string in Swift?

北城余情 提交于 2019-12-17 19:48:43
问题 How to convert string to unicode(UTF-8) string in Swift? In Objective I could write smth like that: NSString *str = [[NSString alloc] initWithUTF8String:[strToDecode cStringUsingEncoding:NSUTF8StringEncoding]]; how to do smth similar in Swift? 回答1: Use this code, let str = String(UTF8String: strToDecode.cStringUsingEncoding(NSUTF8StringEncoding)) hope its helpful 回答2: Swift 4 I have created a String extension func utf8DecodedString()-> String { let data = self.data(using: .utf8) if let

Emoji value range

旧城冷巷雨未停 提交于 2019-12-17 16:21:45
问题 I was trying to take out all emoji chars out of a string (like a sanitizer). But I cannot find a complete set of emoji values. What is the complete set of emoji chars' UTF16 values? 回答1: The Unicode standard's Unicode® Technical Report #51 includes a list of emoji (emoji-data.txt): ... 21A9 ; text ; L1 ; none ; j # V1.1 (↩) LEFTWARDS ARROW WITH HOOK 21AA ; text ; L1 ; none ; j # V1.1 (↪) RIGHTWARDS ARROW WITH HOOK 231A ; emoji ; L1 ; none ; j # V1.1 (⌚) WATCH 231B ; emoji ; L1 ; none ; j # V1

Why does .net uses the UTF16 encoding for string , but uses utf8 as default for saving files?

断了今生、忘了曾经 提交于 2019-12-17 15:32:06
问题 From here Essentially, string uses the UTF-16 character encoding form But when saving vs StreamWriter : This constructor creates a StreamWriter with UTF-8 encoding without a Byte-Order Mark (BOM), I've seen this sample (broken link removed): And it looks like utf8 is smaller for some strings while utf-16 is smaller in some other strings. So Why .net uses utf16 as default encoding for string while utf8 for saving file ? Thank you. p.s. Ive already read the famous article 回答1: If you're happy

Any good solutions for C++ string code point and code unit?

别来无恙 提交于 2019-12-17 12:41:03
问题 In Java, a String has methods: length()/charAt(), codePointCount()/codePointAt() C++11 has std::string a = u8"很烫烫的一锅汤"; but a.size() is the length of char array, cannot index the unicode char. Is there any solutions for unicode in C++ string ? 回答1: I generally convert the UTF-8 string to a wide UTF-32/UCS-2 string before doing character operations. C++ does actually give us functions to do that but they are not very user friendly so I have written some nicer conversion functions here: // This

JavaScript strings - UTF-16 vs UCS-2?

做~自己de王妃 提交于 2019-12-17 09:21:36
问题 I've read in some places that JavaScript strings are UTF-16, and in other places they're UCS-2. I did some searching around to try to figure out the difference and found this: Q: What is the difference between UCS-2 and UTF-16? A: UCS-2 is obsolete terminology which refers to a Unicode implementation up to Unicode 1.1, before surrogate code points and UTF-16 were added to Version 2.0 of the standard. This term should now be avoided. UCS-2 does not define a distinct data format, because UTF-16

How to solve “unable to switch the encoding” error when inserting XML into SQL Server

萝らか妹 提交于 2019-12-17 05:52:25
问题 I'm trying to insert into XML column (SQL SERVER 2008 R2), but the server's complaining: System.Data.SqlClient.SqlException (0x80131904): XML parsing: line 1, character 39, unable to switch the encoding I found out that the XML column has to be UTF-16 in order for the insert to succeed. The code I'm using is: XmlSerializer serializer = new XmlSerializer(typeof(MyMessage)); StringWriter str = new StringWriter(); serializer.Serialize(str, message); string messageToLog = str.ToString(); How can

Python UTF-16 CSV reader

元气小坏坏 提交于 2019-12-17 05:12:54
问题 I have a UTF-16 CSV file which I have to read. Python csv module does not seem to support UTF-16. I am using python 2.7.2. CSV files I need to parse are huge size running into several GBs of data. Answers for John Machin questions below print repr(open('test.csv', 'rb').read(100)) Output with test.csv having just abc as content '\xff\xfea\x00b\x00c\x00' I think csv file got created on windows machine in USA. I am using Mac OSX Lion. If I use code provided by phihag and test.csv containing one

Writing utf16 to file in binary mode

馋奶兔 提交于 2019-12-17 04:29:05
问题 I'm trying to write a wstring to file with ofstream in binary mode, but I think I'm doing something wrong. This is what I've tried: ofstream outFile("test.txt", std::ios::out | std::ios::binary); wstring hello = L"hello"; outFile.write((char *) hello.c_str(), hello.length() * sizeof(wchar_t)); outFile.close(); Opening test.txt in for example Firefox with encoding set to UTF16 it will show as: h�e�l�l�o� Could anyone tell me why this happens? EDIT: Opening the file in a hex editor I get: FF FE