utf-8 | 易学教程

How to change the CP_ACP(0) of windows ANSI apis in an application?

阅读更多关于 How to change the CP_ACP(0) of windows ANSI apis in an application?

问题 I try to draw text using a dll library which has only interfaces of ANSI version encapsulated windows ANSI apis, but I need to store string data using utf-8. I don't want to convert strings using MultiByte/WideChar functions so I want an approach to change the CP_ACP in my application, so that I can input string data into ANSI apis. thanks. ps: I don't to change the system default codepage. 回答1: CP_ACP represents the system Ansi codepage. You cannot change that on a per-process or per-thread

How to save pdf in proper encoding via nodejs

阅读更多关于 How to save pdf in proper encoding via nodejs

问题 So I'm trying to download a pdf file from a website with my script but the problem is that the file gets broken in the process and I'm pretty sure it's because of wrong encoding being used. I'm using request lib for downloading the file and I've set the Content-type to application-pdf My code is pretty simple:4 var fs = require('fs'); var request = require("request"); request({uri: 'xxxxxxxxxxxxxx.pdf', headers: { 'Content-type' : 'applcation/pdf' }} , function (error, response, body) { if (

Decode string with hex characters in python 2

阅读更多关于 Decode string with hex characters in python 2

问题 I have a hex string and i want to convert it utf8 to insert mysql. (my database is utf8) hex_string = 'kitap ara\xfet\xfdrmas\xfd' ... result = 'kitap araştırması' How can I do that? 回答1: Assuming Python 2.6, >>> print('kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9')) kitap araştırması >>> 'kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9').encode('utf-8') 'kitap ara\xc5\x9ft\xc4\xb1rmas\xc4\xb1' 回答2: Try(Python 3.x): import codecs codecs.decode("707974686f6e2d666f72756d2e696f", "hex").decode(

Case insensitive utf8 select

阅读更多关于 Case insensitive utf8 select

问题 In SQLite I want to case-insensitive "SELECT LIKE name" works fine for normal latin names, but when the name is in UTF-8 with non-latin characters then the select becomes case-sensitive, how to make it also case-insensitive like latin characters? p.s. my sqlite is v3 and I connect with PHP PDO 回答1: For SQLite you have 2 options: compile it with ICU support: How to compile, Compilation options override the LIKE function, here is a complete solution (from http://blog.amartynov.ru/?p=675) $pdo =

Removing control characters from a UTF-8 string

阅读更多关于 Removing control characters from a UTF-8 string

问题 I found this question but it removes all valid utf-8 characters also (returns me a blank string, while there are valid utf-8 characters plus control characters). As I read about utf-8 , there's not a specific range for control characters and each character set has its own control characters . How can I modify above solution to only remove control characters ? 回答1: I think the following code will work for you: public static string RemoveControlCharacters(string inString) { if (inString == null

ï»¿ encoding issue

阅读更多关于 ï»¿ encoding issue

问题 I'm developing a website using PHP and these strange chars "ï»¿" appears in my page, right on the top of it. My code is this: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><?php echo '';?> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> But when I see the source code in the browser, it shows this: ï»¿<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1

Conversion in .net: Native Utf-8 <-> Managed String

阅读更多关于 Conversion in .net: Native Utf-8 Managed String

问题 I created those two methods to convert Native utf-8 strings (char*) into managed string and vice versa. The following code does the job: public IntPtr NativeUtf8FromString(string managedString) { byte[] buffer = Encoding.UTF8.GetBytes(managedString); // not null terminated Array.Resize(ref buffer, buffer.Length + 1); buffer[buffer.Length - 1] = 0; // terminating 0 IntPtr nativeUtf8 = Marshal.AllocHGlobal(buffer.Length); Marshal.Copy(buffer, 0, nativeUtf8, buffer.Length); return nativeUtf8; }

Conversion in .net: Native Utf-8 <-> Managed String

阅读更多关于 Conversion in .net: Native Utf-8 Managed String

Swift 3 method to create utf8 encoded Data from String

阅读更多关于 Swift 3 method to create utf8 encoded Data from String

问题 I know there's a bunch of pre Swift3 questions regarding NSData stuff. I'm curious how to go between a Swift3 String to a utf8 encoded (with or without null termination) to Swift3 Data object. The best I've come up with so far is: let input = "Hello World" let terminatedData = Data(bytes: Array(input.nulTerminatedUTF8)) let unterminatedData = Data(bytes: Array(input.utf8)) Having to do the intermediate Array() construction seems wrong. 回答1: It's simple: let input = "Hello World" let data =

C++ iterate or split UTF-8 string into array of symbols?

阅读更多关于 C++ iterate or split UTF-8 string into array of symbols?

问题 Searching for a platform- and 3rd-party-library- independent way of iterating UTF-8 string or splitting it into array of UTF-8 symbols. Please post a code snippet. Solved: C++ iterate or split UTF-8 string into array of symbols? 回答1: If I understand correctly, it sounds like you want to find the start of each UTF-8 character. If so, then it would be fairly straightforward to parse them (interpreting them is a different matter). But the definition of how many octets are involved is well