utf | 易学教程

Draw string with normalized scientific notation (superscripted )

阅读更多关于 Draw string with normalized scientific notation (superscripted )

问题 I want to draw following string in my game To compare, there are 10^75 particles in the universe. where 10 75 is formatted in a normalized scientific notation (as we've been doing in school). I use SpriteBatch.DrawString method, but I cannot figure out a nite solution. There are a few trivial ones: Draw two strings where the second string's got a smaller font or is scaled. Draw an image. I've been looking at UTF tables, but seems it's not possible. Do I have to have special font for this task

谈谈Unicode编码，简要解释UCS、UTF、BMP、BOM等名词

阅读更多关于谈谈Unicode编码，简要解释UCS、UTF、BMP、BOM等名词

整理这篇文章的动机是两个问题：问题一：使用Windows记事本的“另存为”，可以在GBK、Unicode、Unicode big endian和UTF-8这几种编码方式间相互转换。同样是txt文件，Windows是怎样识别编码方式的呢？我很早前就发现Unicode、Unicode big endian和UTF-8编码的txt文件的开头会多出几个字节，分别是FF、FE（Unicode）,FE、FF（Unicode big endian）,EF、BB、BF（UTF-8）。但这些标记是基于什么标准呢？问题二：最近在网上看到一个ConvertUTF.c，实现了UTF-32、UTF-16和UTF-8这三种编码方式的相互转换。对于Unicode(UCS2)、GBK、UTF-8这些编码方式，我原来就了解。但这个程序让我有些糊涂，想不起来UTF-16和UCS2有什么关系。查了查相关资料，总算将这些问题弄清楚了，顺带也了解了一些Unicode的细节。写成一篇文章，送给有过类似疑问的朋友。本文在写作时尽量做到通俗易懂，但要求读者知道什么是字节，什么是十六进制。 ###0、big endian和little endian big endian和little endian是CPU处理多字节数的不同方式。例如“汉”字的Unicode编码是6C49。那么写到文件里时，究竟是将6C写在前面

How do I generate keyboard events that don't have key code in Java?

阅读更多关于 How do I generate keyboard events that don't have key code in Java?

问题 I'm using Robot class and KeyEvent key codes to generate all the other key events and they work fine, but I also need Hangul key(toggle Korean keyboard). Apparently KeyEvent does not have a key code for this key, so I'm stuck :( Is there a way to generate this Hangul key event? Is there a way to use the Windows' key code like VK_HANGUL (0x15) instead of the KeyEvent key codes? If that's possible changing all the key codes wouldn't be a problem... Or somehow take the key event once and store

Content is not allowed in prolog

阅读更多关于 Content is not allowed in prolog

问题 i'm trying to convert xml to html using xslt . Am using java.xml.transform to do this in java. it was working fine until i bumped into some xml . it said the following error. [Fatal Error] :1:1: Content is not allowed in prolog. javax.xml.transform.TransformerConfigurationException: javax.xml.transform.TransformerConfigurationException: javax.xml.transform.TransformerException: org.xml.sax.SAXParseException: Content is not allowed in prolog. so i made sure there is no character before the xml

Persist UTF-8 as Default Encoding

阅读更多关于 Persist UTF-8 as Default Encoding

I tried to persist UTF-8 as the default encoding in Python. I tried: >>> import sys >>> sys.getdefaultencoding() 'ascii' And I also tried: >>> import sys >>> reload(sys) <module 'sys' (built-in)> >>> sys.setdefaultencoding('UTF8') >>> sys.getdefaultencoding() 'UTF8' >>> But after closing the session and opening a new session, the following was the result: >>> import sys >>> sys.getdefaultencoding() 'ascii' How can I persist my changes? (I know that it's not always a good idea to change to UTF-8. It's in a Docker container of Python). I know it's possible. I saw someone who has UTF-8 as his

Git cant diff or merge .cs file in utf-16 encoding

阅读更多关于 Git cant diff or merge .cs file in utf-16 encoding

问题 A friend and I were working on the same .cs file at the same time and when there's a merge conflict git points out there's a conflict but the file isnt loaded with the usual "HEAD" ">>>" stuff because the .cs files were binary files. So we added numerous things (*.cs text and so on)- to our .gitattributes file to make git treat it as a text file which didnt work. Thats when we realized that git could diff other .cs files and just not this one. The reason for that is because its in unicode

HTML Unicode Issue: How to display special characters

阅读更多关于 HTML Unicode Issue: How to display special characters

Currently, I have my webpage set to Unicode/UTF-8. When trying to display a special character (for example, em dash, double arrow, etc), it shows up as a question mark symbol. I cannot change these characters to the HTML entity equivalent. How can I circumvent this issue? A question mark in a lozenge, �, indicates a character-level error: the data contains bytes that do no represent any character, according to the character encoding being applied. This typically happens when the document is declared as UTF-8 encoded but is really in iso-8859-1, windows-1252, or some similar encoding. Windows

字符集与编码（四）——Unicode

阅读更多关于字符集与编码（四）——Unicode

注：由于两边同步的麻烦，更多更改及调整可参考我的网站： xiaogd.net 上的字符集编码与乱码系列，已将字符集编码系列与乱码探源系列合并，更新及勘误等不再更新到这边。前面谈到不少的Unicode，但一直没有系统地谈及Unicode的方方面面，所以本篇文章专门谈谈Unicode，当然了，Unicode是一个庞大的主题，这里也是拣些重要的方面谈谈而已，免不了挂一漏万。什么是Unicode？按Unicode官方的说法，Unicode是Unicode Standard（Unicode标准）的简写，所以Unicode即是指Unicode标准。按wiki的说法，它是一个计算机工业标准（a computing industry standard）。下图来自 http://www.unicode.org/standard/WhatIsUnicode.html 中的截图，在这里我把中文和英文的合在一起这样一个所谓的一个唯一的数字在Unicode中就叫做码点。 Unicode中的码点是什么？字符集通常又叫” 编码字符集”（ coded charset），这里的 coded 与”字符集编码 ”（charset encoding ）中的 encoding 是不同的。一个是 code ，一个是 encode ，翻译时都可以译成”编码”，但把 coded charset译成”

Content is not allowed in prolog

阅读更多关于 Content is not allowed in prolog

i'm trying to convert xml to html using xslt . Am using java.xml.transform to do this in java. it was working fine until i bumped into some xml . it said the following error. [Fatal Error] :1:1: Content is not allowed in prolog. javax.xml.transform.TransformerConfigurationException: javax.xml.transform.TransformerConfigurationException: javax.xml.transform.TransformerException: org.xml.sax.SAXParseException: Content is not allowed in prolog. so i made sure there is no character before the xml declaration. i even took care of BOM using the solution http://forums.sun.com/thread.jspa?messageID

Powershell and UTF-8

阅读更多关于 Powershell and UTF-8

I have an html file test.html created with atom which contains: Testé encoding utf-8 When I read it with Powershell console (I'm using French Windows) Get-Content -Raw test.html I get back this: TestÃ© encoding utf-8 Why is the accent character not printing correctly? The Atom editor creates UTF-8 files without a pseudo-BOM by default (which is the right thing to do, from a cross-platform perspective). Other popular cross-platform editors, such as Visual Studio Code and Sublime Text , behave the same way. Windows PowerShell [1] only recognizes UTF-8 files with a pseudo-BOM . In the absence of