codepages

Is there any mapping table between windows code and java charset?

為{幸葍}努か 提交于 2021-02-10 14:53:27
问题 Is there any mapping table between windows code and java charset? Java Charset list Windows Code page list Why I want that: I have to read and analyse a binary file which contains part: 0x00 word codepage like 1252, 936, 65001 I knew, and also 50222 I did not know. 0x02 word string length 0x04 bytes null-terminated string ...more strings... I wrote the code: int codepage = read16(inputStream); int length = read16(inputStream); byte[] bytes = new byte[length]; inputStream.read(bytes, 0, length

conversion of a string from some codepage to Unicode

二次信任 提交于 2021-02-08 07:53:52
问题 I would like to convert a CP-1253 string to Unicode and also perform the opposite conversion as well. Suppose I have two variables holding the strings, a MySource1253 and a MyUnicodeTarget . I presume AnsiString to be the appropriate type for MySource1253 , while String should be suitable for MyUnicodeTarget , please correct me if I am wrong. Is there some function in Delphi XE to make these conversions from one to the other and vice versa? 回答1: Declare: type GreekString = type Ansistring

What exactly is Unicode codepage 1200?

白昼怎懂夜的黑 提交于 2021-01-02 05:56:45
问题 While investigating some localization options, I stumbled across this as a save option in Visual Studio. What is Unicode code page 1200 exactly? The Microsoft documentation page Code Page Identifiers describes: Unicode UTF-16, little endian byte order (BMP of ISO 10646); available only to managed applications So is Unicode code page 1200 really UTF-16 and therefore has a BOM? Is it advisable to use this for JavaScript then, and if we have to use this, is a charset declaration necessary in the

UTF-8 to ANSI Conversion using C#

烂漫一生 提交于 2020-08-27 05:45:11
问题 I'm a .NET developer and was asked to do an application that converts html files to ANSI in C#. ANSI is necessary because the converted files will run on a Visual Fox Pro application. The basic logic is ready the problem is with the conversion itself. I've tried this code: http://social.msdn.microsoft.com/Forums/pt-BR/026ddda3-9bd1-4502-b445-e2a1cc88345d/convert-file-from-utf8-to-ansi?forum=csharplanguage but when I checked it on editplus the file is still not converted to ANSI and even worst

Encoding.RegisterProvider(CodePagesEncodingProvider.Instance) does not add extra encoding providers

北城余情 提交于 2020-03-19 05:18:17
问题 I am developing a netcoreapp2.0 console application and I need access to the whole encoding package from .NET. I have already added the System.Text.Encoding.CodePages Version=4.4.0 Nuget package from this page to my project and cleaned/restored the project several time. However I can't get the extra encoding I need. The following code: Console.WriteLine(Encoding.GetEncodings().Length); Encoding.RegisterProvider(CodePagesEncodingProvider.Instance); Console.WriteLine(Encoding.GetEncodings()

Can a file be read and written right back with small changes without knowing its encoding in C#?

▼魔方 西西 提交于 2020-01-24 03:21:05
问题 I need to download from FTP over 5000 files being .html and .php files. I need to read each file and remove some stuff that was put there by virus and save it back to FTP. I'm using following code: string content; using (StreamReader sr = new StreamReader(fileName, System.Text.Encoding.UTF8, true)) { content = sr.ReadToEnd(); sr.Close(); } using (StreamWriter sw = new StreamWriter(fileName + "1" + file.Extension, false, System.Text.Encoding.UTF8)) { sw.WriteLine(content); sw.Close(); } I

how to convert unicode text to utf8 text readable?

白昼怎懂夜的黑 提交于 2020-01-15 04:45:39
问题 I got a serious problem regarding Unicode and utf8, I saved a paragraph of Arabic/Persian text file into notepad and saved it, now I saw my information like Êæ Çíä ÓæÑÓ ÈÑäÇãå ÚÏÏ ÏáÎæÇåí Ñæ ÇÒ æÑæÏí ãííÑå æ Èå Øæá åãæä ÚÏÏ ãËáËí Ñæ ÑÓã ãí ˜äå my question is how to get back my data, it is important for me to get this data back, thanks in advance 回答1: The paragraph was scrambled by saving as code page 1256 (Arabic/Persian), then interpreted as code page 1252 (Western Europe), and finally saved

Java 1.6 Windows-1252 encoding fails on 3 characters

三世轮回 提交于 2020-01-14 10:35:46
问题 EDIT: I've been convinced that this question is somewhat non-sensical. Thanks to those who responded. I may post a follow-up question that is more specific. Today I was investing some encoding problems and wrote this unit test to isolate a base repro case: int badCount = 0; for (int i = 1; i < 255; i++) { String str = "Hi " + new String(new char[] { (char) i }); String toLatin1 = new String(str.getBytes("UTF-8"), "latin1"); assertEquals(str, new String(toLatin1.getBytes("latin1"), "UTF-8"));