The encoding that Notepad++ just calls “ANSI”, does anyone know what to call it for Ruby?

你离开我真会死。 提交于 2019-11-29 01:09:58

What they mean is probably ISO/IEC 8859-1 (aka Latin-1), ISO-8859-1, ISO/IEC 8859-15 (aka Latin-9) or Windows-1252 (aka CP 1252). All 4 of them have the ä at position 0xE4.

I found the answer to this question on the Notepad++ Forum, answered in 2010 by CChris who seems to be authoritative.

Question: Encoding ANSI?

Answer:

That will be the system code page for your computer (code page 0).

More Info:

Show your current code page.

>help chcp
Displays or sets the active code page number.

CHCP [nnn]

  nnn   Specifies a code page number.

Type CHCP without a parameter to display the active code page number.

>chcp
Active code page: 437

Code Page Identifiers

Identifier  .NET Name  Additional information
437         IBM437     OEM United States

I think it's 'cp1252', alias 'windows-1252'.

After reading Jörg's answer, I went back through the Encoding page on ruby-doc.org trying to find references to the specific encodings he mentioned, and that's when I spotted the Encodings.aliases method.

So I kludged up the method at the end of this answer.

Then I looked at the output in notepad++, viewing it as both 'ANSI' and utf-8, and compared that to the output in irb...

I could only find two places in the irb output where the utf-8 file was garbled in the exact same way it appeared in notepad++ when viewing it as 'ANSI', and those places were for cp1252 and cp1254.

cp1252 is apparently my 'filesystem' encoding, so I'm going with that.

I wrote a script to make copies of all the files converted to utf-8's, trying both from 1252 and 1254.

utf-8 regexes seem to work with both sets of files so far.

Now I have to try to remember what I was actually trying to accomplish before I ran into all these encoding headaches. xD

def compare_encodings file1, file2
    file1_probs = []
    file2_probs = []

    txt = File.open('encoding_test_output.txt','w')

    Encoding.aliases.sort.each do |k,v|
        Encoding.default_external=k
        ename = [k.downcase, v.downcase].join "  ---  "
        s = ""
        begin
            s << "#{File.read(file1)}" 
        rescue
            s << "nope nope nope"
            file1_probs << ename
        end
        s << "\t| #{ename} |\t"
        begin
            s << "#{File.read(file2)}"
        rescue
            s << "nope nope nope"
            file2_probs << ename
        end
        Encoding.default_external= 'utf-8'
        txt.puts s.center(58)
        puts s.center(58)
    end
    puts
    puts "file1, \"#{file1}\" exceptions from trying to convert to:\n\n"
    puts file1_probs
    puts
    puts "file2, \"#{file2}\" exceptions from trying to convert to:\n\n"
    puts file2_probs
    txt.close
end

compare_encodings "utf-8.txt", "np++'ANSI'.txt"
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!