The encoding that Notepad++ just calls “ANSI”, does anyone know what to call it for Ruby?

后端 未结 3 1677
栀梦
栀梦 2020-12-15 16:41

I have a bunch of .txt\'s that Notepad++ says (in its drop-down \"Encoding\" menu) are \"ANSI\".

They have German characters in them, [äöüß], which display fine in N

3条回答
  •  温柔的废话
    2020-12-15 17:26

    I think it's 'cp1252', alias 'windows-1252'.

    After reading Jörg's answer, I went back through the Encoding page on ruby-doc.org trying to find references to the specific encodings he mentioned, and that's when I spotted the Encodings.aliases method.

    So I kludged up the method at the end of this answer.

    Then I looked at the output in notepad++, viewing it as both 'ANSI' and utf-8, and compared that to the output in irb...

    I could only find two places in the irb output where the utf-8 file was garbled in the exact same way it appeared in notepad++ when viewing it as 'ANSI', and those places were for cp1252 and cp1254.

    cp1252 is apparently my 'filesystem' encoding, so I'm going with that.

    I wrote a script to make copies of all the files converted to utf-8's, trying both from 1252 and 1254.

    utf-8 regexes seem to work with both sets of files so far.

    Now I have to try to remember what I was actually trying to accomplish before I ran into all these encoding headaches. xD

    def compare_encodings file1, file2
        file1_probs = []
        file2_probs = []
    
        txt = File.open('encoding_test_output.txt','w')
    
        Encoding.aliases.sort.each do |k,v|
            Encoding.default_external=k
            ename = [k.downcase, v.downcase].join "  ---  "
            s = ""
            begin
                s << "#{File.read(file1)}" 
            rescue
                s << "nope nope nope"
                file1_probs << ename
            end
            s << "\t| #{ename} |\t"
            begin
                s << "#{File.read(file2)}"
            rescue
                s << "nope nope nope"
                file2_probs << ename
            end
            Encoding.default_external= 'utf-8'
            txt.puts s.center(58)
            puts s.center(58)
        end
        puts
        puts "file1, \"#{file1}\" exceptions from trying to convert to:\n\n"
        puts file1_probs
        puts
        puts "file2, \"#{file2}\" exceptions from trying to convert to:\n\n"
        puts file2_probs
        txt.close
    end
    
    compare_encodings "utf-8.txt", "np++'ANSI'.txt"
    

提交回复
热议问题