How do I determine file encoding in OS X?

后端 未结 15 973
死守一世寂寞
死守一世寂寞 2020-11-29 15:12

I\'m trying to enter some UTF-8 characters into a LaTeX file in TextMate (which says its default encoding is UTF-8), but LaTeX doesn\'t seem to understand them.

Runn

相关标签:
15条回答
  • 2020-11-29 15:36
    vim -c 'execute "silent !echo " . &fileencoding | q' {filename}
    

    aliased somewhere in my bash configuration as

    alias vic="vim -c 'execute \"silent \!echo \" . &fileencoding | q'"
    

    so I just type

    vic {filename}
    

    On my vanilla OSX Yosemite, it yields more precise results than "file -I":

    $ file -I pdfs/udocument0.pdf
    pdfs/udocument0.pdf: application/pdf; charset=binary
    $ vic pdfs/udocument0.pdf
    latin1
    $
    $ file -I pdfs/t0.pdf
    pdfs/t0.pdf: application/pdf; charset=us-ascii
    $ vic pdfs/t0.pdf
    utf-8
    
    0 讨论(0)
  • 2020-11-29 15:36

    Using file command with the --mime-encoding option (e.g. file --mime-encoding some_file.txt) instead of the -I option works on OS X and has the added benefit of omitting the mime type, "text/plain", which you probably don't care about.

    0 讨论(0)
  • 2020-11-29 15:44

    Just use:

    file -I <filename>
    

    That's it.

    0 讨论(0)
  • 2020-11-29 15:44

    You can try loading the file into a firefox window then go to View - Character Encoding. There should be a check mark next to the file's encoding type.

    0 讨论(0)
  • Classic 8-bit LaTeX is very restricted in which UTF8 characters it can use; it's highly dependent on the encoding of the font you're using and which glyphs that font has available.

    Since you don't give a specific example, it's hard to know exactly where the problem is — whether you're attempting to use a glyph that your font doesn't have or whether you're not using the correct font encoding in the first place.

    Here's a minimal example showing how a few UTF8 characters can be used in a LaTeX document:

    \documentclass{article}
    \usepackage[T1]{fontenc}
    \usepackage{lmodern}
    \usepackage[utf8]{inputenc}
    \begin{document}
    ‘Héllø—thêrè.’
    \end{document}
    

    You may have more luck with the [utf8x] encoding, but be slightly warned that it's no longer supported and has some idiosyncrasies compared with [utf8] (as far as I recall; it's been a while since I've looked at it). But if it does the trick, that's all that matters for you.

    0 讨论(0)
  • 2020-11-29 15:50

    The @ means that the file has extended file attributes associated with it. You can query them using the getxattr() function.

    There's no definite way to detect the encoding of a file. Read this answer, it explains why.

    There's a command line tool, enca, that attempts to guess the encoding. You might want to check it out.

    0 讨论(0)
提交回复
热议问题