How do I determine file encoding in OS X?

后端未结

关注

 15  973

I\'m trying to enter some UTF-8 characters into a LaTeX file in TextMate (which says its default encoding is UTF-8), but LaTeX doesn\'t seem to understand them.

Runn

相关标签:

15条回答

予麋鹿

2020-11-29 15:36

vim -c 'execute "silent !echo " . &fileencoding | q' {filename}

aliased somewhere in my bash configuration as

alias vic="vim -c 'execute \"silent \!echo \" . &fileencoding | q'"

so I just type

vic {filename}

On my vanilla OSX Yosemite, it yields more precise results than "file -I":

$ file -I pdfs/udocument0.pdf
pdfs/udocument0.pdf: application/pdf; charset=binary
$ vic pdfs/udocument0.pdf
latin1
$
$ file -I pdfs/t0.pdf
pdfs/t0.pdf: application/pdf; charset=us-ascii
$ vic pdfs/t0.pdf
utf-8

0 讨论(0)

旧巷少年郎

2020-11-29 15:36

Using file command with the --mime-encoding option (e.g. file --mime-encoding some_file.txt) instead of the -I option works on OS X and has the added benefit of omitting the mime type, "text/plain", which you probably don't care about.

0 讨论(0)
发布评论:

提交评论
- 加载中...
一生所求

2020-11-29 15:44
Just use:
```
file -I <filename>
```
That's it.
0 讨论(0)
发布评论:

提交评论
- 加载中...
日久生厌

2020-11-29 15:44

You can try loading the file into a firefox window then go to View - Character Encoding. There should be a check mark next to the file's encoding type.

0 讨论(0)
发布评论:

提交评论
- 加载中...
不要未来只要你来

2020-11-29 15:49
Classic 8-bit LaTeX is very restricted in which UTF8 characters it can use; it's highly dependent on the encoding of the font you're using and which glyphs that font has available.

Since you don't give a specific example, it's hard to know exactly where the problem is — whether you're attempting to use a glyph that your font doesn't have or whether you're not using the correct font encoding in the first place.

Here's a minimal example showing how a few UTF8 characters can be used in a LaTeX document:
```
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage[utf8]{inputenc}
\begin{document}
‘Héllø—thêrè.’
\end{document}
```
You may have more luck with the [utf8x] encoding, but be slightly warned that it's no longer supported and has some idiosyncrasies compared with [utf8] (as far as I recall; it's been a while since I've looked at it). But if it does the trick, that's all that matters for you.
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2020-11-29 15:50

The @ means that the file has extended file attributes associated with it. You can query them using the getxattr() function.

There's no definite way to detect the encoding of a file. Read this answer, it explains why.

There's a command line tool, enca, that attempts to guess the encoding. You might want to check it out.

0 讨论(0)
发布评论:

提交评论
- 加载中...