I\'m trying to enter some UTF-8 characters into a LaTeX file in TextMate (which says its default encoding is UTF-8), but LaTeX doesn\'t seem to understand them.
Runn
vim -c 'execute "silent !echo " . &fileencoding | q' {filename}
aliased somewhere in my bash configuration as
alias vic="vim -c 'execute \"silent \!echo \" . &fileencoding | q'"
so I just type
vic {filename}
On my vanilla OSX Yosemite, it yields more precise results than "file -I":
$ file -I pdfs/udocument0.pdf
pdfs/udocument0.pdf: application/pdf; charset=binary
$ vic pdfs/udocument0.pdf
latin1
$
$ file -I pdfs/t0.pdf
pdfs/t0.pdf: application/pdf; charset=us-ascii
$ vic pdfs/t0.pdf
utf-8
Using file
command with the --mime-encoding
option (e.g. file --mime-encoding some_file.txt
) instead of the -I option works on OS X and has the added benefit of omitting the mime type, "text/plain", which you probably don't care about.
Just use:
file -I <filename>
That's it.
You can try loading the file into a firefox window then go to View - Character Encoding. There should be a check mark next to the file's encoding type.
Classic 8-bit LaTeX is very restricted in which UTF8 characters it can use; it's highly dependent on the encoding of the font you're using and which glyphs that font has available.
Since you don't give a specific example, it's hard to know exactly where the problem is — whether you're attempting to use a glyph that your font doesn't have or whether you're not using the correct font encoding in the first place.
Here's a minimal example showing how a few UTF8 characters can be used in a LaTeX document:
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage[utf8]{inputenc}
\begin{document}
‘Héllø—thêrè.’
\end{document}
You may have more luck with the [utf8x] encoding, but be slightly warned that it's no longer supported and has some idiosyncrasies compared with [utf8] (as far as I recall; it's been a while since I've looked at it). But if it does the trick, that's all that matters for you.
The @
means that the file has extended file attributes associated with it. You can query them using the getxattr()
function.
There's no definite way to detect the encoding of a file. Read this answer, it explains why.
There's a command line tool, enca, that attempts to guess the encoding. You might want to check it out.