Show Unicode characters in PostScript

问题

How do I get my PostScript program to show G clef character from Bravura font? According to this SMuFL document the Unicode code point for a G (treble) clef in Bravura is U+E050 (see page 48 Clefs (U+E050–U+E07F)). The PostScript glyph name might be gClef (not sure).

Here is my best attempt so far to get the unicode characters on page. I am using GhostScript 9.25 to produce a PDF. This is the output from GhostScript:

GPL Ghostscript 9.25 (2018-09-13)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Scanning C:/Windows/Fonts for fonts... 550 files, 358 scanned, 337 new fonts.
Can't find (or can't open) font file %rom%Resource/Font/Calibri.
Can't find (or can't open) font file Calibri.
Loading Calibri font from C:/Windows/Fonts/calibri.ttf... 8525920 7081126 4118548 2767358 1 done.
Can't find (or can't open) font file %rom%Resource/Font/BravuraText.
Can't find (or can't open) font file BravuraText.
Loading BravuraText font from C:/Windows/Fonts/BravuraText.otf... 9545496 7985907 8185868 6762307 1 done.
GPL Ghostscript 9.25: Can't embed the complete font BravuraText as it is too large, embedding a subset.
Main

And this is the minimal PostScript program:

%!PS-Adobe-3.0
%%Title: unicode.ps
%%LanguageLevel: 3
%%EndComments


%%BeginProlog

userdict begin

%%EndProlog


%%BeginSetup

/mm { 25.4 div 72 mul } bind def
/A4Landscape [297 mm 210 mm] def
/PageSize //A4Landscape def
<< /PageSize PageSize >> setpagedevice


% ‘‘ReEncodeSmall’’ generates a new re-encoded font. It
% takes 3 arguments: the name of the font to be
% re-encoded, a new name, and an array of new character
% encoding and character name pairs (see the definition of
% the ‘‘scandvec’’ array below for the format of this
% array). This method has the advantage that it allows the
% user to make changes to an existing encoding vector
% without having to specify an entire new encoding
% vector. It also saves space when the character encoding
% and name pairs array is smaller than an entire encoding
% vector.

% Usage: /Times-Roman /Times-Roman-Scand scandvec new-font-encoding

/new-font-encoding { <<>> begin
    /newcodesandnames exch def
    /newfontname exch def
    /basefontname exch def

    /basefontdict basefontname findfont def     % Get the font dictionary on which to base the re-encoded version.
    /newfont basefontdict maxlength dict def    % Create a dictionary to hold the description for the re-encoded font.

    basefontdict 
        { exch dup /FID ne      % Copy all the entries in the base font dictionary to the new dictionary except for the FID field.
            { dup /Encoding eq
                { exch dup length array copy    % Make a copy of the Encoding field.
                    newfont 3 1 roll put }
                { exch newfont 3 1 roll put }
                ifelse
            }
            { pop pop }         % Ignore the FID pair.
            ifelse
        } forall

    newfont /FontName newfontname put   % Install the new name.
    newcodesandnames aload pop      % Modify the encoding vector. First load the new encoding and name pairs onto the operand stack.
    newcodesandnames length 2 idiv
        { newfont /Encoding get 3 1 roll put}
        repeat  % For each pair on the stack, put the new name into the designated position in the encoding vector. 
    newfontname newfont definefont pop      % Now make the re-encoded font description into a POSTSCRIPT font. Ignore the modified dictionary returned on the operand stack by the definefont operator.
end} def


/Calibri /TextFont [
    16#41   /Scaron     % A (/Scaron Š U+0160)
    16#42   /quarternote                % B U+2669
    16#43   /musicalnote                % C
    16#44   /eighthnotebeamed           % D
    16#45   /musicalnotedbl             % E
    16#46   /beamedsixteenthnotes       % F
    16#47   /musicflatsign              % G
    16#47   /musicsharpsign             % H U+266F
] new-font-encoding

% https://github.com/steinbergmedia/bravura
% The Unicode code point for a G (treble) clef in Bravura Text is U+E050
% http://www.smufl.org/files/smufl-0.9.pdf
% p48 Clefs (U+E050–U+E07F)
% U+E050 (and U+1D11E) gClef G clef 
% http://www.jdawiseman.com/papers/trivia/character-entities.html
/Bravura /MusicFont [
    16#41   /gClef                      % A
    16#42   /quarternote                % B U+2669
    16#43   /musicalnote                % C
    16#44   /eighthnotebeamed           % D
    16#45   /musicalnotedbl             % E
    16#46   /beamedsixteenthnotes       % F
    16#47   /musicflatsign              % G
    16#47   /musicsharpsign             % H U+266F
] new-font-encoding

/MusicFont findfont 48 scalefont setfont

%%EndSetup


%%BeginScript

%% Main
(Main\n) print
<<>>begin
    /TextFont findfont 48 scalefont setfont
    0 setgray
    72 72 moveto
    (@ABCDEFGHIJKL) show

    0 72 translate

    /MusicFont findfont 48 scalefont setfont
    0 setgray
    72 72 moveto
    (@ABCDEFGHIJKL) show
end
showpage

%%EndScript

%%Trailer
%%EOF

回答1:

The first question is how you are defining Bravura and Calibri. These fonts are not part of the standard Ghostscript installation, so they must be added in some fashion, possibly via fontconfig (on Linux), but I see you are using Windows (from the path name). How have you added the fonts ?

Now you are (again from the back channel messages) loading TrueType fonts and using them as substitutes for missing PostScript fonts. That's a non-standard feature, so Ghostscript has to do a lot of guessing in order to try and create a Type 42 font (PostScript font with TrueType outlines) from a TrueType font. There's no guarantee it'll get it right, though it is pretty good these days.

By the way, this is nothing to do with Unicode :-)

In PostScript you use a character code for each character you want to display. In your case you have used 0x40 (@) to 0x4C (L) consecutively. When rendering the glyph, the interpreter takes the character code, and looks up the Encoding at that position. Note that your Encoding arrays only contain entries from 0x41 to 0x47, so codes 0x48 to 0x4C will be undefined.

Lets think about your 'TextFont', which is Calibri. At position 0x41 in the Encoding you have a glyph name 'Scaron'. So the interpreter then consults the CharStrings dictionary of the font. The CharStrings dictionary contains key/value pairs, the key (in this case) is a name, and the value is an executable program which defines how to render the glyph.

So the interpreter looks for a key called /Scaron in the CharStrings dictionary, and then executes the program associated with it. If it can't find the key /Scaron, then it looks up the key /.notdef (all fonts are required to have a .notdef) and executes that instead.

You haven't actually said what you're getting out. I'm assuming there's a problem, because you've posted a question (which doesn't seem to contain any actual questions....) but you have't said what it is. If you are getting hollow rectangles instead of the expected glyph, then that's because the interpreter is executing the /.notdef which for TrueType fonts is often a rectangle (PostScript fonts often have a completely blank .notdef, but both font types can have anything they want)

In which case the problem is that you are using a glyph name (eg /muscialnote) which doesn't exist in the CharStrings dictionary. Unless the TrueType font had a POST table (most do not) then that's not surprising, because /musicalnote is a very non-standard name for a glyph.

If I add Calibri to fontmap.GS and then do:

%!
/Calibri findfont /CharStrings get {== ==} forall

Then I see many entries of the form:

0 /_6756 0 /_6689

these are mapping the names (eg /_6576) to the TrueType GID. When using a TrueType font Ghostscript needs the GID so that it can find the glyph program in the font from the GLYF table. When defining a TrueType font for use as a type 42, this is somethign Ghostscript has to try and create for itself (a real Type 42 font is defined with this dictionary as part of the font). How it achieves this is heuristic, ie it guesses a lot.

In this case the GID is 0, which is the TrueType reserved GID for the .notdef glyph, so these names will all map to the .notdef.

I also see a number of entries like:

4 /A

These (obviously) are the glyphs that you can use, in this case the name /A maps to GID 4. Checking the output, there is no name 'quarternote, 'musicalnote' etc. There is an Scaron, so I expect that your '@' character will render as a capital S with a caron accent. The remaining glyphs will render as empty squares, or nothing at all. Testing here shows (interestingly) a rectangle with a question mark in it.

Now it may be that the Calibri font contains the glyphs you want, if it doe, then I'm afraid the only way to access them (from PostScript) is to identify the name that Ghostscript associated with the glyph. The same is true of the Bravura font.

A little PostScript programming (seems like you're more than competent to write this) would allow you to retrieve the CharStrings dictionary from the font, iterate through it, and build an array of all the names which have a non-zero value. You could then print a page (probably many pages) where you print a named glyph from the font, and under it print the name associated with that glyph. There's your map, now you can build an Encoding which maps the glyph name to the character code you want to use in your PostScript program to draw that glyph.

FWIW when I try to use Bravura (which is an OpenType font, not a TrueType font) I get a syntax error whie loading the font. Same for BravuraText.

来源：https://stackoverflow.com/questions/54840594/show-unicode-characters-in-postscript

标签

unicode

postscript

lang