PdfBox encode symbol currency euro

假如想象 提交于 2019-11-26 19:09:15

Unfortunately PDFBox's String encoding is far from perfect yet (version 1.8.x). Unfortunately it uses the same routines when encoding strings in generic PDF objects as when encoding strings in content streams which is fundamentally wrong. Thus, instead of using PDPageContentStream.drawString (which uses that wrong encodings), you have to translate to the correct encoding yourself.

E.g. instead of using

    contentStream.beginText();
    contentStream.setTextMatrix(100, 0, 0, 100, 50, 100);
    contentStream.setFont(PDType1Font.HELVETICA, 2);
    contentStream.drawString("€");
    contentStream.endText();
    contentStream.close();

which results in

you could use some like

    contentStream.beginText();
    contentStream.setTextMatrix(100, 0, 0, 100, 50, 100);
    contentStream.setFont(PDType1Font.HELVETICA, 8);
    byte[] commands = "(x) Tj ".getBytes();
    commands[1] = (byte) 128;
    contentStream.appendRawCommands(commands);
    contentStream.endText();
    contentStream.close();

resulting in

If you wonder how I got to use 128 as byte code for the €, have a look at the PDF specification ISO 32000-1, annex D.2, Latin Character Set and Encodings which indicates an octal value 200 (decimal 128) for the € symbol in WinAnsiEncoding.


PS: An alternative approach has meanwhile been presented by other answers which in case of the € symbol amounts to something like:

    contentStream.beginText();
    contentStream.setTextMatrix(100, 0, 0, 100, 50, 100);
    contentStream.setFont(PDType1Font.HELVETICA, 8);
    contentStream.drawString(String.valueOf(Character.toChars(EncodingManager.INSTANCE.getEncoding(COSName.WIN_ANSI_ENCODING).getCode("Euro"))));
    contentStream.endText();
    contentStream.close();

This indeed also draws the '€' symbol. But even though this approach looks cleaner (it does not use byte arrays, it does not construct an actual PDF stream operation manually), it is dirty in its own way:

To use a broken method, it actually breaks its string argument in just the right way to counteract the bug in the method.

Thus, if the PDFBox people decided to fix the broken PDFBox method, this seemingly clean work-around code here would start to fail as it would then feed the fixed method broken input data.

Admittedly, I doubt they will fix this bug before 2.0.0 (and in 2.0.0 the fixed method has a different name), but one never knows...

alf

This worked for me:

char symbol = '€';

Encoding e = EncodingManager.INSTANCE.getEncoding(COSName.WIN_ANSI_ENCODING);

String toPDF = String.valueOf(Character.toChars(e.getCode(e.getNameFromCharacter(symbol))));`

A created a solution of the many:

        String text = "Lorem ipsum dolor sit amet, consectetur adipisici € 1.234,56 " +
                "elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua.";

        contentStream.beginText();
        contentStream.setFont(font, 12);
        contentStream.moveTextPositionByAmount(10, 500);

        char[] tc = text.toCharArray();
        StringBuilder te = new StringBuilder();
        Encoding e =
                EncodingManager.INSTANCE.getEncoding(COSName.WIN_ANSI_ENCODING);           
        for (int i = 0; i < tc.length; i++) {
            Character c = tc[i];
            int code = 0;
            if(Character.isWhitespace(c)){
                code = e.getCode("space");
            }else{
                code = e.getCode(e.getNameFromCharacter(c));
            }               
            te.appendCodePoint(code);
        }
        contentStream.drawString( te.toString() );
        contentStream.endText();
        contentStream.close();

For the character space it's unknown code beacause the name "spacehackarabic" not described into the WinAnsiEncoding, I do not know why returns this name. Anyway I have verifier the character spaces, but it's possible also mapping this name with equivalent code space:

e.addCharacterEncoding( 040, "spacehackarabic" );

Thanks...

Maybe is too late, but I did it using:

String toPDF = String.valueOf(Character.toChars(e.getCode("Euro")));

Make sure you put uppercase "E", if you do "euro" throws an error. Please take a look of this link it help me a lot http://partners.adobe.com/public/developer/en/opentype/glyphlist.txt

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!