问题
I am trying to write a string in my pdf file creating using apache pdfbox. I have used ISO-8859-1 as encoding with UTF-8. but still, it is printing question mark. Tried a lot and looked for solutions on the internet(StackOverflow). Could someone please help. Thanks in advance
public class TestClass {
public static void main(String[] args) throws IOException{
PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDPageContentStream cos= new PDPageContentStream(doc, page);
cos.beginText();
String text = "Deposited Cash of ₹10,00,000/- or more in a Saving Bank Account";
cos.newLineAtOffset(25, 700);
byte[] ptext = text.getBytes("ISO-8859-1");
String value = new String(ptext, "UTF-8");
}
cos.setFont(PDType1Font.TIMES_ROMAN, 12);
cos.showText(value);
cos.endText();
cos.close();
doc.save("C:\\Users\\xyz\\Desktop\\Sample.pdf");
doc.close();
}
}
In pdf, it is writing question mark instead of rupee symbol.
回答1:
You use the font PDType1Font.TIMES_ROMAN
. This is a standard 14 font, i.e. a font every PDF-1.x viewer must have available but merely for a limited character set which the Rupee symbol does not belong to (cf. Annex D of the PDF specification ISO 32000-1).
PDFBox in particular uses WinAnsiEncoding for standard 14 fonts which the Rupee symbol very definitively is not among.
Thus, use a local font for which you know that it includes the Rupee symbol (e.g. ARIALUNI for test purposes) with an encoding which allows representing the Rupee symbol (e.g. Identity-H).
And don't do
byte[] ptext = text.getBytes("ISO-8859-1");
String value = new String(ptext, "UTF-8");
This encodes text as bytes according to one encoding and decodes those bytes according to a different encoding. Such code usually only damages the text, often beyond repair. (There are seldom occasions in which such code might sense, in particular if the original string already was damaged, decoded using a wrong encoding. But it does not in your case.)
As the OP asked, this is the code that worked for me:
PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDPageContentStream cos= new PDPageContentStream(doc, page);
cos.beginText();
String text = "Deposited Cash of ₹10,00,000/- or more in a Saving Bank Account";
cos.newLineAtOffset(25, 700);
cos.setFont(PDType0Font.load(doc, new File("c:/windows/fonts/arial.ttf")), 12);
cos.showText(text);
cos.endText();
cos.close();
doc.save("IndianRupee.pdf");
doc.close();
(ShowSpecialGlyph test testIndianRupeeForVandanaSharma
)
The result:
As @Tilman already stressed, one needs to have a new enough font file to make this work: The Indian Rupee Sign ₹ (U+20B9) was introduced to Unicode in version 6.0.0 (October 2010) and it might have taken font developers some time to implement that glyph. E.g. I use ArialMT (arial.ttf) version 6.90 with "(c) 2015 The Monotype Corporation."
And of course, if your font file is not located in "c:/windows/fonts/", use the path it has on your system.
回答2:
Solution to the above question:
Purpose:trying to write IndianRupeeSymbol(₹) in PDF using Apache PDFBox library.
Error:there was some problem in writing this symbol in PDF (refer the question for exact details.)
Approach:I was looking for font which support reading/writing unicode character in PDF file.I downloaded many .ttf files for various fonts from internet , i was placing it somewhere in my system, using that .ttf file to read/write (encode/decode) the unicode character so that i can write the same in my pdf file.
Mistake:any font style you want to use to read/write character, the font file for that particular font must be installed in the system.However, i was simply downloading the file and was trying to read it in my code.
Solution:As provided by @Tilman and @mkl, there are some default font files installed in our system (C:\Windows\Fonts....)(I am using windowsOS). You can use these pre-installed files to fulfill your purpose. Please check the version of the font file installed in your system once. Version should be latest to support latest features. In case, you find the fonts installed are not the latest one, you can download the respective font file and install in your system.
来源:https://stackoverflow.com/questions/49426018/%e2%82%b9-indian-rupee-symbol-symbol-is-printing-as-question-mark-in-pdf-using-apa