问题
I am trying to fetch some data from a PDF file in Java using apache PDFBox(1.8.9). I have added the jar in my buildpath and classpath (in Eclipse-Mars)
I am getting a null pointer exception while creating a PDFTextStripper
object.
import java.io.File;
import org.apache.pdfbox.util.PDFTextStripper;
import org.apache.pdfbox.pdmodel.PDDocument;
public class MainClass {
public static void main(String[] args) {
PDDocument pd ;
try{
StringBuilder sb = new StringBuilder();
File input = new File("C:\\Result.pdf");
pd = PDDocument.load(input);
PDFTextStripper s = new PDFTextStripper();
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
The error I am getting is :
java.lang.NullPointerException
at org.apache.pdfbox.util.TextNormalize.findICU4J(TextNormalize.java:54)
at org.apache.pdfbox.util.TextNormalize.<init>(TextNormalize.java:45)
at org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:229)
at MainClass.main(MainClass.java:17)
(Line 17 is where I am trying to create a PDFTextStripper object)
回答1:
Checking the source of TextStripper class, it appears that a class not found exception is made to return as null.
You need ICU4J jar as your dependency. These classes is loaded at run time.
From TextStripper
// see if we can load the icu4j classes from the classpath
try
{
this.getClass().getClassLoader().loadClass("com.ibm.icu.text.Bidi");
this.getClass().getClassLoader().loadClass("com.ibm.icu.text.Normalizer");
icu4j = new ICU4JImpl();
}
catch (ClassNotFoundException e)
{
icu4j = null;
}
回答2:
You are missing some dependency, please ensure below three jars are present in your classpath:-
I executed the code mentioned in your question with the above three jars, didn't receive any NPE.
Also kindly check your pdfbox-1.8.9.jar, ensure that its not corrupted.
TextStripper class is present in pdfbox-1.8.9.jar, so It looks to me that this jar is corrupted.
Download the jar again and try.
来源:https://stackoverflow.com/questions/32093905/pdftextstripper-nullpointerexception