问题
I used the following code to get data in PDF from a particular location. I want to get bold text present in that location.
Rectangle rect = new Rectangle(0,0,250,250);
RenderFilter filter = new RegiontextRenderFilter(rect);
fontBasedTextExtractionStrategy strategy = new fontBasedTextExtractionStrategy();
strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter); //Throws Error.
To start with, creating a new method called fontBasedTextExtractionStrategy instead of text simple TextExtractionStrategy help? Something like below
public class fontBasedTextExtractionStrategy implements TextExtractionStrategy {
private String text;
@Override
public void beginTextBlock() {
}
@Override
public void renderText(TextRenderInfo renderInfo) {
text = renderInfo.getText();
System.out.println(renderInfo.getFont().getFontType());
System.out.print(text);
}
@Override
public void endTextBlock() {
}
@Override
public void renderImage(ImageRenderInfo renderInfo) {
}
@Override
public String getResultantText() {
return text;
}
}
But again how to call it properly?
回答1:
Please take a look at the ParseCustom example. In this example, we create a custom RenderFilter
(not a TextExtractionStrategy
):
class FontRenderFilter extends RenderFilter {
public boolean allowText(TextRenderInfo renderInfo) {
String font = renderInfo.getFont().getPostscriptFontName();
return font.endsWith("Bold") || font.endsWith("Oblique");
}
}
This text will filter all text so that only text of which the Postscript font name ends with Bold or Oblique.
This is how you use this filter:
public void parse(String filename) throws IOException {
PdfReader reader = new PdfReader(filename);
Rectangle rect = new Rectangle(36, 750, 559, 806);
RenderFilter regionFilter = new RegionTextRenderFilter(rect);
FontRenderFilter fontFilter = new FontRenderFilter();
TextExtractionStrategy strategy = new FilteredTextRenderListener(
new LocationTextExtractionStrategy(), regionFilter, fontFilter);
System.out.println(PdfTextExtractor.getTextFromPage(reader, 1, strategy));
reader.close();
}
As you can see, we create a FilteredTextRenderListener
that takes two filters, a RegionTextRenderFilter
and our self-made filter based on the font.
来源:https://stackoverflow.com/questions/24506830/can-we-use-text-extraction-strategy-after-applying-location-extraction-strategy