When I extract text from a PDF file using iText I am getting values from previous pages

孤人 提交于 2019-12-12 14:42:47

问题


I am trying to extract a block of text from a specific location from each page in a multiple page PDF file.

I have the location of the text, and I am able to extract it correctly on the first page. However on the pages after the first page, the text extracted seems to be accumulating.

For example if the text value on page 1 is "A", page 2 is "B" and Page 3 is "C" then I am receiving the following values in my output string for each iteration through my FOR loop:

Loop1 : output = A

Loop2 : output = B A

Loop3 : output = C B A

I am using iTextSharp in my project, written in C#.

Any help would be appreciated.

var reader = new PdfReader(foregroundFile);

RectangleJ customerIdRectangle = new RectangleJ(0, 495, 108, 27);
RenderFilter[] filters = new RenderFilter[1];
LocationTextExtractionStrategy regionFilter = new LocationTextExtractionStrategy();
filters[0] = new RegionTextRenderFilter(customerIdRectangle);
FilteredTextRenderListener strategy = new FilteredTextRenderListener(regionFilter, filters);

for (int i = 1; i <= reader.NumberOfPages; i++)
{
    string output = "";
    output = PdfTextExtractor.GetTextFromPage(reader, i, strategy);
    Console.WriteLine(output);
}

回答1:


Please adapt your code like this:

var reader = new PdfReader(foregroundFile);

RectangleJ customerIdRectangle = new RectangleJ(0, 495, 108, 27);

for (int i = 1; i <= reader.NumberOfPages; i++)
{
    RenderFilter[] filters = new RenderFilter[1];
    LocationTextExtractionStrategy regionFilter = new LocationTextExtractionStrategy();
    filters[0] = new RegionTextRenderFilter(customerIdRectangle);
    FilteredTextRenderListener strategy = new FilteredTextRenderListener(regionFilter, filters);
    string output = "";
    output = PdfTextExtractor.GetTextFromPage(reader, i, strategy);
    Console.WriteLine(output);
}


来源:https://stackoverflow.com/questions/20959292/when-i-extract-text-from-a-pdf-file-using-itext-i-am-getting-values-from-previou

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!