问题
In my windows 8 application, I would like to read a PDF line by line then I would like to assign a String array. How can I do it?
public StringBuilder addd= new StringBuilder();
string[] array;
private async void btndosyasec_Click(object sender, RoutedEventArgs e)
{
FileOpenPicker openPicker = new FileOpenPicker();
openPicker.ViewMode = PickerViewMode.List;
openPicker.SuggestedStartLocation = PickerLocationId.PicturesLibrary;
openPicker.FileTypeFilter.Add(".pdf");
StorageFile file = await openPicker.PickSingleFileAsync();
if (file != null)
{
PdfReader reader = new PdfReader((await file.OpenReadAsync()).AsStream());
for (int page = 1; page <= reader.NumberOfPages; page++)
{
addd.Append(PdfTextExtractor.GetTextFromPage(reader, page));
string tmp= PdfTextExtractor.GetTextFromPage(reader, page);
array[page] = tmp.ToString();
reader.Close();
}
}
}
回答1:
Hi I had this problem too, I used this code, it worked.
You will need a reference to the iTextSharp lib.
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
PdfReader reader = new PdfReader(@"D:\test pdf\Blood Journal.pdf");
int intPageNum = reader.NumberOfPages;
string[] words;
string line;
for (int i = 1; i <= intPageNum; i++)
{
text = PdfTextExtractor.GetTextFromPage(reader, i, new LocationTextExtractionStrategy());
words = text.Split('\n');
for (int j = 0, len = words.Length; j < len; j++)
{
line = Encoding.UTF8.GetString(Encoding.UTF8.GetBytes(words[j]));
}
}
words array contains lines of pdf file
来源:https://stackoverflow.com/questions/25424816/how-to-read-a-pdf-file-line-by-line-in-c