How to convert pdf file to excel in c#

后端 未结 3 1819
南方客
南方客 2021-01-24 09:36

I want to extract some data like \" email addresses \" .. from table which are in PDF file and use this email addresses which I extract to send email to those peopl

3条回答
  •  误落风尘
    2021-01-24 09:57

    Using bytescout PDF Extractor SDK we can be able to extract the whole page to csv as below.

    CSVExtractor extractor = new CSVExtractor();
    extractor.RegistrationName = "demo";
    extractor.RegistrationKey = "demo";
    
    TableDetector tdetector = new TableDetector();
    tdetector.RegistrationKey = "demo";
    tdetector.RegistrationName = "demo";
    
    // Load the document
    extractor.LoadDocumentFromFile("C:\\sample.pdf");
    tdetector.LoadDocumentFromFile("C:\\sample.pdf");
    
    int pageCount = tdetector.GetPageCount();
    
    for (int i = 1; i <= pageCount; i++)
    {
        int j = 1;
    
            do
            {
                    extractor.SetExtractionArea(tdetector.GetPageRect_Left(i),
                    tdetector.GetPageRect_Top(i),
                    tdetector.GetPageRect_Width(i),
                    tdetector.GetPageRect_Height(i)
                );
    
                // and finally save the table into CSV file
                extractor.SavePageCSVToFile(i, "C:\\page-" + i + "-table-" + j + ".csv");
                j++;
            } while (tdetector.FindNextTable()); // search next table
    }
    

提交回复
热议问题