I've written a library to do just that. It will also OCR the pdf's using Tesseract or Cuneiform and create searchable, compressed PDF files. It's a library that uses several open source projects (iTextsharp, jbig2 encoder, Aforge, muPDF#) to complete the task. You can check it out here http://hocrtopdf.codeplex.com/