Can anyone point me to a tutorial.
My main experience with Solr is indexing CSV files. But I cannot find any simple instructions/tutorial to tell me what I need to d
The hardest part of this is getting the metadata from the PDFs, using a tool like Aperture simplifies this. There must be tonnes of these tools
Aperture is a Java framework for extracting and querying full-text content and metadata from PDF files
Apeture grabbed the metadata from the PDFs and stored it in xml files.
I parsed the xml files using lxml and posted them to solr