Indexing PDF with Solr
Can anyone point me to a tutorial. My main experience with Solr is indexing CSV files. But I cannot find any simple instructions/tutorial to tell me what I need to do to index pdfs. I have seen this: http://wiki.apache.org/solr/ExtractingRequestHandler But it makes very little sense to me. Do I need to install Tika? Im lost - please help With solr-4.9 (the latest version as of now), extracting data from rich documents like pdfs, spreadsheets(xls, xlxs family), presentations(ppt, ppts), documentation(doc, txt etc) has become fairly simple. The sample code examples provided in the downloaded