I have thousands of PDF files in my computers which names are from a0001.pdf to a3621.pdf, and inside of each there is a title; e.g. \"aluminum carbona
a0001.pdf
a3621.pdf
What you need is a library that can actually read PDF files. For example pdfrw:
In [8]: from pdfrw import PdfReader In [9]: reader = PdfReader('example.pdf') In [10]: reader.Info.Title Out[10]: 'Example PDF document'