How to extract text from pdf in Python 3.7

后端未结

关注

 10  1254

后悔当初 2020-12-29 10:19

I am trying to extract text from a PDF file using Python. My main goal is I am trying to create a program that reads a bank statement and extracts its text to update an exce

10条回答

南笙 (楼主)

2020-12-29 10:59
Using tika worked for me!
```
from tika import parser

rawText = parser.from_file('January2019.pdf')

rawList = rawText['content'].splitlines()
```
This made it really easy to extract separate each line in the bank statement into a list.
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...