+ 4
Can Python do that?
Hi guys. I've heard that Python can do basically anything. And I was wondering, suppose I have PDF files (say some quarterly publications by some firms) that I need to fill some of the information from them to an Excel file, so I need to create a program that will do it automatically. Can Python do that? If yes, then can it do that alone or does it need some other programs, e.g. PDF editor etc., installed on my device? Thanks in advance!
2 Answers
+ 4
of course it can
look at this code
import PyPDF2
pdf_file = open('sample.pdf', 'rb')
read_pdf = PyPDF2.PdfFileReader(pdf_file)
number_of_pages = read_pdf.getNumPages()
page = read_pdf.getPage(0)
page_content = page.extractText()
print page_content.encode('utf-8')
you can do it with textract
from textract import process
text=process ('/path/to/file.pdf')
+ 2
@Minovsky, thanks man! I'll try it )))