+ 1

Pdf to csv or excel with Python

It's possible to download a pdf from python, and transform the response to a csv or a excel file? I can parse the response and convert it in a pdf, but I would want to turn it to csv or excel without the step of converting it to pdf (even if it can be done with the pdf file is OK). The response when I ask python for the type says "bytes" Thanks☺️

15th Aug 2019, 8:09 PM
Zalo203
6 Réponses
+ 3
Zalo203 If possible, making your PDF available via a download link could be useful for people willing to take a closer look.
15th Aug 2019, 8:54 PM
David Carroll
David Carroll - avatar
+ 1
Pdf can contain lots of different types of content, images blocks of text, etc. It is really not straightforward to convert it to any spreadsheet format, if at all possible. What format is your original data? If it can be captured in a pandas dataframe, then it is really easy to export to excel or csv with built-in functions of pandas.
15th Aug 2019, 8:29 PM
Tibor Santa
Tibor Santa - avatar
+ 1
I am trying to do that, transform the data inside the pdf into a data frame to manipulate it with pandas, but I can't find a way to do it, the pdf in the mayority is text. I tried to use an online converter and it worked quite well, but I'm trying to do it without having to upload all pdfs and then download the converted file. Thanks for answering ☺️
15th Aug 2019, 8:41 PM
Zalo203
+ 1
Ok so does the PDF contains some sort of table? If you have to do this repeatedly, does the pdf consistently have the same structure? Same amount of table rows / columns even? In any case you will need to find the right library that can process the pdf. I would start looking here: https://realpython.com/pdf-JUMP_LINK__&&__python__&&__JUMP_LINK/
15th Aug 2019, 8:45 PM
Tibor Santa
Tibor Santa - avatar
+ 1
It hasn't tables exactly, but it can be interpreted like a kind of tables, so I can somehow filter it once the data is in excel ot similar format, I will take a look to the link. So many thanks☺️
15th Aug 2019, 8:49 PM
Zalo203
+ 1
Yes, but unfortunately I can't upload it, sorry. A method for a standard pdfs without tables or images would be useful. Thanks for the help.
15th Aug 2019, 9:04 PM
Zalo203