0
How to scrape data from online PDFs?
Theres a bunch of data i would like to collect from a bunch of tables in a bunch of PDFs that a particular website contains. How do i scrape the data so that i don't need to open each PDF file and search for the specific data i need? Or where should i get started? (I'm not experienced in webscraping but i know python, html, css and a bit of javascript).
1 Odpowiedź
0
R has two useful libraries that can help you "rvest" and "pdftools" with they both you could get info from the web and pdf respectively
You could search them in the documentation or take this Edx course it have many examples about
https://www.edx.org/course/data-science-wrangling-3