+ 2

My first WebCrawler

Decided to try my hand at a Web crawler today. First one took all the links from a page and put them in a file. when done with that page it picks a link from that file and continues. Second one takes all .jpg files and saves them to a folder. Third one let's you pick a word and counts how many times it appears on a website. because we are lame, when we found words with 1 instance, me and the little women seek and find them. Since nobody else cares about my programming adventures, figured I'd share

python web crawler

16th Apr 2017, 11:56 PM

LordHill

9 odpowiedzi

+ 3

thats very interesting. u might motivate me to make a web crawler

17th Apr 2017, 12:03 AM

Edward

+ 23

nice job didn't caught 2nd sent of 3 par

17th Apr 2017, 12:02 AM

Illusive Man

+ 10

Very Nice. ^.^

17th Apr 2017, 12:50 AM

Style Jr.

+ 3

only problem is i dont know much about reading from websites and stuff like that, so do u think you can show me what tutorials to start looking into

17th Apr 2017, 12:07 AM

Edward

+ 3

a webcrawler is a nice project, if you have some experience with web search engines. A lot of exciting aplications, an a lot of complexities too

17th Apr 2017, 12:44 AM

⏩▶Clau◀⏪

+ 2

ok thanks i will start there

17th Apr 2017, 12:12 AM

Edward

+ 1

Very nice, well done. :D

17th Apr 2017, 12:05 AM

DaemonThread

Do it! I very much enjoyed myself

17th Apr 2017, 12:06 AM

LordHill

https://m.youtube.com/watch?v=qfGthiqwaZo This is the tutorial I watched, but his method didn't work. html2text didn't work for me at all, I think it's a python 2 vs 3 issue. urllib2 had same problem, but I built on the same principles and after some tinkering it is working excellent

17th Apr 2017, 12:10 AM

LordHill