+ 2

My first WebCrawler

Decided to try my hand at a Web crawler today. First one took all the links from a page and put them in a file. when done with that page it picks a link from that file and continues. Second one takes all .jpg files and saves them to a folder. Third one let's you pick a word and counts how many times it appears on a website. because we are lame, when we found words with 1 instance, me and the little women seek and find them. Since nobody else cares about my programming adventures, figured I'd share

16th Apr 2017, 11:56 PM
LordHill
LordHill - avatar
9 ответов
+ 3
thats very interesting. u might motivate me to make a web crawler
17th Apr 2017, 12:03 AM
Edward
+ 23
nice job didn't caught 2nd sent of 3 par
17th Apr 2017, 12:02 AM
Illusive Man
Illusive Man - avatar
+ 10
Very Nice. ^.^
17th Apr 2017, 12:50 AM
Style Jr.
Style Jr. - avatar
+ 3
only problem is i dont know much about reading from websites and stuff like that, so do u think you can show me what tutorials to start looking into
17th Apr 2017, 12:07 AM
Edward
+ 3
a webcrawler is a nice project, if you have some experience with web search engines. A lot of exciting aplications, an a lot of complexities too
17th Apr 2017, 12:44 AM
⏩▶Clau◀⏪
⏩▶Clau◀⏪ - avatar
+ 2
ok thanks i will start there
17th Apr 2017, 12:12 AM
Edward
+ 1
Very nice, well done. :D
17th Apr 2017, 12:05 AM
DaemonThread
DaemonThread - avatar
0
Do it! I very much enjoyed myself
17th Apr 2017, 12:06 AM
LordHill
LordHill - avatar
0
https://m.youtube.com/watch?v=qfGthiqwaZo This is the tutorial I watched, but his method didn't work. html2text didn't work for me at all, I think it's a python 2 vs 3 issue. urllib2 had same problem, but I built on the same principles and after some tinkering it is working excellent
17th Apr 2017, 12:10 AM
LordHill
LordHill - avatar