+ 1

Why does my web crawler only work for selected websites?

I built a web crawler based on the python tutorials of thenewboston and let the program output all the links on the main page of wikipedia which it did without any problems. I also tried letting the code output all caption of sections and a few other things. It all worked perfectly. But when I simple wanted to output the links on Amazon‘s main page it did nothing. Is it Amazon not allowing me to do that or the modules I used? (I used the BeautifulSoup method from the pycharm build-in module bs4).

23rd Mar 2019, 4:57 PM
SohndesZeus
SohndesZeus - avatar
2 ответов
+ 4
Try adding a header as suggested in this stackoverflow topic: https://stackoverflow.com/questions/23555283/why-cant-i-scrape-amazon-by-beautifulsoup
27th Mar 2019, 2:13 PM
Tibor Santa
Tibor Santa - avatar
+ 2
thank you Im going to test that
27th Mar 2019, 6:45 PM
SohndesZeus
SohndesZeus - avatar