+ 1
Need help with code
I want to create a folder and scrape the images from a website into the folder, the code creates the file but doesn’t download any images, A little help would be great, Need to run in IDE because of import modules https://code.sololearn.com/c9Oy9Gmw7WNK/?ref=app
14 ответов
+ 1
Yes I know about the request library but the code doesn’t work when I try it in Pycharm
0
requests library isn't install in sololearn. That's the same thing about urllib2, too. So you can't use them.
0
Oh, sorry for my mistake.
I tried to print soup and see page's content and it looks like that cloudflare system asks for captcha to prove that a human is sending the request and website doesn't give you real content including model images you're looking for.
0
you mean the pexels.com website asks for a captcha ?
0
Yes, at least when using requests library.
0
ok Thanks I’ll try something else
0
OK, I’ve tried with other websites and it’s working, my problem now is it only downloads the images from the first page
I’m guessing I’ll probably need to set some condition using Selenium Webdriver, Are you familiar yourself with Selenium ?
0
Yes, but just a bit
Are you sure you need to use selenium?
How much of website images do you want to download?(I mean images of what other pages do you want to download)
0
I’m just doing it as a learning exercise so what I’m looking to do is to automate my browser to navigate to all the pages of a website and scrape/download the images
0
As an idea, you can make a list of urls that you have copied. Then collect urls of the home page and if they weren't in the list, download its images and then add url to that list.
I suppose the script will end up with all website's images becoming downloaded
0
Sorry I’m having a little problem understanding what you mean
0
I mean get url tags of home page.
And then get their content and download their images as you have done for homepage.
But some urls may download more than one time. To prevent this, you can create a list and after downloading a url images, add that url to the list and before downloading a url check if it doesn't exist in the list.
0
ok now I understand, Thanks
0
you can use termux for android