+ 2
Making A Python Search Engine
Hello everyone, I'm trying to learn to web scrap and I was wondering, is it possible to: 1. Have Python search a term with Google search 2. Create a list of the resulting URLs 3. Go to the next search page 4. Repeat step 2 an 3 for x result pages. I appreciate any help that can be offered. Sincerely, Adrian
2 Réponses
+ 6
You can do it "properly" by using a webscraping module 'beautifulsoup4' or go even further and make a webcrawler with 'scrapy'. Best would be to combine both, perhaps.
However, if you just want to retreive the links from a website, here's a little code that might help you:
import requests
import re
q = "Warszawa" # the search query
URL = 'https://www.google.pl/search?q='
for pg in range(10):
response = requests.get((URL+q+"&num=100&start="+str(pg*100)))
tekst = response.text
pattern = u'(https?:\/\/[\w\S\d]+)\"'
result = re.findall(pattern, tekst)
for i in result:
print(i)
Unfortunately, Sololearn does not support the 'requests' module and my attempts to replace it with urllib's request method returned "Access denied" error.
It is working properly on a normal Python installation, though.
0
This may be relevant for you:
https://www.udacity.com/course/intro-to-computer-science--cs101