Web Scrapping

I have a website link in which I need to capture timings of a particular entity webpage. Give me a code to collect the data from that and return it in excel format.

python

21st Jul 2024, 6:04 PM

MOHAMMAD FAIYAZ

3 Réponses

+ 4

First off, that’s not neccesarily legal. Second off, what does “capture timings of a particular entity webpage” even mean? Third off, this is not a free code buffet and should not be treated as such. Chances are somebody will hand you the code you so rudely demanded, but this is primarily a help forum, and “questions” like this that just want someone else to do your assigned task for you without you learning anything are far from what I understand to be the spirit of the SoloLEARN platform.

22nd Jul 2024, 2:42 AM

Wilbur Jaywright

+ 3

Hi, MOHAMMAD FAIYAZ! As always, we have a responsibility to follow the rules and laws that apply, including for web scraping. Aside from this, a good starting point is to explore the possibilities with Python and libraries like Beautiful Soup and Requests.

22nd Jul 2024, 5:16 AM

Per Bratthammar

+ 2

It would be best if you learn libraries such as BeatifulSoup and Selenium. But let me share a simple example: I didn't understand what you meant by "timing". So in the example, I will parse only the heading. # URL of the webpage url = "your_website_link" # Fetching the webpage response = requests.get(url) # Parsing the html source using BeautifulSoup soup = BeautifulSoup(response.content, 'html.parser') # For instance, parsing the whole heading in the web page headings = [i.text.strip() for i in soup.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6'])] # Creating a pandas DataFrame data = { 'url': url, 'Headings': headings } df = pd.DataFrame(data) # Saving the df to an excel file df.to_excel('webpage_info.xlsx', index=False)

23rd Jul 2024, 11:04 AM

Tentra