0
Web Scrapping
I have a website link in which I need to capture timings of a particular entity webpage. Give me a code to collect the data from that and return it in excel format.
3 Respostas
+ 4
First off, that’s not neccesarily legal. Second off, what does “capture timings of a particular entity webpage” even mean? Third off, this is not a free code buffet and should not be treated as such. Chances are somebody will hand you the code you so rudely demanded, but this is primarily a help forum, and “questions” like this that just want someone else to do your assigned task for you without you learning anything are far from what I understand to be the spirit of the SoloLEARN platform.
+ 3
Hi, MOHAMMAD FAIYAZ!
As always, we have a responsibility to follow the rules and laws that apply, including for web scraping. Aside from this, a good starting point is to explore the possibilities with Python and libraries like Beautiful Soup and Requests.
+ 2
It would be best if you learn libraries such as BeatifulSoup and Selenium. But let me share a simple example:
I didn't understand what you meant by "timing". So in the example, I will parse only the heading.
# URL of the webpage
url = "your_website_link"
# Fetching the webpage
response = requests.get(url)
# Parsing the html source using BeautifulSoup
soup = BeautifulSoup(response.content, 'html.parser')
# For instance, parsing the whole heading in the web page
headings = [i.text.strip() for i in soup.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6'])]
# Creating a pandas DataFrame
data = {
'url': url,
'Headings': headings
}
df = pd.DataFrame(data)
# Saving the df to an excel file
df.to_excel('webpage_info.xlsx', index=False)