Do It Yourself – Tutorials – Let's Build a Python Web Scraping Project from Scratch | Hands-On Tutorial

by | Apr 18, 2021 | 0 comments

Do It Yourself – Tutorials – Let's Build a Python Web Scraping Project from Scratch | Hands-On Tutorial

by | Apr 18, 2021 | Do It Yourself - Build Your Own Website | 0 comments

Do It Yourself – Website Tutorials



Join our upcoming 20-week data science boot camp: https://www.jovian.ai/data-analyst-bootcamp

💻 Web scraping is the process of extracting and parsing data from websites in an automated fashion using a computer program. It’s a useful technique for creating datasets for research and learning.

🔗 Resources used in the workshop:
– Rough notebook: https://jovian.ai/aakashns-6l3/scraping-github-topics-repositories-rough
– Final notebook: https://jovian.ai/aakashns-6l3/scraping-github-topics-repositories
– Web scraping project guide: https://jovian.ai/aakashns/python-web-scraping-project-guide
– Web scraping tutorial: https://jovian.ai/aakashns/python-web-scraping-and-rest-api

In this workshop, we’ll use Python and its ecosystem of libraries to scrape information from a website and create a dataset of CSV file(s).

Here are the steps we’ll follow to build a web scraping project from scratch:
✅ Pick a website and identify the information to be scraped into a CSV file
💾 Use the requests library to download web pages from the site programmatically
💬 Use Beautiful Soup to parse and extract information from web pages
📝 Create well-formatted CSV file(s) with the extracted information
✍ Document and share your work online in the form of a Jupyter notebook or blog post

Time Breaks:
Introduction 00:00
Problem Statement 7:03
Setting up Jupyter 21:05
Fetching pages with requests 31:23
Parsing pages with beautifulsoup 39:23
Saving data to CSV files 1:07:36
Scraping another page 1:09:04
Defining functions 1:17:29
Putting it together 1:36:34
Documentation 2:03:28
Publishing your notebook 2:32:09
Q&A 2:34:17

🎤 About the speaker
Aakash N S is the co-founder and CEO of Jovian – a community learning platform for data science & ML. Previously, Aakash has worked as a software engineer (APIs & Data Platforms) at Twitter in Ireland & San Francisco and graduated from the Indian Institute of Technology, Bombay. He’s also an avid blogger, open-source contributor, and online educator.


Learn Data Science the right way at https://www.jovian.ai
Interact with a global community of like-minded learners https://jovian.ai/forum/
Get the latest news and updates on Machine Learning at https://twitter.com/jovianml
Connect with us professionally on https://linkedin.com/company/jovianml
Subscribe for new videos on Artificial Intelligence https://youtube.com/jovianml

source