Web scraping is a very powerful tool to learn for any data professional. With web scraping the entire internet becomes your database. In this tutorial we show you how to parse a web page into a data file (csv) using a Python package called BeautifulSoup.
In this example, we web scrape graphics cards from NewEgg.com.
Python Code:
%20with%20Python%20and%20BeautifulSoup
Sublime:
Anaconda:
JavaScript beautifier:
If you are not seeing the command line, follow this tutorial:
—
Table of Contents:
0:00 – Introduction
1:28 – Setting up Anaconda
3:00 – Installing Beautiful Soup
3:43 – Setting up urllib
6:07 – Retrieving the Web Page
10:47 – Evaluating Web Page
11:27 – Converting Listings into Line Items
16:13 – Using jsbeautiful
16:31 – Reading Raw HTML for Items to Scrape
18:34 – Building the Scraper
22:11 – Using the “findAll” Function
27:26 – Testing the Scraper
29:07 – Creating the .csv File
32:18 – End Result
—
Learn more about Data Science Dojo here:
Watch the latest video tutorials here:
See what our past attendees are saying here:
—
Like Us:
Follow Us:
Connect with Us:
Also find us on:
Instagram:
Vimeo:
#webscraping #python #pythontutorial