Web Scraping with Python: Introduction and Tutorial

The World Wide Web is made up of billions of interlinked documents, widely known as web pages. The source text of the web pages is written in the Hypertext Markup Language (HTML). The HTML source code is a Mixture of human-readable information and machine-readable codes, the so-called tags. The web browser - e.g. B. Chrome, Firefox, Safari or Edge - processes the source text, interprets the tags and presents the information contained therein for the user.

Special software is used to specifically extract only information that is of interest to people from the source text. These programs known as "Web Scraper", "Crawler", "Spider" or simply "Bot" search the source text of websites according to given patterns and extract the information it contains. The information obtained through web scraping is summarized, combined, evaluated or saved for further use.

In the following, we explain why the Python language is particularly suitable for creating web scrapers and give you an introduction with a corresponding tutorial.