The script
Here’s a proof-of-concept for fetching a product’s price from a specific online store’s product page using Python:
# How to automatically fetch product prices
# from the website of a well-known European "T" store
# Created by hartsa.fi
import requests
import re
from bs4 import BeautifulSoup
from decimal import Decimal
try:
th_url_record = "https://www.thomann.de/fi/boss_rc_500_loop_station.htm"
page = requests.get(th_url_record)
page.raise_for_status()
soup = BeautifulSoup(page.content, "html.parser")
price = soup.find("meta", itemprop="price")
price_string = re.findall('"([^"]*)"', str(price).split(" ")[1])[0]
result = Decimal(price_string).quantize(Decimal('0.00'))
print(f"The 'T' Price: {result}")
except requests.exceptions.HTTPError as e:
if e.response.status_code == 404:
print(f"HTTP 404 Error: URL not found - {th_url_record}")
else:
print(f"HTTP Error making request to URL {th_url_record}: {e}")
In this example, we use a somewhat random product to demonstrate how to fetch the price from the Finnish pages of the store. Essentially, all you need is the URL of the product page, which appears to remain quite consistent for this particular store. In our example the URL is https://www.thomann.de/fi/boss_rc_500_loop_station.htm.
This flow shows how to install the prerequisites and run the script:
mkdir scraper-test
cd scraper-test
python3 -m venv myenv
source myenv/bin/activate
nano scraper.py
pip install requests
pip install beautifulsoup4
python scraper.py
The 'T' Price: 311.00
deactivate
In the “nano scraper.py
” step, paste and save the code. The fetched price is shown at “The 'T' Price: 311.00
“. As evident from earlier, we are using a Python virtual environment.
You can easily set up an input text file or a database to store the products and their URLs for fetching. Similarly, the fetched prices can be saved into a file or database for further processing. From there, you can build a Web GUI to view your price list, integrate the price data into an application, ERP system, or any other tool, or even automate sending price information via email.
Automated fetching can be conveniently set up using crontab (on Linux or macOS), launchd (on macOS), or the Windows Task Scheduler, depending on your platform.
The possibilities are quite flexible depending on your needs!
This script has been tested on both macOS and Linux, and a more advanced version has proven to be quite reliable over an extended period. However, this reliability depends on the website’s structure remaining consistent, which could change at any time.
Legality
Data scraping may or may not be legal, depending on the use case. Ensure you fully understand the legal implications before implementing such solutions.
