Take and Compare Website Screenshots with Python

I maintain a few wordpress websites and every time I install multiple updates, I have to click through the sites to see if everything still works. Small, potential design issues could still be overseen. To simplify and to improve this process, I created a python script that takes multiple screenshots of the website. This script is executed before and after the upgrade. Once this is done, I execute another script that compares the screenshots and shows if there are any differences. This blog post is about those two scripts – one that takes screenshots, another that compares two images.

The screenshot taker script uses selenium and therefore, we need to download the selenium webdriver at first. The second script uses pillow library in python to compare images:

Prerequisites

  1. Download and install webdriver
  2. extract exe into a folder and add folder path to PATH
  3. Install selenium and pillow:
pip install selenium
pip install pillow

# or 
# python -m pip install selenium
# python -m pip install pillow

Python script to take screenshots of a website

The following is the simplified version of the “screenshot taker” script. It contains the hardcoded URL and a hardcoded list of subsites. Later in this blog post, I will link a nicer one which accepts arguments, crawls the website for urls and allows a few more parameters.

import os
from datetime import datetime
from time import sleep
from selenium import webdriver
from selenium.webdriver.firefox.options import Options

siteurl = "https://arminreiter.com"
screen_width = 2560
screen_height = 1440
output_directory = 'output_' + datetime.now().strftime('%Y%m%d_%H%M%S')

sites = [ "/", "/about", "/resources", "/privacy-policy"]

options = Options()
options.add_argument("--headless")

driver = webdriver.Firefox(options=options)
driver.set_window_size(screen_width, screen_height)

os.makedirs(output_directory , exist_ok=True)

for url in sites:
    print("get " + url + "...")
    filename = url.replace('/','_') + ".png"

    driver.get(siteurl + url)
    sleep(3)
    outfile = os.path.join(output_directory, filename)
    driver.get_screenshot_as_file(outfile)

driver.quit()

Even the script above contains the hardcoded urls and only takes screenshots, it should already be sufficient for many use cases. When the script ran, we will have screenshots of our website. If we run it before and after the upgrade, we will have screenshots to compare. So, lets write the script that allows us to compare two images:

Python script to compare two images

import argparse
import os
from datetime import datetime
from PIL import Image, ImageChops

parser = argparse.ArgumentParser()
parser.add_argument("--first",  "-f", help="path to the first folder for image comparison", required=True)
parser.add_argument("--second", "-s", help="path to the second folder for image comparison", required=True)
args = parser.parse_args()

dir1 = args.first
dir2 = args.second

outputdir = 'result_' + datetime.now().strftime('%Y%m%d_%H%M%S')

for filename in os.listdir(dir1):

    file1 = os.path.join(dir1, filename)
    file2 = os.path.join(dir2, filename)
    im1 = Image.open(file1)
    im2 = Image.open(file2)

    print('Compare ' + filename + ' (' + file1 + ' AND ' + file2 + ')')

    diff_img = ImageChops.difference(im1, im2).convert('RGB')
    if diff_img.getbbox():
        outpath = outputdir + '/' + filename + "-s.png"
        print("Images are different, store difference in " + outpath)
        os.makedirs(outputdir , exist_ok=True)
        diff_img.save(outpath)
    else:
        print("Images are equal")

Improved Python Scripts

The following scripts are improved version of the two above. The improved screenshot script takes the url of the website and the screen resolution as parameters. It uses a webservice to get all subsites of the website and takes screenshots of each subsites. The number of subsites can be limited by using the -l parameter.
The image compare script is just extended by a description and 1-2 small adoptions.

Improved Website Screenshot Python Script

Improved Image Compare Python Script

Categories:

No responses yet

Leave a Reply

Your email address will not be published. Required fields are marked *

About
about armin

Armin Reiter
Blockchain/Web3, IT-Security & Azure
Vienna, Austria

Reiter ITS Logo

Cryptix Logo

Legal information