Don't be lazy and verify ETAG of a downloaded file

20 Dec 2016

Don't be lazy and please verify the etag of a downloaded file.

For example, https://chromedriver.storage.googleapis.com/index.html?path=2.26/ shows etags of each file. Or we can get etag from a response header.

Hardcode the etag and verify that the downloaded file's etag matches the expected etag. Etag is basically an MD5 checksum.

Here's how we can download a file in Python 2.7 and verify its etag:

import hashlib import tempfile import urllib URL = 'https://chromedriver.storage.googleapis.com/2.26/chromedriver_linux64.zip' ETAG = '3cdae483af1e54c6732abc9af875b9c1' tmp = tempfile.NamedTemporaryFile(delete=False) tmp.close() urllib.urlretrieve(URL, tmp.name) with open(tmp.name, 'rb') as f: actual_etag = hashlib.md5(f.read()).hexdigest() if actual_etag != ETAG: raise Exception( 'The checksum (md5) of %s is not correct (expected=%s, actual=%s).' ' It is unsafe to use the downloaded zip.' % (URL, ETAG, actual_etag))

Give it a kudos