Hashing is a fundamental technique in programming that converts data into a fixed-size string of characters. Unlike encryption, hashing is a one-way process: you can’t reverse it to get the original data back.
This makes it perfect for storing passwords, verifying file integrity, and creating unique identifiers. In this tutorial, you will learn how to use Python built-in hashlib Module To implement secure hashing in your applications.
By the end of this tutorial, you will understand:
How to create a basic hash with different algorithms
Why simple hashing is not enough for passwords
How to Add Salt to Prevent Rainbow Table Attacks
How to use key derivation functions for password storage
You can find the code on GitHub.
Conditions
To follow this tutorial, you should:
Basic Python: Variables, data types, functions, and control structures
Understanding strings and bytes: How to encode strings and work with byte data
No external libraries are required, e.g Hashlib And OS Both are part of Python’s standard library.
Table of Contents
Basic Hashing with Python’s hashlib
Let’s start with the basics. The hashlib module provides access to several hashing algorithms such as MD5for , for , for , . SHA-1for , for , for , . SHA-256and more.
Here’s how to create a simple SHA-256 hash:
import hashlib
message = "Hello, World!"
hash_object = hashlib.sha256(message.encode())
hex_digest = hash_object.hexdigest()
print(f"Original: {message}")
print(f"SHA-256 Hash: {hex_digest}")
Output:
Original: Hello, World!
SHA-256 Hash: dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f
Here, we import the hashlib module, encoding our string into bytes .encode() As hashlib requires bytes, not strings.
Then we create a hash object using hashlib.sha256() and get the hexadecimal representation .hexdigest().
The resulting hash is always 64 characters long regardless of the input size. Meaning you have an output string that is 256 bits long. Since each hexadecimal character requires 4 bits, The output has 256/4 = 64 hexadecimal characters. Changing even one character produces a completely different hash.
Let’s verify this:
import hashlib
message1 = "Hello, World!"
message2 = "Hello, World?"
hash1 = hashlib.sha256(message1.encode()).hexdigest()
hash2 = hashlib.sha256(message2.encode()).hexdigest()
print(f"Message 1: {message1}")
print(f"Hash 1: {hash1}")
print(f"\nMessage 2: {message2}")
print(f"Hash 2: {hash2}")
print(f"\nAre they the same? {hash1 == hash2}")
Output:
Message 1: Hello, World!
Hash 1: dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f
Message 2: Hello, World?
Hash 2: f16c3bb0532537acd5b2e418f2b1235b29181e35cffee7cc29d84de4a1d62e4d
Are they the same? False
This property is called The avalanche effect Where a small change produces a completely different output.
Why simple hashing is not enough for passwords
You might think you can just hash passwords and store them in your database. But there’s a problem: attackers use Rainbow tableswhich are pre-existing databases of hashes for common passwords.
Here’s what happens:
import hashlib
password = "password123"
hashed = hashlib.sha256(password.encode()).hexdigest()
print(f"Password: {password}")
print(f"Hash: {hashed}")
Output:
Password: password123
Hash: ef92b778bafe771e89245b89ecbc08a44a4e166c06659911881f383d4473e94f
If two users have the same password, they will have the same hash. An attacker who cracks a hash knows the passwords for all users with that hash.
So how do we handle it? Let’s learn in the next section.
Adding salt to your hashes
There is a solution salty: Adding random data to each password before hashing. Thus, even identical passwords generate different hashes.
Here’s how to implement salted hashing:
import hashlib
import os
def hash_password_with_salt(password):
salt = os.urandom(16)
hash_object = hashlib.sha256(salt + password.encode())
password_hash = hash_object.hexdigest()
return salt.hex(), password_hash
password = "password123"
salt1, hash1 = hash_password_with_salt(password)
salt2, hash2 = hash_password_with_salt(password)
print(f"Password: {password}\n")
print(f"First attempt:")
print(f" Salt: {salt1}")
print(f" Hash: {hash1}\n")
print(f"Second attempt:")
print(f" Salt: {salt2}")
print(f" Hash: {hash2}\n")
print(f"Same password, different hashes? {hash1 != hash2}")
Output:
Password: password123
First attempt:
Salt: fc24b2d2245ff65b80c5bced38744171
Hash: 5ce634c05941d25871e7ee334b5c24c75f64c4f6d557db66909fcaa793d869f9
Second attempt:
Salt: bc8a1f79b07e56b51285557211f88bb0
Hash: 043599d90b2aa0556265869cead35724c7d9d9d37129d897c6b68bade9e737e6
Same password, different hashes? True
How it works:
os.urandom(16)Generates 16 random bytes, which is our saltWe combine the salt and password bytes before hashing
We both return the salt (eg
hex) and the hashYou must store both the salt and the hash in your database
When a user logs in, you retrieve their salt, hash the entered password with that salt, and compare the result to the stored hash.
Verifying the salted password
Now create a function to validate the password against the salted hashes:
import hashlib
import os
def hash_password(password, salt=None):
"""Hash a password with a salt. Generate new salt if not provided."""
if salt is None:
salt = os.urandom(16)
else:
if isinstance(salt, str):
salt = bytes.fromhex(salt)
password_hash = hashlib.sha256(salt + password.encode()).hexdigest()
return salt.hex(), password_hash
def verify_password(password, stored_salt, stored_hash):
"""Verify a password against a stored salt and hash."""
_, new_hash = hash_password(password, stored_salt)
return new_hash == stored_hash
Here’s how you can use the above:
print("=== User Registration ===")
user_password = "mySecurePassword!"
salt, password_hash = hash_password(user_password)
print(f"Password: {user_password}")
print(f"Salt: {salt}")
print(f"Hash: {password_hash}")
print("\n=== Login Attempts ===")
correct_attempt = "mySecurePassword!"
wrong_attempt = "wrongPassword"
print(f"Attempt 1: '{correct_attempt}'")
print(f" Valid? {verify_password(correct_attempt, salt, password_hash)}")
print(f"\nAttempt 2: '{wrong_attempt}'")
print(f" Valid? {verify_password(wrong_attempt, salt, password_hash)}")
Output:
=== User Registration ===
Password: mySecurePassword!
Salt: 381779b5262deea84183e4b9454b98b1
Hash: 9756e1f0bc4c1aa4a72f35b0be8d3c8f430d31613371cf7de3c615bc475de98f
=== Login Attempts ===
Attempt 1: 'mySecurePassword!'
Valid? True
Attempt 2: 'wrongPassword'
Valid? False
This implementation demonstrates a complete registration and login flow.
Using key derivative functions
Although salted SHA-256 is better than simple hashing, modern applications should use key derivative functions (KDFs) designed specifically for password hashing. They include PBKDF2 (password based key derivation function 2), bcryptfor , for , for , . The scriptand Argon 2. You can check the links for more information about these key derivative functions.
These algorithms are deliberately slow and require more computational resources, making brute force attacks very difficult. Let’s implement PBKDF2, built in Python:
import hashlib
import os
def hash_password_pbkdf2(password, salt=None, iterations=600000):
"""Hash password using PBKDF2 with SHA-256."""
if salt is None:
salt = os.urandom(32)
elif isinstance(salt, str):
salt = bytes.fromhex(salt)
password_hash = hashlib.pbkdf2_hmac(
'sha256',
password.encode(),
salt,
iterations,
dklen=32
)
return salt.hex(), password_hash.hex(), iterations
def verify_password_pbkdf2(password, stored_salt, stored_hash, iterations):
"""Verify password against PBKDF2 hash."""
_, new_hash, _ = hash_password_pbkdf2(password, stored_salt, iterations)
return new_hash == stored_hash
print("=== PBKDF2 Password Hashing ===")
password = "SuperSecure123!"
salt, hash_value, iterations = hash_password_pbkdf2(password)
print(f"Password: {password}")
print(f"Salt: {salt}")
print(f"Hash: {hash_value}")
print(f"Iterations: {iterations:,}")
These results:
=== PBKDF2 Password Hashing ===
Password: SuperSecure123!
Salt: b388aecd774f6a7ddd95405091548bb50102c99beb1a10326a4c54070da4a3a5
Hash: c681450f41d0cec9ea2aad1108efe2a430b9c3d9fc3af621071be10ac9b3615a
Iterations: 600,000
Now verify the password and also compare the speed of SHA-256 vs PBKDF2:
print("\n=== Verification ===")
is_valid = verify_password_pbkdf2(password, salt, hash_value, iterations)
print(f"Password valid? {is_valid}")
import time
print("\n=== Speed Comparison ===")
test_password = "test123"
start = time.time()
for _ in range(100):
hashlib.sha256(test_password.encode()).hexdigest()
sha256_time = time.time() - start
start = time.time()
for _ in range(100):
hash_password_pbkdf2(test_password)
pbkdf2_time = time.time() - start
print(f"1000 SHA-256 hashes: {sha256_time:.3f} seconds")
print(f"1000 PBKDF2 hashes: {pbkdf2_time:.3f} seconds")
print(f"PBKDF2 is {pbkdf2_time/sha256_time:.1f}x slower")
Output:
=== Verification ===
Password valid? True
=== Speed Comparison ===
100 SHA-256 hashes: 0.000 seconds
100 PBKDF2 hashes: 53.631 seconds
PBKDF2 is 240068.1x slower
How PBKDF2 works:
Takes your password and salt
This example uses a 600,000-repeat hash function (SHA-256).
Each iteration makes the calculation slower and harder to solve
You store the salt, hash and iteration count (so you can verify later)
The iteration count can be increased over time as computers get faster. Modern recommendations (2024) recommend 600,000 iterations for PBKDF2-SHA256.
The result
You have learned how to implement secure password hashing in Python hashlib Here are the key paths to the module:
Basic hashing with SHA-256 is useful for data integrity, not passwords
Salty prevents rainbow table attacks by making each hash unique
PBKDF increases the computational cost by 2 iterations, slowing down attackers
Always store salt, hash and iteration count together
Use key derivation functions (PBKDF2, BCRYPT, ARGON2) for passwords
The code examples in this tutorial provide a solid foundation for implementing validation in your projects. But remember, security is an ongoing process. Stay updated on best practices and regularly review your security implementation.
Happy (safe) coding!