AES: how to detect that a bad password has been entered?

问题

A text s has been encrypted with:

s2 = iv + Crypto.Cipher.AES.new(Crypto.Hash.SHA256.new(pwd).digest(), 
                                    Crypto.Cipher.AES.MODE_CFB, 
                                    iv).encrypt(s.encode())

Then, later, a user inputs the password pwd2 and we decrypt it with:

iv, cipher = s2[:Crypto.Cipher.AES.block_size], s2[Crypto.Cipher.AES.block_size:]

s3 = Crypto.Cipher.AES.new(Crypto.Hash.SHA256.new(pwd2).digest(),
                           Crypto.Cipher.AES.MODE_CFB, 
                           iv).decrypt(cipher)

Problem: the last line works even if the entered password pw2 is wrong. Of course the decrypted text will be random chars, but no error is triggered.

Question: how to make Crypto.Cipher.AES.new(...).decrypt(cipher) fail if the password pw2 is incorrect? Or at least how to detect a wrong password?

Here is a linked question: Making AES decryption fail if invalid password and here a discussion about the cryptographic part (less programming) of the question: AES, is this method to say “The password you entered is wrong” secure? .

回答1:

AES provides confidentiality but not integrity out of the box - to get integrity too, you have a few options. The easiest and arguably least prone to "shooting yourself in the foot" is to just use AES-GCM - see this Python example or this one.

You could also use an HMAC, but this generally requires managing two distinct keys and has a few more moving parts. I would recommend the first option if it is available to you.

A side note, SHA-256 isn't a very good KDF to use when converting a user created password to an encryption key. Popular password hashing algorithms are better at this - have a look at Argon2, bcrypt or PBKDF2.

Edit: The reason SHA-256 is a bad KDF is the same reason it makes a bad password hash function - it's just too fast. A user created password of, say, 128 bits will usually contain far less entropy than a random sequence of 128 bits - people like to pick words, meaningful sequences etc. Hashing this once with SHA-256 doesn't really alleviate this issue. But hashing it with a construct like Argon2 that is designed to be slow makes a brute-force attack far less viable.

回答2:

Doesn't use the Crypto package, but this should suit your needs:

import base64
import os

from cryptography.fernet import Fernet
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.kdf.scrypt import Scrypt


def derive_password(password: bytes, salt: bytes):
    """
    Adjust the N parameter depending on how long you want the derivation to take.
    The scrypt paper suggests a minimum value of n=2**14 for interactive logins (t < 100ms),
    or n=2**20 for more sensitive files (t < 5s).
    """
    kdf = Scrypt(salt=salt, length=32, n=2**16, r=8, p=1, backend=default_backend())
    key = kdf.derive(password)
    return base64.urlsafe_b64encode(key)


salt = os.urandom(16)
password = b'legorooj'
bad_password = b'legorooj2'

# Derive the password
key = derive_password(password, salt)
key2 = derive_password(bad_password, salt)  # Shouldn't re-use salt but this is only for example purposes

# Create the Fernet Object
f = Fernet(key)

msg = b'This is a test message'

ciphertext = f.encrypt(msg)

print(msg, flush=True)  # Flushing pushes it strait to stdout, so the error that will come
print(ciphertext, flush=True)

# Fernet can only be used once, so we need to reinitialize
f = Fernet(key)

plaintext = f.decrypt(ciphertext)

print(plaintext, flush=True)

# Bad Key
f = Fernet(key2)
f.decrypt(ciphertext)
"""
This will raise InvalidToken and InvalidSignature, which means it wasn't decrypted properly.
"""

See my comment for links to the documentation.

回答3:

For future reference, here is a working solution following the AES GCM mode (recommended by @LukeJoshuaPark in his answer):

from Crypto.Cipher import AES
from Crypto.Random import get_random_bytes

# Encryption
data = b"secret"
key = get_random_bytes(16)
cipher = AES.new(key, AES.MODE_GCM)
ciphertext, tag = cipher.encrypt_and_digest(data)
nonce = cipher.nonce

# Decryption
key2 = get_random_bytes(16)  # wrong key
#key2 = key  # correct key
try:
    cipher = AES.new(key2, AES.MODE_GCM, nonce=nonce)
    plaintext = cipher.decrypt_and_verify(ciphertext, tag)
    print("The message was: " + plaintext.decode())
except ValueError:
    print("Wrong key")

It does fail with an exception when the password is wrong indeed, as desired.

The following code uses a real password derivation function:

import Crypto.Random, Crypto.Protocol.KDF, Crypto.Cipher.AES

def cipherAES(pwd, nonce):
    return Crypto.Cipher.AES.new(Crypto.Protocol.KDF.PBKDF2(pwd, nonce, count=100000), Crypto.Cipher.AES.MODE_GCM, nonce=nonce)

# encryption
nonce = Crypto.Random.new().read(16)
cipher = cipherAES(b'pwd1', nonce)
ciphertext, tag = cipher.encrypt_and_digest(b'bonjour')

# decryption
try:
    cipher = cipherAES(b'pwd1', nonce=nonce)
    plaintext = cipher.decrypt_and_verify(ciphertext, tag)
    print("The message was: " + plaintext.decode())
except ValueError:
    print("Wrong password")

@fgrieu's answer is probably better because it uses scrypt as KDF.

回答4:

The best way is to use authenticated encryption, and a modern memory-hard entropy-stretching key derivation function such a scrypt to turn the password into a key. The cipher's nounce can be used as salt for the key derivation. With PyCryptodome that could be:

from Crypto.Random       import get_random_bytes
from Crypto.Cipher       import AES
from Crypto.Protocol.KDF import scrypt

# initialize an AES-128-GCM cipher from password (derived using scrypt) and nonce
def cipherAES(pwd, nonce):
    # note: the p parameter should allow use of several processors, but did not for me
    # note: changing 16 to 24 or 32 should select AES-192 or AES-256 (not tested)
    return AES.new(scrypt(pwd, nonce, 16, N=2**21, r=8, p=1), AES.MODE_GCM, nonce=nonce)

# encryption
nonce = get_random_bytes(16)
print("deriving key from password and nonce, then encrypting..")
ciphertext, tag = cipherAES(b'pwdHklot2',nonce).encrypt_and_digest(b'bonjour')
print("done")

# decryption of nonce, ciphertext, tag
print("deriving key from password and nonce, then decrypting..")
try:
    plaintext = cipherAES(b'pwdHklot2', nonce).decrypt_and_verify(ciphertext, tag)
    print("The message was: " + plaintext.decode())
except ValueError:
    print("Wrong password or altered nonce, ciphertext, tag")
print("done")

Note: Code is here to illustrate the principle. In particular, the scrypt parameters should not be fixed, but rather be included in a header before nonce, ciphertext, and tag; and that must be somewhat grouped for sending, and parsed for decryption.

Caveat: nothing in this post should be construed as an endorsement of PyCryptodome's security.

Addition (per request):

We need scrypt or some other form of entropy stretching only because we use a password. We could use a random 128-bit key directly.

PBKDF2-HMAC-SHAn with 100000 iterations (as in the OP's second code fragment there) is only barely passable to resist Hashcat with a few GPUs. It would would be almost negligible compared to other hurdles for an ASIC-assisted attack: a state of the art Bitcoin mining ASIC does more than 2*10¹⁰ SHA-256 per Joule, 1 kWh of electricity costing less than $0.15 is 36*10⁵ J. Crunching these numbers, testing the (62⁽⁸⁺¹⁾-1)/(62-1) = 221919451578091 passwords of up to 8 characters restricted to letters and digits cost less than $47 for energy dedicated to the hashing part.

scrypt is much more secure for equal time spent by legitimate users because it requires a lot of memory and accesses thereof, slowing down the attacker, and most importantly making the investment cost for massively parallel attack skyrocket.

来源：https://stackoverflow.com/questions/59145627/aes-how-to-detect-that-a-bad-password-has-been-entered

标签

python

encryption

cryptography

aes