How to Decrypt Openssl Aes-Encrypted Files in Python

How to decrypt OpenSSL AES-encrypted files in Python?

Given the popularity of Python, at first I was disappointed that there was no complete answer to this question to be found. It took me a fair amount of reading different answers on this board, as well as other resources, to get it right. I thought I might share the result for future reference and perhaps review; I'm by no means a cryptography expert! However, the code below appears to work seamlessly:

from hashlib import md5
from Crypto.Cipher import AES
from Crypto import Random

def derive_key_and_iv(password, salt, key_length, iv_length):
d = d_i = ''
while len(d) < key_length + iv_length:
d_i = md5(d_i + password + salt).digest()
d += d_i
return d[:key_length], d[key_length:key_length+iv_length]

def decrypt(in_file, out_file, password, key_length=32):
bs = AES.block_size
salt = in_file.read(bs)[len('Salted__'):]
key, iv = derive_key_and_iv(password, salt, key_length, bs)
cipher = AES.new(key, AES.MODE_CBC, iv)
next_chunk = ''
finished = False
while not finished:
chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
if len(next_chunk) == 0:
padding_length = ord(chunk[-1])
chunk = chunk[:-padding_length]
finished = True
out_file.write(chunk)

Usage:

with open(in_filename, 'rb') as in_file, open(out_filename, 'wb') as out_file:
decrypt(in_file, out_file, password)

If you see a chance to improve on this or extend it to be more flexible (e.g. make it work without salt, or provide Python 3 compatibility), please feel free to do so.

Notice

This answer used to also concern encryption in Python using the same scheme. I have since removed that part to discourage anyone from using it. Do NOT encrypt any more data in this way, because it is NOT secure by today's standards. You should ONLY use decryption, for no other reasons than BACKWARD COMPATIBILITY, i.e. when you have no other choice. Want to encrypt? Use NaCl/libsodium if you possibly can.

Decrypting AES CBC in python from OpenSSL AES

The OpenSSL statement uses PBKDF2 to create a 32 bytes key and a 16 bytes IV. For this, a random 8 bytes salt is implicitly generated and the specified password, iteration count and digest (default: SHA-256) are applied. The key/IV pair is used to encrypt the plaintext with AES-256 in CBC mode and PKCS7 padding, s. here. The result is returned in OpenSSL format, which starts with the 8 bytes ASCII encoding of Salted__, followed by the 8 bytes salt and the actual ciphertext, all Base64 encoded. The salt is needed for decryption, so that key and IV can be reconstructed.

Note that the password in the OpenSSL statement is actually passed without quotation marks, i.e. in the posted OpenSSL statement, the quotation marks are part of the password.

For the decryption in Python the salt and the actual ciphertext must first be determined from the encrypted data. With the salt the key/IV pair can be reconstructed. Finally, the key/IV pair can be used for decryption.

Example: With the posted OpenSSL statement, the plaintext

The quick brown fox jumps over the lazy dog

was encrypted into the ciphertext

U2FsdGVkX18A+AhjLZpfOq2HilY+8MyrXcz3lHMdUII2cud0DnnIcAtomToclwWOtUUnoyTY2qCQQXQfwDYotw== 

Decryption with Python is possible as follows (using PyCryptodome):

from Crypto.Protocol.KDF import PBKDF2
from Crypto.Hash import SHA256
from Crypto.Util.Padding import unpad
from Crypto.Cipher import AES
import base64

# Determine salt and ciphertext
encryptedDataB64 = 'U2FsdGVkX18A+AhjLZpfOq2HilY+8MyrXcz3lHMdUII2cud0DnnIcAtomToclwWOtUUnoyTY2qCQQXQfwDYotw=='
encryptedData = base64.b64decode(encryptedDataB64)
salt = encryptedData[8:16]
ciphertext = encryptedData[16:]

# Reconstruct Key/IV-pair
pbkdf2Hash = PBKDF2(b'"mypassword"', salt, 32 + 16, count=100000, hmac_hash_module=SHA256)
key = pbkdf2Hash[0:32]
iv = pbkdf2Hash[32:32 + 16]

# Decrypt with AES-256 / CBC / PKCS7 Padding
cipher = AES.new(key, AES.MODE_CBC, iv)
decrypted = unpad(cipher.decrypt(ciphertext), 16)

print(decrypted)

Edit - Regarding your comment: 16 MB should be possible, but for larger data the ciphertext would generally be read from a file and the decrypted data would be written to a file, in contrast to the example posted above.

Whether the data can be decrypted in one step ultimately depends on the available memory. If the memory is not sufficient, the data must be processed in chunks.

When using chunks it would make more sense not to Base64 encode the encrypted data but to store them directly in binary format. This is possible by omitting the -a option in the OpenSSL statement. Otherwise it must be ensured that always integer multiples of the block size (relative to the undecoded ciphertext) are loaded, where 3 bytes of the undecoded ciphertext correspond to 4 bytes of the Base64 encoded ciphertext.

In the case of the binary stored ciphertext: During decryption only the first block (16 bytes) should be (binary) read in the first step. From this, the salt can be determined (the bytes 8 to 16), then the key and IV (analogous to the posted code above).

The rest of the ciphertext can be (binary) read in chunks of suitable size ( = a multiple of the block size, e.g. 1024 bytes). Each chunk is encrypted/decrypted separately, see multiple encrypt/decrypt-calls. For reading/writing files in chunks with Python see e.g. here.
Further details are best answered within the scope of a separate question.

how to decrypt a file in python which was encrypted using openssl

The password is not the key. Openssl uses EVP_BytesToKey to create an appropriate key (& IV, if necessary) from the password and the salt.

As James K Polk mentions in a comment, you can use the -P (or -p) option to tell Openssl to print the key (in hex ), which you can then pass to Crypto.Cipher. Alternatively, you can implement EVP_BytesToKey in Python, as shown below. This is a simplified version of EVP_BytesToKey that uses no salt, and the default value of 1 for the count arg.

As the EVP_BytesToKey docs state, this is a rather weak password derivation function. As the hashlib docs mention, modern password derivation normally performs hundreds of thousands of hashes to make password hashing attacks very slow.

We also need a function to remove the PKCS7 padding from the decrypted data bytes. The unpad function below simply assumes that the padding data is valid. In real software the unpad function must verify that the padded data is valid to prevent padding-based attacks. My unpad function also assumes the data has been encoded as UTF-8 bytes and decodes the unpadded data to text.

from __future__ import print_function
from Crypto.Cipher import AES
from base64 import b64decode
from hashlib import md5

def evp_simple(data):
out = ''
while len(out) < 32:
out += md5(out + data).digest()
return out[:32]

def unpad(s):
offset = ord(s[-1])
return s[:-offset].decode('utf-8')

iv = 'a2a8a78be66075c94ca5be53c8865251'.decode('hex')
passwd = '00112233445566778899aabbccddeeff'
key = evp_simple(passwd)
print('key', key.encode('hex'))

aes = AES.new(key, AES.MODE_CBC, IV=iv)

data = b64decode('pt7DqtAwtTjPbTlzVApucQ==')

raw = aes.decrypt(data)
print(repr(raw), len(raw))
plain = unpad(raw)
print(repr(plain), len(plain))

output

key b4377f7babf2991b7d6983c4d3e19cd4dd37e31af1c9c689ca22e90e365be18b
'hello world\n\x04\x04\x04\x04' 16
u'hello world\n' 12

That code will not run on Python 3, so here's a Python 3 version.

from Crypto.Cipher import AES
from base64 import b64decode
from hashlib import md5

def evp_simple(data):
out = b''
while len(out) < 32:
out += md5(out + data).digest()
return out[:32]

def unpad(s):
offset = s[-1]
return s[:-offset].decode('utf-8')

iv = bytes.fromhex('a2a8a78be66075c94ca5be53c8865251')
passwd = b'00112233445566778899aabbccddeeff'
key = evp_simple(passwd)

aes = AES.new(key, AES.MODE_CBC, IV=iv)

data = b64decode('pt7DqtAwtTjPbTlzVApucQ==')

raw = aes.decrypt(data)
print(repr(raw), len(raw))
plain = unpad(raw)
print(repr(plain), len(plain))

output

b'hello world\n\x04\x04\x04\x04' 16
'hello world\n' 12

How to encrypt a file so that OpenSSL can decrypt it without providing the IV manually

In order to decrypt an AES encrypted file you need both key and IV. IV is not a secret and is usually store at the encrypted file.

OpenSSL uses a key derivation function to generate these two using the provided password and a random salt. then after encryption it stores the salt at the header of the file with Salted__ prefix, so at the decryption it could use it along with the password to produce the same key and IV.

package main

import (
"crypto/aes"
"crypto/cipher"
"crypto/rand"
"crypto/sha256"
"io"
"os"
"golang.org/x/crypto/pbkdf2"
)

func main() {

keySize := 32;
// its only for demonstration purpose
password := []byte("TESTPASSWORD1234TESTPASSWORD1234");
bReader, err := os.Open("doc.docx")
defer bReader.Close();
if err != nil {
panic(err)
}

salt := make([]byte, 8)
if _, err := io.ReadFull(rand.Reader, salt[:]); err != nil {
panic(err)
}

computed := pbkdf2.Key(password, salt, 10000, keySize + aes.BlockSize , sha256.New)
key := computed[:keySize]
iv := computed[keySize:]

block, err := aes.NewCipher(key)
if err != nil {
panic(err)
}
stream := cipher.NewOFB(block, iv)

bWriter, err := os.Create("doc-encrypted.docx")
if err != nil {
panic(err)
}
defer bWriter.Close()

prefix := []byte("Salted__");
header := append(prefix[:], salt...);
bWriter.Write(header)
sWriter := &cipher.StreamWriter{S: stream, W: bWriter}
if _, err := io.Copy(sWriter, bReader); err != nil {
panic(err)
}

}

and you can decrypt it with openssl enc -in doc-encrypted.docx -out doc-decrypted.docx -d -aes-256-ofb -pbkdf2 -pass pass:TESTPASSWORD1234TESTPASSWORD1234

How to use OpenSSL to encrypt/decrypt files?

Security Warning: AES-256-CBC does not provide authenticated encryption and is vulnerable to padding oracle attacks. You should use something like age instead.

Encrypt:

openssl aes-256-cbc -a -salt -pbkdf2 -in secrets.txt -out secrets.txt.enc

Decrypt:

openssl aes-256-cbc -d -a -pbkdf2 -in secrets.txt.enc -out secrets.txt.new

More details on the various flags

AES-128 CBC encryption in python

With Python3 you can use PyCryptodome, binascii and base64.

from base64 import b64encode, b64decode
from binascii import unhexlify

from Crypto.Cipher import AES
from Crypto.Util.Padding import pad, unpad

iv = "7bde5a0f3f39fd658efc45de143cbc94"
password = "3e83b13d99bf0de6c6bde5ac5ca4ae68"
msg = "this is a message"

print(f"IV: {iv}")
print(f"PWD: {password}")
print(f"MSG: {msg}")

# Convert Hex String to Binary
iv = unhexlify(iv)
password = unhexlify(password)

# Pad to AES Block Size
msg = pad(msg.encode(), AES.block_size)
# Encipher Text
cipher = AES.new(password, AES.MODE_CBC, iv)
cipher_text = cipher.encrypt(msg)

# Encode Cipher_text as Base 64 and decode to String
out = b64encode(cipher_text).decode('utf-8')
print(f"OUT: {out}")

# Decipher cipher text
decipher = AES.new(password, AES.MODE_CBC, iv)
# UnPad Based on AES Block Size
plaintext = unpad(decipher.decrypt(b64decode(out)), AES.block_size).decode('utf-8')
print(f'PT: {plaintext}')

You can see more:

  1. AES-128 CBC decryption in Python
  2. Python Encrypting with PyCrypto AES


Related Topics



Leave a reply



Submit