How to decrypt OpenSSL AES-encrypted files in Python?
Given the popularity of Python, at first I was disappointed that there was no complete answer to this question to be found. It took me a fair amount of reading different answers on this board, as well as other resources, to get it right. I thought I might share the result for future reference and perhaps review; I'm by no means a cryptography expert! However, the code below appears to work seamlessly:
from hashlib import md5
from Crypto.Cipher import AES
from Crypto import Random
def derive_key_and_iv(password, salt, key_length, iv_length):
d = d_i = ''
while len(d) < key_length + iv_length:
d_i = md5(d_i + password + salt).digest()
d += d_i
return d[:key_length], d[key_length:key_length+iv_length]
def decrypt(in_file, out_file, password, key_length=32):
bs = AES.block_size
salt = in_file.read(bs)[len('Salted__'):]
key, iv = derive_key_and_iv(password, salt, key_length, bs)
cipher = AES.new(key, AES.MODE_CBC, iv)
next_chunk = ''
finished = False
while not finished:
chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
if len(next_chunk) == 0:
padding_length = ord(chunk[-1])
chunk = chunk[:-padding_length]
finished = True
out_file.write(chunk)
Usage:
with open(in_filename, 'rb') as in_file, open(out_filename, 'wb') as out_file:
decrypt(in_file, out_file, password)
If you see a chance to improve on this or extend it to be more flexible (e.g. make it work without salt, or provide Python 3 compatibility), please feel free to do so.
Notice
This answer used to also concern encryption in Python using the same scheme. I have since removed that part to discourage anyone from using it. Do NOT encrypt any more data in this way, because it is NOT secure by today's standards. You should ONLY use decryption, for no other reasons than BACKWARD COMPATIBILITY, i.e. when you have no other choice. Want to encrypt? Use NaCl/libsodium if you possibly can.
Decrypting AES CBC in python from OpenSSL AES
The OpenSSL statement uses PBKDF2 to create a 32 bytes key and a 16 bytes IV. For this, a random 8 bytes salt is implicitly generated and the specified password, iteration count and digest (default: SHA-256) are applied. The key/IV pair is used to encrypt the plaintext with AES-256 in CBC mode and PKCS7 padding, s. here. The result is returned in OpenSSL format, which starts with the 8 bytes ASCII encoding of Salted__, followed by the 8 bytes salt and the actual ciphertext, all Base64 encoded. The salt is needed for decryption, so that key and IV can be reconstructed.
Note that the password in the OpenSSL statement is actually passed without quotation marks, i.e. in the posted OpenSSL statement, the quotation marks are part of the password.
For the decryption in Python the salt and the actual ciphertext must first be determined from the encrypted data. With the salt the key/IV pair can be reconstructed. Finally, the key/IV pair can be used for decryption.
Example: With the posted OpenSSL statement, the plaintext
The quick brown fox jumps over the lazy dog
was encrypted into the ciphertext
U2FsdGVkX18A+AhjLZpfOq2HilY+8MyrXcz3lHMdUII2cud0DnnIcAtomToclwWOtUUnoyTY2qCQQXQfwDYotw==
Decryption with Python is possible as follows (using PyCryptodome):
from Crypto.Protocol.KDF import PBKDF2
from Crypto.Hash import SHA256
from Crypto.Util.Padding import unpad
from Crypto.Cipher import AES
import base64
# Determine salt and ciphertext
encryptedDataB64 = 'U2FsdGVkX18A+AhjLZpfOq2HilY+8MyrXcz3lHMdUII2cud0DnnIcAtomToclwWOtUUnoyTY2qCQQXQfwDYotw=='
encryptedData = base64.b64decode(encryptedDataB64)
salt = encryptedData[8:16]
ciphertext = encryptedData[16:]
# Reconstruct Key/IV-pair
pbkdf2Hash = PBKDF2(b'"mypassword"', salt, 32 + 16, count=100000, hmac_hash_module=SHA256)
key = pbkdf2Hash[0:32]
iv = pbkdf2Hash[32:32 + 16]
# Decrypt with AES-256 / CBC / PKCS7 Padding
cipher = AES.new(key, AES.MODE_CBC, iv)
decrypted = unpad(cipher.decrypt(ciphertext), 16)
print(decrypted)
Edit - Regarding your comment: 16 MB should be possible, but for larger data the ciphertext would generally be read from a file and the decrypted data would be written to a file, in contrast to the example posted above.
Whether the data can be decrypted in one step ultimately depends on the available memory. If the memory is not sufficient, the data must be processed in chunks.
When using chunks it would make more sense not to Base64 encode the encrypted data but to store them directly in binary format. This is possible by omitting the -a option in the OpenSSL statement. Otherwise it must be ensured that always integer multiples of the block size (relative to the undecoded ciphertext) are loaded, where 3 bytes of the undecoded ciphertext correspond to 4 bytes of the Base64 encoded ciphertext.
In the case of the binary stored ciphertext: During decryption only the first block (16 bytes) should be (binary) read in the first step. From this, the salt can be determined (the bytes 8 to 16), then the key and IV (analogous to the posted code above).
The rest of the ciphertext can be (binary) read in chunks of suitable size ( = a multiple of the block size, e.g. 1024 bytes). Each chunk is encrypted/decrypted separately, see multiple encrypt/decrypt-calls. For reading/writing files in chunks with Python see e.g. here.
Further details are best answered within the scope of a separate question.
how to decrypt a file in python which was encrypted using openssl
The password is not the key. Openssl uses EVP_BytesToKey
to create an appropriate key (& IV, if necessary) from the password and the salt.
As James K Polk mentions in a comment, you can use the -P (or -p) option to tell Openssl to print the key (in hex ), which you can then pass to Crypto.Cipher. Alternatively, you can implement EVP_BytesToKey
in Python, as shown below. This is a simplified version of EVP_BytesToKey
that uses no salt, and the default value of 1 for the count
arg.
As the EVP_BytesToKey
docs state, this is a rather weak password derivation function. As the hashlib docs mention, modern password derivation normally performs hundreds of thousands of hashes to make password hashing attacks very slow.
We also need a function to remove the PKCS7 padding from the decrypted data bytes. The unpad
function below simply assumes that the padding data is valid. In real software the unpad
function must verify that the padded data is valid to prevent padding-based attacks. My unpad
function also assumes the data has been encoded as UTF-8 bytes and decodes the unpadded data to text.
from __future__ import print_function
from Crypto.Cipher import AES
from base64 import b64decode
from hashlib import md5
def evp_simple(data):
out = ''
while len(out) < 32:
out += md5(out + data).digest()
return out[:32]
def unpad(s):
offset = ord(s[-1])
return s[:-offset].decode('utf-8')
iv = 'a2a8a78be66075c94ca5be53c8865251'.decode('hex')
passwd = '00112233445566778899aabbccddeeff'
key = evp_simple(passwd)
print('key', key.encode('hex'))
aes = AES.new(key, AES.MODE_CBC, IV=iv)
data = b64decode('pt7DqtAwtTjPbTlzVApucQ==')
raw = aes.decrypt(data)
print(repr(raw), len(raw))
plain = unpad(raw)
print(repr(plain), len(plain))
output
key b4377f7babf2991b7d6983c4d3e19cd4dd37e31af1c9c689ca22e90e365be18b
'hello world\n\x04\x04\x04\x04' 16
u'hello world\n' 12
That code will not run on Python 3, so here's a Python 3 version.
from Crypto.Cipher import AES
from base64 import b64decode
from hashlib import md5
def evp_simple(data):
out = b''
while len(out) < 32:
out += md5(out + data).digest()
return out[:32]
def unpad(s):
offset = s[-1]
return s[:-offset].decode('utf-8')
iv = bytes.fromhex('a2a8a78be66075c94ca5be53c8865251')
passwd = b'00112233445566778899aabbccddeeff'
key = evp_simple(passwd)
aes = AES.new(key, AES.MODE_CBC, IV=iv)
data = b64decode('pt7DqtAwtTjPbTlzVApucQ==')
raw = aes.decrypt(data)
print(repr(raw), len(raw))
plain = unpad(raw)
print(repr(plain), len(plain))
output
b'hello world\n\x04\x04\x04\x04' 16
'hello world\n' 12
How to encrypt a file so that OpenSSL can decrypt it without providing the IV manually
In order to decrypt an AES encrypted file you need both key and IV. IV is not a secret and is usually store at the encrypted file.
OpenSSL uses a key derivation function to generate these two using the provided password and a random salt. then after encryption it stores the salt at the header of the file with Salted__ prefix, so at the decryption it could use it along with the password to produce the same key and IV.
package main
import (
"crypto/aes"
"crypto/cipher"
"crypto/rand"
"crypto/sha256"
"io"
"os"
"golang.org/x/crypto/pbkdf2"
)
func main() {
keySize := 32;
// its only for demonstration purpose
password := []byte("TESTPASSWORD1234TESTPASSWORD1234");
bReader, err := os.Open("doc.docx")
defer bReader.Close();
if err != nil {
panic(err)
}
salt := make([]byte, 8)
if _, err := io.ReadFull(rand.Reader, salt[:]); err != nil {
panic(err)
}
computed := pbkdf2.Key(password, salt, 10000, keySize + aes.BlockSize , sha256.New)
key := computed[:keySize]
iv := computed[keySize:]
block, err := aes.NewCipher(key)
if err != nil {
panic(err)
}
stream := cipher.NewOFB(block, iv)
bWriter, err := os.Create("doc-encrypted.docx")
if err != nil {
panic(err)
}
defer bWriter.Close()
prefix := []byte("Salted__");
header := append(prefix[:], salt...);
bWriter.Write(header)
sWriter := &cipher.StreamWriter{S: stream, W: bWriter}
if _, err := io.Copy(sWriter, bReader); err != nil {
panic(err)
}
}
and you can decrypt it with openssl enc -in doc-encrypted.docx -out doc-decrypted.docx -d -aes-256-ofb -pbkdf2 -pass pass:TESTPASSWORD1234TESTPASSWORD1234
How to use OpenSSL to encrypt/decrypt files?
Security Warning: AES-256-CBC does not provide authenticated encryption and is vulnerable to padding oracle attacks. You should use something like age instead.
Encrypt:
openssl aes-256-cbc -a -salt -pbkdf2 -in secrets.txt -out secrets.txt.enc
Decrypt:
openssl aes-256-cbc -d -a -pbkdf2 -in secrets.txt.enc -out secrets.txt.new
More details on the various flags
AES-128 CBC encryption in python
With Python3 you can use PyCryptodome, binascii and base64.
from base64 import b64encode, b64decode
from binascii import unhexlify
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad, unpad
iv = "7bde5a0f3f39fd658efc45de143cbc94"
password = "3e83b13d99bf0de6c6bde5ac5ca4ae68"
msg = "this is a message"
print(f"IV: {iv}")
print(f"PWD: {password}")
print(f"MSG: {msg}")
# Convert Hex String to Binary
iv = unhexlify(iv)
password = unhexlify(password)
# Pad to AES Block Size
msg = pad(msg.encode(), AES.block_size)
# Encipher Text
cipher = AES.new(password, AES.MODE_CBC, iv)
cipher_text = cipher.encrypt(msg)
# Encode Cipher_text as Base 64 and decode to String
out = b64encode(cipher_text).decode('utf-8')
print(f"OUT: {out}")
# Decipher cipher text
decipher = AES.new(password, AES.MODE_CBC, iv)
# UnPad Based on AES Block Size
plaintext = unpad(decipher.decrypt(b64decode(out)), AES.block_size).decode('utf-8')
print(f'PT: {plaintext}')
You can see more:
- AES-128 CBC decryption in Python
- Python Encrypting with PyCrypto AES
Related Topics
Python: Importing a Sub‑Package or Sub‑Module
Tkinter Gui Layout Using Frames and Grid
How to Create Test and Train Samples from One Dataframe with Pandas
Generating a List of Random Numbers, Summing to 1
Combining Two Sorted Lists in Python
How to Convert a Dataframe to a Dictionary
How to Get First Element in a List of Tuples
How to Dynamically Change Base Class of Instances at Runtime
Applying Function with Multiple Arguments to Create a New Pandas Column
Loop That Also Accesses Previous and Next Values
How to Get the Ip Address from a Nic (Network Interface Controller) in Python
Why am I Getting Attributeerror: Object Has No Attribute
How to Count the Occurrence of a Certain Item in an Ndarray
Python String Prints as [U'String']
Remove Punctuation from Unicode Formatted Strings
Why Do I Need 'B' to Encode a String with Base64