Interpreting openssl speed output for rsa with multi option
The code for the speed test is in <openssl>/apps/speed.c
.
-multi
is a switch for multiple benchmarks in parallel, not multiplications (to remove all confusion). See the comments around line 1145:
#ifndef NO_FORK
BIO_printf(bio_err,"-multi n run n benchmarks in parallel.\n");
#endif
What does the column sign and verify mean?
Sign and verify do just what they say. They time a signing operation and a verify operation with different RSA moduli.
Sign/s and Verify/s are the inversions of Sign and Verify. I.e., 1/0.000008s => 125,000 signs per second.
Here's the code to print the report you are seeing. It starts around line 2450:
#ifndef OPENSSL_NO_RSA
j=1;
for (k=0; k<RSA_NUM; k++)
{
if (!rsa_doit[k]) continue;
if (j && !mr)
{
printf("%18ssign verify sign/s verify/s\n"," ");
j=0;
}
if(mr)
fprintf(stdout,"+F2:%u:%u:%f:%f\n",
k,rsa_bits[k],rsa_results[k][0],
rsa_results[k][1]);
else
fprintf(stdout,"rsa %4u bits %8.6fs %8.6fs %8.1f %8.1f\n",
rsa_bits[k],rsa_results[k][0],rsa_results[k][1],
1.0/rsa_results[k][0],1.0/rsa_results[k][1]);
}
#endif
Finding the code to perform the sign and verify is left as an exercise to the reader ;)
have an Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
Just bike shedding, but be sure to config
with enable-ec_nistp_64_gcc_128
if you are using a modern GCC. Using ec_nistp_64_gcc_128
will speed up some operations, such as DH operation, by 2x or 4x.
You need a modern GCC for the __uint128_t
. Configure
cannot determine if the compiler supports __uint128_t
on its own, so it leaves ec_nistp_64_gcc_128
disabled.
openssl speed rsa less performant on (normally) better cpu
Edit:
As stated by @Alexei Khlebnikov, the openssl speed rsa
command only measures the speed of the rsa sign/verify functions, and these don't use random numbers. Because of that, my original answer doesn't answer the question.
After a quick search, I found that the 1st server has bmi2 and adx instructions, while the 2nd server doesn't. These instructions are used to improve the performance of
Montgomery’s integer multiplication/squaring, that are used in the RSA signing operations. It's hard to confirm that's the reason for the performance difference, but it can be one of the reasons.
Original answer:
To generate RSA keys you need random and large prime numbers. The process to find a random and large prime number consists in:
- Generate a random number;
- Check if it's prime;
- If it's not, repeat.
As you can see, this involves a lot of RNG, and generating good RNG is really slow. So, having a faster RNG means a faster RSA key generation.
How can I interpret openssl speed output?
While it could probably be worded better, it pretty much means what it says - run the md4
hash routine in a loop for 3 seconds with a 16 byte input. After 3 seconds, observe that we ran just a bit over 9 million iterations. That's about 144 million bytes processed, or 48 million bytes per second (where "million" means 10^6).
Fully utilizing HW accelerator
According to Interpreting openssl speed output for rsa with multi option , -multi
doesn't "parallelize" work or something, it just runs multiple benchmarks in parallel.
So, your HW card's load will be essentially limited by how much work is available at the moment (note that in industry in general, 80% planned capacity load is traditionally considered optimal in case of load spikes). Of course, running multiple server threads/processes will give you the same effect as multiple benchmarks.
OpenSSL supports multiple threads provided that you give it callbacks to lock shared data. For multiple processes, it warns about reusing data state inherited from parent.
That's it for scaling vertically. For scaling horizontally:
openssl
supports asynchronous I/O through asynchronous BIOs- but, its elemental crypto operations and internal ENGINE calls are synchronous, and changing this would require a logic overhaul
- private efforts to make them provide asynchronous operation have met severe criticism due to major design flaws
Intel announced some "Asynchronous OpenSSL" project (08.2014) to use with its hardware, but the linked white paper gives little details about its implementation and development state. One developer published some related code (10.2015), noting that it's "stable enough to get an overview".
Signing 20-byte message with 256-bit RSA key working with openssl.exe but not in code
Guess I found the solution. openssl rsautl -sign
uses RSA_private_encrypt
instead of RSA_sign
(what I would have expected). RSA_sign
creates a longer structure than the 20-bytes message I provided, so it fails with the given error.
openssl smime: aes vs rsa - which one encrypts?
Using the openssl smime
command to encrypt data encrypts it in such a way that it can be received and decrypted by a recipient using email. This means that it uses certificates representing various entities (sender, recipient) and containing keys uniquely associated with those entities to both sign, encrypt, decrypt, and verify signatures on the contained data.
As Artjom pointed out in their comment, a hybrid cryptosystem is used. This means a combination of symmetric and asymmetric encryption algorithms are used, as each has benefits and drawbacks. Symmetric encryption (such as AES) is extremely fast in comparison to asymmetric encryption (such as RSA). However, AES/CBC provides only confidentiality, not integrity, authentication, or non-repudiation. Asymmetric encryption can provide integrity, authentication, and non-repudiation. Asymmetric encryption also has fairly low limits for the amount of data it can encrypt (for example, an RSA-4096 bit key can encrypt at most 446 bytes at once). By combining both methods, we can exercise each for their strengths and mitigate the weaknesses of the other.
In this case, let our message be M
. An AES/CBC key Kaes
of length 256 bits is generated and used to encrypt the raw data such that the cipher text C
is C = E(Kaes, M)
. The recipient's public key Krpub
is then used to encrypt Kaes
as C' = E(Krpub, Kaes)
. The cost of encrypting this small amount of data is relatively low, even though we are using an expensive operation.
Note prior to encrypting the ephemeral session key, it is likely signed using the sender's private key Kspriv
unless digital signing is disabled. I am not 100% certain with S/MIME, but this is how PGP works, as when sending to multiple recipients, it would be much more expensive to encrypt the session key with n
recipient's public keys and then sign each encrypted key with the sender's private key O(2n)
vs. O(n+1)
(not really Big-O notation, but effective for communicating the point). The signature of the session key is the same regardless of recipient because it depends only on the sender's private key.
Now all that is left is to concatenate the encrypted session key C'
and the cipher text C
and transmit them to the recipient. In a real S/MIME message, the identifying information about the sender's public key is transmitted as well in order to facilitate signature verification. The recipient parses the complete encrypted message, decrypts the session key using the recipient's private key Krpriv
and then uses the session key Kaes
to decrypt the cipher text C
.
However, it seems like all of this may be overkill for your situation. If you do not need to integrate with an email communication system and are simply trying to encrypt large files for storage, openssl
also offers a simple enc
command. You can use it as follows:
Password-based encryption:$ openssl enc -aes-256-cbc -in large_plain_file.dmg -out large_encrypted_file.enc -k thisIsABadPassword
-k
allows you to enter a password on the command line from which the key will be derived. OpenSSL uses a deterministic key derivation function (KDF) (effectively key, iv = MD5(password||salt)
). You can run this command with -p
on the end to get the key, salt, and IV printed to the console. The IV is deterministically derived from the password and salt. A KDF like PBKDF2 is recommended but the library function in OpenSSL does not expose it to the command-line tool for some reason.
Warning you can run this and specify -nosalt
to skip the salt generation, but this is not recommended as it makes an already extremely weak KDF even weaker.
Keyed encryption:$ openssl enc -aes-256-cbc -in large_plain_file.dmg -out large_encrypted_file.enc K 0123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA9876543210 -IV 0123456789ABCDEFFEDCBA9876543210
Running with the actual key and IV provided. The key is 32 bytes (256 bits) and the IV is 16 bytes (the size of one AES block).
To decrypt the data, run the command with the -d
flag:
$openssl enc -aes-256-cbc -d -in large_encrypted_file.dmg -out large_decrypted_file.enc
This is going to be simpler for you (no certificates/key pairs needed) and faster (no RSA encryption).
Something to keep in mind for all of this is that the password/key will be kept in terminal history if you provide it in the command invocation. Running without -k
or -K
will prompt the password to be entered in a secure prompt. Or use -pass
to read from an environment variable, file, or file descriptor.
Update to address original questions explicitly:
- The AES key is responsible for encrypting the data. The RSA key is used to encrypt the AES key.
- Both impact the performance. The AES key is encrypting much more data, but is much faster than RSA encryption.
- Yes, changing the size of the RSA key impacts performance, (higher key sizes are slower), but for a large file this is unlikely to be the limiting factor on performance.
- Yes, changing the AES mode of operation may have a substantial impact on performance. AES/CBC, AES/OFB, and AES/CFB are similar in that they are serial operations (depending on the result of one block operation to proceed to the next), but AES/CTR operates as a stream cipher and can be parallelized to provide both encryption and decryption.
- See answer above.
Related Topics
Questions About Embedded Linux Device Driver by Linux Newbie
Process-Local Override of Name Resolution
In Bash Tee Is Making Function Variables Local, How to Escape This
Bash Ip If Then Else Statement
Linux Slab Allocator and Cache Performance
Stream Static Image to V4L2Loopback Using Obs with V4L2 Plugin, Ffmpeg or Gstreamer
Whole-System Snapshot on Operating System
Some Flags About Workqueue in Kernel
Producer Consumer Implementation in a Block Device Driver
Arguments Were Passed Wrong in Pthread
Can a Gnome Application Be Automated? How
Linux Module to Hook Process Functions
"Segmentation Fault (Core Dumped)" Error in Fortran Gfortran Linux
Calling Windows Subsystem for Linux Apps Through Powershell/Cmd
Finding Processor Id in Which Process Is Running [Through Command/Interface Similar to Top]