How to Split a Ca Certificate Bundle into Separate Files

How can I split a CA certificate bundle into separate files?

You can split the bundle with awk, like this, in an appropriate directory:

awk 'BEGIN {c=0;} /BEGIN CERT/{c++} { print > "cert." c ".pem"}' < ca-bundle.pem 

Then, create the links OpenSSL wants by running the c_rehash utility that comes with OpenSSL:

c_rehash .

Note: use 'gawk' on non linux-platforms - as above relies on a GNU specific feature.

Automate Splitting a PEM File into multiple Certs


I was hoping OpenSSL would have a tool that might be able to do

I'm not aware of an OpenSSL function or OpenSSL tool to do it. Looking at the sources, PEM_bytes_read_bio may be the function to do it. But its not documented, so I'm not certain. (The function name begins with capitol letters - PEM_*. The various lower letters - pem_* are private and should not be used).

If you have the OpenSSL sources handy, then the source code for the parsing routine is in <openssl src>/crypto/pem/pem_lib.c. That's where PEM_bytes_read_bio is implemented.


However this seems a bit "hacky."

Well, its not so much hacky - you have to roll up your sleeves and code it up. You might be able to use Bison and Flex to create a parser and lexer. How you call it from the shell is a different story. With a lexer, I think you can parse a PEM object in O(n).


I need to figure out a way to automate the process of splitting a PEM file into multiple PEM files... What would be the best way of doing this?

I wrote similar for Crypto++ at PEM Pack. It added support for PEM encoded keys, including encrypted keys. Crypto++ is a C++ library, but the same general algorithm should work well with your language of choice.

The routine of interest in Crypto++ is called PEM_NextObject, its located in source file pem-rd.cpp. You can find the source files at the bottom of the page in a ZIP file. PEM_NextObject looked for four items:

  • The leading -----BEGIN
  • the following -----
  • The trailing -----END
  • the following -----

I used four indexes - one for each token. I would read 64+1 bytes at a time because OpenSSL outputs its break at 64 characters. I would read a line into a string and concatenate the string into an accumulator. I would then use find to locate the token in the accumulator (some hand waiving, because they were secure strings). If I did not find a particular index, I would read another line.

When searching for the token, the search for the first token started at position 0. The next search started after the previous index was found. For example, the search for index two began at index one plus the size of the token; and the search for index three began at index two plus the size of the token. If a token was not found, I only search the current line and 10 character proceeding it in case the token spans a previous read and current read.

I used indexes rather than iterators because an iterator is invalidated if the container's size was increased. The concatenation would have caused that. Fortunately, the index was always valid because it was simply an offset from the beginning of the string. You may not have this problem in bash (or whatever you choose).

If I read to the end of a stream without finding all four indexes, then I threw an error.

If I found all four indexes, then I had something that claimed to be PEM encoded. I discarded any leading characters, and trimmed trailing whitespace. So the PEM object was located at (Index1) to (Index4 + 5) (+5 for the trailing -----).

Because I might have parsed an invalid PEM object (i.e., -----BEGIN FOO----- and -----END BAR-----), I needed another routine to classify the type of PEM object that was parsed. That function is called PEM_GetType.

The algorithm should work well because its not egregious from an algorithm analysis point of view and PEM objects are usually small (less than 2K or 4K). I think the analysis is O(n + m*10), where m is the number of lines in the file. The m*10 is based on scanning a 64 character line looking for a token with a 10 character "rewind", reading another line, and then scanning for the token again. Recall I "rewind" a bit in case the token spans lines.

This algorithm performs OK if there's no PEM object and the file is large. I'm pretty certain it runs in O(n + m*10) in the worse case, too. If n >>> m, then its essentially a O(n) function because m*10 is just a large bounding c.

You might also be interested in How to split a PEM file on Server Fault and Where is the PEM file format specified? on Stack Overflow.


-----BEGIN CERTIFICATE----- to -----END CERTIFICATE-----

While you show a certificate, there are other types of objects. For example, public keys and encrypted private keys. If you need to decrypt an encrypted key, then you will need to lift/borrow/use OpenSSL's EVP_BytesToKey.

EVP_BytesToKey is kind of non-satndard, so it becomes a copy/paste operation to ensure interoperability. I seem to recall EVP_BytesToKey is equivalent to PKCS#5 derivation if the number of bytes produced by EVP_BytesToKey is 16 or less. If 17 or more are produced, then OpenSSL uses a "non-standard" extension.


If you are interested in testing, then take a look at pem-create-keys.sh. It creates malformed PEM encoded keys (not certificates). For example, it will concatenate multiple keys without line breaks, it will delete one of the trailing dashes, and it will delete one of the trailing dashes and then concatenate another key.

How do I split a pem file?

I did find a command to split the file into its two parts:

csplit $pemfile '/^-----BEGIN RSA PRIVATE KEY-----$/'

This leaves you with two files: xx00 containing the certificate and xx01 containing the key. Then all you need is a couple of mv commands to rename the files to something more appropriate.

How do I split a multi-valued p12 certificate into separate certificates

As you previously mentioned you can use OpenSSL to change the p12 format to a PEM format, the PEM format would be accepted but also not secure with a password so make sure you get what you need.

openssl pkcs12 -in yourcertificates.p12 -out certificates.pem -nodes

This will put everything in one file, so you will have to open the PEM file in a text editor and take out the required files.

Certificates are separated by

-----BEGIN CERTIFICATE-----

Content

-----END CERTIFICATE-----

Keys would be separated by

-----BEGIN RSA PRIVATE KEY-----

Content

-----END RSA PRIVATE KEY-----

Please update your question if you need anymore information.

Using recursion to split text file based on delimiter in python

I can't see that recursion would be useful here - instead you could make a list of the output file names and iterate through them using iter and next, to open a file when you encounter "BEGIN", then close the same file when you encounter "END".

def parse_file(input_file, output_files):
filenames = iter(output_files)
with open(input_file, 'r') as cert_file:
for line in cert_file:
if "BEGIN" in line:
output = open(filenames.next(), 'w')
output.write(line)
if "END" in line:
output.close()
output.close() # just in case not already closed

input_file = '/Users/arl/Downloads/bundle.pem'
output_files = ['root.pem', 'int1.pem', 'int2.pem', 'end.pem']
parse_file(input_file=input_file, output_files=output_files)

This will raise an error if there is any space or other content in between 'BEGIN' and 'END'. If that is a problem you could add a line to check that the output file has been opened.

def parse_file(input_file, output_files):
filenames = iter(output_files)
output = None
with open(input_file, 'r') as cert_file:
for line in cert_file:
if "BEGIN" in line:
output = open(filenames.next(), 'w')
if output and not output.closed:
output.write(line)
if "END" in line:
output.close()
output.close()

Or equivalently, use a nested loop:

def parse_file(input_file, output_files):
output = None
with open(input_file, 'r') as cert_file:
for output_file in output_files:
for line in cert_file:
if "BEGIN" in line:
output = open(output_file, 'w')
if output and not output.closed:
output.write(line)
if "END" in line:
output.close()
break # breaks out of inner loop and gets next output_file
output.close()


Related Topics



Leave a reply



Submit