Where is the PEM file format specified?

百般思念 提交于 2019-11-27 12:34:18

For quite a long time, there was no formal specification of the PEM format with regards to cryptographic exchange of information. PEM is the textual encoding, but what is actually being encoded depends on the context. In April 2015, the IETF approved RFC 7468, which finally documents how various implementations exchange data using PEM textual encoding. The following list, taken directly from the RFC, describes the PEM format used for the following scenarios:

  1. Certificates, Certificate Revocation Lists (CRLs), and Subject Public Key Info structures in the Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile [RFC5280].
  2. PKCS #10: Certification Request Syntax [RFC2986].
  3. PKCS #7: Cryptographic Message Syntax [RFC2315].
  4. Cryptographic Message Syntax [RFC5652].
  5. PKCS #8: Private-Key Information Syntax [RFC5208], renamed to One Asymmetric Key in Asymmetric Key Package [RFC5958], and Encrypted Private-Key Information Syntax in the same documents.
  6. Attribute Certificates in An Internet Attribute Certificate Profile for Authorization [RFC5755].

According to this RFC, for the above scenarios you can expect the following labels to be within the BEGIN header and END footer. Figure 4 of the RFC has more detail, including corresponding ASN.1 types.

That's not the full story, though. The RFC was written by looking at existing implementations and documenting what they did. The RFC wasn't written first, nor was it written based on some existing authoritative documentation. So if you end up in a situation where you want to inter-operate with some implementation, you may have to look into the implementation's source code to figure out what they support.

For example, OpenSSL defines these BEGIN and END markers in crypto/pem/pem.h. Here is an excerpt from the header file with all the BEGIN and END labels that they support.

# define PEM_STRING_X509_OLD     "X509 CERTIFICATE"
# define PEM_STRING_X509         "CERTIFICATE"
# define PEM_STRING_X509_TRUSTED "TRUSTED CERTIFICATE"
# define PEM_STRING_X509_REQ_OLD "NEW CERTIFICATE REQUEST"
# define PEM_STRING_X509_REQ     "CERTIFICATE REQUEST"
# define PEM_STRING_X509_CRL     "X509 CRL"
# define PEM_STRING_EVP_PKEY     "ANY PRIVATE KEY"
# define PEM_STRING_PUBLIC       "PUBLIC KEY"
# define PEM_STRING_RSA          "RSA PRIVATE KEY"
# define PEM_STRING_RSA_PUBLIC   "RSA PUBLIC KEY"
# define PEM_STRING_DSA          "DSA PRIVATE KEY"
# define PEM_STRING_DSA_PUBLIC   "DSA PUBLIC KEY"
# define PEM_STRING_PKCS7        "PKCS7"
# define PEM_STRING_PKCS7_SIGNED "PKCS #7 SIGNED DATA"
# define PEM_STRING_PKCS8        "ENCRYPTED PRIVATE KEY"
# define PEM_STRING_PKCS8INF     "PRIVATE KEY"
# define PEM_STRING_DHPARAMS     "DH PARAMETERS"
# define PEM_STRING_DHXPARAMS    "X9.42 DH PARAMETERS"
# define PEM_STRING_SSL_SESSION  "SSL SESSION PARAMETERS"
# define PEM_STRING_DSAPARAMS    "DSA PARAMETERS"
# define PEM_STRING_ECDSA_PUBLIC "ECDSA PUBLIC KEY"
# define PEM_STRING_ECPARAMETERS "EC PARAMETERS"
# define PEM_STRING_ECPRIVATEKEY "EC PRIVATE KEY"
# define PEM_STRING_PARAMETERS   "PARAMETERS"
# define PEM_STRING_CMS          "CMS"

These labels are a start, but you still have to look into how the implementation encodes the data between the labels. There's not one correct answer for everything.

Updated answer for 2015: As users have already answered twice, before moderator @royhowie deleted the answers: there is now RFC 7468 which defines the PEM headers. The following quote is only a small part, and you should read the actual spec, which will likely stay on the internet for far longer than StackOverflow will.

However @royhowie deletes every answer that points to the RFC as 'link only' unless it has some text. So here is some text:

  1. Textual Encoding of PKCS #10 Certification Request Syntax

    PKCS #10 Certification Requests are encoded using the "CERTIFICATE REQUEST" label. The encoded data MUST be a BER (DER strongly preferred; see Appendix B) encoded ASN.1 CertificationRequest structure as described in [RFC2986].

-----BEGIN CERTIFICATE REQUEST-----

MIIBWDCCAQcCAQAwTjELMAkGA1UEBhMCU0UxJzAlBgNVBAoTHlNpbW9uIEpvc2Vm c3NvbiBEYXRha29uc3VsdCBBQjEWMBQGA1UEAxMNam9zZWZzc29uLm9yZzBOMBAG ByqGSM49AgEGBSuBBAAhAzoABLLPSkuXY0l66MbxVJ3Mot5FCFuqQfn6dTs+9/CM EOlSwVej77tj56kj9R/j9Q+LfysX8FO9I5p3oGIwYAYJKoZIhvcNAQkOMVMwUTAY BgNVHREEETAPgg1qb3NlZnNzb24ub3JnMAwGA1UdEwEB/wQCMAAwDwYDVR0PAQH/ BAUDAwegADAWBgNVHSUBAf8EDDAKBggrBgEFBQcDATAKBggqhkjOPQQDAgM/ADA8 AhxBvfhxPFfbBbsE1NoFmCUczOFApEuQVUw3ZP69AhwWXk3dgSUsKnuwL5g/ftAY dEQc8B8jAcnuOrfU

-----END CERTIFICATE REQUEST-----

Figure 9: PKCS #10 Example

The label "NEW CERTIFICATE REQUEST" is also in wide use. Generators conforming to this document MUST generate "CERTIFICATE REQUEST" labels. Parsers MAY treat "NEW CERTIFICATE REQUEST" as equivalent to "CERTIFICATE REQUEST".^

To get you started: As far as I know, if there's a part that's human-readable (has words and stuff), that's meant for human operators to know what the certification in question is, expiry dates, etc, for a quick manual verification. So you can ignore that.

You'll want to parse what's between the BEGIN-END blocks.

Inside, you'll find a Base64 encoded entity that you need to Base64 decode into bytes. These bytes represent a DER encoded certificate/key/etc. I'm not sure what good libraries you could use for parsing the DER data.

As a test to understand what data is inside each block, you can paste what's between the BEGIN-END blocks to this site which does ASN.1 decoding in JavaScript:

http://lapo.it/asn1js/

Although I wouldn't go pasting any production environment private keys to any site (although that seems to be just a javascript).

Base64: http://en.wikipedia.org/wiki/Base64

DER: http://en.wikipedia.org/wiki/Distinguished_Encoding_Rules

ASN.1: http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One

hopia

I found an old thread regarding this issue. It looks like there is no "official" standard format for the encapsulation boundaries and the best way to determine this is by guessing the contents based on well-known keywords you find in the BEGIN statement.

As answered by indiv, for the full list of the keywords, refer to the OpenSSL crypto/pem/pem.h header file.

jathanism

I am unsure if it's specific to OpenSSL, but the documentation for PEM Encryption Format may be what you're looking for.

Where is the PEM file format specified?

There is no one place. It depends on the standard. You can even make up your own encapsulation boundaries and use them in your own software.

As @indiv stated, OpenSSL has a fairly comprehensive list at <openssl dir>/crypto/pem/pem.h.

Someone asked the PKIX Working Group to provide a list like you are asking for back in 2006. The working group declined. See PEM file format rfc draft request.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!