Extract Server Name Indication (SNI) from TLS client hello

China☆狼群 提交于 2019-11-28 06:33:31

I did this in sniproxy, examining a TLS client hello packet in Wireshark while reading that RFC is a pretty good way to go. It's not too hard, just lots of variable length fields you have to skip past and check checking if you have the correct element type.

I'm working on my tests right now, and have this annotated sample packet that might help:

const unsigned char good_data_2[] = {
    // TLS record
    0x16, // Content Type: Handshake
    0x03, 0x01, // Version: TLS 1.0
    0x00, 0x6c, // Length (use for bounds checking)
        // Handshake
        0x01, // Handshake Type: Client Hello
        0x00, 0x00, 0x68, // Length (use for bounds checking)
        0x03, 0x03, // Version: TLS 1.2
        // Random (32 bytes fixed length)
        0xb6, 0xb2, 0x6a, 0xfb, 0x55, 0x5e, 0x03, 0xd5,
        0x65, 0xa3, 0x6a, 0xf0, 0x5e, 0xa5, 0x43, 0x02,
        0x93, 0xb9, 0x59, 0xa7, 0x54, 0xc3, 0xdd, 0x78,
        0x57, 0x58, 0x34, 0xc5, 0x82, 0xfd, 0x53, 0xd1,
        0x00, // Session ID Length (skip past this much)
        0x00, 0x04, // Cipher Suites Length (skip past this much)
            0x00, 0x01, // NULL-MD5
            0x00, 0xff, // RENEGOTIATION INFO SCSV
        0x01, // Compression Methods Length (skip past this much)
            0x00, // NULL
        0x00, 0x3b, // Extensions Length (use for bounds checking)
            // Extension
            0x00, 0x00, // Extension Type: Server Name (check extension type)
            0x00, 0x0e, // Length (use for bounds checking)
            0x00, 0x0c, // Server Name Indication Length
                0x00, // Server Name Type: host_name (check server name type)
                0x00, 0x09, // Length (length of your data)
                // "localhost" (data your after)
                0x6c, 0x6f, 0x63, 0x61, 0x6c, 0x68, 0x6f, 0x73, 0x74,
            // Extension
            0x00, 0x0d, // Extension Type: Signature Algorithms (check extension type)
            0x00, 0x20, // Length (skip past since this is the wrong extension)
            // Data
            0x00, 0x1e, 0x06, 0x01, 0x06, 0x02, 0x06, 0x03,
            0x05, 0x01, 0x05, 0x02, 0x05, 0x03, 0x04, 0x01,
            0x04, 0x02, 0x04, 0x03, 0x03, 0x01, 0x03, 0x02,
            0x03, 0x03, 0x02, 0x01, 0x02, 0x02, 0x02, 0x03,
            // Extension
            0x00, 0x0f, // Extension Type: Heart Beat (check extension type)
            0x00, 0x01, // Length (skip past since this is the wrong extension)
            0x01 // Mode: Peer allows to send requests
};

Use WireShark and capture only TLS (SSL) packages by adding a filter tcp port 443. Then find a "Client Hello" Message. You can see its raw data below.

Expand Secure Socket Layer->TLSv1.2 Record Layer: Handshake Protocol: Client Hello->...
and you will see Extension: server_name->Server Name Indication extension. The server name in the Handshake package is not encrypted.

http://i.stack.imgur.com/qt0gu.png

I noticed that the domain is always prepend by two zero bytes and one length byte. Maybe it's unsigned 24 bit integer, but I can't test it, as my DNS server won't allow domain names beyond 77 characters.

Upon that knowledge I came up with this (Node.js) code.

function getSNI(buf) {
  var sni = null
    , regex = /^(?:[a-z0-9-]+\.)+[a-z]+$/i;
  for(var b = 0, prev, start, end, str; b < buf.length; b++) {
    if(prev === 0 && buf[b] === 0) {
      start = b + 2;
      end   = start + buf[b + 1];
      if(start < end && end < buf.length) {
        str = buf.toString("utf8", start, end);
        if(regex.test(str)) {
          sni = str;
          continue;
        }
      }
    }
    prev = buf[b];
  }
  return sni;
}

This code looks for a sequence of two zero bytes. If it finds one, it assumes the following byte is a length parameter. It checks if the length is still in the boundary of the buffer and if so reads the byte sequence as UTF-8. Later on, one could RegEx the array and extract the domain.

Works amazingly well! Still, I noticed something odd.

'�\n�\u0014\u0000�\u0000�\u00009\u00008�\u000f�\u0005\u0000�\u00005�\u0007�\t�\u0011�\u0013\u0000E\u0000D\u0000f\u00003\u00002�\f�\u000e�\u0002�\u0004\u0000�\u0000A\u0000\u0005\u0000\u0004\u0000/�\b�\u0012\u0000\u0016\u0000\u0013�\r�\u0003��\u0000\n'
'\u0000\u0015\u0000\u0000\u0012test.cubixcraft.de'
'test.cubixcraft.de'
'\u0000\b\u0000\u0006\u0000\u0017\u0000\u0018\u0000\u0019'
'\u0000\u0005\u0001\u0000\u0000'

Always, no matter what subdomain I choose, the domain is targeted twice. It seems like the SNI field is nested inside another field.

I am open to suggestions and improvements! :)

I turned this into a Node module, for everyone, who cares: sni.

For anyone interested, this is a tentative version of the C/C++ code. It has worked so far. The function returns the position of the server name in a the byte array containing the Client Hello and the length of the name in the len parameter.

char *get_TLS_SNI(unsigned char *bytes, int* len)
{
    unsigned char *curr;
    unsigned char sidlen = bytes[43];
    curr = bytes + 1 + 43 + sidlen;
    unsigned short cslen = ntohs(*(unsigned short*)curr);
    curr += 2 + cslen;
    unsigned char cmplen = *curr;
    curr += 1 + cmplen;
    unsigned char *maxchar = curr + 2 + ntohs(*(unsigned short*)curr);
    curr += 2;
    unsigned short ext_type = 1;
    unsigned short ext_len;
    while(curr < maxchar && ext_type != 0)
    {
        ext_type = ntohs(*(unsigned short*)curr);
        curr += 2;
        ext_len = ntohs(*(unsigned short*)curr);
        curr += 2;
        if(ext_type == 0)
        {
            curr += 3;
            unsigned short namelen = ntohs(*(unsigned short*)curr);
            curr += 2;
            *len = namelen;
            return (char*)curr;
        }
        else curr += ext_len;
    }
    if (curr != maxchar) throw std::exception("incomplete SSL Client Hello");
    return NULL; //SNI was not present
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!