Pages

Sunday, October 16, 2011

A (somewhat) concise explanation of HTTPS / SSL / PKI / Digital Certificates / Blah blah blah

My work has recently forced me to understand HTTPS. It's been an area I've always shied from since 2004 -- I interned at an IT security department. I felt that the more detailed and accurate explanations on the web was a bit too much, so I thought this might be a bit more useful to a group of people. (In particular, I'm hoping this is useful to one of my friends who agreed that the video explanations on YouTube sucked.)

I'm no security expert, but I just know enough to get my job done. And here's what I have to share, a concise explanation of HTTPS. Topics includes SSL (secure socket layer), PKI (public key infrastructure), digital certificates, digital signatures, hashes, public/private keys, and stuff along those lines.

Let's start with this spreadsheet.

TermDefinitionReason
HTTPAn application layer protocol for surfing the web.So we have a standard for writing and showing webpages.
SSLA protocol that wraps other application layer protocols so that it's secure.So we can communicate across the network securely.
HTTPSHTTP wrapped with SSL.So we can browse the web securely.
EncryptionWays to scramble data.So Mr. Nosy doesn't know what you and me are talking about.
DecryptionWays to unscramble data.So you know what I'm talking about.
KeysSomething that can encrypt and decrypt data.So we can encrypt and decrypt data. Duh?
Symmetric KeyA key that can decrypt data that was encrypted by itself.So you and I can use this single key to encrypt and decrypt data.
Asymmetric KeyA key that can only decrypt data that was encrypted by its pair.So it's only half the damage if one key was stolen.
Public/Private KeyAn asymmetric key pair such that if one encrypts, the other can decrypt.So I can use one key to encrypt data, knowing that you can use the other key to decrypt data.
Public KeyOne key in the Public/Private Key pair that the owner decides to make public.So people can write to me securely (because only I have the private key to decrypt the data).
Private KeyOne key in the Public/Private Key pair that the owner decides to keep as a secret.So I can assure people that I was the one who sent the data. (Only I have the private key to encrypt the data. You can verify it by decrypting it with my public key.)
HashA small computed number that represents a bigger chunk of data, such that even the smallest change to the big chunk of data can result in a different hash.So if you knew this data is supposed to have hash X, but you computed the hash from the actual data and got Y, it means somebody changed the data.
Digital SignatureA hash encrypted by the private key.So you know that the hash itself is not changed (only I can encrypt the hash with my private key).
Public Key Infrastructure (PKI)A group of technologies to help verify public keys.So that you know that the public key is actually mine and not an impersonator.
Certificate Authority (CA)Somebody who you trust.So you can believe anything that the CA digitally signed.
Digital CertificateA document containing my public key, digitally signed by a CA.So I can give it to you and you can verify if it's really me or not.

Now that all the terms are hopefully understood. Here's basically how HTTPS works. I'll call this protocol Mini-HTTPS because a lot of detail is skipped and is not exactly HTTPS. However, understanding Mini-HTTPS will definitely help you understand HTTPS if you want to.
  1. A user downloads Chrome. Chrome comes with a lot of digital certificates from CA's you trust. You trust them so much that you allow them to be shipped with Chrome. The digital certificates contain the CA's public key and is signed by the CA itself.
  2. You open Chrome and go to https://plus.google.com.
  3. Chrome detects that the URL starts with "https" so it needs some security checking.
  4. Chrome sends, "verify yourself" to the server.
  5. The server replies with, "here's my digital certificate, along with its hash encrypted by a well known CA (digitally signed)" It's like when you were a kid, if your Dad says you can trust somebody, you believe him.
  6. Chrome verifies the certificate.
    1. It tries to decrypt the digital signature (the hash encrypted by the CA) in the certificate. Chrome can do this because Chrome has the CA's public key from (1).
    2. If it can decrypt it, it means that the CA really signed this (because only the CA has the private key to sign this.)
    3. Since the CA really signed the certificate, it knows that the server can be trusted. Otherwise, the server wouldn't be able to get a signature from the CA.
    4. The resulting decrypted hash is untampered for sure because otherwise it wouldn't be able to decrypt it successfully.
    5. It computes the certificate's hash by itself and expects to get the same hash as the decrypted hash from (6.4). If they're different, it knows the certificate has been tampered with.
    6. It now trusts the the server is really https://plus.google.com because its certificate is "CA approved" and untampered with.
  7. Since Chrome trusts the server, it can confidently use the server's public key shipped with the certificate.
  8. Chrome creates a symmetric key to be used for the entire session. It doesn't use asymmetric keys because it's more expensive. A one time symmetric key would be secure enough because if the key was stolen, only this session will be affected.
  9. It encrypts the symmetric key with the server's public key and sends it to the server. (Only the server can see this symmetric key because only the server has the private key.)
  10. This symmetric key is used to encrypt normal HTTP traffic through out this session (session is defined as "everything until the client says it's done").
  11. When the client is done with this session, it sends "close session" to the server.
  12. No more HTTP encryption between the client and server beyond this point.
Hmmm, maybe that wasn't too concise, but I think it was a good shot. Let me know if I can make anything clearer.

No comments:

Post a Comment