Managing SSL certificates and HTTPS configuration at scale

Our multi-network multisite WordPress installation at WSU has 1022 sites spread across 342 unique domain names. We have 481 SSL certificates on the server to help secure the traffic to and from these domains. And we have 1039 unique server blocks in our nginx configuration to help route that traffic.

Configuring a site for HTTPS is often portrayed as a difficult process. This is mostly true depending on your general familiarity with server configuration and encryption.

The good thing about process is only having to figure it out a few times before you can automate it or define it in a way that makes things less difficult.

Pieces used during SSL certification

A key—get it—to understanding and defining the process of HTTPS configuration is to first understand the pieces you’re working with.

  • Private Key: This should be secret and unique. It is used by the server to sign encrypted traffic that it sends.
  • Public Key: This key can be distributed anywhere. It is used by clients to verify that encrypted traffic was signed by your private key.
  • CSR: A Certificate Signing Request. This contains your public key and other information about you and your server. Used to request digital certification from a certificate authority.
  • Certificate Authority: The issuer of SSL certificates. This authority is trusted by the server and clients to verify and sign public keys. Ideally, a certificate authority is trusted by the maximum number of clients. (i.e. all browsers)
  • SSL Certificate: Also known as a digital certificate or public key certificate. This contains your public key and is signed by a certificate authority. This signature applies a level of trust to your public key to help clients when deciding its validity.

Of the files and keys generated, the most important for the final configuration are the private key and the SSL certificate. The public key can be generated at any time from the private key and the CSR is only a vessel to send that public key to a certificate signing authority.

Losing or deleting the SSL certificate means downloading the SSL certificate again. Losing or deleting the private key means restarting the process entirely.

Obtaining an SSL certificate

The first step in the process is to generate the private key for a domain and a CSR containing the corresponding public key.

openssl req -new -newkey rsa:2048 -nodes -sha256 -keyout jeremyfelt.com.key -out jeremyfelt.com.csr

This command will generate a 2048 bit RSA private key and a CSR signed with the SHA-256 hash algorithm. No public key file is generated as it is inserted directly into the CSR file.

Next, submit the CSR to a certificate signing authority. The certificate signing authority will sign the public key and return a digital certificate including the signature, your public key, and other information.

The certificate signing authority is often the part of the process that is annoying and difficult to automate.

If you’re purchasing the signature of a certificate through a certificate authority or reseller such as GoDaddy or Namecheap, the steps to purchase the initial request, submit the CSR, and download the correct certificate file can often be confusing and very time consuming.

Luckily, in WSU’s case, we have a university subscription to InCommon, a reseller of Comodo certificates. This allows us to request as many certificates as we need for one flat annual fee. It also provides a relatively straight forward web interface for requesting certificates. Similar to other resellers, we still need to wait as the request is approved by central IT and then generated by Comodo via InCommon.

Even better is the new certificate authority, Let’s Encrypt, which provides an API and a command line tool for submitting and finishing a certificate signing request immediately and for free.

Configuring the SSL certificate

This is where the process starts becoming more straight forward again. And where I’ll only focus on nginx as my familiarity with Apache disappeared years ago.

A cool thing about nginx when you’re serving HTTP requests is the flexibility of server names. It can use one server block in the configuration to serve thousands of sites.

server {
    listen 80;
    server_name *.wsu.edu wsu.io jeremyfelt.com foo.bar;
    root /var/www/wordpress;
}

However, when you serve HTTPS requests, you must specify which files to use for the private key and SSL certificate:

server {
    listen 443 ssl http2;
    server_name jeremyfelt.com;
    root /var/www/wordpress;

    ssl on;
    ssl_certificate /etc/nginx/ssl/jeremyfelt.com.cer;
    ssl_certificate_key /etc/nginx/ssl/jeremyfelt.com/key;
}

If you are creating private keys and requesting SSL certificates for individual sites as you configure them, this means having a server block for each server name.

There are three possibilities here:

  1. Use a wildcard certificate. This would allow for one server block for each set of subdomains. Anything at *.wsu.edu would be covered.
  2. Use a multi-domain certificate. This uses the SubjectAltName portion of a certificate to list multiple domains in a single certificate.
  3. Generate individual server blocks for each server name.

A wildcard certificate would be great if you control the domain and its subdomains. Unfortunately, at WSU, subdomains point to services all over the state. If everybody managing multiple subdomains also had a wildcard certificate to make it easier to manage HTTPS, the likelihood of that private key and certificate leaking out and becoming untrustworthy would increase.

Multi-domain certificates can be useful when you have some simple combinations like www.site.foo.bar and site.foo.bar. To redirect an HTTPS request from www to non-www, you need HTTPS configured for both. A minor issue is the size of the certificate. Every domain added to a SubjectAltName field increases the size of the certificate by the size of that domain text.

Not a big deal with a few small domains. A bigger deal with 100 large domains.

The convenience of multi-domain certificates also depends on how frequently domains are added. Any time a domain is added to a multi-domain certificate, it would need to be re-signed. If you know of several in advance, it may make sense.

If you hadn’t guessed yet, we use option 3 at WSU. Hence the 1039 unique server blocks! 🙂

From time to time we’ll request a small multi-domain certificate to handle the www to non-www redirects. But that too fits right into our process of putting the private key and certificate files in the proper place and generating a corresponding server block.

Using many server blocks in nginx for HTTPS

Private keys are generated, CSRs are submitted, SSL certificates are generated and downloaded.

Here’s what a generated server block at WSU looks like:

# BEGIN generated server block for fancy.wsu.edu
#
# Generated 2016-01-16 14:11:15 by jeremy.felt
server {
    listen 80;
    server_name fancy.wsu.edu;
    return 301 https://fancy.wsu.edu$request_uri;
}

server {
    server_name fancy.wsu.edu;

    include /etc/nginx/wsuwp-common-header.conf;

    ssl_certificate /etc/nginx/ssl/fancy.wsu.edu.cer;
    ssl_certificate_key /etc/nginx/ssl/fancy.wsu.edu.key;

    include /etc/nginx/wsuwp-ssl-common.conf;
    include /etc/nginx/wsuwp-common.conf;
}
# END generated server block for fancy.wsu.edu

We listen to requests on port 80 for fancy.wsu.edu and redirect those to HTTPS.

We listen to requests on port 443 for fancy.wsu.edu using a common header, provide directives for the SSL certificate and private key, and include the SSL configuration common to all server blocks.

wsuwp-common-header.conf

This is the smallest configuration file, so I’ll just include it here.

listen 443 ssl http2;
root /var/www/wordpress;

Listen on 443 for SSL and HTTP2 requests and use the directory where WordPress is installed as the web root.

These directives used to be part of the generated server blocks until nginx added support for HTTP2 and immediately deprecated support for SPDY. I had to replace spdy with http2 in all of our server blocks so instead decided to create a common config and include it.

WSU’s wsuwp-common-header.conf is open source if you’d like to use it.

wsuwp-ssl-common.conf

This is my favorite configuration file and one I often revisit. It contains all of the HTTPS specific nginx configuration.

# Enable HTTPS.
ssl on;

# Pick the allowed protocols
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;

# And much, much more...

This is a case where so much of the hard stuff is figured out for you. I regularly visit things like Mozilla’s intermediate set of ciphers and this boilerplate nginx configuration and then make adjustments as they make sense.

WSU’s wsuwp-ssl-common.conf is open source if you’d like to use it.

wsuwp-common.conf

And the configuration file for WordPress and other things. It’s the least interesting to talk about in this context. But! It too is open source if you’d like to use it.

The process of maintaining all of this

At the beginning I mentioned defining and automating the process as a way of making it less difficult. We haven’t yet reached full automation at WSU, but our process is now well defined.

  1. Generate a private key and CSR using our WSUWP TLS plugin. This provides an interface in the main network admin to type in a domain name and generate the required files. The private key stays on the server and the CSR is available to copy so that it can be submitted to InCommon.
  2. Submit the CSR through the InCommon web interface. Wait.
  3. Upon receipt of the approval email, download the SSL certificate from the embedded link.
  4. Upload the SSL certificate through the WSUWP TLS interface. This verifies the certificate’s domain, places it on the server alongside the private key, and generates the server block for nginx.
  5. Deploy the private key, SSL certificate, and generated server block file. At the moment, this process involves the command line.
  6. Run nginx -t to test the configuration and service nginx reload to pull it into production.
  7. In the WSUWP TLS plugin interface, verify the domain responds on HTTPS and remove it from the list.

Looking at the steps above, it’s not hard to imagine a completely automated process, especially if your certificate authority has a way of immediately approving request and responding with a certificate. And even without automation, having this process well defined allows several members of our team to generate, request, and deploy certificates.

I’d love to know what other ways groups are approaching this. I’ve often hoped and spent plenty of time searching for easier ways. Share your thoughts, especially if you see any holes! 🙂

Previously:

Configure Nginx to allow for embedded WordPress posts

The ability to embed WordPress posts in WordPress posts is a pretty sweet feature from 4.4 and I’ve been looking forward to finding ways of using it throughout WSU. Today, when I tried it for the first time, I got an error because of our strict X-Frame-Options header that we had set to SAMEORIGIN for all page views.

To get around this, I added a block to our Nginx configuration that modifies this header whenever /embed/ is part of the requested URL. It’s a little sloppy, but it works.

Before our final location block, I added a new one to capture /embed/:

# We'll want to set a different X-Frame-Option header on posts which
# are embedded in other sites.
location ~ /embed/ {
    set $embed_request 1;
    try_files $uri $uri/ /index.php$is_args$args;
}

This sets the $embed_request variable to be used later in our final .php location block:

location ~ \.php$ {
    try_files $uri =404;

    # Set slightly different headers for oEmbed requests
    if ( $embed_request = 1 ) {
        add_header X-Frame-Option ALLOWALL;
        add_header X-Content-Type-Options nosniff;
        add_header X-XSS-Protection "1; mode=block";
    }

    # Include the fastcgi_params defaults provided by nginx
    include /etc/nginx/fastcgi_params;
    ...etc...

Now, all URLs except those specifically for embedding are prevented from being used in iframes on other domains.

And here we are!

Still searching for Amelia

 

Figuring out how to serve many SSL certificates, part 2.

I’ve been pretty happy over the last couple days with our A+ score at SSL Labs. I almost got discouraged this morning when it was discovered that LinkedIn wasn’t able to pull in the data from our HTTPS links properly when sharing articles.

Their bot, `LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 +http://www.linkedin.com)`, uses an end of life HTTP client that happens to also be Java based. One of our warnings in the handshake simulation area was that clients using Java Runtime Environment 6u45 did not support 2048 DH params, something that we were using. I’m not entirely sure if LinkedIn has their JRE updated to 6u45, but I’m guessing that anything below that has the same issue.

I generated a new 1024 bit dhparams file to solve the immediate issue and reloaded nginx without changing any other configs. LinkedIn can now ingest our HTTPS links and we still have an A+ score. 🙂

Figuring out how to serve many SSL certificates, part 1.

In the process of figuring out how to configure SSL certificates for hundreds (maybe thousands) of domains in a single nginx configuration without a wildcard certificate, I decided it would be cool to use `server_name` as a variable in the nginx configuration:

`ssl_certificate /etc/nginx/ssl/$server_name.crt;`

Unfortunately, per this aptly named request on Server Fault—nginx use $server_name on ssl_certificate path—that’s not allowed.

Nginx docs explain it more:

Variables are evaluated in the run-time during the processing of each request, so they are rather costly compared to plain static configuration.

So with that, I’m going to have to generate a bunch of `server {}` blocks that point to the correct certificate and key files before including a common config. I can’t find any examples of this yet, so I’m still wondering if there’s a better way.

Clear nginx Cache in Vagrant

Fooled you. You think that cache is the problem, but it’s not.

Scenario 1… You installed Vagrant with VirtualBox on your local machine and have a sweet nginx setup going as your development environment. You made a few changes to a CSS file and the new style is not reflecting on the page. You try saving the file again in your text editor, no go. You look at the file on the server, it’s cool. You restart the nginx service, still no change. You restart the services for php5-fpm and memcached, maybe even mysql… no go.

Something has captured this file in cache and is not letting go!

Scenario 2… Same setup. You made a few changes to a JS file and the script doesn’t seem to be working. Must be a caching issue. You try saving the file again, look at the file on the server, restart nginx, restart everything. Finally look at the console in your browser and see some kind of random error.

Sooner or later, with one of these files, you open it up and see these:

�����������������

What the what? It’s an encoding issue? Not a caching issue? Or it’s a… wait, what?

Hopefully you haven’t spent too much time trying to figure this out before stumbling on a site like this one that tells you the only change necessary is a simple line in your nginx config file.

sendfile off;

Find the spot in your assorted nginx config files that says ‘sendfile on’ and change it to ‘sendfile off’.

Sendfile is used to ‘copy data between one file descriptor and another‘ and apparently has some real trouble when run in a virtual machine environment, or at least when run through Virtualbox. Turning this config off in nginx causes the static file to be served via a different method and your changes will be reflected immediately and without question – or black question mark diamond thing.

Hope that saves you a minute.

For further reading, consider those that have stumbled on the same problem before.

Or, even better – more detail about sendfile itself and other common nginx pitfalls:

Are nginx, Batcache and Memcached Working?

So you’ve done your due diligence and gone to great lengths to make sure that your WordPress server setup is top notch.

Nginx, PHP-FPM, and Memcached are all installed and process are running. Batcache and a Memcached PECL plugin are installed for WordPress.

How do you know it’s working, besides that things seem quick?

The obvious answer is to open an incognito or private window in your browser, visit the site in question, and then view source on the page load. In the <head> area, there should be a comment left by Batcache providing statistics on when and for how long it was generated.

Batcache Comments in page source

However, the easy answer is not always the most fun answer. And not every answer tells the full story. Checking the reponse headers of the HTTP request can also fill you in on some extra info.

Batcache by default is set to only start serving cached page views if a page has been accessed more than twice in a 120 second period. It also should only be serving these cached views to unauthenticated users. Before you’ll be able to see the entire package in action, you need to load whichever page you want to test a few times as a non authenticated user, and then check the headers. If you have a command prompt with curl installed, your easiest option is this:

curl http://mydomain.com/
curl http://mydomain.com/
curl http://mydomain.com/
curl -I http://mydomain.com/
curl -I http://mydomain.com/

Do these in quick succession, ignoring the page content that you receive after the first 3 commands. Hint, hint.. use the up arrow. Feel free to ignore me too, because right now I’m hitting it after one page view, so who knows what I did.

After the fourth command, issued with the -I flag that tells curl to only make a HEAD request, your should see output indicating that Batcache has caught the request and is attempting to work some magic.

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 10 Dec 2012 06:19:30 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
X-Powered-By: PHP/5.3.6-13ubuntu3.9
Vary: Cookie
X-Pingback: http://mydomain.com/wordpress/xmlrpc.php
Link: <http://little.shorty/2aYLe>; rel=shortlink
Last-Modified: Mon, 10 Dec 2012 06:19:30 GMT
Cache-Control: max-age=300, must-revalidate

The last two lines in this HEAD output are key. Cache-Control has been turned on, and the max-age of the request has been set to 300 seconds. Sweet! This means that at least one thing is doing something – Batcache.

Now, the second curl request made with the -I flag gives us almost the same information:

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 10 Dec 2012 06:42:45 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
Last-Modified: Mon, 10 Dec 2012 06:42:43 GMT
Cache-Control: max-age=298, must-revalidate
X-Powered-By: PHP/5.3.6-13ubuntu3.9
Vary: Cookie
X-Pingback: http://mydomain.com/wordpress/xmlrpc.php
Link: <http://little.shorty/2aYLe>; rel=shortlink

You’ll notice here that the Cache-Control line has moved up a bit and the max-age of the request has been set to 298 seconds. This means the timer on the cache expiration has started to count down. If you were to keep on making curl -I requests at this point, you could expect that timer to count all the way to 0 before the page cache was regenerated.

The ordering of the lines in these examples may not always match, but if you do this curl -I request multiple times and max-age stays at 300, this is probably an indication that while Batcache is doing what it needs to do, there is no actual caching going on in the background. At this point, check to make sure that you have an object cache plugin installed and that Memcached is in fact running.

Of course, sometimes it isn’t as easy as just making sure everything is running. Batcache and Memcached may be running fine, but if there’s a PHP Session to be maintained, then the Cache-Control header will probably not be modified and you’ll see something similar to this:

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 10 Dec 2012 06:06:33 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
Last-Modified: Mon, 10 Dec 2012 06:03:29 GMT
X-Powered-By: PHP/5.3.6-13ubuntu3.9
Vary: Cookie
Set-Cookie: PHPSESSID=dpbnobn95jrgg5fhpfaa5sn707; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
X-Pingback: http://mydomain.com/wordpress/xmlrpc.php
Link: <http://little.shorty/12345>; rel=shortlink

At this point, you’ll need to go back to relying on the comment left by Batcache in the page source to determine if it is still generating things as intended. Though it may be work a search for ‘session_start()’ through your code base to see if you really need that PHP Session, as this could be negating some of what you are hoping to achieve through caching anyway.

Oh, and if nothing is turned on at all, expect something similar to this:

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 10 Dec 2012 06:21:48 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
X-Powered-By: PHP/5.3.6-13ubuntu3.9
X-Pingback: http://mydomain.com/wordpress/xmlrpc.php
Link: <http://little.shorty/2aYLe>; rel=shortlink

Notice no Cache-Control header at all when Batcache is out of the picture.

All of the above is intended to be taken with a grain of salt. I didn’t research too much of it thoroughly, and a lot is dependant on my specific environment. It may come in handy to a few though, and it’ll probably come in handy to myself again.

Server Setup: nginx 1.0.5 | Memcached PECL | WordPress 3.5 | Batcache | WordPress Memcached Backend