AWS Gotcha - IIS + ELB + HTTPS

So I've run into this twice now and have to write it down so I don't forget (or at least so future-me can google myself).

We were using a standard AWS Elastic Load Balancer (ELB) to serve traffic to our API.  The API is served from IIS webservers via 80/443.  ELB is not checking back-end cert authenticity.  (the servers have a self-signed cert)

We're trying to move toward more of a farming approach to servers vs. pets, and so decided to spin up a batch of new servers and migrate them over using the very fabulous Route 53 traffic policies.  It worked great!

Until humans got involved.

Since this is a new thing, I built one, tested it, then built the rest.  For whatever reason a seemingly minor change I made didn't make it back into the repo, and so the other 3 servers were built using the non-fixed configuration.

We noticed in New Relic that of the four servers, only one was doing any heavy lifting - the other 3 were receiving almost the same amount of requests.  So what's the deal, AWS!!  Come on!

Anyways...it turns out that by fixing that bad config everything turned into rainbows and sunshine.

The bad config?  Well, you need to understand IIS gotcha #1:  You cannot programmatically add a website to an IIS instance that has zero websites. (you can, however, do it via GUI - this is a long-standing thing and a very dumb problem to have)

Ok, cool, so fine - leave the Default Web Site in place, but stop it.  Fine, no problem.

Still, when you try to go through an ELB to a host set up like this via HTTPS it fails to load (endless timeout is how it manifested itself).

THE FIX
Add an HTTPS binding to the stopped Default Web Site and Robert's your mother's brother.

Srsly.  Site is stopped, and yet this causes ELB security checks or something (if anyone knows, comment away) to say 'no HTTPS traffic for you!'.  ELB does continue to send HTTP traffic, however - which explains why I saw some good traffic coming through (logs showed all of the traffic was on port 80).

Anyways.  There you go, future me.

Comments

Popular posts from this blog

DFSR - eventid 4312 - replication just won't work

Fixing duplicate SPNs (service principal name)

Logstash to Nagios - alerting based on Windows Event ID