Amazon S3 Bucket Names For Websites

... or, 'Why can't I give my bucket a Route 53 DNS name?'

TL;DR: Because S3 buckets don't get an IP number just because their 'Use this bucket to host a website' toggle is checked. I knew that...

The Setup

I spent a good chunk of yesterday morning trying to figure out why I was unable to give a DNS entry in Route 53 to the S3 bucket that was hosting this blog. Resolving this problem gave me a much better understanding of how S3 implements HTTP access to S3 buckets and how DNS, HTTP, and TCP interact in non-obvious ways.

Now, I knew I wanted to host this blog as a static site in S3, as I this would be a super-cheap hosting solution and it would give me a chance to experiment with adding more AWS functionality later. So, after a little fooling around with Pelican I had a skeleton website ready to go, loaded into an S3 bucket.

Just to review, an Amazon S3 bucket is a simple object store where you can put arbitrary data as opaque 'objects', and associate each object with a key value. There's an API for accessing S3 objects programmatically, but there is also an option to make the contents of an S3 bucket available via HTTP; this is what makes hosting a website in an S3 bucket so attractive.

When you create an S3 bucket you can enable this functionality by toggling the 'Use this bucket to host a website' button in the AWS console. If you do this, the console will reward you with a little note like

Endpoint : http://scratch-bucket.s3-website-us-east-1.amazonaws.com

This is great! All I would need to do would be to enable this feature and my website would be up and running. Sure enough, pasting http://scratch-bucket.s3-website-us-east-1.amazonaws.com into my browser's address bar brought up my scratch website. Success!

All that was left to do was give this endpoint a more convenient URL and I'd be ready to go. How hard could that be?....

Pain And Woe...

Now, I had read the Route 53 docs. I had watched the informative videos. I knew that I could use Route 53 (Amazon's DNS service) to map a DNS name to my bucket. I could just make an entry for blog.foobar.com pointing to my bucket and I'd be off to the races. Simple!

The way you're supposed to do this using Route 53 is by creating an 'alias' record for blog.foobar.com. However, trying to create such an alias in Route 53 console resulted in the pulldown menu being populated with an unhelpful collection of S3 buckets:

— S3 website endpoints — No Targets Available

OK, well, perhaps there was something wrong with Route 53 or my bucket; I'd fix that later. We don't need this helpful pulldown menu: we know the name of the S3 bucket we want to use and we know they're globally unique, so we'll just jam it into the edit box ourself! But putting 'scratch-bucket' into the 'Alias Target' box just gave another error when trying to save the DNS resource record:

The record set could not be saved because: - Alias Target contains an invalid value.

Well, OK, trying to make an alias wasn't working. But I'm smart: all I want is for clients trying to access 'blog.foobar.com' to be routed to 'scratch-bucket.s3-website-us-east-1.amazonaws.com', and I know how to do that: I just needed a DNS CNAME record for 'blog.foobar.com' that tells DNS clients that they should redirect to the S3 address instead, right?

But this doesn't work, either: this didn't even resolve correctly! What was going on?

The Light Finally Dawns...

Hmmm... Alright, I knew that the scratch bucket was accessible: maybe I could just make a DNS A record that associated 'blog.foobar.com' with the IP number of S3 bucket? First I had to fire up nslookup(1) to see what IP number was:

> nslookup scratch-bucket.s3-website-us-east-1.amazonaws.com.
Server:     192.168.1.1
Address:    192.168.1.1#53

Non-authoritative answer:
scratch-bucket.com.s3-website-us-east-1.amazonaws.com   canonical name = s3-website-us-east-1.amazonaws.com.
Name:   s3-website-us-east-1.amazonaws.com
Address: 52.216.65.234

Oh. Of course.

Just because S3 buckets that have their 'Bucket Hosting' toggle turned on have DNS entries doesn't mean they have their own IP numbers. Indeed, nslookup shows us that all such S3 buckets in an Amazon region are handled by a pool of IP numbers. The HTTP handler that winds up handling an incoming request relies on the HTTP Request URL: header to determine which S3 bucket to look in.

TL;DR: If you want to host a website in AWS using S3, you need to give your S3 bucket the same name as your DNS hostname.

This is actually a terrible feature of AWS. There's really nothing to stop a bad actor from essentially cybersquatting on S3 bucket names. I shouldn't be able to create an S3 bucket named 'sortinghat.google.com' and thereby prevent Google from hosting a website at that address using S3.

links

social