Creating Subdomain Hosts On The Fly Using Apache
Often when I'm developing applications for the web or testing revisions to a web site's design, I like to set up a temporary staging area for testing on a subdomain that isn't used. This usually means creating a DNS record defining the subdomain, then adding the appropriate VirtualHost block in my Apache web server's configuration. Of course if I'm testing several versions at once, I may need several subdomains. And with each change to DNS comes the inevitable waiting-for-propagation before remote clients can see any of my new revisions. But by using Wildcard DNS in combination with a few sneaky mod_rewrite directives, we can create an Apache server that will automatically serve content on any subdomain, without touching the configuration or mucking about in the DNS records. This article will explain how to do just that, and examine some pros and cons of this approach.Update!
Recently I discovered quite by accident that enabling MultiViews will cause mod_rewrite's file test operators (-f, -d and so on) to become confused. You can verify this for yourself by creating a file, say site.css, in a subdomain directory, and a file whose name matches it, say site.css.old, in your webserver root directory, and then request the subdomain's site.css file. Crazy! The solution is to simply add Options -MultiViews to your config block, as this article now shows. -G
How It Works
The set-up is actually pretty simple: we'll create a directory inside our webserver's document root that contains directories for each subdomain we wish to host. In our examples below, our Document Root is set to/var/www, and our "subdomain directory" is called _. So the server's filesystem layout looks like this:
/var/www /var/www/index.html /var/www/site.css /var/www/_ /var/www/_/test1 /var/www/_/test1/index.html /var/www/_/test2 /var/www/_/test2/index.html /var/www/_/test3 /var/www/_/test3/index.html # ...In the
/var/www directory we have the usual suspects: an index.html page that provides the content for the http://automagick.us site, and a cascading style sheet file. In the /var/www/_ directory, we have three subdirectories -- test1, test2 and test3. Each of these contain an index.html that is the index page for the website http://test1.automagick.us, http://test2.automagick.us, and http://test3.automagick.us, respectively. Every additional directory we create inside /var/www/_ will automatically create a new website on the subdomain of the same name. Handy, yes? Let's get started!
Wildcard DNS
Before we get to the Apache magic, we have to take care of DNS. Simply put, a wildcard DNS entry will treat any subdomain not otherwise defined in your DNS maps as an A record for a specific IP address. All you need to do is create a new record in your maps thusly (using automagick.us as an example):*.automagick.us IN A 123.456.789.1With this entry in place, any name that is asked for in the automagick.us domain that isn't defined elsewhere will resolve to the (fictitious) IP address 123.456.789.1. This includes the subdomains test1.automagick.us, test2.automagick.us, and test3.automagick.us, but also test4.automagick.us, fnord.automagick.us, this.really.long.silly.name.automagick.us, and so on. Note: this is cool, but has some implications that may not be obvious; see the Pros & Cons section for more. For now, let's configure Apache and make the magick happen:
Apache Configuration
In a traditional configuration, your Apache webserver would need VirtualHost directives for each subdomain you wish to host. That would mean creating a new config block every time you add a subdomain. Instead, we're going to configure our server with some rules for mod_rewrite inside our root <Directory> block and forgo the VirtualHost configuration entirely. Here's the whole block:
<Directory "/var/www">
# ... regular old directory configuration ...
Options -MultiViews
RewriteEngine On
RewriteBase /
# don't mess with rediected requests
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
# avoid serving pages on subdomains that don't exist as directories in the /_/ directory
RewriteCond %{HTTP_HOST} !^(www\.)?automagick\.us [NC]
RewriteCond %{HTTP_HOST} ^(.+).automagick\.us [NC]
RewriteCond /var/www/_/%1 !-d [NC]
RewriteRule .* http://automagick.us/ [L]
# internally redirect all subdomain requests to corresponding directory
RewriteCond %{HTTP_HOST} !^(www\.)?automagick\.us [NC]
RewriteCond %{HTTP_HOST} ^(.+).automagick\.us [NC]
RewriteCond /var/www/_/%1 -d [NC]
RewriteCond %{REQUEST_URI} !^_/.* [NC]
RewriteRule (.*) _/%1/$1 [L]
# redirect http://automagick.us/_/subdomain to http://subdomain.automagick.us
RewriteCond %{HTTP_HOST} ^(www\.)?automagick\.us [NC]
RewriteRule ^_/([^/]+)(.*) http://$1.automagick.us$2 [L,R=301]
</Directory>
So that's a lot of stuff. Let's break it down line by line and see what's going on. First, we enable mod_rewrite for the directory, and specify a base URL for per-directory rewrites using RewriteBase:
RewriteEngine On RewriteBase /The documentation for RewriteBase may seem confusing at first, but think of it as telling mod_rewrite where the newly-rewritten URL should begin. Since we want the contents of of the subdirectory to appear in the root URL of our subdomain, we set the
RewriteBase to /.
Next, a hint that we don't want mod_rewrite to process requests that have already been processed:
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
Since the results of our rewrites will create internal redirects that will be handled by this same configuration block, we don't want URLs rewritten over and over. The RewriteCond condition instructs apache to check the environment for a REDIRECT_STATUS variable with a value of 200. If the request has such a variable, the RewriteRule is enacted, which is just syntatic sugar for "prevent any further mod_rewrite directives from being processed for this request."
We also don't want every single valid subdomain request to be handled by Apache; we only want to serve content for subdomains that have corresponding directories in our /var/www/_ tree. These next four lines take care of that:
# avoid serving pages on subdomains that don't exist as directories in the /_/ directory
RewriteCond %{HTTP_HOST} !^(www\.)?automagick\.us [NC]
RewriteCond %{HTTP_HOST} ^(.+).automagick\.us [NC]
RewriteCond /var/www/_/%1 !-d [NC]
RewriteRule .* http://automagick.us/ [L]
The RewriteRule is pretty straightforward: it says "redirect all requests to the root domain, and stop processing." Why would we want to do that? The answer is in the rewrite conditions. The first condition matches every request that is not www.automagick.us or just automagick.us (notice the leading !). The second condition matches every request that points to a subdomain. Taken together, these two conditions will only be satisfied for requests that don't point to the root domain or its alias, www.
The last condition is trickier. Let's look at it again:
RewriteCond /var/www/_/%1 !-d [NC]This condition users a technique known as backreferencing, which is recognizable to anyone who uses regular expressions. But all you need to know to understand this example is that the
%1 will be replaced with the subdomain name that was matched in the previous condition, which in turn is the subdomain of the original request. So if the URL that was requested was http://foo.automagick.us, the %1 would be replaced with foo before the condition is evaluated.
But what does it do, this final condition? It checks the /var/www/_ directory to see if there's a directory that matches the requested subdomain. In our case, it checks for the existence of /var/www/_/foo. But notice that !, again -- this means inverts the regular meaning of the test. So what we have is a condition that will be true if the proper directory is missing. So altogether, our three conditions tell mod_rewrite that it should redirect the request back to the root domain if:
- The request is for a subdomain;
- The request isn't for the
wwwsubdomain; and - There is no directory in
/var/www/_with the same name as the subdomain.
Do you see it now? We've just denied requests for subdomains we haven't created content for, and handled those denied requests by gracefully sending the user back to the main website. But we still haven't actually served up the domains, have we? Well, let's! Here's the rewrite block that does just that:
# internally redirect all subdomain requests to corresponding directory
RewriteCond %{HTTP_HOST} !^(www\.)?automagick\.us [NC]
RewriteCond %{HTTP_HOST} ^(.+).automagick\.us [NC]
RewriteCond /var/www/_/%1 -d [NC]
RewriteCond %{REQUEST_URI} !^_/.* [NC]
RewriteRule (.*) _/%1/$1 [L]
The first three conditions may look familiar; in fact they're the very same conditions as in the previous section, but with one important distinction. Look at the third condition:
RewriteCond /var/www/_/%1 -d [NC]Notice how there's no leading
! before the directory test? For our subdomains to work, we want to match all requests for subdomains that do have content, ie., those with corresponding directories inside /var/www/_.
There is also one additional condition:
RewriteCond %{REQUEST_URI} !^_/.* [NC]
This one says "do not match any request with a URI beginning with _/. This is more sanity-checking; we don't want our rewrite rules to recurse down into the /var/www/_/* tree, or Bad Things™ will happen.
So taken together, our four conditions will match a request if:
- The request is for a subdomain;
- The request isn't for the
wwwsubdomain; and - The requested subdomain has a corresponding directory in
/var/www/_
/var/www/_/test1 as if it was the document root of the subdomain that was requested. A final note: see the $1 at the end of the rule? That's there so that the query string is preserved across the redirect -- a must for form submissions using GET.
We're almost done! We have just one more thing to take care of: In case a user ends up requesting one of our subdomains directly (it's in the document root, after all), we want to force them to use the appropriate subdomain address instead. This will ensure that scripts and other clever bits that think they're running on an actual virtual host see the proper URIs. The final two lines of our configuration are:
RewriteCond %{HTTP_HOST} ^(www\.)?automagick\.us [NC]
RewriteRule ^_/([^/]+)(.*) http://$1.automagick.us$2 [L,R=301]
The condition matches all requests on the root domain (and its www alias), and ensures that any direct requests for directories inside _ get redirected back out to the subdomain.
That's it! With this configuration in place, all you need to do is create the _ directory and begin populating it with content for your dynamic subdomains. When you want to stop serving one, simply remove the subdirectory. Sweet!
Final Thoughts: The Good And The Bad
We already know what's good about this setup -- once in place, you won't ever have to muck with Apache virtual hosts to quickly deploy a new subdomain on your webserver. You can easily move content around between the subdomains, and you won't ever have to wait for DNS to propagate a change before showing a temporary URL to a client. This approach is also great for running several iterations of the same content, with only a few changes. But there are some drawbacks as well:Despite these gotchas, this approach can be very handy for rapidly prototyping webapps, testing new content, serving low-bandwidth resources on their own URL just for the Hell of it.It's slow. mod_rewrite is a very efficient little module, and is capable of doing an awful lot of work very quickly, but it will never be as fast as simply serving up direct requests. Worse still, the two directory tests in our rewrite conditions mean up to two additional
stat()calls for every request, and filesystem access is very expensive. This is definitely not a configuration you'd want serving a production website with even moderate traffic.Directories are not virtual hosts. This may seem obvious, but it's worth pointing out: by serving all of your subdomain content out of the same <Directory> directive, you lose an awful lot of flexibility in configuring each host.
It might confuse your designers, especially if they use fancy GUI-based web editors that like to suck up entire directory trees and assume they're all content for the same site.
Wildcard DNS can be dangerous. This one has the most potential for disaster: by making your box answer to every conceivable name on your subdomain, your risk of exposure to spam, DoS, script kiddies and other internet gremlins multiplies by orders of magnitude. Any box that answers to wildcard DNS names had better have a big thick firewall in front of it.
License
All source code, tools and scripts on http://automagick.us is Copyright © 2007 - 2010 Greg Boyington, and licensed under aCreative Commons Attribution-Share Alike 3.0 United States License, except where otherwise noted.
