Moving Sentry from Heroku to Hardware

Update: Don’t decide against Heroku just because you’ve read my blog. It makes some things (especially
prototyping) very easy, and with certain kinds of applications it can work very well.

I’ve talked a lot about how I run, mostly with my experiences
on Heroku
and how I switched to leased servers. Many people consistently
suggested that operations work is difficult so they shouldn’t deal with it themselves. I’m not going to tell you that my
roommate, Mike Clarke, one of the few operations people we have at
DISQUS, has it easy, but I’d like to give you a little bit of food for thought.

GetSentry started around Christmas of 2011. I had already built and open sourced Sentry at Disqus, and the idea was to
take that work and create a Heroku AddOn out of it. The pitch was that I could make a little bit of money on the side
simply by hosting Sentry for people. About three months later I had that prototype hosting service running on Heroku,
accepting payments both via the AddOn infrastructure, as well as on my own using the amazing
Stripe platform.

Let’s fast forward to today. I no longer run any servers on Heroku (or any cloud provider, other than S3 for backups),
and instead I lease servers. Now the company I lease from is what most people would call a “budget provider”. They’re extremely cheap (they dont’
add extreme margins to the cost of the machines you’re leasing), and they do absolutely nothing for you. It’s not
for the faint of heart. That said, it’s also how I can get away with very low costs.

I’m going to tell you a bit of a story of how I switched from Heroku to fully configured leased servers in less than
a week, in my free time. I’m also going to try to convince you that it’s really not that complicated,.

The First Server

This part could be more appropriately titled “Learning Chef”. I’m fortunate to have some awesome coworkers, and even
more fortunate that when I was making this transitiong I had access to my roommate to prod him about questions. I’m
also extremely fortunate that medians like Google, IRC, and Twitter exist for any other questions I ever have.

The first task I had to getting my prototype web server online was to get it all configured. I could have taken the
old fashioned approach of creating a few config files locally (vcs maybe) and then sending them up to the server,
as well as manually installing whatever packages I needed (nginx, memcache, etc.), but with Puppet and Chef becoming
all the range I figured it was as good as time as ever to dig into one.

I decided to use the Chef hosted service, and after a few bumps with figuring out what all this Ruby stuff was about,
I had managed to get a basic understanding of roles and cookbooks. After quite a bit of fiddling I had created a
cookbook specific to getsentry (which holds things like setting up varoius paths), and a bunch of generic ones,
like apt, nginx, memcached, python, etc.

Creating a Recipe

The meat of this was handled via Chef’s awesome roles, and wiring up a few things in the ‘default’ recipe of getsentry:

include_recipe "python"

directory "/srv/www" do
  owner "root"
  group "root"
  mode "0755"
  action :create

directory "/srv/www/" do
  owner "dcramer"
  group "dcramer"
  mode "0755"
  action :create

This formed the basis of any server that I would be running, and simply setup a couple of directories. I also simply
gave ownership to my user, as I’m the only one working on the project, and didn’t need the added complexities of build
or system users.

I then moved on to a second recipe, which formed the basis of a web node. This one has a lot more to it, as it needed
to configure nginx and memcache at the start:

include_recipe "getsentry"
include_recipe "supervisor"

template "#{node[:nginx][:dir]}/sites-available/" do
  source "nginx/getsentry.erb"
  owner "root"
  group "root"
  mode 0644
  notifies :reload, "service[nginx]"

nginx_site ""

supervisor_service "web-1" do
  directory "/srv/www/"
  command "/srv/www/ run_gunicorn -b -w #{node[:getsentry][:web][:workers]}"
  environment "DJANGO_CONF" => node[:django_conf]
  user "dcramer"

supervisor_service "web-2" do
  directory "/srv/www/"
  command "/srv/www/ run_gunicorn -b -w #{node[:getsentry][:web][:workers]}"
  environment "DJANGO_CONF" => node[:django_conf]
  user "dcramer"

There is a bit more to it then what I’ve shown, but all in all it was pretty simple. It just took me a bit to understand
how chef functioned. All in all, I’m now an engineer that has experience in Chef, even if it’s very little. From
from my perspective (on the hiring end at Disqus), that’s is an awesome addition to an engineer’s skillset.

Once the web server was online, all I had to do was to configure a primary database server. I simply brought up another
node, gave it a new role (db), and didn’t even need to create a custom recipe (I simply reused the existing pgbouncer,
postgersql, and redis recipes available elsewhere on the internet).

Operational Complexity

I stated in the beginning that I completed this process in less than a week. From Heroku to hardware it took me about
three evenings of toying with Chef (mostly more complex components, like iptables and building a deploy script). What
I really want to point out is how I have never been in an operations position. I’ve definitely configured
servers (ala apt-get install nano), and know my way around, especially with a database, but most of this was fairly
new to me.

The continued argument of it being “too difficult” to run your own servers is quite the overstatement, but it’s not
something you should ignore. There are many things I have to be concerned about, most importantly data loss and the
ability to recover in the event of a disaster on my machines. These also aren’t overly complex challenges to handle.

Data redundancy is handled a simple cron script that does nightly backups to S3. It’s literally just a script that calls
pg_dump and s3cmd to send the files upstream. Now that’s not enough for any real requirements, so step two is simply
setting up replication on your database node to a second server, if if that server is your application server.

Availability is the second big problem, and is easily avoided the same way that you avoid losing your database: have
a second server. This again can be a server thats primary task is for something other than your application (it can
be your database). It doesnt have to a permanent location for it. It only has to survive until a primary server is
available or you’re willing and able to invest in more hardware.

Closing Thoughts

I spent an initial three evenings, and another week’s worth since on server configuring an operations. There were
various problems like Postgres not being tuned well enough (pgtune is amazing by the way), DNS being slow (fuck it,
use IPs), and some more minor things that needed addressed throughout that time. All in all, there’s basically
zero day-to-day operations concerns, and most of the work happens when I need to expand the system (which is rare).

All of it ended as an extremely valuable learning experience, but you using Chef wasn’t a necessity. I could have done
things the more “amateur” way, but I also now have the benefit of being able to bring online a server, run a few
commands, and have a machine or even a cluster identical to what’s already running.

On the limited hardware I run for, that is, two servers that actually service requests (one database,
one app), we’ve serviced around 25 million requests since August 1st, doing anywhere from 500k to 2 million in a
single day. That isn’t that much traffic, but what’s important is it services those requests very quickly, and is
using very little of the resources that are dedicated to it. In the end, this means that Sentry’s revenue will grow
much more quickly than it’s monthly bill will.

GetSentry has been profitable since its 4th month, and currently only spends 10% of its monthly revenue (hardware and
other third party services). That gap gets larger every month, and I’ve been more than happy to invest some of my
time to keep that gap as large as possible. The irony of it all? I’m selling a service that’s entirely open source,
yet suggesting that you run your own hardware. For some people sacrificing cost for convenience is acceptable, for others
it may not be.

Also, this.

Look for a future post with many more details on how I setup Chef (likely incorrect) with more in-depth code and
configuration from the cookbooks.

Original Source

Leave a comment