Sailing Past Dependency Hell With Docker

Have you ever been excited to tinker with a software project, only to have dependency hell ruin all of the fun? As a software consultant, I face this situation all the time.

Luckily however, technologies like Docker take the pain out of spinning up additional components of your app architecture, like databases or job queues, or installing runtimes like Ruby or Python. By isolating processes from one another with very little overhead, you can sail past dependency hell.

Over the last year, I’ve worked on a number of different projects, encountered lots of of different requirements, and explored all sorts of ways of working through dependency conflicts. This is the story of my search for a solution, the pros and cons of each path I explored, and how I ended up using Docker.

Life as a Consultant, Day 1

When I started working as a consultant, I looked down at my beautiful new laptop and made a promise to treat it right. I wouldn’t install a bunch of system extensions. I wouldn’t tweak too many esoteric settings. I wouldn’t bog it down with databases and other services.

Then came my first client project. Along with it came Python 2.7.6, MySQL 5.1.73, Ruby 1.9.3-p545, Node 0.10.29, Elasticsearch something-dot-something. And as I looked through the 500-line README, my heart sank.

My first reaction was to try to sidestep the dependencies. Occasionally, you can get away without a dependency. Usually, you’re just delaying it for a while. Even for an optional dependency, skipping it meant a delicate dance of stubbed-out methods and dark hallways that must not be entered. Before long, this delicate dance turned into a clumsy stumble.

Faced with the reality that ignoring these dependencies was not a long term solution, I started to wonder why this project had so many dependencies in the first place. Certainly it had too many dependencies!

I may have been right about this one, but it’s not a solution. Over time, I may be able to steer the project away from some of its dependencies, but that wouldn’t do me any good at the moment.

I was ready to bite the bullet, beg forgiveness from my new laptop, and install all of this stuff. But I wasn’t quite ready to give up without a fight. If I took really detailed notes, I thought, maybe I could just uninstall all of this stuff later. Welcome to the world of “Homebrew post-install notes.txt” and “List of rbenv files.txt”. Now, in addition to a 500-line README file, I had a 600-line UNINSTALL file.

I ended up with a typical developer setup: the base OS, plus all of the services needed to run my application. Or, in a picture, this:

Services running on base OS

Life as a Consultant, Day 2

After a week or so, my first project ended, and I moved on to a new client and a new project. This project had its own unique set of dependencies: Ruby 2.2.1, Redis 2.8.19, MySQL 5.6.21, Elasticsearch something-dot-something-else.

Realizing that installing every dependency of every client project directly on my laptop would not be sustainable in the long run, I used Vagrant to create a virtual machine to install these dependencies on.

This new setup turned out to be pretty nice. Instead of services installed directly on my laptop, I had them conveniently tucked away in a virtual machine, like this:

Services running in VM

Life as a Consultant, Day 3

As I continued to move to new client projects, I accumulated a collection of Vagrant-based virtual machines on my laptop. From time to time, I would even find myself working on a few projects at once, all in their own nicely isolated VMs.

It turns out, though, that managing your application process in a virtual machine is kind of cumbersome. I quickly grew tired of using SSH to connect to a VM every time I wanted to run an application or see a log file. And, when I was connected, I struggled to work with the limited set of command line tools available on the VM. I settled on a middle-ground where service dependencies, like Elasticsearch, MySQL, and Redis, etc. ran on the VMs, and my code ran locally, using tools like rbenv and virtualenv to manage Ruby and Python runtimes.

My hybrid setup looked something like this:

A hybird approach

Switching to Docker

Per-project virtual machines were working pretty well, but they were becoming a real drain on my laptop’s resources. Each project’s VM required an entire Linux OS and file system, and I often needed to have several active at once.

Then one day, a client I was working with switched to Docker for their deployment process. I had never heard of Docker before, and at first I didn’t understand how it was different from Vagrant and the virtual machines I was used to.

And then I saw what was different about Docker, and my eyes widened.

In theory at least, I could use Docker to radically change my setup. I would leave behind all of these virtual machines and live in the world I had dreamed of all along.

A world that would look like this:

Services running in Containers

The individual virtual machines with their individual copies of Linux would be gone, replaced by nothing but fully isolated services running directly on the host OS. Of course, few things are as nice in practice as they are in theory.

In practice, Docker is (currently) limited to running Linux services on a Linux host, so I still needed a Linux virtual machine. Unlike my old setup, though, I only needed a single VM. On top of that VM, I could run all of my individual services, each in a separate Docker container. The containers would be fully isolated from one another, and networked as if they were in separate VMs.

My reality now looked like this:

Services running in Containers in VM

By far the easiest way to achieve this was (and still is) with Boot2Docker. Installing this wonderful piece of software created a lightweight Linux virtual machine that was preconfigured for Docker.

Developing with Docker

My team had to put in some work to configure our project so that it could be built as a Docker image and run as a Docker container. A lot of that work was simply adopting best practices, like the twelve-factor app approach.

With that in place, we started to organize our entire development process around Docker.

Instead of 500-lines of setup instructions, ours looks like this:

  1. Install Boot2Docker and Compose
  2. Clone the Git repository
  3. Run make

What was absent from these instructions was any mention of specific versions of MySQL, Ruby, Elasticsearch or anything else. That was all taken care of in our Docker (and Compose) configuration.

There were some ups and downs over the following months, but the effort paid off.

Setting up a the environment for a new developer was easier than it had ever been before. Our unit and integration tests ran, and passed reliably (sadly, that’s not always a given). When tests passed locally, you could be sure that they wouldn’t fail on the CI server due to a slightly different version of some service. We could tinker with configuration settings without fear of permanently breaking a service—a fresh setup was just one make command away.

Moving to Deployment

Although I was initially introduced to Docker as a deployment concern, the development and testing benefits I saw made me start to think of it as a development tool. By the time we were ready to deploy our application, I had almost forgotten why we had even started using Docker. I’m so glad we had made the switch, though, because getting our application ready for deployment was a breeze.

We had to make a few changes, like switching from WEBrick to Puma, and adjusting our logging, for our application to be truly production-ready. But, even before making those changes, we were able to run our existing application image in the production environment. And, because we were running from the same Docker images that we were using for development and testing, we had those same assurances that our application wouldn’t crash for no good reason.

The biggest hurdle we faced was working through issues with our own deployment system. As early-adopters, we ended up being guinea pigs for the company’s new, home-grown container-based infrastructure.

Being Nice to Your Future Self

If you’re a web application developer like me, then the whole reason you’re writing software is to (hopefully) deploy it some day. By using Docker for the whole chain, from development to testing to deployment, you can make your life easier. “Only in production” bugs are some of the worst things to debug. Anything you can do to reduce the differences between what you see day-to-day, and what you see in production, will help reduce these types of issues.

As an added bonus, when it comes time to deploy your application you’ll have some unbelievably simple options. There are a number of platform-as-a-service companies that allow you to deploy a containerized application in a matter of minutes.

Sailing On

Running supporting services, like databases and job queues, directly on your dev computer can quickly lead you to dependency hell. Strategies for avoiding that often rely on virtual machines that can take up lots of resources, and slow your machine down. Docker containers aim to provide the same isolation as those VMs, but using only a fraction of the resources.

Using Docker for deployment, combined with Compose for development, is a great option for managing dependencies.

There are still some limitations, though. For example, Compose doesn’t get along very well with Node modules. Disk access in development can be slow because of the Linux virtual machine between you and your containers. And sometimes the whole thing just falls apart and you have to restart the VM. You’ll need to weigh those limitations against the frustration of a trip to dependency hell for yourself. For me, it’s often worthwhile.