~/

Speeding up Docker on Ruby projects - an experimental approach

Docker-Rails

It has been three days since I started exploring Docker with Rails, spending most of my time trying to figure out the development and production workflows (including setting up a load balancer with nginx, Registrator, and Consul because I wanted zero-downtime deployments).

Let’s take a look at this command:

bundle install

Our workflow is slowed down tremendously since we’re using Docker.

Due to the nature above, any Ruby project can be slow on Docker because whenever the contents of the Gemfile changes, even by one or two gems, it has to run bundle install all over again during Docker’s build process. It may be fine in production, but when we’re in development mode, constantly altering the gems, we’ll end up spending more time waiting for the image to build.

Here are a few approaches that I took to fix that:

First approach: utilizing Docker’s cache

# Choose the official Ruby 2.3.0 image as our starting point
FROM ruby:2.3.0

# Run updates for JS runtime
RUN apt-get update -qq && apt-get install -y build-essential nodejs

# Cleanup
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Set up working directory
RUN mkdir /app

# Copy Gemfiles and install gems
WORKDIR /tmp
COPY Gemfile Gemfile
COPY Gemfile.lock Gemfile.lock
RUN bundle install

# Everything up to here was cached. This includes
# the bundle install, unless the Gemfiles changed.

# Change back to app directory
WORKDIR /app
ADD . /app # optional if we mount local volumes to /app

# Start the server
CMD ["rails", "server", "-b", "0.0.0.0"]

Now, that’s caching from Docker, but once we have invalidated the hash of the Gemfile, the cache is busted! There’s no way we could avoid this since we are constantly adding or removing gems during development.

Second approach: Gemfile hack

This approach is very similar to the first one. This time, we’re adding multiple Gemfiles as we add gems, ensuring that the actual Gemfile’s cache is not busted (since this usually contains the most number of gems).

# Choose the official Ruby 2.3.0 image as our starting point
FROM ruby:2.3.0

# Run updates for JS runtime
RUN apt-get update -qq && apt-get install -y build-essential nodejs

# Cleanup
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Set up working directory
RUN mkdir /app

# Copy Gemfiles and install gems
WORKDIR /tmp
COPY Gemfile Gemfile
COPY Gemfile.lock Gemfile.lock
RUN bundle install

# Everything up to here was cached. This includes
# the bundle install, unless the Gemfiles changed.

# Hack
RUN mkdir /tmp2
WORKDIR /tmp2
COPY Gemfile2 Gemfile
COPY Gemfile2.lock Gemfile.lock
RUN bundle install

## Now everything up to here was cached.

# Hack again?
RUN mkdir /tmp3
WORKDIR /tmp3
COPY Gemfile3 Gemfile
COPY Gemfile3.lock Gemfile.lock
RUN bundle install

## Now everything up to here was cached (again)?

# Change back to app directory
WORKDIR /app
ADD . /app # optional if we mount local volumes to /app

# Start the server
CMD ["rails", "server", "-b", "0.0.0.0"]

This definitely wasn’t a good solution – too much extra work was involved.

Third approach: Data-only containers

Since data within containers are not persisted on each build, we’ll try to extract the gems’ data into a different container and mount that data container with our app. By default, gems are installed in /usr/local/bundle. We will now mount a custom volume to that directory through a data-only container.

Gemfile

# Choose the official Ruby 2.3.0 image as our starting point
FROM ruby:2.3.0

# Run updates for nokogiri and JS runtime
RUN apt-get update -qq && apt-get install -y build-essential libxml2-dev libxslt1-dev nodejs

# Cleanup
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Set up working directory
RUN mkdir /app
WORKDIR /app

# Set up bundle environment
ENV BUNDLE_JOBS=3 BUNDLE_GEMFILE=/app/Gemfile

docker-compose.yml

# Main application
web:
  container_name: rails_web
  build: .
  command: bash ./start.sh
  volumes:
    - .:/app
  ports:
    - "80:3000"
  volumes_from:
    - bundle_store

# Data-only container for bundler data
bundle_store:
  container_name: rails_bundle_store
  image: cogniteev/echo
  command: echo 'Data Container for Bundler'
  volumes:
    - bundle_store:/usr/local/bundle

start.sh

#!/bin/bash

bundle check || bundle install

rm -f /app/tmp/pids/server.pid

rails s -b 0.0.0.0

All we need to do is:

# execute these for the first time only
docker volume create bundle_store
docker-compose run --rm web bundle install
docker-compose run --rm web rake db:create
docker-compose run --rm web rake db:migrate

# execute this to start up containers in daemon mode
docker-compose up -d

# follow the console log
docker-compose run --rm web tail -f log/development.log

# accessing the file system
docker-compose run --rm web bash

If we want to avoid typing those commands, we could use aliases.

.bash_profile

alias web="docker-compose run --rm web"

The rm flag is needed so that our containers are removed when they are done.

When we have done so, we could just do the following:

web bundle install
web rake db:create
web rake db:migrate
web bash

Here’s an interesting read to drawbacks of using this method: docker + ruby gems - a failed experiment.

Note: All the three approaches above are for development mode only. Although the typical workflow for production is to upload to Docker Hub’s private registry to build, however, I feel that we can also apply this to production, but I’ve not tried that yet.