Introducing Shipit

Introducing Shipit

After a year of internal use, we’re excited to open-source our deployment tool, Shipit.

With dozens of teams pushing code multiple times a day to a variety of different targets, fast and easy deploys are key to developer productivity (and happiness) at Shopify. Along with key improvements to our infrastructure, Shipit plays a central role in making this happen.

Motivation

Coordinating frequent deployments in a large development team poses a number of challenges. You need to ensure that no one else is currently deploying, that the revision you are about to deploy has been successfully tested on CI, and even that there is no ongoing maintenance operation. With smaller teams, it’s easy to give developers deployment access via something like Capistrano, but giving production credentials to many people can become a problem. Additionally, keeping an audit trail of when specific commits were deployed is indispensable when debugging production issues.

The first version of Shipit was built in early 2013 using an open source project by Rackspace called Dreadnot. It allowed our developers to deploy their code without needing to request credentials, understand the production hardware, or involve the Operations team. While this worked well initially, we hit some limitations. Configuring new projects was difficult, which meant few projects were using it. We also hit performance limitations.

During Hack Days, we undertook a complete rewrite to address these problems. Here are some of the improvements we've made:

Synchronization

To ensure code isn’t deployed during incidents like system maintenance, there needs to be safety mechanisms in place. Shipit allows developers to set a lock to prevent other developers from deploying when it’s unsafe to do so.

Easy setup

To make configuration easier, we adopted a model similar to Travis CI and introduced the notion of a shipit.yml file. This allows deployment recipes to be maintained within each project and kept under version control. Here's the one used to ship Shipit:


We don't need to specify a deployment command - Shipit will infer the necessary steps to deploy to Pypi, Rubygems, and anything using Capistrano.

Audit trail

Shipit keeps logs and metadata of all the deploys and rollbacks performed.

Better performance

We were able to make gains in performance and simplicity by adopting GitHub as the source of truth. Instead of constantly updating a local copy of the repository and polling for CI status, Shipit relies on GitHub's push and status webhook events. This has the added benefit of playing well with the plethora of third-party services that use the Statuses API (including our Docker image builder).

Deployment

Before deployments, Shipit allows you to display key metrics, or add a checklist that should be followed. We’ve also made it possible to write visualizations shown during deployments, that make it easy to monitor progress and abort the deployment if needs be.

Shipit and You

We've put together a detailed README to get you started. Please report any issues you might run into, and feel free to submit bug fixes and improvements via pull request.

At Shopify, Shipit deploys over 200 projects (including itself) at the press of a button - something we do several hundred times daily. It handles a broad set of general purpose tasks ranging from updating DNS configurations to publishing new versions of Python eggs, and deploys our applications to Heroku, EC2 and our datacenters. Over time, we've learned from it and improved it, and we’re excited to share it with the community.

Continue reading

Secrets at Shopify - Introducing EJSON

This is a continuation of our series describing our evolution of Shopify toward a Docker-powered, containerized data centre. Read the last post in the series here.

One of the challenges along the road to containerization has been establishing a way to move application secrets like API keys, database passwords, and so on into the application in a secure way. This post explains our solution, and how you can use it with your own projects.

Continue reading

Announcing go-lua

Today, we’re excited to release go-lua as an Open Source project. Go-lua is an implementation of the Lua programming language written purely in Go. We use go-lua as the core execution engine of our load generation tool. This post outlines its creation, provides examples, and describes some challenges encountered along the way.

Continue reading

There's More to Ruby Debugging Than puts()

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan

Debugging is always challenging, and as programmers we can easily spend a good chunk of every day just trying to figure out what is going on with our code. Where exactly has a method been overwritten or defined in the first place? What does the inheritance chain look like for this object? Which methods are available to call from this context?

This article will take you through some under-utilized convenience methods in Ruby which will make answering these questions a little easier.

    Continue reading

    Building Year in Review 2014 with SVG and Rails

    Building Year in Review 2014 with SVG and Rails

    feature

    As we have for the past 3 years, Shopify released a Year in Review to highlight some of the exciting growth and change we’ve observed over the past year. Designers James and Veronica had ambitious ideas for this year’s review, including strong, bold typographic treatments and interactive data visualizations. We’ve gotten some great feedback on the final product, as well as some curious developers wondering how we pulled it off, so we’re going to review the development process for Year in Review and talk about some of the technologies we leveraged to make it all happen.

    Continue reading

    Building and Testing Resilient Ruby on Rails Applications

    Black Friday and Cyber Monday are the biggest days of the year at Shopify with respect to every metric. As the Infrastructure team started preparing for the upcoming seasonal traffic in the late summer of 2014, we were confident that we could cope, and determined resiliency to be the top priority. A resilient system is one that functions with one or more components being unavailable or unacceptably slow. Applications quickly become intertwined with their external services if not carefully monitored, leading to minor dependencies becoming single points of failure.

    For example, the only part of Shopify that relies on the session store is user sign-in - if the session store is unavailable, customers can still purchase products as guests. Any other behaviour would be an unfortunate coupling of components. This post is an overview of the tools and techniques we used to make Shopify more resilient in preparation for the holiday season.

    Continue reading

    Tuning Ruby's Global Method Cache

    Tuning Ruby's Global Method Cache

    feature

    I was recently profiling a production Shopify application server using perf and noticed a fair amount of time being spent in a particular function, st_lookup, which is used by Ruby’s MRI implementation for hash table lookups:

    Hash tables are used all over MRI, and not just for the Hash object; global variables, instance variables, classes, and the garbage collector all use MRI’s internal hash table implementation, st_table. Unfortunately, what this profile did not show were the callers of st_lookup. Is this some application code that has gone wild? Is this an inefficiency in the VM?

    Continue reading

    Docker at Shopify: How we built containers that power over 100,000 online shops

    Docker at Shopify: How we built containers that power over 100,000 online shops

    feature

    This is the second in a series of blog posts describing our evolution of Shopify toward a  Docker-powered, containerized data center. This instalment will focus on the creation of the container used in our production environment when you visit a Shopify storefront.

    Read the first post in this series here.

    Why containerize?

    Before we dive into the mechanics of building containers, let's discuss motivation. Containers have the potential to do for the datacenter what consoles did for gaming. In the early days of PC gaming, each game typically required video or sound driver massaging before you got to play. Gaming consoles however, offered a different experience:

    • predictability: cartridges were self-contained fun: always ready-to-run, with no downloads or updates.
    • fast: cartridges used read-only memory for lightning fast speeds.
    • easy: cartridges were robust and largely child-proof - they were quite literally plug-and-play.

    Predictable, fast, and easy are all good things at scale. Docker containers provide the building blocks to make our data centers easier to run and more adaptable by placing applications into self-contained, ready-to-run units much like cartridges did for console games.

    Continue reading

    Rebuilding the Shopify Admin: Improving Developer Productivity by Deleting 28,000 lines of JavaScript

    Rebuilding the Shopify Admin: Improving Developer Productivity by Deleting 28,000 lines of JavaScript

    feature

    This September, we quietly launched a new version of the Shopify admin. Unlike the launch of the previous major iteration of our admin, this version did not include a major overhaul of the visual design, and for the most part, would have gone largely unnoticed by the user.

    Why would we rebuild our admin without providing any noticeable differences to our users? At Shopify, we strongly believe that any decision should be able to be questioned at any time. In late 2012, we started to question whether our framework was still working for us. This post will discuss the problems in the previous version of our admin, and how we decided that it was time to switch frameworks.

    Continue reading

    Building an Internal Cloud with Docker and CoreOS

    Building an Internal Cloud with Docker and CoreOS

    feature

    This is the first in a series of posts about adding containers to our server farm to make it easier to scale, manage, and keep pace with our business.  

    The key ingredients are:

    • Docker: container technology for making applications portable and predictable
    • CoreOS: provides a minimal operating system, systemd for orchestration, and Docker to run containers

    Shopify is a large Ruby on Rails application that has undergone massive scaling in recent years. Our production servers are able to scale to over 8,000 requests per second by spreading the load across 1700 cores and 6 TB RAM.

    Continue reading

    Kafka Producer Pipeline for Ruby on Rails

    Kafka Producer Pipeline for Ruby on Rails

    feature

    In the early fall our infrastructure team was considering Kafka, a highly available message bus. We were looking to solve several infrastructure problems that had come up around that time.

    • We were looking for a reliable way to collect event data and send it to our data warehouse.

    • We were considering a more service-oriented architecture, and needed a standardized way of message passing between the components.

    • We were starting to evaluate containerization of Shopify, and were searching for a way to get logs out of containers.

    We were intrigued by Kafka due to its highly available design. However, Kafka runs on the JVM, and its primary user, LinkedIn, runs a full JVM stack. Shopify is mainly Ruby on Rails and Go, so we had to figure out how to integrate Kafka into our infrastructure.

    Continue reading