Programming

Announcing go-lua

Today, we’re excited to release go-lua as an Open Source project. Go-lua is an implementation of the Lua programming language written purely in Go. We use go-lua as the core execution engine of our load generation tool. This post outlines its creation, provides examples, and describes some challenges encountered along the way.

Continue reading

Building Year in Review 2014 with SVG and Rails

Building Year in Review 2014 with SVG and Rails

feature

As we have for the past 3 years, Shopify released a Year in Review to highlight some of the exciting growth and change we’ve observed over the past year. Designers James and Veronica had ambitious ideas for this year’s review, including strong, bold typographic treatments and interactive data visualizations. We’ve gotten some great feedback on the final product, as well as some curious developers wondering how we pulled it off, so we’re going to review the development process for Year in Review and talk about some of the technologies we leveraged to make it all happen.

Continue reading

Building and Testing Resilient Ruby on Rails Applications

Black Friday and Cyber Monday are the biggest days of the year at Shopify with respect to every metric. As the Infrastructure team started preparing for the upcoming seasonal traffic in the late summer of 2014, we were confident that we could cope, and determined resiliency to be the top priority. A resilient system is one that functions with one or more components being unavailable or unacceptably slow. Applications quickly become intertwined with their external services if not carefully monitored, leading to minor dependencies becoming single points of failure.

For example, the only part of Shopify that relies on the session store is user sign-in - if the session store is unavailable, customers can still purchase products as guests. Any other behaviour would be an unfortunate coupling of components. This post is an overview of the tools and techniques we used to make Shopify more resilient in preparation for the holiday season.

Continue reading

Rebuilding the Shopify Admin: Improving Developer Productivity by Deleting 28,000 lines of JavaScript

Rebuilding the Shopify Admin: Improving Developer Productivity by Deleting 28,000 lines of JavaScript

feature

This September, we quietly launched a new version of the Shopify admin. Unlike the launch of the previous major iteration of our admin, this version did not include a major overhaul of the visual design, and for the most part, would have gone largely unnoticed by the user.

Why would we rebuild our admin without providing any noticeable differences to our users? At Shopify, we strongly believe that any decision should be able to be questioned at any time. In late 2012, we started to question whether our framework was still working for us. This post will discuss the problems in the previous version of our admin, and how we decided that it was time to switch frameworks.

Continue reading

Building a Rack middleware

I'm Chris Saunders, one of Shopify's developers. I like to keep journal entries about the problems I run into while working on the various codebases within the company.

Recently we ran into a issue with authentication in one of our applications and as a result I ended up learning a bit about Rack middleware. I feel that the experience was worth sharing with the world at large so here's is a rough transcription of my entry. Enjoy!

Continue reading

IdentityCache: Improving Performance one Cached Model at a Time

IdentityCache: Improving Performance one Cached Model at a Time

 
A month ago Shopify was at BigRubyConf where we mentioned an internal library we use for caching ActiveRecord models called IdentityCache. We're pleased to say that the library has been extracted out of the Shopify code base and has been open sourced!
 
At Shopify, our core application has been database performance bound for much of our platform’s history. That means that the most straightforward way of making Shopify more performant and resilient is to move work out of the database layer. 
 
For many applications, achieving a very high cache ratio is a matter of storing full cached response bodies, and versioning them based on the associated records in the database, serving always the more current version and relying on the cache’s LRU algorithm for expiration. 
 
That technique, called a “generational page cache”, is well proven and very reliable.  However, part of Shopify’s value proposition is that store owners can heavily customize the look and feel of their shops. We in fact offer a full fledged templating language
 
As a side effect, full page static caching is not as effective as it would be in most other web platforms, because we do not have a deterministic way of knowing what database rows we’ll need to fetch on every page render. 
 
The key metric driving the creation of IdentityCache was our master database’s queries per (second/minute) and thus the goal was to reduce read operations reaching the database as much as possible. IdentityCache does this by moving the workload to Memcached instead.
 
The inability of a full page cache to take load away from the database becomes even more evident during write heavy - and thus page cache expiring - events like Cyber Monday, and flash sales. On top of that, the traffic on our web app servers typically doubles each year, and we invested heavily in building out IdentityCache to help absorb this growth.  For instance, in 2012 during the last pre-IdentityCache sales peak, we saw 130.000 requests per minute generating 21.000 queries per second in comparison with the latest flash sale on April 2013 generated 203.000 requests with only 14.500 queries per second.  

What Exactly is IdentityCache?

IdentityCache is a read through cache for ActiveRecord models. When reading records from the cache, IdentityCache will try to fetch the requested object from memcached. If the cache entry doesn't exist, IdentityCache will load the object from the database and store it in memcache, then the cached copy will be available for subsequent reads and avoid any more trips to the database. This behaviour is key during events that expire the cache often.
 
Expiration is explicit and does not rely on Memcached's LRU. It is automatic, objects are expired from the cache by issuing memcached delete command as they change in the database via after_commit hooks. This is important because given a row in the database we can always calculate its cache key based on the current table schema and the row’s id. There is no need for the user to ever call delete themselves. It was a conscious decision to take expiration away from day-to-day developer concerns.
 
This has been a huge help as the characteristics of our application and Rails have changed. One great example of this is how Ruby on Rails changed what actions would fire after_commit hooks. For instance, in Rails 3.2, touch will not fire an after_commit. Instead of having to add expires, and think about all the possible ramifications every time, we added the after_touch hook into IdentityCache itself.
 
Aside from the default key, built from the schema and the row id, IdentityCache uses developer defined indexes to access your models. Those indexes simply consist of keys that can be created deterministically from other row fields and the current schema. Declaring an index will also add a helper method to fetch your cached models using said index.
 
IdentityCache is opt-in, meaning developers need to explicitly specify what should be indexed and explicitly ask for data from the cache. It is important that developers don’t have to guess whether calling a method will bring a cached entry or not. 
 
We think this is a good thing. Having caching hook in automatically is nice in its simplest form.  However, IdentityCache wasn't built for simple applications, it has been built for large, complicated applications where you want, and need to know what's going on.

Down to the Numbers

If that wasn’t good enough, here are some numbers from Shopify itself.
 
 
This is an example of when we introduced IdentityCache to one of the objects that is heavily hit on the shop storefronts. As you can see we cut out thousands of calls to the database when accessing this model. This was huge since the database is one of the heaviest contended components of Shopify.
 
 
This example shows similar results once IdentityCache was introduced. We eliminated what was approaching 50K calls per minute (which was growing steadily) to almost nothing since the subscription was now being embedded with the Shop object. Another huge win from IdentityCache.

Specifying Indexes

Once you include IdentityCache into your model, you automatically get a fetch method added to your model class. Fetch will behave like find plus the read-through cache behaviour.
 
You can also add other indexes to your models so that you can load them using a different key. Here are a few examples:
class Product < ActiveRecord::Base
  include IdentityCache
end

Product.fetch(id)

class Product < ActiveRecord::Base
  include IdentityCache
  cache_index :handle
end

Product.fetch_by_handle(handle)
We’ve tried to make IdentityCache as simple as possible to add to your models. For each cache index you add, you end up with a fetch_* method on the model to fetch those objects from the cache.
 
You can also specify cache indexes that look at multiple fields. The code to do this would be as follows:
class Product < ActiveRecord::Base
  include IdentityCache
  cache_index :shop_id, :id
end

Product.fetch_by_shop_id_and_id(shop_id, id)

Caching Associations

One of the great things about IdentityCache is that you can cache has_one, has_many and belongs_to associations as well as single objects. This really sets IdentityCache apart from similar libraries.
 
This is a simple example of caching associations with IdentityCache:
class Product < ActiveRecord::Base
  include IdentityCache
  has_many :images
  cache_has_many :images
end

@product = Product.fetch(id)
@images = @product.fetch_images
What happens here is the product is fetched from either Memcached or the database if it's a cache miss. We then look for the images in the cache or database if we get another miss. This also works for both has_one and belongs_to associations with the cache_has_one and cache_belongs_to IdentityCache, respectively.
 
What if we always want to load the images though, do we always need to make the two requests to the cache? 

Embedding Associations

With IdentityCache we can also embed the associations with the parent object so that when you load the parent the associations are also cached and loaded on a cache hit. This avoids needing to make the multiple Memcached calls to load all the cached data. To enable this you simple need to add the ':embed => true' options. Here's a little example:
class Product < ActiveRecord::Base
  include IdentityCache
  has_many :images
  cache_has_many :images, :embed => true
end

@product = Product.fetch(id)
@images = @product.fetch_images
The main difference with this example versus the previous is that the '@product.fetch_images' call won't hit Memcached a second time; the data is already loaded when we fetch the product from Memcached.
 
The tradeoffs of using embed are: first your entries in memcached will be larger, as they’ll have to store data for the model and its embedded associations, second the whole cache entry will expire on changes to any of the models cached.
 
There are a number of other options and different ways you can use IdentityCache which are highlighted on the github page https://github.com/Shopify/identity_cache, I highly encourage anyone interested to take a look at those examples for more details. Please check it out for yourself and let us know what you think!

Continue reading

HackTO: This Saturday, April 14th in Toronto

HackTO: This Saturday, April 14th in Toronto

 

Artwork based on a Creative Commons photo by John R. Southern. Click here to see the original.

HackTO is a hackathon taking place in Toronto this Saturday, April 14th, in which developers will be challenged to come up with and implement an application that takes one or more of the sponsors' APIs and does something interesting, useful or cool -- all in the space of a few hours. The APIs that you'll be able to use at HackTO include:


Shopify will be there! Developer Relations guy David Underwood and Yours Truly, Platform Evangelist Joey deVilla, will be there to walk you through the Shopify API should you make the really excellent decision to include it in your project.

Here's the schedule for Saturday:

Time  What's Happening

9:00 a.m. Breakfast and introductions. It's your chance to meet the HackDays organizers and all the participating developers. Use this time to get settled in, see the final schedule and fuel up for a day's hacking with some breakfast.

9:30 a.m. API Presentations. Each API sponsor, including Shopify, will present their API and show you what's possible with it.

10:15 a.m. Open Planning. Come up with an idea for an application -- and remember, you have only a few hours in which to build it -- and put together a team. This is a whiteboard exercise: if you have an idea that you would like to work on, just put its name down on the whiteboard. If you're looking to build a team, identify yourself to other developers with the skill sets that you need. If you're looking for ideas, identify yourself to other developers as well.

10:30 a.m. Let the hacking begin! Claim a table and get to work! The API sponsors will be available to answer questions.

12:00 noon Lunch. Take a break, grab a bite, hang out with your fellow geeks.

1:00 p.m. Back to work! You've got until 5:30 p.m. to finish.

5:30 p.m. Presentations. Every team gets 3 minutes to present their application, and when you're not presenting, you're watching the other teams and enjoying beer.

7:00 p.m. Judging and prizes. Our panel of experts weighs in and delivers their verdict.

7:30 p.m. Celebratory food and drink at a nearby pub.

There are prizes! HackTO will offer three prizes to the apps deemed by our panel of judges to be the best:

  • First prize: $2,000
  • Second prize: $1,500
  • Third prize: $500

This all takes place at Freshbooks' new HQ, located at 35 Golden Avenue, Suite 105. Golden Avenue is just off Dundas Street West, east and south of where it branches off from Roncesvalles Avenue. The closest subway station is Dundas West; Freshbooks is a short walk away.

If you're up for a programming challenge, the chance to win prizes and the opportunity to meet the folks from some of the coolest startups there as well as the people from Toronto's vibrant tech community, come to HackTO and hack! Register now!


Continue reading

I have an animated idea

I have an animated idea


We (Dan, Ryan, and myself) launched a small page called "I have a business idea". Check it out, if you haven't already — especially in Chrome or Firefox or Safari. Assuming you have JavaScript turned on (as most of you do), you should see some fun animations as you scroll down the page.

I have to admit that the animations that I had in my head were of a much grander scale but there was a time crunch: the site from design to implementation needed to be pulled off in two days. So how did we do it?

The design idea was to have our man-of-the-hour, Skip, put on the various hats that he'll need to wear to get a new business off the ground. Where do you even begin when you have an idea? We broke it down into a few steps with some helpful links along the way.

The steps were thought of as slides, like a Powerpoint presentation, but clearly better. As the user scrolls from slide to slide, Skip transitions from role to role.

Deciding what slide we're on

The first thing to figure out was what slide the viewer was looking at. This was the easiest part. All slides are the same height and are the same distance apart. We look for the window scroll event and then divide that by the height of a slide (plus its margin).

var slideSize = 700;
var scrollTop = $(document).scrollTop() + 100;
var newSlide = Math.round(scrollTop / slideSize);

The 100 that is being added is the margin at the very top of the page. Now that we know what the current slide is in view, we can change the state of the page and change the state of that slide.

First up is the subtle (or maybe not so subtle) background colour change to match the current slide.

document.body.style.backgroundColor = slides.eq(currentSlide)
    .children('.slide-content')
    .css('background-color');

We used jQuery to handle the heavy lifting. The slides variable is an array of all our slides. We grab the current slide and then grab the background colour from the slide content. That gets applied to the body. Just like that, the carpet now matches the drapes.

Handling the slide transitions

The next step in setting slide state was handling transitions. A slide comes in from the top or from the bottom. The animations should be different to reflect the different movement. Each slide has three states: the current (or default state), the before state, and the after state.

As we transition the slides, we set the state on those by changing an attribute:

if (oldSlide > currentSlide) {
    oldSlideContent.attr('data-state', 'is-after');
} else {
    oldSlideContent.attr('data-state', 'is-before');
}

currentSlideContent.attr('data-state', 'is-current');

If the old slide is below (or after) the current slide then it's set to "is-after". If it's above then it's set to "is-before". The current slide is aptly set to "is-current".

Slide CSS

Each slide is then made up of a number of components. If you look at the CSS, it's a little messy but you can see the different components of each slide.

#s4-1 { ... }
#s4-2 { ... }
#s4-3 { ... }

Each of those defines the current (or default) state of each slide. For browsers that don't have JavaScript turned on or don't support CSS transitions and animations, it'll just be the normal static slides. For everybody else, we set the before and after states.

.slide-content[data-state=is-after] #s4-3 { top: 400px; }
.slide-content[data-state=is-before] #s4-3 { top: -200px; }

CSS Transitions do the heavy lifting for us. As we set the state, the browser sees that the top value has changed and transitions that property smoothly over a brief period of time.

Since I knew I was going to be animating a lot of stuff on the page, I just transitioned ALL the things!

* {
    -webkit-transition: all 1s;
    -moz-transition: all 1s;
    -ms-transition: all 1s;
    -o-transition: all 1s;
    transition: all 1s;
}

Animations

We added a few small animations along the way, like the flying money on the last slide. Something fun along the way. First, we defined the keyframes of our animation.

@-moz-keyframes float {
    0% { -moz-transform: rotate(0deg) translate(0,0); }
    30% { -moz-transform: rotate(-20deg) translate(10px,10px); }
    70% { -moz-transform: rotate(20deg) translate(-10px,10px); }
    100% { -moz-transform: rotate(0deg) translate(0,0); }
}

In this case, the floating money moves the element at different points in the timeline. It rotates and shifts (or translates) the element from its current position.

With the animation keyframes defined, the animation needs to be applied to an element.

#s9-3 {
    -moz-animation: float 8s infinite linear;
}

The animation is done over 8 seconds to create a nice soft effect and loops infinitely. To have each dollar bill float at a different frequency, we just changed the animation length.

#s9-4 {
    -moz-animation: float 10s infinite linear;
}

Browser Compatibility

We used a bunch of cutting edge CSS to do this and as a result, we added all the vendor prefixes for the major browsers out there. For browsers that don't support it, it should still show the default state of the slides.

Tools

In a way, building this felt like doing animations in Adobe Flash. Elements were laid out on a canvas and then keyframes were defined for each of the elements that we wanted to animate. There are a number of tools, like Adobe Edge, that allow you to create these types of animations in a graphical way without necessarily needing to get your hands dirty with code.

Continue reading

RESTful thinking considered harmful - followup

My previous post RESTful thinking considered harmful caused quite a bit of discussion yesterday. Unfortunately, many people seem to have missed the point I was trying to make. This is likely my own fault for focusing too much on the implementation, instead of the thinking process of developers that I was actually trying to discuss. For this reason, I would like to clarify some points.

  • My post was not intended as an arguments against REST. I don't claim to be a REST expert, and I don't really care about REST semantics.
  • I am also not claiming that it is impossible to get the design right using REST principles in Rails.

So what was the point I was trying to make?

  • Rails actively encourages the REST = CRUD design pattern, and all tutorials, screencasts, and documentation out there focuses on designing RESTful applications this way.
  • However, REST requires developers to realize that stuff like "publishing a blog post" is a resource, which is far from intuitive. This causes many new Rails developers to abuse the update action.
  • Abusing update makes your application lose valuable data. This is irrevocable damage.
  • Getting REST wrong may make your API less intuitive to use, but this can always be fixed in v2.
  • Getting a working application that properly supports your process should be your end goal, having it adhere to REST principles is just a means to get there.
  • All the focus on RESTful design and discussion about REST semantics makes new developers think this is actually more important and messes with them getting their priorities straight.

In the end, having a properly working application that doesn't lose data is more important than getting a proper RESTful API. Preferably, you want to have both, but you should always start with the former.

Improving the status quo

In the end, what I want to achieve is educating developers, not changing the way Rails implements REST. Rails conventions, generators, screencasts, and tutorials are all part of how we educate new Rails developers.

  • Rails should ship with a state machine implementation, and a generator to create a model based on it. Thinking "publishing a blog post" is a transaction in a state machine is a lot more intuitive.
  • Tutorials, screencasts, and documentation should focus on using it to design your application. This would lead to to better designed application with less bugs and security issues.
  • You can always wrap your state machine in a RESTful API if you wish. But this should always come as step 2.

Hopefully this clarifies a bit better what I was trying to bring across.

Continue reading

Start your free 14-day trial of Shopify