Back to Job Listings

Production Engineering Lead – Application Scalability

Ottawa, Montreal, Toronto, Waterloo, Remote

Production Engineering at Shopify encompasses the disciplines of site reliability engineering, infrastructure engineering, and developer productivity. Our team ensures that Shopify infrastructure is able to scale massively, while also delivering resilient systems, amazing performance, and impactful tools for our entire engineering team.

You’ll be responsible for the Application Scalability team to improve production reliability and development velocity. You will be building the shared tools for our 100s of applications to scale their production infrastructure while keeping engineering productivity high. As we continue to grow we refuse to slow down. World-class tooling should make development at scaler fast, not slower

The main objective of the Application Scalability team is to spread to the rest of the organization the tooling, lessons and patterns we’ve used to reliably scale Shopify to over 80,000 requests per second. Today, we build Ruby on Rails applications that on day one of production become some of the busiest Rails applications on the planet. You’ll be working with many of the teams in Production Engineering to help build the developer experience on top of their tools. This team is formed with the hypothesis that much developer productivity is lost through the lack of amazing production tooling that we can build.

Some of the things you may take charge of:

  • In massively distributed systems, we need to build a reliable system assuming unreliable components. Services should use resiliency tools like Toxiproxy and Semian as well as adopt chaos engineering practises, e.g. Chaos Monkey, into their stack
  • Working closely with developers across all our offices to ensure they have the best possible infrastructure tools to get their work done, tracking developer productivity as a key metric for the team
  • Evolve RPC across the company. How do we tackle issues like versioning, resiliency, and testing across 100s of services?
  • Turn waiting for database migrations for days into a tale of the past through integrating tools like gh-ost into our architecture to replace LHM to minimize the impact of migrations
  • Provide primitives for massive parallelized maintenance tasks to decrease the time it takes for developers to do large scale data refactoring

What you’ll need to have to tackle this role:

  • Experience in infrastructure and systems architecture
  • Experience in leading engineering-heavy teams
  • Desire to build tools for people who build products, your customers are both developers (through tools) and merchants (through the reliability achieved with these tools)
  • Equally excited about leading projects, people, and writing code

It’d be pretty cool if you have:

  • Experience with Ruby and Ruby on Rails
  • Relational database chops (especially MySQL)
  • Experience with Go
  • Experience working at large scale already
  • Tried chaos monkeys before and love breaking things
  • Designed APIs consumed by developers before and care deeply about designing pleasurable APIs
  • Experience leading cross functional teams

Who you'll be working with

How to Apply 📄 ➡️ 📬

If you’re interested in helping us shape the future of commerce at Shopify, click the “Apply Now” button to submit your application. Please address your application to King.

Experience comes in many forms, many skills are transferable, and passion goes a long way. If your experience is this close to what we’re looking for, consider applying. We know that diversity of thought makes for the best problem-solving and creative thinking, which is why we're dedicated to adding new perspectives to the team and encourage everyone to apply.

Apply now

Or, know someone who would be a perfect fit? Let them know!