The Site Reliability team is part of the Infrastructure organization that builds, operates, and improves the heart of Shopify’s technical platform, and unlocks the power of planet-scale infrastructure for all of Shopify’s merchants, buyers, and developers.
Shopify has many critical components, and sometimes they fail. Members of our Site Reliability team are the ones ensuring we can get back to normal operation as fast as possible when that happens. Site Reliability sets the foundation for building and running resilient systems at Shopify. This is a team of engineers with both in-depth operational knowledge of the entire Shopify stack, as well as strong programming fundamentals, who act as first responders and leaders during an incident.
Our goal is to drive incidents to resolution as quickly as possible, and guide teams to build a more resilient Shopify. We build whatever systems and tools are necessary to ensure Shopify is resilient, and that incident response and resolution is fast and reliable. We continuously seek out ways to automate away manual toil involved with keeping Shopify running.
Commerce happens 24/7, and we have built out a globally distributed team that can respond whenever necessary. Our team hires across 4 different regions: Asia-Pacific (APAC), North America West, North America East, and Europe, the Middle East, and Africa (EMEA), in a follow-the-sun support model that provides 24/7 coverage for incident management.
At Shopify, Staff Site Reliability Engineers (also referred to as Lead Site Reliability Engineers) use their expertise and passion to multiply the overall output of their team. As a technical leader, you’ll help drive your team’s vision to its implementation. You and the team will design and build technically innovative solutions that empower all teams at Shopify to build powerful and resilient distributed cloud software. Merchants that depend on Shopify for a highly scalable, performant, and reliable platform benefit directly from the work you do. You will maintain a high bar for quality, and will lead and mentor other engineers. And of course, you’ll be hands-on in the code and contribute technically.
This is a remote position available in Australia, Japan, and Singapore.
Shopify is now permanently remote and working towards a future that is digital by default. Learn more about what this can mean for you.
What we can offer you:
The opportunity to run Shopify’s planet-scale systems by enabling engineering teams to create resilient systems.
Work focusing on a unique set of interesting and challenging problems that can’t be easily found elsewhere.
The flexibility to define what resiliency and site reliability engineering mean for Shopify.
The means to grow the capacity of our worldwide distributed site reliability engineering teams, and consult with other engineering groups on how to build low-latency, highly resilient systems.
A direct impact on our millions of merchants’ ability to generate revenue for their livelihood, their families, and their employees through the business they’ve built from the ground up on our platform.