Staff Site Reliability Engineer (Remote, APAC)

Staff Site Reliability Engineer (Remote, APAC)

The Site Reliability team is part of the Infrastructure organization that builds, operates, and improves the heart of Shopify’s technical platform, and unlocks the power of planet-scale infrastructure for all of Shopify’s merchants, buyers, and developers.  

Shopify has many critical components, and sometimes they fail. Members of our Site Reliability team are the ones ensuring we can get back to normal operation as fast as possible when that happens. Site Reliability sets the foundation for building and running resilient systems at Shopify. This is a team of engineers with both in-depth operational knowledge of the entire Shopify stack, as well as strong programming fundamentals, who act as first responders and leaders during an incident.  

Our goal is to drive incidents to resolution as quickly as possible, and guide teams to build a more resilient Shopify. We build whatever systems and tools are necessary to ensure Shopify is resilient, and that incident response and resolution is fast and reliable. We continuously seek out ways to automate away manual toil involved with keeping Shopify running.

Commerce happens 24/7, and we have built out a globally distributed team that can respond whenever necessary. Our team hires across 4 different regions: Asia-Pacific (APAC), North America West, North America East, and Europe, the Middle East, and Africa (EMEA), in a follow-the-sun support model that provides 24/7 coverage for incident management.

At Shopify, Staff Site Reliability Engineers (also referred to as Lead Site Reliability Engineers) use their expertise and passion to multiply the overall output of their team. As a technical leader, you’ll help drive your team’s vision to its implementation. You and the team will design and build technically innovative solutions that empower all teams at Shopify to build powerful and resilient distributed cloud software. Merchants that depend on Shopify for a highly scalable, performant, and reliable platform benefit directly from the work you do. You will maintain a high bar for quality, and will lead and mentor other engineers. And of course, you’ll be hands-on in the code and contribute technically.

This is a remote position available in Australia, Japan, and Singapore.

Shopify is now permanently remote and working towards a future that is digital by default. Learn more about what this can mean for you.

What we can offer you:

  • The opportunity to run Shopify’s planet-scale systems by enabling engineering teams to create resilient systems.

  • Work focusing on a unique set of interesting and challenging problems that can’t be easily found elsewhere.

  • The flexibility to define what resiliency and site reliability engineering mean for Shopify.
    The means to grow the capacity of our worldwide distributed site reliability engineering teams, and consult with other engineering groups on how to build low-latency, highly resilient systems.

  • A direct impact on our millions of merchants’ ability to generate revenue for their livelihood, their families, and their employees through the business they’ve built from the ground up on our platform.

  • You are based in Australia, Japan, or Singapore.

  • Experience handling multiple on-call shifts for mission-critical systems, and responsibility for the tools and processes used to debug and correct failures. 

  • You've navigated more than one incident through to the retrospective process.

  • You know what good observability looks like, but more importantly, how to get there.

  • Strong programming fundamentals—ideally in a variety of languages—primarily in backend software development. 

  • Comfort with hands-on development, navigating through multiple programming languages, digging deep in the stack, and using cloud infrastructure (for example, Google Cloud Platform, Amazon Web Services, Azure, Kubernetes, Docker)..

  • Experience with mentorship and helping teammates level up their craft and technical skills. 

  • You understand the meaning of continuous improvement and evolving systems.

  • You reject the idea that on-call rotations have to be a terrible, disruptive experience.

  • You understand how to improve difficult situations through short and iterative projects.

  • A commitment and drive for quality, technical excellence and results.

  • If you don’t know all this stuff, don’t worry, we’ll teach you!

Bonus Points:

  • Experience working with a variety of open-source software, including Nginx, Redis, Memcached and MySQL.

  • Familiarity with network and web protocols, from IP to HTTP.


#Senior Software Developer #DistributedSystems #Senior Software Developer

#SRE #DevOps #Data Infrastructure #Reliability Engineer  

Our belief is that a strong commitment to diversity & inclusion enables us to truly make commerce better for everyone. We encourage applications from Indigenous peoples, racialized people, people with disabilities, people from gender and sexually diverse communities, and/or people with intersectional identities. Please take a look at our Sustainability Reports to learn more about Shopify’s commitments to our communities, and our planet.

At Shopify, we understand that experience comes in many forms. We’re dedicated to adding new perspectives to the team - so if your experience is this close to what we’re looking for, please consider applying.