Product
Performanceđ, complexityđ: Killer updates from Shopify engineering
January 10, 2024
By Farhan Thawar, VP & Head of Engineering at Shopify
At Shopify, weâre obsessed with technical excellence. Always have been. We spend a lot of time on our infrastructureâeven when the results arenât immediately obvious.
Often, this infrastructure work involves simplifying our systems. Continually doing this is a requirement for innovation. Why? Not all fast software is great, but all great software is fast. Every millisecond matters for our merchants. That means it needs to be simple to scale, and not bogged down by overly complex architecture.
As our CEO and founder Tobi LĂźtke put it, some of the best long-term returns on investment in software come from these simplifications. Here are some of the killer updates we shipped in 2023 to remove complexity and improve performance.
Performance wins
In commerce, performance is king. Performance leads to delightful and seamless user experiences. Performance leads to conversion, which adds dollars to the bottom line of our merchants. And so performance wins were a key focus for us in 2023 (and every year). We:
- Optimized the tail of the Shopify Admin customer and order pagesâ load times for certain merchantsâan 80K list went from ~20 seconds to ~400 milliseconds (ms) to load.
- Dramatically improved the speed of duties calculations for cross-border orders. Our p99* is down from ~550ms to ~80ms. TL;DR: Checkout is critical, commerce is global, and we sweat the details. *This means 99% of requests are completed this fast.
- Shopify admin search results now show up more than 7x faster. This is just one of many ways weâre increasing efficiency and giving time back to our merchants.
- For products with multiple variants, itâs now 200x faster to serialize them (meaning, assign a unique ID to each). A product with 2,000 variants now takes one second to serialize. This win came when we realized there was a cache slowing this down.
- GraphQL Storefront API queries are now rendered 3x faster. We extracted these queries from Shopify Core into our Storefront Renderer. And good thing, too, because Storefront API traffic has significantly increased in the past year. This is one of many GraphQL improvements we made in 2023.
- We 5xâd the query performance of Observe, the home base for our dashboards tracking Shopifyâs performance. We created a new query engine that distributes requests across multiple locations, leading to an 80% improvement in query performance. Loading dashboards is now smoother and snappier. đŤ°
Cleanup and subtraction
We love addition by subtraction.
Clutter slows things down. Complicates things unnecessarily for our merchants. So we got rid of a ton of it.
Here are just a few complexity-extraction highlights from 2023:
- Removed nearly 3 million lines of code. Probably more; this is a rough tally from our internal board where we share our âcleanup wins.â Iâm willing to bet the real number was far more.
- Sped up Shopify Adminâs developer feedback loop by 20x, including 50% faster continuous integration (CI) using 35% less compute resources.
- Archived ~6,800 unused and unnecessary GitHub repositories, vastly more than past years.
- Merged 702 machine-generated pull requests (PRs) to delete dead code.
- Reduced memory usage of a background process from ~3 gigabytes to 400 megabytes on online-store-web. This is a substantial improvement to our developer experience, which matters to us because we are committed to making Shopify the best place to build.
- Majorly improved Storefront Rendererâs use of Rubyâs Garbage Collector (GC), a memory management system. Our tweaks led to a 56% decrease in average GC time, and an 80%Â decrease in P99 GC time. A clear example of how weâre focused on making the whole stack work well for commerce.
AI in engineering
Leaning on AI allows us to ship more, faster for our merchants, so weâre constantly finding new ways to incorporate AI into our workflows. Our engineering team works in close partnership with AI to make us more efficient in our work.
One of the tools we use to do this is GitHub Copilotâwe were their first customer in January â22. Here are some of the ways Copilot has impacted our work:
- About 70% of Shopifyâs engineering team uses Copilot regularly
- An average of 21-34% acceptance rate of suggestions depending on the programming language
- We estimate weâre accepting over 20,000 lines of code every weekday
- ~675K suggestions accepted total
- ~975K lines of code accepted total
These kinds of performance enhancements help lead to more than 25,000 commits per week across Shopify, and about 1,300-1,400 PRs per day.
Weâve also been leaning on an internal tool we built called VaultBot, our AI-powered chatbot where Shopifolk can ask Shopify-related questions. VaultBot is currently answering around 32% of all engineering questions.
BFCM performance
The Black Friday-Cyber Monday shopping weekend is the biggest event of the year for many of our merchants. And because of all the traffic it generates, itâs the ultimate stress test for our platform. We build all year to handle the traffic spikes of BFCM â and then that level of traffic tends to become our new normal the following year.Â
Weâre so proud of this yearâs BFCM performance because it allowed each of our millions of merchants to show up to their customers with a fast, smooth shopping experience that led to many cha-chings from their Shopify app.
Here are some of the truly wild numbers reflecting Shopifyâs performance over BFCM 2023, starting with our CEO Tobi LĂźtkeâs fave stats:
Nerd BFCM stats:
â tobi lutke (@tobi) November 25, 2023
Shopifyâs egress processed 145 billion requests on Friday. App servers handled peak of ~60 million requests per minute. Increase of 38%. Total GMV was $4.1b, up by 22% from last year.
But Rails doesn't scale so what are we even doing đ¤ˇââď¸
- We achieved 99.999+% uptime, handling 29.7 petabytes of data served from across our infrastructureâthatâs over 5 terabytes per minute.
- At peak, our core application server handled 967K requests per second, equivalent to 58 million requests per minute.
- During the BFCM rush, our MySQL fleet (a combo of MySQL 5.7 and MySQL 8) handled over 19 million queries per second (QPS).
- At peak, we were indexing 22 GB/sec of logs and 51.4 GB/sec of metrics data! On top of that we ingested 9 million spans a second of tracing data. We constantly monitor second-to-second data on how production systems are performing.
- Our Apache Kafka streaming infrastructure served 29 million messages per second at peak. Thatâs 45% growth from 2022âand we did it more efficiently this year with 14% fewer brokers.
All of this performance enabled our merchants to reach $9.3B in global sales throughout BFCM, with a peak of $4.2 million per minute on Black Friday.
________________________________
âFast software is a cultural phenomenon, driven by curiosity not ego.â - Ian Ker-Seymer (a production engineer at Shopify)
To keep building that culture, weâll keep celebrating and sharing these wins. We are only as strong as our foundation, and our culture helps keep that foundation strong, agile, and delightful.
Why is all of this so important? Because it allows us to show up for our merchants in the best way possible, the way they need us to, so they can keep building their businesses and showing the world the real power of entrepreneurshipâbacked by a performant AF commerce platform.