Load Testing for 2022 Wrapped

0
933
Load Testing for 2022 Wrapped



March 31, 2023

Published by Fred Wang, Senior Backend Engineer

Wrapped is Spotify’s international annual year-end marketing campaign that celebrates our listeners with a customized evaluation of their listening habits over the previous 12 months.

Last 12 months, with greater than 150 million distinctive engaged customers throughout greater than 111 markets, the Wrapped engineering group had the extraordinary problem of supporting site visitors throughout many areas and from totally different gadgets to make sure the Wrapped expertise was a clean one.

Each 12 months, when Wrapped goes stay, the Spotify group is confronted with a “thundering herd problem”. Millions of individuals, everywhere in the world and throughout all time zones, begin watching their information tales on-platform, usually on the similar time within the morning of the launch. And at the moment, our infrastructure for supporting that quantity is put to the check. To this finish, we’ve leveraged our inner instruments to emphasize check our system with the size of site visitors that we are going to expertise, significantly on day considered one of launch.

Our predominant backend service, referred to as “campaigns service,” sends personalised and localized information to the cellular shoppers to ship Wrapped to tens of millions of customers. In the primary three to 4 hours, when the expertise goes stay, we see a few of the most demanding site visitors of the marketing campaign, on the order of tens of tens of millions of customers. So we have to correctly load check and stress check our system to ensure our service, and the upstream companies that it is dependent upon, can stand the load.

Moshpit is Spotify’s inner Backstage plugin for load testing our backend companies. The instrument was initially developed as a Spotify Hack Week undertaking, but it surely has since developed into a strong load-testing instrument that may ship payloads to any inner Spotify service over a number of community protocols (HTTP and gRPC amongst them). As part of the Backstage ecosystem, it seamlessly integrates with all backend companies created and hosted on Backstage.

The load-testing instrument works as a driver to ship any sort of payload you may create in any format you would like. Spotify internally makes use of Protocol Buffers as our de facto messaging format, so Moshpit makes use of this format encoded in binary to mock real-life payloads to any service.

Moshpit can also be configurable with any preliminary ramp-up time (say, 60 seconds), total period time (often 5 to 12 minutes), latency between payloads, and a goal requests-per-second for service to ship payloads.

These settings are all tunable with an online UI and are usually very simple for the developer to make use of to drive payloads towards their service in a simple and configurable means.

Usually, the primary problem a developer faces with utilizing Moshpit is sourcing the check payload to ship to our service. The problem lies in sourcing a pattern set that’s each numerous sufficient and truly consultant of a stay real-world load. 

In this case, our payload construction is pretty easy, consisting of the Spotify consumer ID and the Accept-Language header, amongst different fields. To meet our necessities, we determined to make use of inner worker IDs for a number of causes:

  • Some of our information pipelines for exterior customers take longer to finish main as much as launch, however workers’ Wrapped information pulls are prepared sooner within the improvement cycle for inner check classes.
  • The smaller checklist of worker Spotify accounts is simpler to gather and handle than a much bigger and equally numerous set of exterior Spotify consumer accounts to check with.
  • Our hundreds of workers are distributed throughout a variety of nations and areas, with their distinctive and numerous listening information and language preferences are a great pattern of finish customers.

Once we’ve a great pattern of customers throughout many alternative nations, the next is our common plan of assault:

  1. Notify upstream house owners. 

The predominant backend service that serves the personalised, localized information makes use of a number of key upstream companies, notably metadata service, translation service, picture technology service, and personalization information service. All these are owned by separate squads inside Spotify. Before blasting hundreds of requests per second to these companies and doubtlessly setting off pager responsibility alerts, we have to give advance warning to every squad of our check plans to allow them to put together.

  1. Tweak our inner routing to take away worker flags for some upstream companies. 

In sure companies, we will’t precisely load-test and simulate real-world situations, as a result of the worker flags in our request inform a few of our backend companies to not cache any information from the response. This causes additional latency that’s usually not seen in our manufacturing load. For these causes, we’ve to manually take away the worker flag in our requests to sure companies.

  1. Scale pods horizontally for every load-test session. 

We must provision sufficient pods for our companies to scale, but in addition be certain that upstream companies have sufficient pods.

  1. Record every load-test session.

We have a look at our load-test classes on Grafana and ensure latency, packet drop charges, and horizontal pod autoscaling are acceptable and our CPU/reminiscence assets are sufficient as we differ our check masses throughout the U.S., E.U., and Asia areas. We check throughout totally different data-center areas and scale as much as the anticipated manufacturing site visitors. We additionally monitor and examine upstream companies to ensure they’re OK with the load.

  1. Scale up load-testing resulting in launch for anticipated manufacturing site visitors. 

For this 12 months, roughly on the order of tens of hundreds requests per second for the U.S. , E.U. and Asia had been anticipated for our predominant service. Recording all of our runs, the cpu/reminiscence utilization, and what number of replicas we had scaled as much as in every check assist different builders who bounce in on the trouble to know what was performed and what nonetheless must be performed.

Implementing a radical load-test plan is barely however a crucial a part of making certain that the launch of a large marketing campaign like Spotify Wrapped is profitable. Here are a few of the key substances that we emphasize for fulfillment:

Ensure you check with a various number of payloads that symbolize real-world load

In our case, this implies a various set of customers from many alternative nations and languages, in addition to fetching distinctive prime artists, tracks, playlists, and podcasts that can actually stress check the backend and the info pipeline for obscure or irregular information that our code can’t course of.

Provision not solely to your service but in addition for upstream companies while you scale your checks 

This contains rate-limiters and traditional Kubernetes scale-up assets (useful resource quotas, horizontal pod autoscaling configs, availability of machines). Coordination must be emphasised when scaling up checks which have upstream results.

Think worldwide

You want to check a number of areas and ensure all of your elements are scaled up internationally (object storage, database, caches, upstream companies).

Test all endpoints and areas

We have a mess of endpoints along with our predominant service that serves the personalised information for the info tales. It’s vital to check all of them totally, and typically in parallel runs and throughout a number of areas, if attainable. This ensures that the assets you provisioned scale up throughout all endpoints and in all areas concurrently.

In the tip, 2022 Wrapped was an enormous success, with tens of millions of customers having the ability to get pleasure from their personalised Wrapped expertise with none important technical points. This isn’t any small feat with a marketing campaign of this measurement and scope, however with tight coordination between all backend, information, and consumer engineering groups, we had been ready to ensure we had sufficient elastic capability, warm-up time, and cautious monitoring in place to resist the anticipated rush of our followers.



LEAVE A REPLY

Please enter your comment!
Please enter your name here