November 15, 2022
Needing to ship sooner and extra reliably whereas managing a rising variety of contributors and a extra advanced codebase looks as if the destiny of each hyper-growth tech firm. For platform groups, the problem isn’t any completely different. How can we rapidly roll out and enhance the adoption of latest applied sciences safely with a rising codebase and group?
Problem area
Spotify’s cellular codebase has been rising tremendously (by 29% in 2019, 49% in 2020, and 23% in 2021). As our enterprise expands, our codebase evolves with new experiences, akin to Spotify Live. At the identical time, Spotify continues to tackle extra engineers, rising the variety of modifications within the cellular codebase over time.
A yr and a half in the past, we kicked off an initiative throughout the Mobile Engineering Strategy program, engaged on a number of migrations to permit shopper options to be developed in an remoted surroundings — equally to backend microservices.
Since then, about 2,200 Android and iOS elements have been related to a system, and most of our Android and iOS codebases have been migrated to construct with Google’s open supply construct system (Bazel), an enormous effort impacting greater than 100 squads throughout the corporate.
Key challenges
We imagine that migrations will develop into more difficult as our codebase and group develop in measurement and complexity. We want to contemplate such difficult migrations to be the norm going ahead, if we wish to ship platform options effectively and at scale for cellular at Spotify.
Below are some challenges that we confronted throughout our journey. We’ve supplied the eventualities with signs (“When”), what you need to keep away from when dealing with the state of affairs (“Don’t”), and what we recommend that you simply do (“Do”).
Challenge 1: Defining the scope
How can we create change and produce standardized architectural options that help most use circumstances with out undermining others?
WHEN:
- The scope appears large — there’s a sense that quite a few use circumstances should be addressed.
- It’s troublesome to know the place to start with the quantity of tech debt, modifications required, and use circumstances to help.
- Stakeholders attain out a number of occasions, not totally understanding what they’re presupposed to do through the migration.
DON’T:
- Try to establish all doable eventualities earlier than rolling out the answer.
- Start a big migration with out an estimated roadmap and definition of success.
- Reach out to stakeholders and not using a clear definition of what they should do.
DO:
- Know your objectives. Create a product transient that explains what you might be doing, why you might be doing it, and the way you’re going to do it.
- Focus on the values. Changing them is time-consuming, however reframe objectives to higher align when wanted.
- Address your viewers. Understand their psychological fashions so that you could speak about what’s related, join the place their wants are, and discover proxies to broaden your attain.
- Start small. Create a proof of idea, validate it with stakeholders, and get the migration by means of alpha, beta, and GA product lifecycles. Add use circumstances as you uncover them, mapping out the commonest ones first. This will mean you can get sufficient suggestions and incrementally help the completely different use circumstances.
- Collaborate! Collaboration is vital within the early phases. Find early adopters who’re wanting to attempt your options and can develop into ambassadors in their very own tribes. This will finally assist with scalability as properly.
Challenge 2: Scaling up
How can we drive massive architectural and infrastructure modifications sooner in a hyper-growing group?
WHEN:
- Large numbers of squads are impacted by the modifications (greater than 100 squads).
- There is a big quantity of labor — together with guide refactorings that should be performed by the squads through the migration.
- Progress is sluggish — there’s an enormous variety of stakeholders and dependencies.
- Stakeholders are overwhelmed with the continuing migrations.
DON’T:
- Believe that enormous infrastructure and structure modifications are inconceivable or not wanted in a big group.
- Conform with shifting slower when making infrastructure/architectural modifications.
DO:
- Focus on stakeholder administration. Identify the completely different stakeholders so as of precedence. Be involved with them (e.g., Slack, e-mail teams). Tell them concerning the significance of your effort.
- Communicate. Keep your viewers engaged by sharing the progress by means of newsletters and office posts.
- Look to automation. Simplify the migration course of by investing in automation upfront. Do that you must refactor the code? Could the identical step be performed mechanically utilizing a script?
- Make time for spike weeks. Partner with squads and tribes to collectively dedicate every week to work on the migration.
- Use a number of assets to extend your attain. Education applications can assist onboard groups to new ideas and migrations, with the potential for making use of them immediately.
- Introduce the corporate’s engineering practices within the onboarding applications to make sure that new joiners perceive and comply with the really useful finest practices from the beginning.
Challenge 3: Competing priorities
What’s the steadiness between having platform groups working to cut back tech debt and undertake new applied sciences versus holding squads accountable for the standard of the code they personal?
WHEN:
- Stakeholders are concerned in high-priority initiatives and have little time to undertake your options.
- Stakeholders suppose they’re slowed down by platform migrations and usually are not motivated to undertake new applied sciences.
- Projects dilute and the crew(s) driving the migration feels unmotivated, probably with some members leaving.
- Code modifications are vital and in battle with the path of the migration.
DON’T:
- Believe that stakeholders perceive the significance and influence of your migration and can prioritize it — they’ve loads of different work to do as properly.
- Give in. “We are busy working on a higher-priority bet” is a typical argument to not execute on platform migrations.
- Consider the metrics and objectives to be set in stone and troublesome to vary.
DO:
- Motivate. Showcase the optimistic influence of your migration to stakeholders and encourage them to get the work performed.
- Continuously consider. Have common checkpoints through the quarter to guage the velocity of the migration to succeed in your quarterly, half-yearly, or yearly objectives.
- Manage threat. If the migration is shifting slowly, how are you going to tweak the method to succeed in the objectives? Can you streamline the method? Do you want extra engineers? Can you rent contractors? Can you affect different tribes to deliver your efforts to their backlog?
- Take the ache on-platform. When doable, make the required modifications on behalf of the group to allow them to concentrate on creating enterprise worth for our customers.
- Monitor modifications that go towards migration KPIs and create channels with groups to help them.
Challenge 4: Being accountable
How will we maintain groups accountable and hold them engaged within the adoption of latest applied sciences that require massive infrastructure modifications and refactoring?
WHEN:
- There’s an absence of accountability within the adoption of latest applied sciences, therefore migrations are taking longer.
- It’s exhausting to foretell when the migration will finish.
DON’T:
- Expect there to be inside/exterior alignment when driving massive modifications over time.
- Think that it’s exhausting to measure the progress and influence of infrastructure modifications.
DO:
- Delineate the definition of performed. Use information and pattern graphs to present projections. It’s necessary to grasp the place you might be ranging from and how briskly you might be progressing so that you could estimate how briskly you’ll transfer sooner or later and whether or not that you must make any changes to hurry up the migration.
- Present your progress. Take management of the definition of success and talk typically to maintain your viewers engaged and onboarded together with your modifications.
- Use dashboards. Metrics and dashboards will talk the progress and influence, in addition to assist prioritize your work at scale over time.
- Maintain a timeline. Keep your roadmap updated. Over time, your crew and stakeholders may change, they usually’ll want to pay attention to your migration and timeline. A roadmap additionally will increase transparency, allows suggestions and collaboration, and helps establish any roadblocks.
Conclusion
Migrations of this nature and magnitude may develop into a brand new regular — in any other case, it may be inconceivable for the corporate to carry out sure modifications. New applied sciences will come, and we are going to do the required migrations, although we must always reduce the disruption for groups. Some platform merchandise will inherently require migrations, and doubtlessly in a big scale, and we must always think about them a part of the event lifecycle, along with testing and design.
We’ve discovered a terrific deal from this effort and hope that this information is beneficial for different groups driving massive migrations. If you wish to be taught extra about our challenges and the way we solved them, don’t hesitate to succeed in out to us.
Special because of Marvin, Foundation, BoB, and Rubik for driving this effort.