TV Spatial Navigation – Spotify Engineering : Spotify Engineering

0
428
TV Spatial Navigation – Spotify Engineering : Spotify Engineering



May 10, 2023

Published by Sergio Avalos, Senior Engineer

Very typically, when growing consumer interfaces, a whole lot of focus is given to the visuals, as that’s what primarily will get a consumer’s consideration. However, different elements are equally impactful for making the expertise nice however will be ignored, just like the consumer’s enter – how the consumer interacts with the app. It is totally comprehensible that it’d go unnoticed; more often than not, the platform, say Windows for Desktop, iOS for iPhone or Android TV on your residence Smart TV, handles it as a substitute of the applying’s developer. But what occurs when this isn’t the case?

This is the state of affairs our staff discovered ourselves in. We develop the Spotify expertise for large screens, like residence TVs and gaming consoles. One occasion specifically that turned out to be difficult however enjoyable was TV spatial navigation.

What is spatial navigation?

Spatial navigation refers back to the act of navigating the main focus of your software to totally different components in a two-dimensional view. This is usually achieved utilizing the arrow keys of a TV distant management or the joystick of a gamepad for a gaming console. Unlike desktop purposes, choosing a component with these controllers is much less intuitive than with a mouse, the place the motion of the cursor is simulated by the motion of the customers’ personal fingers. Here, customers have to maneuver the main focus in one in all 4 instructions till it arrives on the desired vacation spot.

Shouldn’t spatial navigation be supported already?

Talking about spatial navigation and TV management would possibly sound like an outline of the keyboard and mouse; these inputs are developed by platforms, for the desktop or laptop computer. So it’s inevitable to assume, “Shouldn’t the TV manufacturers provide this support as well?”

And sure, sensible TVs do present this help, nevertheless it solely comes without spending a dime when growing a local software (that’s, an software that makes use of the identical know-how because the producer). For instance, Android TVs use Android SDK, Apple TV has their proprietary software program, and Roku TV created their very own programming language, BrightScript.

Our staff determined to take a distinct strategy for constructing the Spotify consumer for TVs and gaming consoles, and we opted for a hybrid software as a substitute of going totally native. In our case, all of the purposes for every TV producer open an internet software that can render the consumer interface. This selection gave us excessive versatility of utilizing the identical supply code on a number of gadgets, nevertheless it got here at the price of dropping a few of the advantages that include native purposes.

Despite internet browsers themselves being superior software program, there isn’t help for this performance on the time of writing. There is a draft report known as CSS Spatial Navigation Level 1, however it’s anticipated to take a very long time to be accepted and applied.

Therefore, it was essential to provide you with our personal resolution, since this performance wouldn’t be offered by the net browsers or TV platforms working the interface. Nonetheless, its implementation proved to be a really enjoyable mission, involving geometry and some ideas from laptop science. 

Defining our necessities

Before leaping into the precise implementation, let’s outline the first circumstances:

Basic navigation

This is the Home view because it seems to be now:

Example of Spotify’s Home view.

From this view, the applying must mark solely these components that set off an motion, like navigating to a brand new view, marking an merchandise as a favourite, or beginning playback. After extracting them, we get hold of a brand new view with the navigable components solely.

Skeleton of the UI with selectable components.

In this new view, the selectable components are on the left menu. To the precise are totally different entities comparable to albums, playlists, exhibits, and so on., and choosing them will clearly take the consumer to the corresponding touchdown pages.

There are different barely extra complicated views, just like the matrix that’s offered when customers navigate to the Search view. Each tile represents a class of audio (and now video) content material.

In these circumstances, customers count on to navigate and meet boundaries the place they can not transfer any additional, like hitting the underside of a tracklist. However, in different circumstances, customers count on no limits.

Cycle navigation

There are circumstances the place it’s extra handy to take away constraints and permit customers to proceed the navigation even after they attain the restrict, like on the backside of a menu. Clicking the down key will transfer the main focus to the primary merchandise of the menu. We name this cycle navigation.

Example of cyclic navigation in the side menu.
Example of cyclic navigation on the Side menu.

Block navigation in sure areas

Finally, in different circumstances, it’s crucial to dam all navigation and permit solely probably the most distinguished component to be chosen (for instance, when there may be an alert and customers are required to reply earlier than shifting ahead, comparable to confirming to signal out). The navigable components behind this alert window are nonetheless there, however they’ll’t be chosen whereas this alert window is current.

Navigation blocked on modal elements
Example of the navigation up and down blocked on modal components.

Having outlined all our wants, we will now leap into the answer mode. The evolution of this library is split into two major elements outlined under, however in actuality, it was much more complicated than this tremendous simplified characterization.

The time period “naive” won’t do that model justice, since many concepts from this section have been reused for the second iteration. Instead, the utilization of this time period goals to spotlight the delivery of this library.

Before deciding the place to go, one wants a map. In our case, a navigation map. Since our software was constructed upon React, it was logical to maintain a illustration of the navigable components in reminiscence, as that library additionally makes use of an analogous strategy with digital DOM. The determine under illustrates how a web page is represented because the navigation tree:

Home view and its illustration as a navigation tree.

To implement this, new React elements have been created to categorize all these nodes; that’s, these components rendered on the display screen that may be chosen by the TV management.

Navigable components

In an analogy of a tree, this component will be thought-about the leaf. Every component that must be chosen is contained inside a NavNode, and its accountability is to register this component on the navigable tree, point out that are the sibling “leaves,” and deal with the choice when a consumer clicks the Enter key. To distinguish these components from one another, every accommodates a singular ID (or navId), which is used to know the place the main focus will go.

<NavNode navId=”toplist-link” nextRight=”news-link”>
  <a href=”/playlist/123”>
    <img src=”/sweden-toplist-cover.png />
    <span>Sweden’s prime checklist</span>
  </a>
</NavNode>

Container component

After the leaves comes the department. This part (aka <NavContainer />) accommodates all these navigable components that require particular remedy, like menus, as defined above. It takes care of circumstances comparable to cyclic navigation or block navigation occasions from effervescent up by way of the DOM tree, if crucial. 

<nav>
  <NavContainer navId=”sidebar-menu” cycle>
    <NavNode navId=”search-item”>
      <a href=”/search”>Search</a>
    </NavNode>
    <NavNode navId=”settings-item” subsequentTop=”search-item”>
      <a href=”/settings”>Settings</a>
    </NavNode>
    ...
  </NavContainer>
</nav>
<major>
  <h1>Good morning, User!</h1>
  ...

Root component

Finally, there may be the basis part (aka <NavRoot />), which sits on the prime of the applying. Its perform is principally to coordinate all the weather: register a brand new navigable component, dictate the place the present focus lies, replace it when it receives a navigation occasion, and replace the navigation tree when customers go to a brand new web page.

perform MyApp({ youngsters }) {
  return (
    <NavRoot navId=”root”>
      <nav>
        <NavContainer navId=”sidebar-menu” cycle>
          <NavNode navId=”search-item”>
            <a href=”/search”>Search</a>
          </NavNode>
          ...
    </NavRoot>
  );
}

Limitations

Although this implementation fulfilled the fundamental position, it got here with a couple of limitations, which will be noticed already from the instance above:

  • The navigation logic is a guide course of and thus error-prone. Developers have to state the place the navigable component lies utilizing nextRight/Left/Up/Down attributes, and that was not all the time recognized. The identical goes for the navId attribute, which must be distinctive.
  • The DOM tree of the web page turned extra convoluted due to these wrapper elements. While the additional components weren’t essentially an issue, they did make our growth expertise extra complicated and tough to debug.
  • Finally, implementing this library required builders to have prior information of the navigation course of. For newcomers, this requirement made the onboarding expertise barely extra sophisticated.

With these limitations in thoughts, the objectives for the subsequent iteration have been set to cut back the quantity of data the consumer of the library ought to have in two methods:

  1. Simplifying the API and thus its utilization
  2. Reducing the coupling between the weather to allow them to be correctly examined

Thanks to the introduction of Hooks in React 16.8, it was potential to encapsulate all of the enterprise logic right into a perform with out requiring an additional part to comprise the component that might be targeted. Here is an instance of the way it was applied:

Before…
perform Settle forButton(props) {
  return (
    <NavNode
      navId=”ok-button”
      subsequentLeft=”more-button”
      nextRight=”cancel-button”
      focusable
      claimFocus
    >
      <button onClick={props.onClick}>
        OK
      </button>
    </NavNode>
  );
}
After…
perform Settle forButton(props) {
  const { ref, isFocused } = useFocusRef();

  return (
    <button
      ref={ref}
      className={isFocused && ‘btn-focused’}
      onClick={props.onClick}>
        OK
    </button>
  );
}


The major highlights:

  • No want for additional DOM components! Thanks to the Hook perform useEffect, it’s potential to detect when a brand new part is mounted. Fewer DOM components equals a greater debugging expertise.
  • No extra navigation logic! Instead, we solely have a reference of the particular DOM component that may be targeted, which shall be returned by the Hook perform.

And right here is the place the enjoyable begins! Instead of letting customers dictate how the navigation shall be laid out, that accountability falls on the library. This is feasible to realize because of the perform Element. getBoundingClientRect from the DOM API; it offers the dimensions and coordinates relative to the viewport.

Viewport; Source: Source: MDN’s Web Docs.
Source: https://developer.mozilla.org/en-US/docs/Web/API/Element/getBoundingClientRect

With this data within the navTree, it was now potential to recreate the identical view with the navigable components mechanically. Using this view, we have been capable of decide what can be the subsequent component chosen in line with the route contained within the navTree, with out including additional components to the DOM.

Navtree

The second space for enchancment was the coupling between the weather. It might be evident within the part above that NavRoot was doing so much: receiving the occasions from the consumer, dictating which component is at present targeted, including and eradicating components to the navigable tree, and so on. Thus, it was divided into three new modules with particular roles:

  1. NavEngine: Keeping the navigable tree updated
  1. FindFocusIn: Keeping monitor of the present targeted component
  1. Navigate: Acting because the navigation motor, discovering the subsequent component that ought to be targeted when customers hit any of the arrow keys

Personally, the latter module was the one which I discovered notably fascinating, because it makes use of some subjects from laptop science. Since the illustration for focusable components is a tree, it makes use of the lowest frequent ancestor algorithm to search out the mother or father component. This is beneficial for circumstances like cyclic navigation the place the mother or father component is concerned.

Binary tree
Lowest frequent ancestor in a binary tree.
Source: https://www.geeksforgeeks.org/lowest-common-ancestor-binary-tree-set-1/

Then there have been different challenges that I loved studying from, comparable to selecting the subsequent component in line with its dimensions, relative positions, and the route of the choice. In most circumstances, that is trivial, because the consumer interface makes use of matrices or rows, however there are different circumstances the place it isn’t tremendous clear (for instance, coming again to the navigation menu from the principle view).

Building this enterprise logic was not a straightforward process, particularly if we needed the flexibility to unit check it. To simplify the method, we have been required to create a brand new mission that might produce a illustration of a navigable tree with out working the applying. The thought behind it was to current a drawing device the place customers can simulate totally different eventualities of how the selectable components shall be positioned, and this device would return a navigable tree in JSON format. This output would then be used because the enter for the unit assessments.

Our tooling for creating a JSON representation of a view to be unit tested.
Our tooling for making a JSON illustration of a view to be unit examined.

Having built-in the final model of this spatial navigation library in our software and never achieved any main updates, we will confidently say that it fulfills the fundamental want of offering an enter that’s simple for our customers and simple to grasp for the growing staff. Our subsequent step is to intently consider the potential efficiency prices, as Spotify continues to develop to extra TVs. This side turns into essential on low-end gadgets and people with totally different content material codecs.

Special because of Erin Depew, Daniel Lopes Alves, Dennis Gulich, Andreina Loriente and Yasa Akbulut. This put up wouldn’t have been potential with out their steerage and assist.

Tags: backend



LEAVE A REPLY

Please enter your comment!
Please enter your name here