{"id":138868,"date":"2025-04-14T21:48:01","date_gmt":"2025-04-14T21:48:01","guid":{"rendered":"https:\/\/showbizztoday.com\/index.php\/2025\/04\/14\/text2tracks-improving-prompt-based-music-recommendations-with-generative-retrieval\/"},"modified":"2025-04-14T21:48:01","modified_gmt":"2025-04-14T21:48:01","slug":"text2tracks-improving-prompt-based-music-recommendations-with-generative-retrieval","status":"publish","type":"post","link":"https:\/\/showbizztoday.com\/index.php\/2025\/04\/14\/text2tracks-improving-prompt-based-music-recommendations-with-generative-retrieval\/","title":{"rendered":"Text2Tracks: Improving Prompt-based Music Recommendations with Generative Retrieval"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<div class=\"published-date\">\n<div class=\"icon-holder\">\n                                                <img decoding=\"async\" src=\"https:\/\/research.atspotify.com\/wp-content\/themes\/spotify\/images\/icon.png\" alt=\"\"\/>\n                                            <\/div>\n<p><span class=\"date\">April 08, 2025<\/span> Published by Enrico Palumbo, Gustavo Penha, Andreas Damianou, Jos\u00e9 Luis Redondo Garc\u00eda, Timothy Christopher Heath, Alice Wang, Hugues Bouchard, Mounia Lalmas<\/p>\n<\/p><\/div>\n<div class=\"img-holder\">\n                                            <img src=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/RS069-Text2Tracks_-Improving-Prompt-based-Music-Recommendations-with-Generative-Retrieval-no-logo.png\" class=\"attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" fetchpriority=\"high\" srcset=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/RS069-Text2Tracks_-Improving-Prompt-based-Music-Recommendations-with-Generative-Retrieval-no-logo.png 1800w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/RS069-Text2Tracks_-Improving-Prompt-based-Music-Recommendations-with-Generative-Retrieval-no-logo-250x131.png 250w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/RS069-Text2Tracks_-Improving-Prompt-based-Music-Recommendations-with-Generative-Retrieval-no-logo-700x368.png 700w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/RS069-Text2Tracks_-Improving-Prompt-based-Music-Recommendations-with-Generative-Retrieval-no-logo-768x403.png 768w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/RS069-Text2Tracks_-Improving-Prompt-based-Music-Recommendations-with-Generative-Retrieval-no-logo-1536x806.png 1536w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/RS069-Text2Tracks_-Improving-Prompt-based-Music-Recommendations-with-Generative-Retrieval-no-logo-120x63.png 120w\" sizes=\"(max-width: 1800px) 100vw, 1800px\"\/><figcaption\/>\n                                        <\/div>\n<p>Imagine asking your music app to \u201c<em>play some old-school rock ballads to relax<\/em>\u201d and immediately receiving the right observe suggestion. This sort of customized music discovery is the objective of Text2Tracks, a brand new system primarily based on generative AI designed to enhance how music is really helpful primarily based in your selection of phrases.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Why Current Methods Fall Short<\/strong><\/h2>\n<p>Some current music suggestion options\u2014reminiscent of these seen in Spotify\u2014have begun to include off-the-shelf Large Language Models (LLMs) to course of pure language prompts.\u00a0 For instance, LLMs can generate <strong>artist names and track titles<\/strong> as their output (e.g. <em>User: \u201cCan you recommend some chill songs for me? LLM: \u201cSure, I can recommend <artist_name> \u2013<track_name>for you\u201d<\/track_name><\/artist_name><\/em>).\u00a0<\/p>\n<p>While intuitive, the method of figuring out tracks by way of their title has a number of limitations:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Song titles are usually not descriptors<\/strong>: Song titles don\u2019t all the time replicate the temper or model of the music. For instance, two songs with related names may have fully completely different genres or vibes.<\/li>\n<li><strong>Song titles may be ambiguous<\/strong>: Songs typically have a number of variations (e.g., stay, acoustic, remastered), and it\u2019s not all the time clear which one to advocate.<\/li>\n<li><strong>Slow and expensive<\/strong>: Song titles and artist names may be fairly lengthy. Since LLMs generate their responses one token (i.e. piece of phrase) at a time, producing the complete title is computationally costly and time-consuming.<\/li>\n<\/ul>\n<p>To overcome these challenges, we developed <strong>Text2Tracks<\/strong>, a mannequin that introduces a extra environment friendly and efficient strategy to advocate music by fine-tuning an LLM to instantly generate optimized observe identifiers (IDs) given a music suggestion immediate.<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"410\" src=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/image1-700x410.png\" alt=\"\" class=\"wp-image-5949\" style=\"width:820px;height:auto\" srcset=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/image1-700x410.png 700w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/image1-250x147.png 250w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/image1-768x450.png 768w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/image1-1536x901.png 1536w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/image1-120x70.png 120w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/image1.png 1999w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\"\/><\/figure>\n<\/div>\n<p class=\"has-text-align-center\"><em>Figure 1: Text2Tracks supplies music suggestion by producing observe IDs which can be related to the person\u2019s immediate.\u00a0<\/em><\/p>\n<h2 class=\"wp-block-heading\"><strong>How Text2Tracks Works<\/strong><\/h2>\n<p>\u200b\u200bRather than producing precise track titles and matching them to a database, Text2Tracks makes use of <strong>Generative Retrieval<\/strong>: a Generative AI method the place the system is educated to generate observe IDs instantly from textual content prompts. These IDs instantly determine songs within the music catalog and permit for quicker, extra correct suggestions.\u00a0<\/p>\n<p>Here\u2019s the way it works (Fig. 1):<\/p>\n<ul class=\"wp-block-list\">\n<li>Experimenting with completely different methods to create observe identifiers, the playlist dataset is pre-processed acquiring pairs of playlist titles and observe IDs<\/li>\n<li>An LLM is fine-tuned on the pre-processed playlist information, enabling Text2Tracks to study the connection between person requests (like \u201cchill acoustic vibes\u201d) and the songs that match them.\u00a0<\/li>\n<li>At suggestion time, the system generates a set of IDs for songs that match the question utilizing a diversified beam search technique (i.e., a decoding method that balances relevance with range by exploring a number of believable sequences, reasonably than simply the top-ranked ones). This avoids the necessity for a further search step, whereas nonetheless offering a related and diversified set of suggestions.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\"><strong>How Text2Tracks is Trained and Tested<\/strong><\/h2>\n<p>A key distinction between Text2Tracks and off-the-shelf LLMs is that Text2Tracks undergoes an in depth fine-tuning course of tailor-made particularly for music suggestion. We fine-tune the mannequin utilizing a big corpus of chosen playlists, the place lengthy and descriptive playlist titles (e.g., \u201cenergetic rock vibes,\u201d \u201cchill relaxing at the beach\u201d) function a proxy for pure language music prompts. This permits the mannequin to study associations between textual descriptions and related observe alternatives.<\/p>\n<p>To consider the effectiveness of this method, we examine the mannequin\u2019s suggestions to playlists curated by skilled editors, utilizing the hits@10 metric to measure how typically the mannequin\u2019s prime predictions align with professional alternatives. This fine-tuning step helps the mannequin higher seize the nuances of music-related language and generate extra related and contextually acceptable observe suggestions.<\/p>\n<h2 class=\"wp-block-heading\"><strong>The Role of Track IDs<\/strong><\/h2>\n<p>One of probably the most progressive features of Text2Tracks is the way it assigns IDs to songs. We experimented with three completely different methods to seek out the easiest way to symbolize tracks (Tab. 1):<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Content-Based IDs<\/strong>: Use metadata like artist names and track titles to construct the ID (<em>Artist Track Name<\/em>). While easy and strong as a baseline, this technique faces the challenges talked about above within the \u201cWhy Current Methods Fall Short\u201d part.<\/li>\n<li><strong>Integer-Based IDs<\/strong>: Assign distinctive integers to songs, both in a naive approach the place every observe corresponds to a special integer (<em>Track Intege<\/em>r) or in a structured approach making use of the artist-track hierarchy (<em>Artist Track Integer<\/em>). This method is simple to implement however could miss extra fine-grained relationships between related tracks, because it solely depends on the observe\u2019s artist as metadata.<\/li>\n<li><strong>Semantic (Learned) IDs<\/strong>: use vector discretization methods to create IDs primarily based on track traits, reminiscent of style, temper, model. For instance, two vacation songs may share a part of their ID (\u201c&lt;0&gt;&lt;1&gt;\u201d and \u201c&lt;0&gt;&lt;2&gt;\u201d), indicating their similarity. These identifiers may be seen as observe \u201czip codes\u201d, figuring out parts of a vector area the place a observe lives. We examined two completely different vector areas, one primarily based on textual content embeddings (i.e. embedding titles of playlists the place the observe seems within the coaching set) and one primarily based on collaborative filtering embeddings (i.e. constructed mining patterns of songs showing collectively in playlists).\u00a0<\/li>\n<\/ul>\n<p>We discovered that <strong>Semantic IDs<\/strong> constructed on prime of <strong>collaborative filtering vectors<\/strong> labored the very best. This highlights the flexibility of Semantic IDs and of the underlying vector area to seize extra nuances in observe representations, leading to a better accuracy within the remaining set of suggestions.\u00a0 The <em>Artist Track Integer<\/em> technique got here in second, which is outstanding contemplating it\u2019s comparatively easy to implement\u2014it solely makes use of the artist and observe and doesn\u2019t want an embedding area. This reveals how vital the artist-track hierarchy is for modeling observe identifiers and the way a lot it could possibly assist the mannequin study to make good observe suggestions.\u00a0<\/p>\n<p>It can also be price noting that the <em>Artist Track Integer<\/em> outperforms the <em>Artist Track Name<\/em> technique, which is usually utilized in many eventualities, reminiscent of off-the-shelf LLM suggestions. This<em> <\/em>additional underscores the benefit of constructing structured observe identifiers reasonably than figuring out songs by way of their titles.<\/p>\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"218\" src=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.28.53-700x218.png\" alt=\"\" class=\"wp-image-5950\" style=\"width:840px;height:auto\" srcset=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.28.53-700x218.png 700w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.28.53-250x78.png 250w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.28.53-768x239.png 768w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.28.53-120x37.png 120w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.28.53.png 1134w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\"\/><\/figure>\n<p class=\"has-text-align-center\"><em>Tab. 1: hits@10 for various methods to create observe identifiers within the generative retrieval setting<\/em><\/p>\n<h2 class=\"wp-block-heading has-text-align-left\"><strong>How Text2Tracks Compare to Other Systems<\/strong><\/h2>\n<p>We in contrast Text2Tracks to different textual suggestion strategies, together with:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Bi-encoders<\/strong>: transformer fashions that detect semantic matches between textual content prompts and songs. In the zero-shot situation, we merely immediate the bi-encoder to match the question and the tracks. In the fine-tuned one, we fine-tune the bi-encoder on the identical playlist information we use for Text2Tracks.<\/li>\n<li><strong>BM25<\/strong>: A state-of-the-art keyword-based retrieval system.<\/li>\n<\/ul>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"304\" src=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.29.00-700x304.png\" alt=\"\" class=\"wp-image-5951\" style=\"width:672px;height:auto\" srcset=\"https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.29.00-700x304.png 700w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.29.00-250x108.png 250w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.29.00-768x333.png 768w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.29.00-120x52.png 120w, https:\/\/storage.googleapis.com\/research-production\/1\/2025\/04\/Screenshot-2025-04-07-at-11.29.00.png 846w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\"\/><\/figure>\n<\/div>\n<p class=\"has-text-align-center\"><em>Tab. 2: hits@10 for Text2Tracks in comparison with well-known textual content retrieval approaches<br \/><\/em><\/p>\n<p>We see that <strong>Text2Tracks considerably outperformed these baselines<\/strong>, reaching <strong>127%<\/strong> higher accuracy than the closest competitor. When wanting into <strong>why<\/strong> it really works so nicely, we qualitatively observe that it tends to advocate extremely related and canonical tracks for broad prompts, matching the person\u2019s expectation. For occasion, when requested for \u201c<em>Christmas classics<\/em>\u201d, Text2Tracks reliably retrieved probably the most iconic vacation tracks, whereas different programs supplied extra generic vacation songs.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Looking forward<\/strong><\/h2>\n<p>In this submit, we launched Text2Tracks, a brand new method to music suggestion that generates observe IDs instantly from a person immediate, delivering quicker and extra correct outcomes. Trained on playlist information and leveraging Semantic IDs constructed on prime of collaborative filtering embeddings, Text2Tracks outperforms present programs for prompt-based music suggestion.<\/p>\n<p>These outcomes spotlight the potential of generative fashions to complement music search and suggestion on Spotify. We are excited to proceed advancing these applied sciences, with the objective of constructing a unified generative AI system that enhances a number of aspects of music suggestion. Ultimately, we goal to make music discovery extra customized, intuitive, and pleasing for everybody.<\/p>\n<p>For extra info, please discuss with our paper:<br \/><a href=\"https:\/\/arxiv.org\/abs\/2503.24193\" target=\"_blank\" rel=\"noopener\">Text2Tracks: Prompt-based Music Recommendation by way of Generative Retrieval<\/a>\u00a0\u00a0<br \/>Enrico Palumbo, Gustavo Penha, Andreas Damianou, Jos\u00e9 Luis Redondo Garc\u00eda, Timothy Christopher Heath, Alice Wang, Hugues Bouchard, Mounia Lalmas. <\/p>\n<\/p><\/div>\n<p>[ad_2]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] April 08, 2025 Published by Enrico Palumbo, Gustavo Penha, Andreas Damianou, Jos\u00e9 Luis Redondo Garc\u00eda, Timothy Christopher Heath, Alice Wang, Hugues Bouchard, Mounia Lalmas Imagine asking your music app to \u201cplay some old-school rock ballads to relax\u201d and immediately receiving the right observe suggestion. This sort of customized music discovery is the objective of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":138870,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38],"tags":[8089,9126,280,9127,7486,8090,9125],"class_list":["post-138868","post","type-post","status-publish","format-standard","has-post-thumbnail","category-spotify","tag-generative","tag-improving","tag-music","tag-promptbased","tag-recommendations","tag-retrieval","tag-text2tracks"],"_links":{"self":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts\/138868","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/comments?post=138868"}],"version-history":[{"count":0,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts\/138868\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/media\/138870"}],"wp:attachment":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/media?parent=138868"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/categories?post=138868"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/tags?post=138868"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}