New local AI music generation model produces "first draft" for composers in one minute

04 Dec 2023

Lianhe Zaobao, 4 Dec 2023, 本地新型AI音乐生成模型 为作曲家一分钟内提供“初稿”​
(translation)
 
The first "controllable text prompt" artificial intelligence music generation model has been developed locally. Experts believe that it will greatly improve the efficiency of music composition and quickly provide high-quality "first drafts" for composers.
 
This AI software called Mustango was developed by a six-person research team led by Soujanya Poria and Dorien Herremans, two assistant professors at the Singapore University of Technology and Design, in about half a year. Developed in March, it can quickly generate music works that meet specific requirements such as chords, beats, speed and pitch based on professional music text prompts input by users.
 
The project was developed from an AI software called TANGO that Dr Poria had developed earlier. TANGO can convert text information into sounds such as language and music in a few seconds.
 
On this basis, the research team used an original "data enhancement method" to create a music data set (dataset) called MusicBench. Then, the researchers used their own music information retrieval method to extract the music features in the database and associate these features with text descriptions, allowing Mustango to create music based on text information within one minute.
 
This AI software focuses specifically on controllability, allowing users to input specified chord sequences and rhythm preferences, providing unprecedented flexibility for composers, sound designers and podcasters," said Poria.
 
This SUTD research has been pre-published on website arXiv, which can be shared with peers and receive feedback at any time. At the same time, this research has also been made available to the public through Hugging Face, an open data and machine learning platform. Currently, there are multiple text prompt samples composed of professional music terms such as "D minor" and "Largo" on the platform.

NUS Assistant Professor: May be suitable for creating music clips for social media advertising and movies and games
After testing Mustango, Chen Zhangyi, assistant professor (music theory, composition) at the National University of Singapore's Yong Siew Toh School of Music, pointed out that most of the generated clips sounded quite accurate and were basically consistent with the timbre, style and emotion contained in the text prompts.
 
"My first reaction was that this AI software could be useful for creating music clips for social media, advertising, and movies and games," he said.
 
From a composer's perspective, he believes that the software is easy to use and covers a variety of musical styles, genres, and sound mixtures, which can help composers quickly create a "first draft" of a work. At the same time, this software can also help composers understand the general effect of the target work by inputting very specific text prompts, thus speeding up the composer's preparation before creation.
 
He added, "If you make good use of this software, you can achieve results that complement traditional composition skills."