A brand-new Artificial Intelligence (AI) tool from Google can now create music of any genre from text prompts and even transcribe a melody into other instruments when it is hummed or whistled. The technique known as MusicLM is a text-to-music creation system, claims Google Research. It functions by examining the text and figuring out how large and intricate the composition is.
Details about a New AI Tool from Google Can Produce Music from Text Descriptions
According to the study publication, “We develop MusicLM, a model that generates high-fidelity music from text descriptions such “a relaxing violin melody supported by a tortured guitar riff.” We show that MusicLM can modify whistled and hummed melodies in accordance with the manner defined in a written caption, proving that it can be conditioned on both text and a melody.
Using a dataset of 280,000 hours of music, MusicLM was trained to learn how to create songs that make sense from written descriptions and to pick up on details like mood, melody, and instruments. Its capabilities go beyond just creating song snippets. The system may build on current melodies, whether hummed, sung, whistled, or played on an instrument, according to Google researchers.
In addition, the research shows that MusicLM can take a series of written descriptors, such as “time to meditate,” “time to wake up,” “time to run,” and “time to give 100%,” and turn them into a melodic “story” or narrative that can last for several minutes. Additionally, it may be directed by a picture and caption combination, or it may produce audio that is “played” by a certain kind of instrument in a particular game.
Google is not the first business to do this, it should be noted. According to TechCrunch, initiatives like Google’s AudioLM, OpenAI’s Jukebox, or Riffusion—an AI that can create music by seeing it—have all made an attempt. However, none have been able to produce songs that are extremely complicated in the composition of high-fidelity due to technical restrictions and a lack of training data. So, according to researchers, MusicLM may be the first to be able.
“MusicLM makes music at 24 kHz that holds steady for several minutes by modelling the process of conditional music synthesis as a hierarchical sequence-to-sequence modelling task. Our tests reveal that MusicLM works better than older algorithms in terms of both audio quality and adherence to the text description “Researchers from Google stated in the report.
However, MusicLM is not faultless. To begin with, some of the audio samples Google included in their research report have a distorted quality to them. While the technology may theoretically produce vocals, TechCrunch reports that these vocals are frequently synthesised and sound like nonsense. Another negative is the occasionally compressed sound quality, which is an unfortunate effect of training.
The numerous ethical issues that a system like MusicLM presents have also been recognised by Google researchers, including a propensity for the created songs to contain copyrighted content from training data. During an experiment, researchers discovered that 1% of the music the system produced was exactly like the songs it had been trained on. This bar appears to be too high for Google researchers to release the most recent AI system in its current form.
The co-authors of the article stated, “We recognise the danger of potential misuse of creative work connected with the use case. We clearly emphasise the necessity for additional future effort to address these dangers related to the development of music, they continued.