Recently, I started a new open-source project called SongGPT that explores the potential of Language Models in generating original and customizable musical compositions. However, I wanted to ask people who actually have an understanding of how LLMs work. Is this even a good idea? Or is it a better approach to have “domain” specific models in that case since text-based models will never have a true understanding of “music”?
Basically no. LLMs aren’t trained on audio. Roughly speaking, the only way it could ‘learn’ how to write music is by ‘reading’ sources where the notes are listed. E.g it reads an article saying ‘play Fur Elise with these notes: E D# E D#’ etc. etc.
In the future, AI models will begin to incorporate audio, and then you might be able try it. But it’s a long way off for now. Save your project for 5 years time
This is only using a simple system message before calling the ChatGPT API. I can imagine if someone puts a bit more effort here it could be something. No?
Sorry for the necro-post, but I have to say that the limitation of GPTs is essentially data, and pattern. If you have enough data reflecting some reasonable relation you can basically set up a location that you can ‘mine’ for new stuff that fits the embedded patterns.
It is strange, but an AI system can already tell a lot about you by posing a few questions and listening to the sound of your voice as you answer. People are different but for some purposes they fall into surprisingly comprehensive archetypes. AI could get particularly good at communicating exactly what is right for a given person, including music.
AI will be advancing quickly. GPTs have really opened things up, but they are not the only way automatons can make sense of the world and do useful things. In 2024 I expect to see the pace of progress in AI to continue accelerating as AI increasingly is used to advance itself.
There are various techniques used to create expectations, introduce changes, build tension and resolution, and structure different parts of music. Different types of music appeal to different people in different ways. In theory, an AI assistant could understand your current state based on factors like the time of day, your physical condition (such as sleepiness, hunger, or temperature), and determine what kind of music would suit you in a particular context. It could then generate a soundtrack that resonates with you in a way you would enjoy. For example, you might benefit from reflective music when you’re relaxing and unwinding, romantic music when you’re falling in love, or uplifting music when you’ve just left a toxic relationship or quit a job you hate. The remarkable aspect is that as AI systems advance, they will improve in terms of knowledge, understanding (or at least their appearance of understanding), perception, and their ability to generate and interact with the world. No human artist could match the ability of a well-designed AI, embodied in the right way, to produce precisely what an individual needs at any given moment, considering their current state and the surrounding context.