m-onz

ai music generation

14.07.24

I've written about music generation A.I before but it quickly became out of date and I personally find documenting the stuff I do way more tedious that actually carrying on experimenting. So the articles don't feel very helpful and get taken down, this is my latest attempt to give you a crash course in generating music using gen AI.

I performed a set using generative AI music (soundcloud link below) but I felt bored and disatisfied performing with it because even though I managed to get some nice output it felt like DJ'ing which I'm really not interested in as an art form. I'm going to do live coding with my own pure data & GEM system in future however I will use genAI music in some contexts like for installations.

There are two different routes you can take when generating music using machine learning or A.I... using commercial (and controversial) platforms like udio & suno will let you generate high fidelity audio from simple prompts like "funky house track". You can attempt to use a prompt like "daft punk" but it will detect that convert your prompt into "french house" for example. There is a question about whether these platforms trained the machine learning on copyrighted material. My guess is they deffinitely did... you can hear influence on the output from tracks that are deffinitely under influence. If you generate enough dubstep you will hear motifs from Skream and it can also generate beatles esque track.

suno web interface

udio web interface

To use sudo or udio you can sign up and pay for a subscription. This will give you tokens/credits to generate audio of pretty much any existing music genre. I was hoping to get these machine learning models to "interpolate" between different genres in the hope of getting new genres's or music and something interesting. From my although the audio fidelity is impressive. These models can only regurgitate their training data, with lots of trial and error a certain percentage of the output is cool but its very easy to generate crap stock music style output.

suno now allows uploading 60 seconds of your own audio (it scans it for copyright material so you can't upload the beatles anthology in theory) and I've had some interesting results uploading an indie rock track and then asking it to extend it into a new genre like drum and bass. This is deffinitely worth playing around with in my opinion.

These commercial platforms are cagey, have fuzzy legal or copyright implications and might get shut down by the copyright police at some point. I would play around with them while you still can! They only offer web interface with no API so those who want to automate this new ML will be disapointed. You cannot write scripts to generate audio and you'll be forced to spend ages clicking and waiting for output.

Open source music generation

Before I started using udio & suno I was experimenting with open source music gen models on replicate.com. Replicate allows you to create scripts so I was able to generate many tracks in short time using node.js scripts.

The audio quality is not as good as suno & udio so I had to do intense prompt engineering to get the output I wanted. These open source models can also be run on your own computers. Its worth being aware of these models because they could be useful in certain situations where you would like to automate or generate offline without the internet.

Thank you

That was a really quick overview of how to generate music with AI, I wish I had more time to go into more detail and give you more examples of prompts, scripts and other tips and tricks. Have fun!

links

replicate
suno
udio