AI & generative music tools

2024-11-03T21:06:15

This is a loose collection of thoughts and notes I’ve gathered on generative music. I started making generative music years ago with Sonic Pi ♥ and later moved to using ToneJS in a web context so I could synchronize it to WebGL visuals. Lately the game has changed dramatically with AI which has introduced a ton of new tools. Seems to me there’s a huge unexplored space where these two things overlap.

Generative Code Music

Sonic Pi https://sonic-pi.net

Sonic Pi rocks, I used this early on when I understood little about what I was doing and it made a lot of stuff fun and easy. It’s written in and based around Ruby, which is a language I have basically only used in this context, but it really can be a surprisingly good place to start for the code-curious.

I ultimately found the language a bit foreign and the context limiting for my skillset (possibly for a lot of the same reasons that make it great to start with).

This is my collection of sonic pi scripts, they require certain folders and must be run in a certain order and none of that is documented, but maybe someone will find helpful ideas within: https://github.com/kylegrover/sonicpi-algobop

Using Sonic Pi:

Sonic Pi is a standalone environment, that’s part of what makes it so easy to start with. Simply download it and try it out using the built in project files. They also have a great tutorial:
https://sonic-pi.net/tutorial.html#section-1

ToneJS https://tonejs.github.io/

I switched to ToneJS after Sonic Pi because it runs in the web context I’m familiar with and allowed me to tie it into other web actions. I still use ToneJS for example on my homepage https://ufffd.com/ to run the mouseover sounds based on interaction events from the 3D system.

Here is a framework I built in ToneJS to generate songs for a few generative art projects I made: https://github.com/kylegrover/js-algobop

You can see it in practice here: https://www.fxhash.xyz/generative/slug/bop

Using ToneJS:

A lot has been written about ToneJS, it’s a popular library. Try starting here:

AI Generative Music

RAVE

RAVE, a realtime generative model made by Antoine Caillon and Philippe Esling at IRCAM late 2021, is to this day my favorite model I’ve tried. I haven’t found anything more modern (other than RAVE v2) that is as fun to use or allows the same kind of control. Rather than prompting with text you manipulate some underlying latents and/or do realtime style transfer based on a tuned model to create bizarre musical sounds with a very human touch. I love this so much more than typing in “cool hihat loop 140bpm” waiting a few seconds and then receiving back maybe a hihat loop that might be cool and might be at 140bpm (or maybe something entirely unrelated) and then having no control over it beyond the normal sound manipulation means I can use on any other (higher quality!) sample.

Using RAVE:

Demo: https://caillonantoine.github.io/ravejs/

Try out a few pretrained models in a UI format. This shows you what it’s sonically capable of but it doesn’t demonstrate the awesome realtime possibilities…

Realtime usage: https://github.com/acids-ircam/RAVE#realtime-usage

This is definitely harder and less documented than Sonic Pi or Tone Js, but it’s very doable on any hardware and moderate computer knowledge. Get some pretrained models to use here: https://acids-ircam.github.io/rave_models_download

Training:

This is where it gets a little harder and requires either a beefy GPU or some money for cloud credits and a little extra patience. I trained a few rounds of RAVEv1 in the cloud successfully but it was a headache in many ways compared to simply having a decent modern GPU to train on.

[ I will update this section when I train a new RAVE model ]

someone elses notes on training v2 https://github.com/acids-ircam/RAVE/issues/312

colab for training v2 https://colab.research.google.com/drive/1ih-gv1iHEZNuGhHPvCHrleLNXvooQMvI?usp=sharing

I Wish It Had:

Nicer UI, for training and for inference. nn~ is cool but it would be great to bundle into a custom tool or VST. (I do think some VST work exists)

Better place to share trained models.

More checkpoints to start from.

Clip/clap/prompt training.

Stable Audio Tools & Stable Audio Open 1.0

Stable Audio Tools is a set of scripts and a UI for training and inferencing Stable Audio Open 1.0

I’ve experimented with it only a bit. I think it’s the best that we’ve got locally but it feels a little restrained. My general feeling on AI audio is it’s held back in comparison to images and video because the music industry has a much more codified system of enforcing IP in the form of sampling and referencing.

Using Stable Audio Tools:

https://huggingface.co/stabilityai/stable-audio-open-1.0/discussions/29
quality setup info from mildly deranged internet bro Assbang 👍

I Wish It Had:

keyframing, timeline, prompt traveling
autolooping mode with variation
upres
demix
basically I want comfyui but for audio i think
- there is a comfyui node…
higher quality tuning (they cant release v2 because of IP shit)

Google Music FX

https://aitestkitchen.withgoogle.com/tools/music-fx

meh. many great brains combined with corporate restriction and misguided financial interests to produce a tool that I find uninspiring and unhelpful for musicians. The best case it can serve is cutting into the market of royalty free music artists like Epidemic, Free Sounds, and Kevin MacLeod

Popular Tools I haven’t Used

Suno, Udio, elevenlabs, hugely popular tools that I haven’t really tried. I tend to not bother with paid closed services unless its on a pay per use api at a low cost. I do not want to subscribe, I do not want to purchase credits.

https://github.com/facebookresearch/audiocraft is also supposedly decent, I just haven’t tried it yet

ufffd.com

AI &amp; generative music tools