high quality steerable ai videos

2024-10-06T03:11:00

i loved deforums steerable video capabilities but they were always glitchy, so when i found animatediff i was very pleased with the quality but very much missing the control. i had built out some tools to steer deforum using synesthesia’s audio engine even and wanted to make use of those with my beefy new gpu. i havent found anything newer that does this so my best idea is to fuse them.

so far my best attempts have come from running a deforum video and then using animatediff in a vid2vid process. i havent had any luck with hyper/lightning model quality so right now its running pretty slow, taking anywhere from 15 minutes to 2.5 hours to run just the vid2vid step for a 120 frame clip.

basic process:

process audio data from a song using synesthesia and other tools
use audio data to construct keyframes for comfyui
come up with a list of prompts to use at different calculated times in the song
render deforum with keyframed motion & prompt
vid2vid with animatediffusion (im using 16 frame window with 8 frame overlap, using v3_sd15_mm)
optionally use hires fix upscaler (veerrry slow) or upscale frames afterwards

things to try:

start small and low framerate, scale up both later
rife frame interpolate at various steps
flux, sdxl, sd3 if possible (existing animatediff built for sd1.5)
- maybe use in upscale
cogvideo (not sure if vid2vid possible here)

ufffd.com