�'s Ramblings
high quality steerable ai videos
2024-10-06T03:11:00
i loved deforums steerable video capabilities but they were always glitchy, so when i found animatediff i was very pleased with the quality but very much missing the control. i had built out some tools to steer deforum using synesthesia’s audio engine even and wanted to make use of those with my beefy new gpu. i havent found anything newer that does this so my best idea is to fuse them.
so far my best attempts have come from running a deforum video and then using animatediff in a vid2vid process. i havent had any luck with hyper/lightning model quality so right now its running pretty slow, taking anywhere from 15 minutes to 2.5 hours to run just the vid2vid step for a 120 frame clip.
basic process:
- process audio data from a song using synesthesia and other tools
- use audio data to construct keyframes for comfyui
- come up with a list of prompts to use at different calculated times in the song
- render deforum with keyframed motion & prompt
- vid2vid with animatediffusion (im using 16 frame window with 8 frame overlap, using v3_sd15_mm)
- optionally use hires fix upscaler (veerrry slow) or upscale frames afterwards
things to try:
- start small and low framerate, scale up both later
- rife frame interpolate at various steps
- flux, sdxl, sd3 if possible (existing animatediff built for sd1.5)
- maybe use in upscale
- cogvideo (not sure if vid2vid possible here)