�'s Ramblings

high quality steerable ai videos

2024-10-06T03:11:00

i loved deforums steerable video capabilities but they were always glitchy, so when i found animatediff i was very pleased with the quality but very much missing the control. i had built out some tools to steer deforum using synesthesia’s audio engine even and wanted to make use of those with my beefy new gpu. i havent found anything newer that does this so my best idea is to fuse them.

so far my best attempts have come from running a deforum video and then using animatediff in a vid2vid process. i havent had any luck with hyper/lightning model quality so right now its running pretty slow, taking anywhere from 15 minutes to 2.5 hours to run just the vid2vid step for a 120 frame clip.

basic process:

  • process audio data from a song using synesthesia and other tools
  • use audio data to construct keyframes for comfyui
  • come up with a list of prompts to use at different calculated times in the song
  • render deforum with keyframed motion & prompt
  • vid2vid with animatediffusion (im using 16 frame window with 8 frame overlap, using v3_sd15_mm)
  • optionally use hires fix upscaler (veerrry slow) or upscale frames afterwards

things to try:

  • start small and low framerate, scale up both later
  • rife frame interpolate at various steps
  • flux, sdxl, sd3 if possible (existing animatediff built for sd1.5)
    • maybe use in upscale
  • cogvideo (not sure if vid2vid possible here)

ufffd.com