FrameBridge: Improving Image-to-Video Generation with Bridge Models
Anonymous Authors
We sincerely thank all the reviewers for their advice on enhancing the experiment part of our work.
We have trained a FrameBridge model with more computational resources (i.e., 100k iterations versus 20k iterations with a batch size of 64 in our initial submission), which effectively improves generation quality.
This demo page is a part of our rebuttal material, and consists of two sections.
In the first section, we show several samples which are representative of our model's capacity for I2V generation. In the second section, we provide samples with more
evident motion (e.g., camera movements, vehicles, animals in motion). We hope the demos can address the reviewers' concerns about the generation performance of FrameBridge.
(As all the videos in our training dataset WebVid-2M has a watermark, it may be difficult to avoid the presence of watermarks in certain demos.)
Demo samples of FrameBridge-100k:
Sample 1: "camera zoom-in, fantastic and beautiful garden"
Sample 2: "pot boiling on the campfire"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Sample 3: "huge waves crashing against the shore"
Sample 4: "leaves of maple trees flutter down"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Sample 5: "young man typing keyboard, looking at the screen"
Sample 6: "ship sailing in the waves"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Sample 7: "deer sits on the grass, looking around"
Sample 8: "rabbit reading the piano"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Sample 9: "knight walking with the sword in hand"
Sample 10: "drumer beating the drum"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Sample 11: "a red panda eating bamboo in a zoo"
Sample 12: "bird on the branch, flaping wings"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Sample 13: "girl singing under the spotlight with deep affection, close eyes slowly, gently"
Sample 14: "a car is moving on the road"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Demos with evident motion
Sample 1: "a bald eagle flying over a tree filled forest"
Sample 2: "bird flying off the tree"
Condition Image:
I2V Result:
Condition Image:
I2V Result:
Sample 3: "dandelion blown by the wind"
Sample 4: "camera zoom-in rapidly, close look at the flower"