Your cart is currently empty!
LatentSync Deployment Test
During the LatentSync test on Lambda, I rented A6000 and A100 GPUs. Test results show:
▪️ On the A6000, generating a video for 20 seconds of audio resulted in a video over 100 seconds long.
▪️ On the A100, generation time was similar to the A6000.
Generated material:
I uploaded a video — the same one used with MuseTalk — and combined it with audio, looping for playback.
Generation results:
Except for insufficient clarity around the teeth detail, other mouth details were preserved very well.
Real‑time performance:
Conclusion:
From testing LatentSync under these different hardware setups, we conclude:
▪️ Performance gap: Although both A6000 and A100 are high‑performance GPUs, video generation speed still fails to reach real‑time or near‑real‑time — generating 20 seconds of audio requires over 100 seconds.
▪️ Not suitable for real‑time applications: Based on current hardware results, LatentSync is better suited for offline or batch rendering rather than applications requiring quick or real‑time video generation.
▪️ Hardware requirements: For higher‑quality output or higher‑resolution video generation, stronger GPUs with more VRAM are needed to reduce generation time.
IMTalker Deployment Test
Currently, IMTalker has been tested remotely, but there are some bugs. After clicking “Generate,” a manual page refresh is required to trigger backend processing. This issue is still being fixed, but partial results are now viewable.
Generated material:
Only a single image needs to be uploaded here.

Generation results:
The output video is cropped to a 512×512 region, can blink automatically, and shows very fast real‑time performance.
Real‑time performance:
Conclusion:
Based on IMTalker testing, we conclude:
▪️ Image cropping: The input image is cropped to 512×512 area.
▪️ Real‑time performance: Real-time performance meets expectations — the video can be generated quickly with synchronized mouth movements.











Leave a Reply