
According to the ByteDance researchers, OmniHuman-1 only needs a single reference image and audio, like speech or vocals, to generate a clip of an arbitrary length. The output video’s aspect ratio is adjustable, as is the subject’s “body proportion” — i.e. how much of their body is shown in the fake footage.
https://techcrunch.com/2025/02/04/deepfake-videos-are-getting-shockingly-good/