EDGE: Editable Dance Generation from Music is an AI tool that generates high-quality choreographies from music using music embeddings from the Jukebox model. The tool works by encoding input music into embeddings using a frozen Jukebox model and then using a conditional diffusion model to map the music embeddings to a series of 5-second dance clips.
At inference time, temporal constraints are applied to batches of multiple clips to enforce temporal consistency before stitching them into an arbitrary-length full video. The tool supports arbitrary spatial and temporal constraints, making it suitable for various end-user applications, including dances subject to joint-wise constraints, motion in-betweening, and dance continuation.
In addition, EDGE has a new Contact Consistency Loss that improves physical realism while keeping sliding intact and avoids unintentional foot sliding, ensuring that generated dances are physically plausible.
The tool has been trained with physical realism in mind and has been shown to outperform previous work, as indicated by human raters’ strong preference for dances generated by EDGE. Overall, EDGE: Editable Dance Generation from Music is a powerful AI tool suitable for generating high-quality choreographies from music, with potential applications in various industries, including entertainment and the arts.
More details about Edge Dance
What is EDGE: Editable Dance Generation from Music?
EDGE: Editable Dance Generation from Music is an AI tool that generates high-quality choreographies from music. It uses music embeddings from the Jukebox model and a conditional diffusion model to map these music embeddings to a series of 5-second dance clips.
How does EDGE ensure the physical realism of the generated dances?
EDGE ensures physical realism of the generated dances with its new Contact Consistency Loss that learns when feet should and shouldn’t slide. This significantly improves physical realism while keeping intentional foot-ground contact sliding intact. The tool has been trained with physical realism in mind.
What is motion in-betweening in the context of EDGE?
Motion in-betweening, in the context of EDGE, refers to the creation of dances that start and end with prespecified motions. It’s one of the temporal constraints that can be implemented using EDGE.
Can human raters determine the quality of dances generated by EDGE?
Yes, human raters can determine the quality of dances generated by EDGE. They have shown a strong preference for dances generated by EDGE over those of previous work, confirming its ability to generate high-quality dance choreographies.
How does EDGE generate choreographies from music?
EDGE generates choreographies from music by encoding the input music into embeddings using a frozen Jukebox model. Then, a conditional diffusion model is used to map these music embeddings to a series of 5-second dance clips. Temporal constraints are applied to batches of multiple clips for temporal consistency before they are stitched into an arbitrary-length full video.