Human-in-the-Loop: Crossmodal AI Alignment between Movement and Audio Latent Spaces for Expressive Sonification in Dance Performance

Koray Tahiroğlu; Mikael Hokkanen; Ariana Marta

poster
Paper PDF link
Presence: remote
Type: long
Session: Poster Session 1

Abstract:

This paper presents Crossmodal AI alignment, a generative AI framework for expressive sonification of human movement in dance performance. The system connects two variational autoencoders: a Movement VAE encoder, capturing real-time expressive dance movement features, and an Audio VAE (RAVE) decoder, generating corresponding musical textures and responses. A central alignment module links their latent spaces, allowing dynamic adaptation between movement and sound. Unlike fixed or rule-based mapping approaches, SonicMove alignment system introduces a human-in-the-loop alignment process, where the dancer calibrates the crossmodal relationship through embodied exploration prior to performance. This enables an adaptive and intuitive co-creative dialogue between performer and AI, producing sonifications that respond to subtle variations in movement. Exploratory sessions with invited dancers, centred on the latent space alignment process, suggest how performer-driven calibration shapes the perceived coherence and expressivity of generated sound in relation to movement, offering a possible direction toward more adaptive, multimodal performance systems that integrate movement, sound, and creative interpretation.