Con Moto: Embodied Steering of Music Transformers for Live Dance Improvisation

Zhixing Chen; Heidi Lei; Cheng-Zhi Anna Huang

Con Moto: Embodied Steering of Music Transformers for Live Dance Improvisation
Image credit: Zhixing Chen; Heidi Lei; Cheng-Zhi Anna Huang

Abstract:

Con Moto is a real-time generative music system for dance improvisation that supports embodied steering of a transformer model with configurable levels of agency. While existing frameworks demonstrate the potential of embodied music-making and movement sonification in live performance, achieving both high musical coherence and low-latency responsiveness remains an ongoing challenge. In response, we leverage the musical coherence of real-time MIDI-based transformer models to design an integrated system that translates camera motion data into movement parameters, which in turn control the musical output. Con Moto employs two layers of control strategies: 1) inference-time steering of the transformer model and 2) post-generation rendering using Max/MSP as a control interface and Ableton Live for sound synthesis. We present the system through a duet performance for a live audience, supplemented by qualitative reflections from the dancers and the audience. By reconfiguring how the dancers’ movements map to musical functions, we create a system with configurable agency. Fine-grained control over an individual musical voice invited dancers to experience the system as a playable instrument, while abstract musical mappings to a genre’s energy opened space for the system to act as an autonomous creative partner. By navigating the aesthetic friction between human intent and AI agency, we explore a dynamic that facilitates a deep, bidirectional feedback loop.