Enhancing Expressive Musical Conversation in the jam_bot

Lancelot Blanchard; Perry Naseck; Katherine Liang; Joel Tan; Heidi Lei; Cheng-Zhi Anna Huang; Joseph Paradiso

oral
Paper PDF link
Presence: in person
Duration: 13
Type: medium
Session: Learning with the Machines

Abstract:

Previous work introduced the jam_bot, a real-time system that embeds live music language models capable of generating symbolic music sequences coherent with a performer’s input. The system supports multiple interaction strategies that have been demonstrated in several public performances. However, these strategies limit expressive musical conversation by constraining tempo, form, or musical roles. We extend the jam_bot to support more expressive, open-ended interaction through four key improvements: (1) modeling velocity, a key dimension of expression in symbolic music; (2) increasing model throughput via a ggml implementation–required to accommodate the longer sequences induced by velocity modeling; (3) developing a new training modality that enables free-form call-and-response interaction across varying tempi; and (4) compensating for external MIDI output latency to ensure rhythmic coherence with the performer. We quantitatively evaluate the model throughput improvement and our latency compensation strategy, and offer MIDI samples online. Together, these enhancements enable the jam_bot to engage in natural, expressive musical conversation, eliminating key musical limitations to enable the development of future performances and installations.