Tomek Roszczynialski PyData NYC 2024

Tomek Roszczynialski
.ical

Data science connoisseur with an obsession for converting numbers into sounds.

Sessions

11-07

16:05

40min

Tuning the Transformer: Context-Aware Masking for Controlled Music Generation in MIDI

Tomek Roszczynialski, Wojciech Matejuk

Audio, images, and text already have well-established data processing pipelines proven to yield amazing results with large deep-learning models. However, applying these methods to music, especially in MIDI format, presents unique challenges. In this talk, we explore the application of context-aware masking techniques to data obtained by recording piano performances in MIDI format.

We demonstrate how methods inspired by masked language modeling, image inpainting, and next-token prediction can be adapted to preprocess MIDI data, capturing the harmonic, dynamic, and temporal information essential for music. These preprocessing strategies can lead to the creation of context-aware infilling tasks, which allow for the training of large transformer models that generate more emotionally nuanced musical performances.

Music Box

Tomek Roszczynialski .ical

Sessions

Tomek Roszczynialski
.ical