Date of Thesis

Spring 2025

Description

This thesis investigates the capacity of transformer-based architectures to learn generalized musical patterns through symbolic generation. To support this exploration, a complete music generation pipeline was developed, beginning with the construction and classification of a large-scale dataset of over 170,000 MIDI files. The dataset was processed using rule-based heuristics and custom neural classifiers to separate tracks by musical function and contour. A novel tokenization scheme, MINTii, was introduced to encode musical information compactly through interval-based representations, reducing redundancy and promoting generalization. Using this infrastructure, a transformer model was trained to generate single-track melodic sequences. Its performance was evaluated through both memorization and generalization tests. While harmonic tendencies such as pitch motion and range were learned with moderate success, rhythmic understanding remained limited. Further analysis revealed that the model could not reproduce even short, repeated sequences under ideal conditions, highlighting structural limitations in learning temporal phrasing. These findings suggest that fully data-driven transformer models lack the inductive bias needed to internalize musical form, and that future systems may benefit from rule-based constraints, structural labeling, or loss functions grounded in music theory.

Keywords

machine learning, music, generation, transformer, symbolic music, music generation

Access Type

Masters Thesis

Degree Type

Master of Science in Electrical Engineering

Major

Electrical Engineering

First Advisor

Robert Nickel

Share

COinS