Date of Thesis
Spring 2025
Description
This thesis investigates the capacity of transformer-based architectures to learn generalized musical patterns through symbolic generation. To support this exploration, a complete music generation pipeline was developed, beginning with the construction and classification of a large-scale dataset of over 170,000 MIDI files. The dataset was processed using rule-based heuristics and custom neural classifiers to separate tracks by musical function and contour. A novel tokenization scheme, MINTii, was introduced to encode musical information compactly through interval-based representations, reducing redundancy and promoting generalization. Using this infrastructure, a transformer model was trained to generate single-track melodic sequences. Its performance was evaluated through both memorization and generalization tests. While harmonic tendencies such as pitch motion and range were learned with moderate success, rhythmic understanding remained limited. Further analysis revealed that the model could not reproduce even short, repeated sequences under ideal conditions, highlighting structural limitations in learning temporal phrasing. These findings suggest that fully data-driven transformer models lack the inductive bias needed to internalize musical form, and that future systems may benefit from rule-based constraints, structural labeling, or loss functions grounded in music theory.
Keywords
machine learning, music, generation, transformer, symbolic music, music generation
Access Type
Masters Thesis
Degree Type
Master of Science in Electrical Engineering
Major
Electrical Engineering
First Advisor
Robert Nickel
Recommended Citation
Buentello, Ben, "Transformer-Based Symbolic Music Generation" (2025). Master’s Theses. 296.
https://digitalcommons.bucknell.edu/masters_theses/296
Included in
Composition Commons, Music Theory Commons, Other Electrical and Computer Engineering Commons, Other Music Commons, Signal Processing Commons
