The Sound of Text

Research Overview

If words were music, what would they sound like? What is the sound of the book you just read, the email you just sent, the news article you just received, or the research paper you just wrote?  Music and words come together in the millions of songs that delight us, and yet for most of the words in the world, their music is silent.  This research team will develop algorithms that leverage existing alignments between words and music to produce a musical interpretation for any text.  There is an entire research community at the intersection of music and computing, with the most common research trends being music retrieval, music analysis, and music generation.  Within the space of music composition, very little work is being done on cross-modality music generation and this project will help to fill this gap by developing algorithms for generating music starting with another modality, namely text.

The main goal of the project is to develop a data-science framework that will allow us to connect language and music, and consequently develop tools that can produce musical interpretations of text.  The team will build a large collection of aligned text and music, by drawing from publicly available digital collections of songs and lyrics, and leveraging automatic algorithms for data alignment.  They will then develop novel neural network based algorithms for text-to-music generation, building on recent advances in sequence-to-sequence deep learning algorithms to uncover patterns of connections between language and music that can be used in the generation process.  Neural networks have found many successful applications in areas ranging from computer vision to language processing and to neuroscience. They are particularly useful for sequence mappings and predictions and offer a unique opportunity to bridge music and language, both of which are produced and consumed sequentially.  This research team will bring innovation to neural networks methodology because they will need to work with an output space in which multiple aspects (pitches, dynamics, rhythm) vary over different time scales and with long-distance dependencies (themes or motifs that are repeated).  The methodology innovation could have broad impact well beyond music research.

Research Impact

Research Team

Rada Mihalcea, co-Principal Investigator, Professor, Electrical Engineering and Computer Science, College of Engineering
Anıl Çamcı, co-Principal Investigator, Assistant Professor, Performing Arts Technology, School of Music, Theatre and Dance
Sile O’Modhrain, Professor, Performing Arts Technology, School of Music, Theatre and Dance
Jonathan Kummerfeld, Research Fellow, Electrical Engineering and Computer Science, College of Engineering