ACM Multimedia: Explaining Structure Annotations Using Nonlinear Optimization

Smith, J.B.L. and E. Chew (2013). Using Quadratic Programming to Estimate Musical Attention from Self-Similarity Matrices. In Proceedings of ACM Multimedia, October 21-25, Barcelona, Catalunya, Spain.

To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener’s annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener’s attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece’s structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners’ interpretations, and lastly propose some variations on our method.

Read more in [ Paper preprint ]