182
pages
English
Documents
2010
Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres
182
pages
English
Documents
2010
Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres
Publié le
01 janvier 2010
Nombre de lectures
11
Langue
English
Poids de l'ouvrage
14 Mo
Publié le
01 janvier 2010
Nombre de lectures
11
Langue
English
Poids de l'ouvrage
14 Mo
Automatic Organization of Digital Music Documents –
Sheet Music and Audio
Dissertation
zur
Erlangung des Doktorgrades (Dr. rer. nat.)
der
Mathematisch-Naturwissenschaftlichen Fakult¨at
der
Rheinischen Friedrich-Wilhelms-Universit¨at Bonn
vorgelegt von
Christian Fremerey
aus
Bonn
Bonn, Mai 2010Angefertigt mit Genehmigung der Mathematisch-Naturwissenschaftlichen Fakulta¨t der
Rheinischen Friedrich-Wilhelms-Universita¨t Bonn
1. Gutachter: Prof. Dr. Michael Clausen
2. Gutachter: PD Dr. Meinard Mu¨ller
Tag der Promotion: 21. Juli 2010
Erscheinungsjahr: 2010Automatic Organization of Digital Music Documents –
Sheet Music and Audio
Christian Fremerey
Abstract
This thesis presents work towards automatic organization and synchronization of scanned
sheet music and digitized audio recordings in the scenario of a digital music library. The
organization targeted in the project of this thesis includes the segmentation of sheet music
books and audio CD collections into individual songs or movements, mapping the resulting
segments to corresponding records in a database of metadata, and temporally synchronizing
the sheet music documents and audio recordings that belong to the same piece of music on
a bar-wise level. Building up a digital music library with a large collection of digitized sheet
music and audio recordings requires automated methods for organizing the data, because a
manual organization is too expensive and time-consuming. In this thesis, a complete work-
flow addressing the practical issues that arise in building up and maintaining a digital music
library for synchronized sheet music and audio recordings is presented. Algorithms and ap-
proaches for the automatic organization of music documents are proposed and evaluated. We
introduce a software application to be used by library employees for the import of new data
andeditingofexistingrecordsthatintegratestheproposedautomaticmethods. Thisapplica-
tion, furthermore, allows semi-automatic or manual interaction where automatic approaches
are not yet reliable enough to meet the high quality standards that are expected in a library
environment. A prototypical user interface for users of the digital library is presented that
allows for applications making explicit use of the synchronization between sheet music books
and audio CD collections.
Keywords: digital music libraries, sheet music, audio recordings, music synchronization,
music alignment, optical music recognition, score-performance matching, user interfaces, au-
tomatic document organization, digital music representations, partial synchronization, struc-
tural differences
iiiAcknowledgements
This work would not have been possible without the support, advice, and encouragement I
received from the people close to me during the last couple of years. In particular, I would
like to express my thanks and gratitude to
my supervisors Michael Clausen and Meinard Mu¨ller who always supported and guided
meinexactlythewayIneededandwhoprobablytaughtmemoreaboutproperlydoing
science than I even realize today,
my parents for their never-ending love and extraordinary support,
my wife for her love, her patience, and for believing in me, as well as her parents who
welcomed me in their family and trusted in me like their own child
,
myformercolleaguesattheUniversityofBonnFrankKurth,RolfBardeli,DavidDamm,
SebastianEwert,andVerenaThomas,withwhomIhadthejoyfulexperienceofworking
together for many years,
Axel Mosig and his group at PICB in Shanghai without whose unconditional help and
support during the final phase of this work things would have been so much harder,
as well as all my family and friends for giving me comfort and happiness.
Shanghai, May 13th 2010
Christian Fremerey
iiiivContents
Abstract i
Acknowledgements iii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Main Goals of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contributions of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Related Publications by the Author . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.1 Digital Music Libraries and Collections . . . . . . . . . . . . . . . . . 6
1.5.2 Synchronization, Alignment, and Score Following . . . . . . . . . . . . 8
1.5.3 Content-based Comparison of Music Data . . . . . . . . . . . . . . . . 12
1.5.4 Other Related Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Digital Music Representations 15
2.1 The Data Classes Audio, Symbolic, and Sheet Music . . . . . . . . . . . . . . 15
2.2 A Closer Look at the Data Classes . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 SharpEye Music Reader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Bridging the Gap Between Sheet Music and Audio 23
3.1 Sheet Music-Audio Synchronization. . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Dynamic Time Warping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Local Similarity and Mid-Level Representations . . . . . . . . . . . . . . . . . 36
3.5 Subsequence Dynamic Time Warping . . . . . . . . . . . . . . . . . . . . . . . 40
vvi CONTENTS
4 PROBADO Music Document Organization System 43
4.1 PROBADO Music Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Real-World Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2.1 Track Segmentation and Identification . . . . . . . . . . . . . . . . . . 50
4.2.2 Structural Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.3 OMR Quality and Note Events . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Framework and Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Handling Special Cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5 From Sheet Music To Note Events 67
5.1 Optical Music Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 Postprocessing OMR Data Using Simple Heuristics . . . . . . . . . . . . . . . 70
5.2.1 Voice Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.2 Key Signature Correction . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.3 Time Signature Correction . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.4 Execution Hint Removal . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.5 Grandstaff Indentation Detection . . . . . . . . . . . . . . . . . . . . . 75
5.3 Iterative Approach for Reducing Inconsistency . . . . . . . . . . . . . . . . . 76
5.4 Detecting Repeats and Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.5 Estimation of Onset Times and Tempo . . . . . . . . . . . . . . . . . . . . . . 89
5.6 Deriving Pitches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.7 Orchestral Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6 Track Segmentation and Identification 103
6.1 Definition of the Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.2 Text-Based Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.3 Content-Based Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.3.1 Baseline Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.3.2 Real-World Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3.3 Postprocessing Clean-Up . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7 Partial Synchronization 119
7.1 Types of Structural Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.2 Jumps and Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.3 Matching Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.4 Path Degeneration Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.5 JumpDTW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.5.1 Customizable Start Points and End Points . . . . . . . . . . . . . . . 133
7.5.2 Optimum Average Cost . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.5.3 Special States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.5.4 Multi-Scale Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.5.5 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.5.6 Summary of the JumpDTW Approach . . . . . . . . . . . . . . . . . . 141CONTENTS vii
8 Applications 143
8.1 Score Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.2 Audio Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.3 Cross-Domain Presentation and Navigation . . . . . . . . . . . . . . . . . . . 147
8.4 Switching Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.5 Cross-Domain Retrieval . . . . . . . . . . . . . . . .