131
pages
English
Documents
2010
Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres
131
pages
English
Documents
2010
Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres
Publié par
Publié le
01 janvier 2010
Nombre de lectures
20
Langue
English
Poids de l'ouvrage
20 Mo
Neuralmechanismsoffeatureextractionforthe
analysisofshapeandbehavioralpatterns
Dissertation zur Erlangung des Doktorgrades (Dr.rer.nat.)
der Fakultät für Informatik und Ingenieurwissenschaften
der Universität Ulm,
vorgelegt von
Ulrich Weidenbacher
aus Heidenheim a.d. Brenz
2010
Universität Ulm
Institut für NeuroinformatikAmtierender Dekan: Prof. Dr. Michael Weber
Erstgutachter: Prof. Dr. Heiko Neumann
Zweitgutachter: Prof. Dr. Günther Palm
Tag der Promotion: 7.12.2010Abstract
Thehumanvisualsystemsegments3Dscenesinsurfacesandobjectswhichcanappear
at different depths with respect to the observer. The projection from 3D to 2D leads par-
tially to occlusions of objects depending on their position in depth. There is experimental
evidence that surface-based features such as occluding contours or junctions are used as
cues for the robust segmentation of surfaces. These features are characterized by their
robustness against variations of illumination and small changes in viewpoint. We demon-
strate that this feature representation can be used to extract a sketch-like representation
of salient features that captures and emphasizes perceptually relevant regions on objects
and surfaces. Furthermore, this representation is also suitable for learning more complex
form patterns such as faces and bodies in different pose.
In this thesis, we present a biologically inspired, recurrent model which extracts and
interprets surface-based features from a 2D grayscale intensity input image. Based on
the neurophysiology of the primate brain, the model is based on few basic processing
mechanisms which are reused at several model stages with different parameterization.
Furthermore, the architecture is characterized by feedforward and feedback connections
which lead to temporal dynamics of model activities. The model simulates the two main
processingstreamsoftheprimatevisualsystem, namelytheform(ventral)andthemotion
(dorsal) pathway. In the model ventral pathway prototypical views of head and body
poses (snapshots) as well as their temporal appearances were learned unsupervised in a
two-layer network. In the dorsal pathway prototypical velocity patterns are generated
by local motion detectors. These learned patterns are combined into typical motion
patternsappearingfromheadandbodymovementsduringestablishmentofvisualcontact.
Activityfrombothpathwaysisfinallyintegratedtoextractacombinedsignalfrommotion
and form features. Based on these initial feature representation we demonstrate a multi-
layered learning scheme that is capable of learning form and motion features utilized for
the detection of specific behaviorally relevant motion patterns (e.g. turn away and turn
towards actions of the body). We show that the combined representation of form and
motion features is superior compared to single pathway based model approaches.
iAcknowledgements
I would like to thank my supervisor, Prof. Dr. Heiko Neumann, who supported me
and my work during all the years I spent in his group. I will miss the daily coffee breaks
which were a good opportunity to discuss new ideas or to get feedback on new results.
He also gave me the chance to regularly visit international conferences where I could
present my work and discuss with other researcher from different countries. I would also
like to thank Prof. Dr. Günther Palm, who kindly accepted to write the second expert’s
report. I also want to thank Dr. Pierre Bayerl who supported me with his experience as
a researcher in the first two years of my thesis. I thank all other members of the Vision
group for valuable discussions and also for the time in numerous social events that we
spend together. Finally, I would like to appreciate my wife Heike who had to bear the long
days and evenings alone with our daughter Hannah while I was working on this thesis.
iiContents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Biological background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Neural modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Learning and Plasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.1 Biological relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.2 Supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.3 Unsupervised learning . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Initial form feature extraction and models for object recognition . . . . . . 9
1.5.1 Form . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.2 Models for object recognition . . . . . . . . . . . . . . . . . . . . . 10
1.6 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Depicting the 3D Shape of Objects and Surfaces 15
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Extracting ground-truth curvature information from the 3d model . 20
2.2.2 Evidence for curvature orientation in image space . . . . . . . . . . 24
2.2.3 A biological model for the extraction of curvature information . . . 24
2.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.1 Model input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.2 Evaluation of principal curvature orientations and anisotropy . . . . 34
2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.1 Limitations of the model . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.2 Generalization of the model . . . . . . . . . . . . . . . . . . . . . . 42
3 Extraction of Surface-related Features 45
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.1 Overview of the model architecture . . . . . . . . . . . . . . . . . . 47
3.2.2 Detailed description of model components . . . . . . . . . . . . . . 49
iiiCONTENTS
3.2.3 Read-out and interpretation of model activities . . . . . . . . . . . 53
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.1 Robustness to noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.2 Extraction of junction configurations . . . . . . . . . . . . . . . . . 59
3.3.3 Processing of illusory contours . . . . . . . . . . . . . . . . . . . . . 61
3.3.4 Processing of real-world data . . . . . . . . . . . . . . . . . . . . . 63
3.3.5 Quantitative evaluation and comparison . . . . . . . . . . . . . . . 63
3.3.6 Simulations with dynamic input stimuli . . . . . . . . . . . . . . . . 64
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.1 Summary of findings . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4.3 Biological plausibility of model components . . . . . . . . . . . . . 70
3.4.4 Evidence for representation of junctions and corners in visual cortex 70
3.4.5 The role of junctions in visual perception . . . . . . . . . . . . . . . 72
4 Learning of form and motion patterns in social interaction 77
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2.1 Biological motivation and model overview . . . . . . . . . . . . . . 79
4.2.2 Processing in the form pathway . . . . . . . . . . . . . . . . . . . . 80
4.2.3 Processing in the motion pathway . . . . . . . . . . . . . . . . . . . 82
4.2.4 Combination of motion and form signals . . . . . . . . . . . . . . . 84
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.1 Model input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.2 Motion pathway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3.3 Form pathway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3.4 Combination of form and motion information . . . . . . . . . . . . 92
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.1 Summary of findings . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4.3 Biological Plausibility . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4.4 Limitation of the model . . . . . . . . . . . . . . . . . . . . . . . . 106
4.4.5 Open questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5 Summary 107
5.1 A survey of major results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2 Relevant publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Summary (German) 125
ivChapter 1
Introduction
1.1 Motivation
The ability to see is one of the most fundamental skills of our species to interact with
our environment. The perception and correct interpretation of an ordinary scene