176
pages
English
Documents
Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus
Découvre YouScribe en t'inscrivant gratuitement
Découvre YouScribe en t'inscrivant gratuitement
176
pages
English
Documents
Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus
ECOLECENTRALEPARIS
PHD THESIS
toobtainthetitleof
DoctorofEcoleCentraleParis
Specialty: APPLIED MATHEMATICS
Defendedby
Olivier TEBOUL
ShapeGrammarParsing:
ApplicationtoImage-basedModeling
preparedatEcoleCentraledeParis,MASlaboratory
defendedonJune1,2011
Jury:
Chairman: Pr. Marc SCHOENAUER - InriaSaclay
Reviewers: Pr. Jiri MATAS - CzechTechnicalUniversity,Prague
Pr. Marc POLLEFEYS - ETHZurich
Advisor: Pr. Nikos PARAGIOS - EcoleCentraledeParis
Examiners: Pr. Carsten ROTHER - MicrosoftResearchCambridge
Pr. Sylvain LEFEBVRE - InriaNancy
Pr. Renaud KERIVEN - EcolePolytechnique-Acute3D
Pr. Iasonas KOKKINOS - EcoleCentraleParis
tel-00628906, version 1 - 6 Oct 20112
tel-00628906, version 1 - 6 Oct 2011Acknowledgments
First of all, I would like to sincerely acknowledge my advisor Nikos Paragios, both as a su-
pervisor and as a person, for his strong support, the total liberty he gave me and the numerous
opportunitiesheofferedmealongtheway. IhadawonderfultimeworkinginEcoleCentraleParis
under his supervisionn during these three years and I will not forget it. Then I would like to thank
MicrosoftFranceandMicrosoftResearchCambridge,andmorespecificallyPierre-LouisXechand
FabienPetitcolas,forhavingsupportedmyworkwithacompletetrustandfreedom,puttingmein
thebestpossibleconditionstocarryonmyresearch.
Second,Iwouldliketothankallthecommitteemembers,thetworeviewersJiriMatasandMarc
Pollefeys, the chairman Marc Schoenauer, and the examinators Carsten Rother, Renaud Keriven,
Sylvain Lefebvre, and Iasonas Kokkinos for having taken the time to read and evaluate my work,
fortheirrelevantcommentsandquestionsandforthefruitfulremarkstheyprovidedintheirreports
andduringthedefense.
Third,IwouldliketoacknowledgethepeoplewhomIhavecollaboratedthemostwith,thatisto
sayLoïcSimonandPanagiotisKoutsourakis. Wespentalotoftimetogether,sharedalottogether
andthereforeIowethemalotbothprofessionallyandpersonally. Moreover,Ihadagreatpleasure
workingwithIasonasKokkinosonveryexcitingproblems. Iasonashasreallybeenasecondadviser
tomeandhehasallmygratitude. Iwouldalsoliketothankmycolleaguesthatboremeinthesame
roomduringtheseyearswithgreatpatienceandkindnessandIreallyconsiderthemastruefriends:
Aris Sotiras, Wang Chaohui, Loïc Simon, Panos Koutsourakis, Ahmed Besbes, Salma Essafi and
Radhouene Neji. My gratitude also goes to my colleagues who reviewed my work: Ahmed, Aris,
Chaohui, Loic and Sarah Parisot, and on a larger scale, I would like to thank all the members of
the vision and medical imaging team: Maxime, Noura, Martin, Charlotte, Olivier, Regis, Daniel,
Mickael, Georg, Fabrice, Nicolas, Pierre-Yves, Katerina, Bo, Sarah, Stravros and Haithem with
whom it was a pleasure to work on a daily basis. I do not forget people who visited the lab for a
while but long enough to become friends: Kostas who supervised my work at the beginning and
with whom it is always a pleasure to share a coffee whenever I visit Athens, Rola from Montreal
andJoséfromBarcelona.
3
tel-00628906, version 1 - 6 Oct 20114
Then, I would like to acknowledge all the members of the MAS laboratory, with special thanks
toSylvieDervinwhohasalwaysbeenoneofmyfirstsupport,ErickHerbin,MarcAiguier,Céline
Hudelot and Florian De Vuyst who offered me the possibility to teach at Ecole Centrale Paris,
Paul-HenryCournède,FredericAbergel,PatrickCalletandGillesFayefortheirkindnessandwith
whomitisalwaysapleasuretodiscuss.
Moreover, I would like to thank the people from outside whom I had the great opportunity to
collaborate with. First of all Sylvain Lefebvre with whom I really appreciated to collaborate on
a very exciting project and that invited me along with Georges Drettakis to their lab in Sophia
Antipolisin2009. ThenGiorgosTziritasandNikosKomodakiswhoinvitedmeinHeraklio,Crete
duringthesummer2009whereIhadagreattimeonapersonalandprofessionallevel. AlanYuille
who invited me in UCLA in summer 2010 and with whom I had a great pleasure to take the time
todiscussdeeplyaboutvisionproblems.
Outside of the professional sphere, I would like to thank all my family who always showed me
love and support and who were most important to me that I may have showed. In particular, I
wouldliketothankAgawhohasalwaysbeenmyfirstandunconditionalsupport,alwaysbringing
meloveandhappiness. Beyondthat,youallhelpedmeinyourownwayandtoallofyouIdedicate
thisthesis.
There are also some friends that I consider as family and that I would also like to acknowledge.
They all supported and helped me, sometimes without even knowing it. Ali Ezzatyar, Razvan
Ionasec, Bastien Grandcoin, Miloud Chahlafi, Remy Beharaing, Sinan Al Awabdh, Alba Jimenez
andChristinaPapistaarealllikebrothersandsistersandtheyhavemyeverlastinggratitude.
Ultimately, I would like to thank all my friends and the people that shared some time with me
duringmyPh.D,makingmeforgetaboutmyresearchforawhile. Thosearemynumerousfriends
from Paris wherever they initially come from, my Volleyball team, people from Crete that hosted
me in their beautiful island, all the very nice people I met in California, my Brazilian friends and
myPortugueseteacher. Lastbutnotleast,IwouldliketothankLuizaMachadoforallshebrought
meandwhocertainlychangedmylifemorethananybodyelseduringthesethreeyears.
tel-00628906, version 1 - 6 Oct 2011Abstract
The purpose of this thesis was to perform facade image parsing with shape grammars in order to
tackle single-view image-based 3D building modeling. The scope of the thesis was lying at the
borderofComputerGraphicsandComputerVision,bothintermsofmethodsandapplications.
Two different and complementary approaches have been proposed: a bottom-up parsing al-
gorithm that aimed at grouping similar regions of a facade image so as to retrieve the underlying
layout, and a top-down parsing algorithm based on a very powerful framework: Reinforcement
Learning. Thisnovelparsingalgorithmusespixel-wiseimagesupportsbasedonsupervisedlearn-
inginaglobaloptimizationofaMarkovDecisionProcess.
Both methods were evaluated quantitatively and qualitatively. The second one was proved to
supportvariousarchitectures,severalshapegrammarsandimagesupports,andshowedrobustness
to challenging viewing conditions; illumination and large occlusions. The second method outper-
formedthestate-of-the-artbothintermsofsegmentationandspeedperformances. Italsoprovides
amuchmoreflexibleframework,inwhichmanyextensionsmaybeenvisioned.
The conclusion of this work was that the problem of single-view image-based 3D building
modeling could be solved elegantly by using shape grammar as a Rosetta stone to decipher the
language of Architecture through a well-suited Reinforcement Learning formulation. This solu-
tion was a potential answer to large-scale reconstruction of urban environments from images, but
also suggested the possibility of introducing Reinforcement Learning in other vision tasks such as
genericimageparsing,whereithavebeenbarelyexploredsofar.
Keywords: Computer Vision, Computer Graphics, Procedural Modeling, Image Parsing, Re-
inforcementLearning,Image-basedModeling.
5
tel-00628906, version 1 - 6 Oct 20116
tel-00628906, version 1 - 6 Oct 2011Résumé
L’objectif de cette thèse était de résoudre le problème d’analyse d’image de façade avec a priori
de forme procédurale en vue de l’appliquer à la modélisation 3D d’immeuble à partir d’une seule
image. Lecadredecettethèsesesitueàlafrontièredel’informatiquegraphiqueetdelavisionpar
ordinateur,tantd’unpointdevuedesméthodesemployéesquedesapplicationspotentielles.
Deuxapprochescomplémentairesontétéproposées: uneméthodediteascendantequicherche
à regrouper des régions similaires de l’image afin de révéler la structure sous-jacente de la façade
; et une méthode dite descendante basée sur les puissants principes de l’apprentissage par ren-
forcement. Cenouvelalgorithmecombinedesmesureslocalesissuesdeméthodesd’apprentissage
supervisé dans une optimisation globale d’un Processus de Décision Markovien, qui découvre la
grammairedubâtimentaufildesitérations.
Ces deux méthodes ont été évaluées qualitativement et quantitativement. Les résultats ainsi
obtenus, se sont avérés bien meilleurs que l’état de l’art sur le plan de la rapidité, de la qual-
ité de segmentation, mais également au niveau de la flexibilité de la méthode et de ses extensions
éventuelles. Cetalgorithmeaétéabondammenttestésurdifférentstypesdegrammairesdeformes,
sur différents styles architecturaux, avec différentes mesures sur les images, et s’est avéré partic-
ulièrementrobusteauxconditionsd’illuminationsetauxocclusions.
En conclusion, les grammaires de formes peuvent être utilisées comme une pierre de Rosette
afin de déchiffrer le langage de l’architecture et permettent ansi de modéliser un bâtiment 3D à
partird’uneuniqueimage,àtraversunnouvelalgorithmeissudel’apprentissageparrenforcement.
D’une part la méthode développée apporte une réponse au problème de reconstruction urbaine 3D
àlarge échelle àpartir d’images, et d’autrepart elle laisseentrevoir depotentielles applications de
l’apprentissageparrenforcementenvisionparordi