40
pages
English
Documents
Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus
Découvre YouScribe en t'inscrivant gratuitement
Découvre YouScribe en t'inscrivant gratuitement
40
pages
English
Documents
Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus
ParametricLinkModelsforKnowledgeTransferinStatisticalLearning1
Chapter1
P
ARAMETRICLINKMODELSFORKNOWLEDGE
TRANSFERINSTATISTICALLEARNING
BeninelF.
1
,BiernackiC.
2
,BouveyronC.
3
,JacquesJ.
∗
2
andLourmeA.
4
1
CREST-ENSAI,Bruz,France
2
UniversitéLille1&CNRS&INRIA,Lille,France
3
UniversitéParis1Panthéon-Sorbonne,Paris,France
4
UniversitédePauetdesPaysdel'Adour,Pau,France
Abstract
Whenastatisticalmodelisdesignedinapredictionpurpose,amajorassumption
istheabsenceofevolutioninthemodeledphenomenonbetweenthetrainingandthe
predictionstages.Thus,trainingandfuturedatamustbeinthesamefeaturespaceand
musthavethesamedistribution.Unfortunately,thisassumptionturnsouttobeof-
tenfalseinreal-worldapplications.Forinstance,biologicalmotivationscouldleadto
classifyindividualsfromagivenspecieswhenonlyindividualsfromanotherspecies
areavailablefortraining.Inregression,wewouldsometimesuseapredictivemodel
fordatahavingnotexactlythesamedistributionthatthetrainingdatausedforesti-
matingthemodel.Thischapterpresentstechniquesfortransferingastatisticalmodel
estimatedfroma
source
populationtoa
target
population.Threetasksofstatistical
learningareconsidered:Probabilisticclassication(parametricandsemi-parametric),
linearregression(includingmixtureofregressions)andmodel-basedclustering(Gaus-
sianandStudent).Ineachsituation,theknowledgetransferiscarriedoutbyintroduc-
ingparametriclinksbetweenbothpopulations.Theuseofsuchtransfertechniques
wouldimprovetheperformanceoflearningbyavoidingmuchexpensivedatalabeling
efforts.
KeyWords
:Adaptiveestimation,linkbetweenpopulations,transferlearning,classi-
cation,regression,clustering,EMalgorithm,applications.
A
MSSubjectClassication:
62H30,62J99.
∗
E-mailaddress:julien.jacques@polytech-lille.fr
Beninel
etal.
21.Introduction
Statisticallearning[18]isakeytoolformanyscienceandapplicationareassinceitallows
toexplainandtopredictdiversephenomenafromtheobservationofrelateddata.Itleadsto
awidevarietyofmethods,dependingontheparticularproblemathand.Examplesofsuch
problemsarenumerous:
•
Examples
E
1
:In
CreditScoring
,predictthebehaviorofborrowerstopaybackloan,
onthebasisofinformationknownaboutthesecustomers;In
Medicine
,predictthe
riskoflungcancerrecurrenceforapatienttreatedforarstcancer,onthebasis
ofthetypeoftreatmentusedfortherstcancerandonclinicalanddemographic
measurementsforthatpatient.
•
Examples
E
2
:In
Economics
,predictthehousingpriceonthebasisofseveralhous-
ingdescriptivevariables;In
Finance
,predicttheprotabilityofanancialassetsix
monthsafterpurchase.
•
Examples
E
3
:In
Marketing
,createcustomersgroupsaccordingtotheirpurchasehis-
toryinordertotargetamarketingcampaign;In
Biology
,identifygroupsinasample
ofbirdsdescribedbysomebiometricfeatureswhichnallyrevealthepresenceof
differentgenders.
Inatypicalstatisticallearningproblem,aresponsevariable
y
∈
Y
hastobepredicted
fromasetof
d
featurevariables(orcovariates)
x
=(
x
1
,...,
x
d
)
∈
X
.Spaces
X
and
Y
areusuallyquantitativeorcategorical.Itisalsopossibletohaveheterogeneityinfeatures
variables(bothquantitativeandcategoricalforinstance).Theanalysisalwaysreliesona
trainingdataset
S
=(
x
,
y
)
,inwhichtheresponseandfeaturevariablesareobservedfora
setof
n
individualswhicharerespectivelydenotedby
x
=(
x
1
,...,
x
n
)
and
y
=(
y
1
,...,
y
n
)
.
Using
S
,apredictivemodelisbuiltinordertopredicttheresponsevariableforanewindi-
vidual,forwhichthecovariates
x
areobservedbutnottheresponse
y
.Thistypicalsituation
iscalled
supervised
learning.Inparticular,if
Y
isacategoricalspace,itcorrespondstoa
discriminantanalysis
situation;ItaimstosolveproblemswhichlooklikeExamples
E
1
.If
Y
isaquantitativespace,itcorrespondstoa
regression
situationandaimstosolveproblems
similartoExamples
E
2
.Notealsothatif
y
isonlypartiallyknownin
S
,itexhibitswhatis
called
semi-supervised
learning.
Anothertypicalstatisticallearningproblemconsistsinpredictingthewholeresponses
y
whilehavingneverobservethem.Inthiscaseonlythefeaturevariablesareknown,
thus
S
=
x
,anditcorrespondstoan
unsupervised
learningsituation.If
Y
isrestricted
toacategoricalspace(themostfrequentcase),itconsistsina
clustering
purpose,related
problemsbeingillustratedbyExamples
E
3
.
Inthischapter,wefocusonstatisticalmodelingforsolvingaswellsupervisedand
unsupervisedlearning.Manyclassicalprobabilisticmethodsexistandwewillgiveuseful
references,whennecessary,throughoutthechapter.Thus,thereaderinterestedforsuch
referencesisinvitedtohavealookinrelatedsectionsbelow.
Amainassumptioninsupervisedlearningistheabsenceofevolutioninthemodeled
phenomenonbetweenthetrainingofthemodelandthepredictionoftheresponseforanew
ParametricLinkModelsforKnowledgeTransferinStatisticalLearning3
individual.Moreprecisely,thenewindividualisassumedtoarisefromthesamestatistical
populationthanthetrainingone.Inunsupervisedlearning,itisalsoimplicitlyassumedthat
allindividualsarisefromthesamepopulation.Unfortunately,suchclassicalhypotheses
maynotholdinmanyrealisticsituationsasreectedbyrevisitedExamples
E
1
to
E
3
:
•
Examples
E
1
∗
:In
CreditScoring
,thestatisticalscoringmodelhasbeentrainedona
datasetofcustomersbutisusedtopredictbehaviorofnon-customers;In
Medicine
,
theriskoflungcancerrecurrenceislearnedforanEuropeanpatientbutwillbeap-
pliedtoanAsianpatient.
•
Examples
E
2
∗
:In
Economics
,areal-estateagencyimplantedforalongtimeonthe
USEastCoastaimstoconquernewmarketsbyopeningseveralagenciesontheWest
Coastbutbothmarketsarequitedifferent;In
Finance
,expertiseinnancialassetof
thepastyearissurelydifferentfromthecurrentone.
•
Examples
E
3
∗
:In
Marketing
,customerstobeclassiedcorrespondinfacttoapooled
panelofnewandoldercustomers;In
Biology
,differentsubpeciesofbirdsarepooled
togetherandmayconsequentlyhavehighlydifferentfeaturesforthesamegender.
Inthesupervisedsetting,thequestionis
Q
1
:Isitnecessarytorecollectnewtraining
dataandtobuildanewstatisticallearningmodelorcantheprevioustrainingdatastillbe
useful?Intheunsupervisedsetting,thequestionis
Q
2
:Isitbettertoperformaunique
clusteringonthewholedatasetortoperformseveralindependantclusteringsonsome
identiedsubsets?.
Question
Q
1
isaddressedas
transfer
learningandageneraloverviewisgivenin[31].
Transferlearningtechniquesaimtotransfertheknowledgelearnedonasourcepopulation
W
toatargetpopulation
W
∗
,inwhichthisknowledgewillbeusedinapredictionpurpose.
Thesetechniquesaredividedintotwoimportantsituations:Thetransferofamodel
does
need
or
doesnotneed
toobservesomeresponsevariablesinthetargetdomain.Therstcase
isquotedas
inductivetransfer
learningwhereasthesecondoneisquotedas
transductive
transfer
learning.Usually,theclassicationpurposeasdescribedinExamples
E
1
∗
canbe
solvedbyeithertransductiveorinductivetransferlearning,thischoicedependingonthe
modelathand(generativeorpredictivemodels).Contrariwise,theregressionpurposeas
describedinExamples
E
2
∗
canbeonlysolvedbyinductivetransferlearningsinceonly
predictivemodelsareinvolved.Question
Q
2
isadressedas
unsupervisedtransfer
learning.
Itcorrespondstosimultaneousclusteringofseveralsamplesand,thus,itconcernsExamples
∗.E3Acommonexpectedadvantageofallthesetransferlearningtechniquesisarealpre-
dictivebenetsinceknowledgelearnedonthesourcepopulationisusedinadditiontothe
availableinformationonthetargetpopulation.However,thecommonchallengeistoestab-
lishatransferfunctionbetweenthesourceandthetargeptopulations.Inthischapter,we
focusonparametricstatisticalmodels.Besidesbeinggoodcompetitorstononparametric
modelsintermsofprediction,thesemodelshavetheadvantageofbeingeasilyinterpreted
bypractitioners.Sinceparametricmodelswillbeused,itwillbenaturaltomodelizethe
transferfunctionbysomeparametriclinks.Thus,inadditiontoapredictivebenet,the
interpretabilityofthelinkparameterswillgivetopractitionersusefulinformationonthe
evolutionandthedifferencesbetweenthesourceandtargetpopulations.
4Beninel
etal.
Thischapterisorganizedasfollows.Section2.presentstransferlearningfordifferent
discriminantanalysiscontexts:Gaussianmodel(continuouscovariates),Bernoullimodel
(binarycovariates)andlogisticmodel(continuousorbinarycovariates).Section3.consid-
ersthetransferofregressionmodelsforaquantitativeresponsevariableintwosituations:
Usualregressionandmixtureofregressions.Finally,Section4.proposesmodelstoclus-
tersimultaneouslyasourceandatargetpopulationintwosituationsagain:Mixturesof
GaussianandStu