107
pages
English
Documents
2006
Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres
107
pages
English
Documents
2006
Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres
Publié par
Publié le
01 janvier 2006
Nombre de lectures
20
Langue
English
Poids de l'ouvrage
1 Mo
Publié par
Publié le
01 janvier 2006
Langue
English
Poids de l'ouvrage
1 Mo
orkloadW
A
delingMo
for
Computers
Baiyi
Song
hResearcoticRobInstitute
hnologyecTInformationSection
The
thesis
UniversityofDortmund
theforsubmitted
Doktor-Ingenieur
degree
of
arallelP
tswledgemenknoAc
Iwanttothankallthosewhomadethisworkpossiblebytheirsupports.
Inparticular,Iwanttoexpressmyappreciationtomydoctorialadvisor,
Prof.Dr.-Ing.UweSchwiegelshohn,forthesourceofinstructiveguidance
andsupportsoverthepastyears.
IamgratefultoallpeopleatRoboticResearchInstituteattheUniversity
ofDortmundfortheexcellentworkingatmosphere.Especially,Iwouldalso
liketothankDr.-Ing.RaminYahyapour,whoseeffortsandexpertisehelp
metofinishmystudy.IwouldliketoexpressmygratitudetoDipl.-Inform.
CarstenErnemann,whogivememanyhelpfulsuggestionsandsupports.
IwouldalsoliketothanktheGraduateSchoolofProductionEngineering
andLogisticsattheUniversityofDortmund,whichprovidesmenotonly
thefinancialsupportsbutalsothebroadcontactswithotherdepartments.
IwouldliketothankmanyofmychinesefriendswiththemIhadapleasure
und.Dortmintime
Atlast,Iwouldliketothankmyparents,mywifeandmyson,whoalways
encouragemewiththeirlove,beliefandpatience.
Abstract
Theavailabilityofgoodworkloadmodelsisessentialforthedesignandanal-
ysisofparallelcomputersystems.Aworkloadmodelcanbeapplieddirectly
inanexperimentalorsimulationenvironmenttoverifynewschedulingpoli-
ciesorstrategies.Moreover,itcanbeusedforextrapolatingandpredicting
futureworkloadconditions.Inthiswork,wefocusontheworkloadmod-
elingforparallelcomputers.Tothisend,westartwithanexaminationof
theoverallfeaturesoftheavailableworkloads.Here,wefindastrongse-
quentialdependencyinthesubmissionseriesofcomputationaljobs.Next,
anewapproachusingMarkovchainsisproposedthatiscapableofdescrib-
ingthetemporaldependency.Second,weanalyzethemissingattributes
insomeworkloads.Ourresultsshowthatthemissinginformationcanbe
stillrecoveredwhentherelevantmodelistrainedfromothercompletedata
set.Basedontheresultsofoverallworkloadanalysis,webegintoinspect
theworkloadcharacteristicsbasedonparticularuser-levelfeatures.That
is,weanalyzeindetailhowtheindividualusersuseparallelcomputers.
Inparticular,weclustertheusersintoseveralmanageablegroups,while
eachofthesegroupshasdistinctfeatures.Thesedifferentgroupsprovide
aclearexplanationfortheglobalcharacteristicsofworkloads.Afterwards,
weexaminetheuserfeedbacksandpresentanovelmethodtoidentifythem.
Theseevidencesindicatethatsomeusershaveanadaptivetendencyanda
completeworkloadmodelshouldnotignoretheusers’feedbacks.Thework
endswithabriefconclusiononthediscussedmodelingaspectsandgivesan
outlookonfuturework.
tstenCon
ductiontroIn11.1Motivation..................................
1.2OrganizationoftheThesis..........................
2WorkloadModelingforParallelComputers
2.1ParallelComputingEnvironment......................
2.2PerformanceEvaluationaUsingWorkloadModel.............
2.3ProblemswithExistingApproaches....................
2.4ProposalofaNewWorkloadModel.....................
3InvestigationofTemporalRelations
3.1ObservationsonTemporalRelations....................
3.2ModelingusingMarkovChainModel....................
3.2.1MarkovChainConstruction.....................
3.2.2CombinationofTwoMarkovChains................
3.3ExperimentalResults.............................
3.3.1StaticComparison..........................
3.3.2ComparisonofTemporalRelations.................
3.4Summary...................................
4AnalyzingMissingInformationinWorkloads
4.1MissingEstimatedRuntime.........................
4.2AnalysisofEstimatedRuntimeinCompleteWorkloads.........
4.2.1EstimationAccuracy.........................
4.2.2UserEstimations...........................
4.3ParameterizedDistributionModel.....................
4.3.1ModelSelection............................
4.3.2ModelingEstimatedRuntime....................
4.4ExperimentalResults.............................
i
114
77101216
212124272932323434
373738383939414347
ii
5
6
7
8
CONTENTS
4.4.1StatisticComparison.........................
4.4.2DerivingaGeneralModel......................
4.5Summary...................................
UserGroup-basedAnalysisandModeling
5.1AnalysisofUser-levelSubmissions.....................
5.2ClusteringUsersintoGroups........................
5.2.1DataPreprocessing..........................
5.2.2JobClustering............................
5.2.3UserGrouping............................
5.2.4WorkloadModelingofIdentifiedUserGroups...........
5.3GenerationofSyntheticWorkloads.....................
5.4ExperimentalResults.............................
5.4.1AnalysisofJobCharacteristicfromUserGroups.........
5.4.2StatisticalComparison........................
5.5Summary...................................
ExaminationofImplicitUserFeedbacks
6.1FeedbackExamination............................
6.2FeedbackAnalysis..............................
6.2.1ImplicitFeedbackFactors......................
6.2.2ProblemSpecification........................
6.2.3MethodSelection...........................
6.3Methodology.................................
6.3.1DataPreprocessing..........................
6.3.2FittingLinearRegressionModel..................
6.4ExperimentalResults.............................
6.4.1FeedbackVisualization........................
6.4.2ExclusionfromOtherFactors....................
6.4.3ComparisonoverAllWorkloads...................
6.5Summary...................................
DiscussionsandFurtherWork
7.1MoreObservationsoftheWorkloads....................
7.2FurtherWork.................................
7.3Summary...................................
Conclusion
474849
515153545455565757585960
6767686869707070727273747577
79798183
85
FiguresofList
2.12.22.32.42.52.62.7
3.13.23.33.43.5
4.14.24.34.44.54.6
5.15.25.35.4
Ajobisrepresentedbyarectangleinourstudy..............
Ajobisscheduledbyaschedulingsystem..................
Amediumnumberofuserssubmitjobstoaparallelsystem........
Theglobalworkloadmodelingstructure..................
Histogramofjobruntime..........................
HistogramofjobparallelismintheKTHworkload............
Anovelworkloadmodelstructure.....................
Continuousjobsubmissions.........................
Temporalrelationinthesequenceoftheparallelism...........
Temporalrelationinthesequenceoftheruntime.............
ProcessofcorrelatedMarkovchainsmodel.................
Comparisonofmodeledandoriginaldistributionsofruntimeandparal-
lelism.....................................
Therelationsoftheaccuracytotherealruntime.Here,onlythejobs
whoseaccuraciessmallerthan1.1areconsidered..............
Comparisonofmodeled(SIM)andoriginal(REAL)accuracies.....
Comparisonofmodeled(SIM)andoriginal(REAL)estimatedruntime.
TherelationsbetweentherealruntimeandtheparametersoftheBeta
distributionsforKTH.TherelationssimilarforthetracesofSDSCSP2
andCTC...................................
Comparisonofthesynthetic(SIM)realestimatedruntimes(REAL)..
Comparisonofthesynthetic(SIM)andrealestimatedruntimes(REAL)
Comparisonoftheaveragejobparallelismforindividualusers......
ProcessoftheMUGMmodel........................
AlgorithmforthePAMClustering.....................
2usergroupsareclusteredinKTHwiththeMUGMmodel........
iii
89912141516
2325262733
404444454747
52535659
iv
5.55.65.7
6.16.26.36.4
7.17.27.37.4
LISTOFFIGURES4usergroupsareclusteredinKTHwiththeMUGMmodel........
6usergroupsareclusteredinKTHwiththeMUGMmodel........
4usergroupsareclusteredinLANLwiththeMUGMmodel.......
HistogramofthenumberofwaitingjobsinKTH.............
FeedbackdiscoveriesinKTH........................
FeedbackdiscoveriesinCTC........................
Feedbackdiscoveriesinthetimeframefrom1pmto4pminKTH....
DailycycleinKTH.Left:thenumberofjobsinthedifferenthoursof
day.Right:theaverageruntimeinthedifferenthoursofday......
WeekdayeffectinKTH.Left:thenumberofjobsinthedifferentdays
ofweek.Right:averageruntimeinthedifferentdaysofweek......
CombinationofcorrelatedMarkovchainandtheusergroupmodel..
ApplicationofNeuralnetworkforworkloadmodeling..........
606263
71747576
80808182
ablesTofList
2.1
3.13.23.33.43.53.63.7
4.14.24.34.44.54.64.74.84.9
5.15.25.3
UsedworkloadsfromtheSWFArchive...................
Percentageofneighboringjobswiththesameparallelismvalues.....
ExampleforderivingthestatesoftheMarkovchainfortheparallelism
DimensionsoftheMarkovchainsfortheparallelismandtheruntime..
Correlationcor0andcor1inexaminedworkloads.............
StaticComparisonofthemodeledandtheoriginalworkloads......
Comparisonofthecorrelationoftheparallelismandtheruntimefrom
theMarkovchainmodelandtheLublin/Feitelsonmodel.........
Comparisonoftheautocorrelationρ1oftheparallelismandtheruntime
sequencesfromMCM,PDM,andtheoriginalworkloads.........
18
22282931333434
Summaryofmissingestimatedruntimefortheexaminedworkloads...38
Analysisofthejobsexceedingtheirrealruntimes.............39
SummarizedalignmentoftheestimatedruntimeswithintheKTH,SDSC
SP2,CTCworkloads,totalalignment:82.7%...............41
AlignmentoftheestimatedruntimeswithintheLANLandKTHworkloads