On Improvements of the Varro19 o Benchmark for Graph ...

icon

43

pages

icon

English

icon

Documents

Écrit par

Publié par

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

43

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

Universitat Karlsruhe (TH)
Forschungsuniversitat gegrundet 1825
Fakultat fur Informatik
Institut fur Programmstrukturen
und Datenorganisation
Lehrstuhl Prof. Goos
On Improvements of the Varr o Benchmark
for Graph Transformation Tools
Rubino Gei Moritz Kroll
December 5, 2007
Technical Report 2007-7
ISSN 1432-7864 ii ABSTRACT
In 2004 G. Varr o, A. Schurr, and D. Varro proposed the rst graph transformation benchmark
ever. Varr o et al. also published results for tools like AGG,PROGRES, andFujaba. While
repeating some of his measurements, we were not able to reproduce his gures and ndings.
This paper reports our suggestions for improving the so-called Varr o benchmark.
iii iv CONTENTS
1 Introduction and Problem 1
1.1 The Published Test Charts . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 What has been Measured? . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 The STS Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 The Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 Incomplete Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.5 The Publications of Benchmark Results . . . . . . . . . . . . . . . . . 4
1.2 The Implementation of the Chronometry . . . . . . . . . . . . . . . 7
1.2.1 Time Measurement for Fujaba and AGG . . . . . . . . . . . . . . . . 7
1.2.2 Timet for PROGRES . . . . . . . . . . . . . . . . . . . 7
1.3 Design of the Chronometry . . . . . . . . . . . . . . . . . . . . . ...
Voir icon arrow

Publié par

Nombre de lectures

60

Langue

English

On
Universit¨atKarlsruhe(TH) Forschungsuniversita¨tgegnu¨r1ted528
Fakult¨atf¨urInformatik Institutf¨urProgrammstrukturen und Datenorganisation Lehrstuhl Prof. Goos
ImprovementsoftheVarro´Benchmark for Graph Transformation Tools
Rubino Geiß
Moritz Kroll
December 5, 2007
Technical Report 2007-7 ISSN 1432-7864
ii
A B S T R A C T
In2004G.Varro´,A.Schu¨rr,andD.Varr´oproposedtherstgraphtransformationbenchmark ever.Varr´oetal.alsopublishedresultsfortoolslikeAGG,PROGRES, andFujaba. While repeating some of his measurements, we were not able to reproduce his figures and findings. Thispaperreportsoursuggestionsforimprovingtheso-calledVarro´benchmark.
iii
iv
1
2
A
B
Introduction
and Problem
C O
N T E N T
S
1
1.1The Published Test Charts. . . . . . . . . . . . . . . . . . . . . 3 1.1.1 What has been Measured?. . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.2 The STS Benchmark. . . . . .  3. . . . . . . . . . . . . . . . . . . . . . 1.1.3 The Log Files. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . .  4 1.1.4 Incomplete Log Files. . . . . . . . . . . . . . . . . . . . . . 4. . . . . . 1.1.5 The Publications of Benchmark Results 4. . . . . . . . . . . . . . . . . 1.2The Implementation of the Chronometry 7. . . . . . . . . . . . .. . 1.2.1 Time Measurement forFujabaandAGG. . . . . . . . . . . . . 7. . . 1.2.2 Time Measurement forPROGRES. . .  7. . . . . . . . . . . . . . . . 1.3Design of the Chronometry 9. . . . . . . . . . . . . . . . . . . . . 1.3.1 Timestamp Chronometry withprintfConsidered Harmful 9. . . . . . 1.3.2 Interpretation 11. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 1.3.3Varr´osAnalysis. . . . . .  12. . . . . . . . . . . . . . . . . . . . . . . . 1.4Design of Rules for Fujaba. . . . . . . . . . . . . . . . . .. . .  18 1.4.1 Linear-time vs. Constant-time Matching. . . . . . . . . . . . . . 18. . . 1.4.2 Underspecified Meta Model for STSone 20. . . . . . . . . . . . . .. . .
Improvements and Results
2.1Improvements. . . . . . . . . . . . . . . . . . . . .. . . . . 2.2Improved Results forFujaba. . . . . . . . . . . . . . . . . . . . 2.3Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . .
Source Code
A.1Timestamp Chronometry Considered Harmful. . . . . . . . . . . . . A.2STSmany Improved. . . . . . . . . . . . . . . . . . . . . . . . A.3STSveryOne. . . . . . . . . . . . . . . . . . . . . .. . . . .
The
Benchmark Computer
v
21
21 21 22
25
26 30 33
35
vi
C H A P T E R
1
I N T R O D U C T I O N A N D
P R O B L E M
In 20041htephratgrsorsfantrebnoitamwkramhcnoposasprG.VaedbyA,S.rro´rr,hcu¨ andD.Varro´[VSV05ayenaG.ar.W]hiit´reoat.lnaDdV.raishedperalsopublecnamrof figures for tools likeAGG,PROGRES, andFujabain two articles [VFV05,VSV05b] as well as on a web page [Var05 web page includes a detailed test chart.]. This the Additionally, source codes for the measurements of the contemplated tools are available there. We reused someoftheoriginalguresofVarro´inourownpublications[GBG+06,Kro07]. We reprint these results in Figure1.1; Table1.1contains the details for Figure1.1as well as further results. Figure1.1contains data points for an improved version ofGrGen(SP) that was not available for the ICGT 2006 proceedings [GBG+06prestwas],buocehrefnetnettadceene.W measuredAGGon our own (see section1.1.4). Besides adding measurements for our tool GrGen[BG07,Gei07tcronoafceitocrrmallbyasuressgaVdeo´rrerewlacs]2to match the originalguresofVarro´duetodierenthardware. WerecentlytriedtorepeatsomeofthemeasurementsofVarr´oet.al.[VFV05,VSV05b], especially those forFujaba Surprisingly,, on our own. we failed in reproducing his figures, at least by orders of magnitude. This obviously cannot be explained by different hardware, since we used roughly comparable computers. Therefore we examined the experimental design and setupofVarr´ocloselyinordertoexplainthefounddiscrepancies(seethischapter).Chapter2 presentsourresultsoftheimprovedVarr´obenchmark.
1publication dates you could also say 2005.Depending on the 2ugsihde0ybsertsweesuliplimultsuVeoTerorsra´r.68 which is the speed difference of both processors according to the SPEC organization [Sta05 this is the correct factor or not, is beyond our scope.]. Whether It simply does not matter if its correct value is, e.g. 0.33 or 3.0, because we talk about several orders of magnitude and even different complexity classes.
1
2
time
1h
1min
1s
100ms
10ms
1ms
100
250
500
750
Introduction and Problem
AGG
Varr´oDB PROGRES
GrGen(PSQL)
Fujaba
GrGen(SP)
size 1000
Figure 1.1: Running times of STS mutex benchmark (multiplicity optimizations off, param-eter passing off, simultaneous execution off; for parameter details see [VSV05a])
Table1.1:RunningtimesforseveraloftheVarro´benchmarks(inmilliseconds) Benchmark LTS simult. ALAPSTS ALAP Tool 1000 1000,1 10 10010 100 100 1000 1000 10 PROGRES 471 2,361 610,600 8 942,10012 946 1,267 459,000 21 AGG330 8,300 – – – 6,881,000 270 8,027 13,654,000>107 Fujaba 3,875 344 69 20 2,821 203 32 4,92740 305 Varro´DB4,69719,825593,50089314,088596,8001535373,130593,200 GrGen 1,180 406,000 – 760 27,715 24(PSQL) 30 – – 96,486 GrGen(SP)< 101 1<1 1 11<1<1 9 21
1.1The Published Test Charts
1.1 The Published Test Charts
3
The webpage [Var05ar.VfG]oatnoco´rrevessnitfeharwstsigoineslon)cstragol(etlahcts data acquired during measurements. The structure of these log files is similar for all tools, but unfortunately not identical. In this section, we’ll take a look at the structure of these log files to give clues on how to interpret the data.
1.1.1 What has been Measured?
TherearethreedierentmutexbenchmarksdenedbyVarr´o:STS(shorttransformationse-quence), ALAP (as long as possible), and LTS (long transformation sequence). These bench-marks can be modified to specifically test the impact of some optimizations: PP (parameter passing between rule applications), par (parallel matching and execution of graph updates), andone/many(indicatingthatmultiplicityoptimizationsareonoro,respectively).Varr´o suggests using only the following combinations: STSone, STSonePP, STSmany, STSmanyPP, ALAP, ALAPpar, LTS (cf. [VSV05b]). Moreover his benchmarks have one (or two in case of LTS) size parameters. In the following list we give hints on abnormalities3of the log files with respect to the above classification (the next subsections elaborate on how to read these log files):
AGGof parallel rewriting nor has multiplicity optimizations, so theseis neither capable benchmark variants are left out.
ThePROGRESlog files for ALAP, ALAPpar, and LTS differ because they contain the timestamps for the applications of the initialnewRule, whereas the log files forAGG andFujabaonly contain the “payload” rule applications.
ALAPpar differs forFujabaandPROGRESdue to issues regarding different modes of parallel rewriting of both tools.
asnoteiticausconsiderthemeasuWdenotooDr´ppBaacrobeh,emerfstnhtroraVe publicly available tool.
1.1.2 The STS Benchmark
TheSTSbenchmarkisoneofthemutexbenchmarksusedbyVarro´(cf.section1.1.1). We use this benchmark to illustrate our explanations. The STS benchmark of sizenexecutes the following sequence of rule applications:
applynewRulen2-times
applymountRuleonce
applyrequestRulen-times
repeat the following sequence of rule applicationsn-times:
applytakeRule applyreleaseRule applygiveRule
This can be concisely specified by the following graph rewrite sequence [BG07]:
newRule[n-2] & mountRule & requestRule[n] & (takeRule & releaseRule & giveRule)[n] 3These are not errors per se; this rather tries to prevent some potential misunderstandings.
4
1.1.3
The Log Files
Introduction and Problem
The log files for the three tools (AGG,Fujaba, andPROGRES) consist of three lines per single rule application. The logs for the Java based tools (AGGandFujaba) are prepared with log4j [Fou06]. For thePROGRESmdethctaegstarenVae´orrinerodCcithiolwto used custom code for logging (see section1.2.2). Listings1.1,1.2, and1.3are excerpts of a run of the STSmany benchmark of sizen Table= 10 for the discussed tools.1.2describes the content of each column of theAGG(Listing1.1) andFujaba(Listing1.2) log files, while Table1.3does the analogous forPROGRES(Listing1.3 Different column) log files. separators like “-”, “:”, and spaces are ignored for conciseness. Please note, that even if you consider all log files forPROGRES, you will not find any zero directly after the decimal point (like in 1110237924.096410) and sometimes the numbers after the decimal point are less than 6 digits long. This is erroneous (described in section1.2.2).
1.1.4 Incomplete Log Files
SomeloglesonVarro´swebpage[Var05are truncated, presumably due to prohibitive long] running time of the according benchmarks. OnlyAGG Inis affected. the publications of the benchmark results one can find corresponding interleaves [VFV05,VSV05b]. The authors state that they aborted measurements with long running times. In the following we give a list of all discontinued log or missing files:
STSmany1000 forAGG(see Listing1.4)
STSmanyPP1000 forAGG
ALAP1000 forAGG
STSone10, STSone100, STSone1000, ALAPpar10, ALAPpar100, and ALAPpar1000 for AGGare left out becauseAGGdoes not support the necessary features.
Due to these discontinued log files forAGG, we did our own measurements to get all data points forAGG.
1.1.5 The Publications of Benchmark Results
TherearetwopublicationsmadebytheoriginalauthorscontainingresultsoftheVarr´o benchmark [VFV05,VSV05b]. In this section we paraphrase our thoughts on how to relate the figures presented there with the raw test charts and source code published on the web site [Var05].
1.1
The Published Test Charts
Table 1.2: Description of a log file row for the Java-based tools Column Example from Listing1.1Description 1618Time in milliseconds (ms, 103sec) since the first log entry; this value is computed internally by log4j. 2[main]Irrelevant.The name of the thread calling log4j. This ismainfor all files. 3DEBUGIrrelevant. is ThisThe debug-level.DEBUGfor all files. 4hu. ... .STSmanyIrrelevant.The fully qualified name of the class containing the method calling log4j. 5giveRuleThe name of the current rule. 6updatethe rule application that has justThe part of been completed: init (initialization code), pattern matching (or pm forFujaba), and update (update the graph i.e. do the rewrite; clean-up code is in-cluded). 71110553056610004794of a call to the Java runtime libraryThe result System.nanoTime() result is not inter-. The pretable in a straightforward manner (please see section1.2.1). Albeit other possibilities, in this case this function counts the nanoseconds (ns, 109sec) since the UNIX epoch, roughly speaking. The val-ues of column one and this column are incommen-surable by definition (cf. section1.2.1).
Column 1 2
3
Table 1.3: Description of a log file row forPROGRES Example from Listing1.3Description giveRuleThe name of the current rule. updateThe part of the rule application that has just been completed: init (initialization code), pm (pattern matching), and update (update the graph i.e. do the rewrite; clean-up code is included). 1110237924.983845 numbers TheThe seconds since the UNIX epoch. after the decimal point should be the fractions of the second with microsecond (µs, 106sec) reso-lution (note the difference between resolution and precision). Due to erroneous implementation this is not true (see section1.2.2).
5
Voir icon more
Alternate Text