Zab: High-performance broadcast for primary-backup systems

icon

12

pages

icon

English

icon

Documents

2013

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
icon

12

pages

icon

English

icon

Documents

2013

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Zab: High-performance broadcast for primary-backup systems Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini Yahoo! Research ffpj,breed,serafinig@yahoo-inc.com Abstract—Zab is a crash-recovery atomic broadcast algorithm scheme [5], [6], [7] to maintain the state of replica processes we designed for the ZooKeeper coordination service. ZooKeeper consistent. With ZooKeeper, a primary process receives all implements a primary-backup scheme in which a primary incoming client requests, executes them, and propagates the process executes clients operations and uses Zab to propagate the 1 resulting non-commutative, incremental state changes in thecorresponding incremental state changes to backup processes . form of transactions to the backup replicas using Zab, theDue the dependence of an incremental state change on the sequence of changes previously generated, Zab must guarantee ZooKeeper atomic broadcast protocol. Upon primary crashes, that if it delivers a given state change, then all other changes it processes execute a recovery protocol both to agree upon a depends upon must be delivered first. Since primaries may crash, common consistent state before resuming regular operation Zab must satisfy this requirement despite crashes of primaries. and to establish a new primary to broadcast state changes.
Voir icon arrow

Publié par

Publié le

08 mai 2013

Nombre de lectures

36

Langue

English

Poids de l'ouvrage

1 Mo

Zab: High-performance broadcast for primary-backup systems
Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini Yahoo! Research { fpj,breed,serafini } @yahoo-inc.com
Abstract —Zab is a crash-recovery atomic broadcast algorithm scheme [5], [6], [7] to maintain the state of replica processes we designed for the ZooKeeper coordination service. ZooKeeper a rima s receives all implements a primary-backup scheme in which a primary icnocnosimstienngt.cliWeintthreZqouoeKstese,peerx,ecutpestherym,paroncdespropagatesthe process executes clients operations and uses Zab to propagate the corresponding incremental state changes to backup processes 1 . resulting non-commutative, incremental state changes in the Due the dependence of an incremental state change on the form of transactions to the backup replicas using Zab , the sequence of changes previously generated, Zab must guarantee ZooKeeper atomic broadcast protocol. Upon primary crashes, that if it delivers a given state change, then all other changes it er roto depends upon must be delivered first. Since primaries may crash, processes execute a recov y p col both to agree upon a Zab must satisfy this requirement despite crashes of primaries. common consistent state before resuming regular operation Applications using ZooKeeper demand high-performance from eaxnedrctioseestthaeblisihaarnyewrolper,imaaprryotcoesbsromaudsctashtasvteattehechsaunpgpeosr.tToof the service, and consequently, one important goal is the ability pr m of having multiple outstanding client operations at a time. a quorum of processes. As processes can crash and recover, Zab enables multiple outstanding state changes by guaranteeing ulti l that at most one primary is able to broadcast state changes tshaemreepcraoncebsesmoavyerextiermceisemtheppriemparriymraorlieesmualntidplientifmacets.tThoe and have them incorporated into the state, and by using a synchronization phase while establishing a new primary. Before idnissttianngcueisvhaltuheewdiitfhfeeraecnhteprsitambalriisehsedovperirmtiarme,Aweivaessnoicnisattean this synchronization phase completes, a new primary does not y. g ance broadcast new state changes. Finally, Zab uses an identification value maps to at most one process. Note that our notion scheme for state changes that enables a process to easily identify of instance shares some of the properties of views of group missing changes. This feature is key for efficient recovery. Experiments and experience so far in production show that our communication [8], but it presents some key differences. With design enables an implementation that meets the performance bgrrooaudpccaost,mamnudncicoantiognu,raatlilopnroccheasnsgesesinhaapgpievnenwhviewareparbolceetsos requirements of our applications. Our implementation of Zab en any can achieve tens of thousands of broadcasts per second, which joins or leaves. With Zab, processes change to a new view (or is sufficient for demanding systems such as our Web-scale ort applications. fprroimmaaryqiunostraunmc.e)onlywhenaprimarycrashesorlosessupp Index Terms —Fault tolerance, Distributed algorithms, Primary backup, Asynchronous consensus, Atomic broadcast statCerictihcaanlgteoisth i e nc d r e e s m ig e n nta o l f w Z i a th b r i e s the obse h r e va p ti r on io that each spect to t ev us state , I. I NTRODUCTION so there is an implicit dependence on the order of the state Atomicbroadcastisacommonlyusedprimitiveindis-carhbaintrgaersy.oSrtdateer,cahnadnigteissccorintisceaqluteontgluyarcaanntneoettbhaetaappprlieedxionfathney tributed computing and ZooKeeper is yet another application touseatomicbroadcast.ZooKeeperisahighly-availableastpaptleiecdhatongtehsegseernveircaetesdtabtey.aStgaitveecnhparnigmeasrayreariededemlipvoetreendtaanndd coordination service used in production Web systems such as applying the same state change multiple times does not lead to the Yahoo! crawler for over three years. Such applications inconsistencies as long as the application order is consistent often comprise a large number of processes and rely upon ZooKeepertoperformimportantcoordinationtasks,suchasownitchetsheemadnetliicvserisysourfdecri.enCtoannsdeqsiumenptlliy,esgtuhaerainmtepleeinmgenatta-tlieoasnt. storing configuration data reliably and keeping the status of running processes. Given the reliance of large applications on As Zab is a critical component of the ZooKeeper core, ZooKeeper, the service must be able to mask and recover from it must perform well. Some applications of ZooKeeper en-failures.[1]tceonmsipvaeslsy.aPlraervgieounsusmybsteermosfhparovceebsseeesnadnedsiugsneedZtoooKcoeeorpdeirneaxte-ZooKeeper is a replicated service, and it requires that a long-lived and infrequent application state changes [9], [10], majority (or more generally a quorum) of servers has not crashed for progress. Crashed servers are able to recover [11]. We designed ZooKeeper to have high throughput and and rejoin the ensemble as with previous crash-recovery low latency, so that applications could use it extensively on protocols[2],[3],[4].ZooKeeperusesaprimary-backupcclounsnteercteendvinroodnems.ents:datacenterswithalargenumberofwell-1 A preliminary description of Zab was presented as a brief announcement When designing ZooKeeper, however, we found it difficult at the 23rd International Symposium on Distributed Computing, DISC 2009. to reason about atomic broadcast in isolation. There are re-
Voir icon more
Alternate Text