EPFL-DACM-07-1-Cours-Intro

icon

11

pages

icon

English

icon

Documents

Écrit par

Publié par

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

11

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

This document is exclusively for students of EPFL following in 2007 the course on Ambient, Continuous«Ambient, Continuous & Mobile Data»& (given by Michel Adiba)Mobile Data It contains materials coming from different sources & people cited in the Michel Adibadocument. Grenoble University-FranceMichel.Adiba@imag.frPersonal use. No reproduction allowed© Michel Adiba-DACM-2007Content• DB & DBMS evolution : from DB to Data Space• Temporal, Spatial & Active DBGENERAL • DB Query & Optimization• Location Dependent QueriesINTRODUCTION • Continuous Queries • Data Streams and DSMS• Mobile Transactions• Conclusions© Michel Adiba-DACM-2007 © Michel Adiba-DACM-2007ηιDB & DBMS evolutionThe Nano-Yotta Paradigm• History-9•Nano10: devices are getting smallerPC, PDA, cell. Phone, smartcard, sensors, etc.•Ambient 24 • Yotta 10 : data are getting larger • Continuous (12) (15) (18) (21) (24)TB , PB , EB , ZB , YB+ Networking : WAN, LAN, WEB, Wireless, bluetooth,…• Mobile Data• From DB to Data Spaces • Ambient Intelligence, Ubiquitous Computing, MobilityData everywhere…How DB people deal with that ?© Michel Adiba-DACM-2007 © Michel Adiba-DACM-2007© Michel Adiba, Cours DACM 2007 1???????????9?9?9???9???9???????????Main Dimensions of Data ManagementHistory of DB & DBMSData : ≠types, ≠sizes, static (values), dynamic (operations)ActiveWEB –basedSpatial Models : ER, Relations, Objects, SSD, …Temporal DBMSMultimedia XML ...
Voir icon arrow

Publié par

Nombre de lectures

65

Langue

English

Ambient, Continuous & Mobile Data
Michel Adiba Grenoble UniversityFrance Michel.Adiba@imag.fr
GENERAL
© Michel AdibaDACM2007
INTRODUCTION
DB & DBMS evolution
History Ambient Continuous Mobile Data From DB to Data Spaces
© Michel Adiba, Cours DACM 2007
© Michel AdibaDACM2007
© Michel AdibaDACM2007
This document is exclusively for students of EPFL following in 2007 the course on
«Ambient, Continuous & Mobile Data» (given by Michel Adiba)
It contains materials coming from different sources & people cited in the document.
Personal use. No reproduction allowed
Content
DB & DBMS evolution : from DB to Data Space Temporal, Spatial & Active DB DB Query & Optimization Location Dependent Queries Continuous Queries Data Streams and DSMS Mobile Transactions Conclusions © Michel AdibaDACM2007
ηι The NanoYotta Paradigm
9  Nano 10 : devices are getting smaller PC, PDA, cell. Phone, smartcard, sensors, etc. 24 Yotta 10 : data are getting larger TB , PB , EB , ZB , YB (12) (15) (18) (21) (24) + Networking :WAN, LAN, WEB, Wireless, bluetooth,…
Ambient Intelligence, Ubiquitous Computing, Mobility Data everywhere…
How DB people deal with that ?
© Michel AdibaDACM2007
1
M.Adiba, DBTA, Berne, 03/03
M.Adiba, DBTA, Berne, 03/03
Main Dimensions of Data Management
© Michel AdibaDACM2007
??
¾Data :types,sizes, static (values), dynamic (operations) ¾Models : ER, Relations, Objects, SSD, … ¾Persistency : Model, Storage ¾Query Languages & Interfaces : SQL,QBE, OQL, QL+PL… ¾Data Access & Query Optimization :I/O bound! ¾Data Integrity & Security : constraints, access rights, crypto,.. ¾Transaction Management : consistency, concurrency, recovery ¾Systems & Architectures :DBMS (from smartcard to DB machine), Client(s)/Server(s), Web, P2P/Grid, MOBILE…
Hierarchical & Network DBMS
1960
Relational DB In ONE slide! * ¾More than 30 Y (1970) : maturity! ¾Theoretical & Practical aspects (DBMS) st ¾Domains, RD1 x D2 x .... Dn, Algebra,Predicate Logic1 Order ¾Languages : SQL (wins), QUEL, QBE (see MS Access)R x S RS ¾DBMS Prototypes (1975), Products (1980) R  S ¾A MAJOR improvement in DB : provide data independence R[α] & a simple, tabular VIEW of data. R :ϕ ¾Relations are implemented as Tables, Views & Snapshots  (NO “relation” in SQL) R * S
Integration Data Documents Services
Active Spatial Temporal Multimedia Object & OR Deductive
1980
1990
1970
Mobile Data Space Sensors DSMS
Data Warehouse Mining OLAP
© Michel AdibaDACM2007
ATOMS
CO Model
Data Models Evolution
History of DB & DBMS
tuple
w_name
r_capital
...
Data Semantics (J.R.Abrial) ER (P.Chen)
list
¾Large spectrum, // with Program. Languages ¾Hierarchical, Network ¾Systems)Relational, (+ ¾ER, Z, MERISE, UML… (+ Methods) ¾Semantic models ¾Data vs. Knowledge : Deductive DB ?? ¾Multidimensional models (OLAP)
DATA MODELS
M.Adiba, DBTA, Berne, 03/03
Objects & DB
© Michel Adiba, Cours DACM 2007
r_name
tuple(name: "France",  capital: v1,  towns: set(v1, v2, v3) )
p1
2
¾Mid80s : a new approach ? ¾Basic ideas : PL + DB, ObjectOriented Languages, Type extensibility, dynamic ¾Two “clans” : relational DB vs. OO DB ¾Now : integration and ObjectRelational DB ¾The best of the two worlds ;)
add_town
¾Complex Objects 9“relational was too flat” ¾Operations : ADT (UDT, UDF) 9“relational was too static” ¾SemiStructured Data Models 9Conciliate Documents and Data 9Deal with the Web & the XML Galaxy 9Schema less data
2000
WEB –based DBMS XML
2007
2005
* Not possible for any other proposed data model ! M.Adiba, DBTA, Berne, 03/03
Relational (T.Codd)
OLTP OLCP
Files
3
Complex Objects, Object Identity, Encapsulation, Types or Classes, Inheritance, Late Binding, Extensibility
Persistence, Storage managt, Security, Transactions, Liability, Concurrency Ad hocQuery.
MOV
T
Y > 1990
5! Possible Joins Access Paths Intra vs. Inter // ================= MOV & MD H.partitioned / 3 servers MOV=M1M2M3 M3 replicated M31, M32
MD
© Michel AdibaDACM2007
© Michel AdibaDACM2007
DB Concepts
AC = S.Stone
PR
MT
RMT
TN
Formalization
¾What vs. How : Data Independence ¾Non procedural, declarative languagesThe Web ¾SQL, QUEL, QBE SEQUEL (76), ANSI/ISOSQL (86) , SQL (89), SQL2 (92), SQL3 (99, 03),…? ¾SQL is not only a query language Structured ¾SQL is not a programming language Databases ¾Object Query Language OQL & OODB Pull vs. Push ? ¾XQueryThe WEB & Continuous Query http://www.w3.org/XML/Query Location dep. Query
¾Client(s) /Server (s)
¾Monolithic DBMS
¾Main Memory DBMS
© Michel AdibaDACM2007
Object Orientation
M.Adiba, DBTA, Berne, 03/03
The best of the two worlds?
PL & DB eg JDO ?
Distributed Query Optimization Towns where movies made after 1990 staring S.Stone ? ƒMOV(T,C,Y,D) ƒMD(T,AC) ƒPR(T,TN,RN,W,NB) ƒMT(TN,TOWN,PN) ƒRMT(TN,RN,NB)
Active DBMS
ACID Transactions Transaction execution ¾Atomicity : all or nothing Case 2 ¾Consistency Case 1 ¾Isolation EifailureEf ¾Durability  OLTP & OLCP, TP Monitors  Benchmarks :www.tpc.org &Standards
S2
User view: writing transactions is a software engineering problem System view: deal with Integrity, Concurrency & Recovery
Systems & Architectures partition and/or replication
S1
time
 Generated vs. Handcrafted Triggers (New programming paradigm for application development or system tool)
Row / Statement
Triggers in DBMS & SQL3 (99)
Generalization : Active Rules
TN, RN
From Triggers to Active DB
?
TOWN,….
Appl. Prog. EVENT
DB
Non active DBMS
Active Rule
DBMS
Event / Condition/ Action
Old Idea : Triggers (75)
Integrity Constraints
Before/ After / Instead
Old / New state
© Michel Adiba, Cours DACM 2007
© Michel AdibaDACM2007
Query Languages
¾Very large spectrum of DBMS from smartcard to (parallell) DB Machine ¾HW Influences: machine architecture (P, M, D), networks, data storage units, shared disk, shared nothing ¾SW Influences: O.S. & Middleware
© Michel AdibaDACM2007
Ambiant, Continuous
DBMS Evolution
¾Need for ubiquitous Data Access
© Michel AdibaDACM2007
¾Challenges for Data Management Techniques : searching, location dependency (GPS), reliability, consistency
GRIDS for Eand m business, science, engineering eand m work, business and society Learning,egovernment
anywhere, anytime, anywhere, any service, for all’ core technologies & “pullthrough” applications
Example : European 6° Framework Program 20032006Î
Data
¾The ability to access any information from anywhere, at any moment, using any kind of devices
Knowledge & interface technologies
Knowledge technologies
Low bandwidth, separate networks …..…
Mobile telephony (voice) ……………….
“Word” based information search ….…
Towards an “all inclusive knowledge society”
“Ambient Intelli ence” tomorrow “Our surrounding” is the interface
Health, eInclusion, mobility, environment safety, cultural heritage
Communication, computing& software technologies
components & µsytems
IST for societal challenges
Mobile/Wireless full multimedia
sources
Information System
© Michel AdibaDACM2007
Infinite bandwidth, convergence, ..
IST for work & business challenges
Demandingapplications
4
Context based Semantic based Agent based Scaleable
Ex. see Nods Project :http://wwwlsr.imag.fr/Les.Groupes/STORM © Michel AdibaDACM2007
& Mobiles
DBMS
¾Distribution, Autonomy, Mobility : users, servers, data
Pervasive & Ubiquitous Computing Ambient Intelligence ¾Existence of Wired & Wireless Networks (eg WiFi)
Use all senses, intuitive
mobile
workstation
© Michel Adiba, Cours DACM 2007
Software
Embedded Distributed Reliability Adaptability
Mobile: beyond 3G Fixed:All optical Integrated (IPv6) Audiovisual systems
Communicatio & networking
Security, privacy IPRs, dependabilty Smart cards,...
Trust & Security
Active
Distribution & Mobility
Contextbased knowledge handling
“Writing and reading” ….…………….…
Applied IST for major societal and economic challenges
interfaces
All senses Multilingual Intuitive ‘Surrounding’
IST toda PC based ………………………………
¾No more monolithic DBMS ¾Extensible, lightweight DBMS ¾Unbundled technology* ¾Componentbased architectures* (thickgrain vs. fine grain) ¾Components are providingServices ¾Blur the boundaries between OS & DBMS ¾Selfadaptive Systems ¾Multitier architectures, Web, P2P, GRID, … * See Dittrich, Geppert, Eds, “Component Database Systems”, MK 2000 * Chaudhuri & Weikum, Rethinking Database System Architecture: Towards a Selftuning RISCstyle Database System, VLDB 2000 © Michel AdibaDACM2007
Micro scale ……………………………
Silicon based ….…………………………
Information System Evolution
Wide adoption (eHealth,eLearning, …)
Nanoscale
+ new materials
eServices just emerging ………………
>70% of worldwide population on line
µ and nano systems
Multidiscplines New Sensing Networked New materials Nanoscale
CMOS : the limit SystemonChip Nanoscale New materials
, nano & opt electronics
Only5%ofglobalpopulationonline.
t1 t2 t3 t4 t5 History = {(ti,vi)} Set of tuple : (timestamp, value) Three important notions : - ti Periodicity - vi Modification - vi Persistency
© Michel AdibaDACM2007
v1
v2
v3
Wireless Grafitti – Data, data everywhere, T.Imielinski, B.Nath, VLDB02 © Michel AdibaDACM2007
© Michel AdibaDACM2007
Data History
© Michel AdibaDACM2007
Several versions of an object during its lifetime
time
v5
v4
vi
ti
¾DB : locally stored data about remote physical objects ¾Dataspace  Data is inherently dispersed & connected  Data lives on the physical object  Data is stored with the physical object  Data is another characteristic of the object (like weight, color) ¾Objects in dataspace produce and store their own data
Virtual organisation
Data Integration Systems
Web Search
¾Mobility & location Mgt for millions of mobile terminals ¾Wireless networks & small devices ¾Sensors : road, environmental, pollution, vending machines, home sensor, light detectors, body sensors ¾Sensors produce and/or store data ¾Sensors can also be mobile
Dealing with TIME & Calendard Time : Discrete vs Continuous Time Line : Left bound (Big Bang), Right bound (Big Crunch) ? Querying historical data SQL extensions: SQL2, TempSQL, TQUEL, TSQL2 A lot of work in the past
Enterprise Portals
Scientific Repositories
5
Low
Semantic Integration
Far
Administrative Proximity
© Michel Adiba, Cours DACM 2007
From DB to Dataspaces: a new abstraction for Information Management, M.Franklin, A.Halevy,D.Maier,ACM SIGMOD RECORD 2005 © Michel AdibaDACM2007
DBMS
High
DB vs. DataSpace (1)
DB vs. DataSpace (3)
Desktop Search
SpatialTemporal Data & Queries
DB vs. DataSpace (2)
Temporal & Historical BD
© Michel AdibaDACM2007
Near
Historical Data (2)
 History = {(ti,vi)}  Interpretation ? At time ti, the object value is vi  Separate history : construction (write) & exploitation (read)  « appendonly » DB semantic: adding tuples (ti, vi) without changing existing ones (what about correcting the past?)  Reading semantics is not obvious.
© Michel AdibaDACM2007
¾SQL extensionsSpatial Objects and ¾Spatial Operations, Relationships ¾Spatial Querying : designate an area on the screen ¾Graphical output ¾Spatial Criteria
SELECT FROM WHERE AND AND
road.geometry road, town town.name = "Orono" road.name = "Grove Street" road.geometry INSIDE town.geometry;
© Michel AdibaDACM2007
Location Dependent Queries (GPS) 1. Find the cheapest hotel in Paris 2. Find the cheapest and nearest hotel 3. Find the cheapest and nearest hotel / where I’ll be in one hour
1 : DB classical query 2 : necessary to locate the user (mobile or not mobile) and different results when time goes on… 3 : mobile user : where he/she is? What is the trajectory? Querying the future…
Cf. Moving Objects Databases, R.H.Guting, M.Schneider, MK, 2005 © Michel AdibaDACM2007
© Michel Adiba, Cours DACM 2007
History : example
History = {(ti,vi)}
Remarks :  Ti granularity ?  Interpretation : in [ti, ti+1[ value is vi (between 30 & 39, value is 65)  Same values for different timestamps : meaning what?  Here we note a regular increase of ti  But time goes on….. Data streams ?
T 10 20 30 40 50 60 70 80 90
V 45 46 65 30 30 66 123 78 12
© Michel AdibaDACM2007
Topological & Directional Relationships
A
B
disjoint(A,B)
A
touch(A,B)
A
B
B
covered_by(B,A) covers(A,B)
A
B
inside(B,A) contains(A,B)
A
B
overlap(A,B)
A B equal(A,B)
W
NW
SW
O1
N
S
O2
NE
SE
E
© Michel AdibaDACM2007
LDQ : Localisation of Mobile Objects Mobile Units : vehicles, mobile phone, PC, PDA, … Mobile Users: localisation? Independently of the unit he/she used Mobile Code : agent Localisation, two extremes: – Look everywhere – Store the location every where Several solutions in this spectrum
Chrysanthis & Pitoura, “Mobile and Wireless Database Access for Pervasive Computing”, Tutorial ICDE2000© Michel AdibaDACM2007
6
Data
Continuous Queries & Data Streams
Query & Data Duality
Index
Data
From Michael Franklin, UC Berkeley, NRC June 2002
Result
Index
Queries
Another look at DB Queries
¾Traditional : Persistent Data and Transient Queries
© Michel AdibaDACM2007
© Michel AdibaDACM2007
¾Repetitive execution of the same query : Continuous Queries
¾Streams of Data : Transient Data & Persistent Queries
FOREVER DO Execute query Q Return results Sleep for some time ENLOOP
9Semantic of CQ? 9AppendOnly DB and Monotone queries
© Michel Adiba, Cours DACM 2007
© Michel AdibaDACM2007
In The Beginning
Query
Index
Data
From Michael Franklin, UC Berkeley, NRC June 2002
Result
Query & Data Duality
Index
Data
From Michael Franklin, UC Berkeley, NRC June 2002
Result
In
ex
Queries
© Michel AdibaDACM2007
© Michel AdibaDACM2007
“Continuous Queries  CQ”
Execution of a Query « forever »
Similar to Triggers & Active DB
Query definition, registration, activation, evaluation.
Query evaluation produces a continuous result
At the origin, CQ for “Appendonly” DB
© Michel AdibaDACM2007
7
CQ Execution FOREVER DO Execute query Q Return results to user Sleep for some period of time ENLOOP
¾Attention : result depends when Q is executed (eg, every hour)
¾At each execution, new results are produced and added to the previous one.
¾Inefficient execution if Q is re executed completely each time.
Continuous & Monotone
τ ∞ Set := FOREVER DO Set t := current time τ Execute queries QM(t) and QM( ) τ Return QM(t) QM( ) to user τ Set := t Sleep for some period of time ENLOOP
© Michel AdibaDACM2007
 Keep track of the last result of Q  This algorithm works when Q is monotone, i.e. M Q (t )when t <tQ (t ) M 1 M 2 1 2
To or not to be Monotone?
© Michel AdibaDACM2007
monotone : Q(t1)Q(t2 ) when t1<t2
Beware of time reference inside the Query
Example Getdate() in SQL for the current date
E < Getdate() or EGetdate()YES
E > (or =,,) Getdate()
NO
Select * From m where m.expire > getdate()
Pbs also with NOT EXISTS
© Michel Adiba, Cours DACM 2007
© Michel AdibaDACM2007
“CQ Semantics”
Problem of non determinism because tuples selected by Q depend on when Q is executed (every hour)
Semantics of « continuous » ?
The results of a continuous query is the set of data that would be returned if the query were executed at every instant in time.
How to implement this in a DBMS ?
Monotone Queries
© Michel AdibaDACM2007
monotone : Q(t1)Q(t2 ) when t1<t2
 Some are naturally monotone, others can be converted and some are not  Attention « appendonly » DB Select * From T where T.A= 3  Selections, joins, are in general Monotone
Monotone Query (2)
Select * From T where T.BαGetdate() Ifα< : monotone True
False T.B Select * From T where T.BαGetdate() IfαNON monotone> : False
True
T.B
© Michel AdibaDACM2007
t
t
© Michel AdibaDACM2007
8
Msg M1 M2 M3 M4 M5 M6 M7 M8
Example Q(t1)Q(t2 ) when t1<t2
D 10 10 20 30 30 40 40 50
D < Date : Result ? Date = 5 { } Date =12 {M1, M2} Date = 25 {M1, M2, M3} Date = 45 {M1, M2, M3, M4, M5, M6, M7}
D > Date : Result ? Date = 5 {M1, M2, M3, M4, M5, M6, M7, M8} Date =12 {M3, M4, M5, M6, M7, M8} Date = 25 {M4, M5, M6, M7, M8} Date = 45 {M8}
Data Streams Applications
© Michel AdibaDACM2007
Applications generating data streams Network monitoring and traffic management Call detail records in telecommunications Transactions in retail chains, ATM operations in banks Log records generated by Web Servers Sensor network data,RFID tags Financial applications Manufacturing processes Characteristics of these applications – Large volume of data (+ terabytes) – Records coming with a high rate (with or without timestamp) – Data may come from mobile or ambient units
Problems : patterns findings, queries, real time statistical analysis on data streams
© Michel AdibaDACM2007
Sensor Networks Samuel Madden, UC Berkeley, December 2002, http://www.cs.berkeley.edu/~madden/berkeley.htm
Habitat Monitoring: Storm petrels on great duck island, microclimates on James Reserve.
Earthquake monitoring in shake-test sites.
Vehicle detection: sensors along a road, collect data about passing vehicles.
© Michel Adiba, Cours DACM 2007
Traditional monitoring apparatus. © Michel AdibaDACM2007
CQ & Data Streams
Unlimited sequence of data or events Very often associated to a temporal attribute Timestamp (generated + or  automatically) Logical time (application dependent) Continuously generated events “pushed” or “pulled” from producers to consumers CQ execution on one or several data streams
Time from data source t 16 p 3
t 15
t 14 p 2
t 13
t 12 p 1
t 11
t 10
stream p
© Michel AdibaDACM2007
IP Network Measurement Data
IP session data (collected using NetFlow) Source Destination Duration Bytes Protocol 10.1.0.2 16.2.3.7 12 20K http 18.6.7.1 12.4.0.3 16 24K http 13.9.4.3 11.6.8.2 15 20K http 15.2.2.9 17.1.2.1 19 40K http 12.4.3.8 14.8.7.4 26 58K http 10.5.1.3 13.0.0.1 27 100K ftp 11.1.0.6 10.3.4.5 32 300K ftp 19.7.1.2 16.5.5.8 18 80K ftp
AT&T collects 100 GBs of NetFlow data each day!
300 M call tuples /day (long distance) 10 B IP flows /day (backbone)
Input Stream
CQ on Data Streams
CQ
© Michel AdibaDACM2007
Result : 9Stream?
9Relation? 9Storage?
CQ Result is potentially unlimited Storage : part of input, part of output? Operators : non blocking (e.g. selection) or blocking (e.g. join)
© Michel AdibaDACM2007
9
Data Stream Processing*
A typical streaming query
Window SELECT S.city, AVG(temp) Clause FROM SOME_STREAM S [range by ‘5 seconds’ slide by ‘5 seconds’] WHERE S.state = ‘California’ GROUP BY S.city Window “I want to look at 5 “I want a result tuple seconds worth of data” every 5 seconds” Data Stream Result Tuple(s)Result Tuple(s)
* From HiFi project, Berkeley
Sam
lin
: Basics
© Michel AdibaDACM2007
Idea: A small random sample S of the data often well-represents all the data – For a fast approx answer, apply “modified” query to S – Example: select agg from R where R.e is odd (n=12) Data stream: 9 3 5 2 7 1 6 5 8 4 9 1
Sample S: 9 5 1 8
answer: 5 – If agg is avg, return average of odd elements in S – If agg is count, return average over all elements e in S of  n if e is odd answer: 12*3/4 =9  0 if e is even
Unbiased:For expressions involving count, sum, avg: the estimator is unbiased, i.e., the expected value of the answer is the actual answer Garofalakis, Gehrke,Rastogi,SIGMOD’02 #57
DSMS : problems
Relations : set of tuples or sequences? DB updates? Append only ? Continuous and Snapshot Queries ? Exact or approximate result ? One pass query evaluation ? Access plan fixed or adaptive ? Limited resources (ex. memory) Real time processing of data streams
L.Golab, T.Ozsu, Issues in Data Stream Management, Sigmod Record, June 2003
© Michel Adiba, Cours DACM 2007
© Michel AdibaDACM2007
Problems for stream processing
¾Taking into account arrival rate of data
¾Sometimes it is impossible to get a precise result. Approximation ? Sample ?
¾Joins : problem to join two unlimited streams. Window(s) ? Approximation ?
¾Computations : sum, count, avg, group by, etc. different kinds of functions
DBMS versus DSMS DSMS= Data Stream Management System
Persistent Relations Snapshot Queries Random access Access Plan is determined by query processor and by the physical organization of the DB
© Michel AdibaDACM2007
Dynamic Streams (& persistent relations)
Continuous Queries
Sequential Access Data characteristics and formats of arriving data non predictable
The 8 Requirements of Real-Time Stream Processing Michael Stonebraker, Ugur Çetintemel, Stan Zdonik SIGMOD RECORD Vol34 NO4, Dec 2005
© Michel AdibaDACM2007
Rule 1: Keep the Data Moving Process messages “instream”, without any requirement to store them to perform any operation or sequence of operations. Ideally the system should also use an active processing model. Rule 2: Query using SQL on Streams (StreamSQL) Support a highlevel language with builtin extensible stream oriented primitives and operators (e.g. Window). Rule 3: Handle Stream Imperfections (Delayed, Missing and OutofOrder Data) Rule 4: Generate Predictable Outcomes A stream processing engine must guarantee predictable and repeatable outcomes. Rule 5: Integrate Stored and Streaming Data Have the capability to efficiently store, access, and modify state information, and combine it with live streaming data. For seamless integration, the system should use a uniform language when dealing with either type of data. Rule 6: Guarantee Data Safety and Availability Ensure that the applications are up and available, and the integrity of the data maintained at all times, despite failures. Rule 7: Partition and Scale Applications Automatically A stream processing system must be able to distribute its processing across multiple processors and machines to achieve incremental scalability. Ideally, the distribution should be automatic and transparent. Rule 8: Process and Respond Instantaneously A stream processing system must have a highlyoptimized, minimaloverhead execution engine to deliver realtime response for highvolume applications © Michel AdibaDACM2007
10
DSMS Projects
– STREAM(Stanford) – Aurora& Borealis (Brown, Brandeis, MIT) – TelegraphCQ & TinyDB(Berkeley) Cooperation for a stream system benchmark” Others : Hancock, Gigascope (AT&T) , NILE (Purdue), StreamMill (UCLA), etc. A large number of papers, tutorials, some books, A very active area Impossible to have a detailed view: choice
Handoff
DB U
Mobile Environments
MU
CellWireless LAN (11 Mbps)
MU
DB U MU
BS
Fixed Network FH DB FH
BS
BS
FH
DB MU MU DB MU
© Michel AdibaDACM2007
FH Fixed Host
BS Base Station
MU Mobile Unit
DB
CellWireless radio (9 Kbps  2 Mbps)
Database
Mobile Network (MN) Fixed Network (FN)
© Michel AdibaDACM2007
Mobile DB & Transactions Transactions :new models (ACID?) – Mobile Transaction : a transaction where at least one MU is involved in the execution ƒProducts: PointBase, Navajo de Poet, Oracle Lite, DB2 Every Place, Sybase iAnywhere, SQL Server CE ƒResearch: Clustering, Twotier replication, Promotion, Reporting, Semanticsbased, Kangaroo transactions, MDSTP, Moflex transactions… P.SERRANO, C.RONCANCIO, M.ADIBA, “A survey of Mobile Transactions” International Journal on Distributed &Parallel Databases 16 (2): 193230, September 2004 ,
Recovery – Network partitioning is frequent – Disconnection is not (always) a failure – More logging
© Michel Adiba, Cours DACM 2007
© Michel AdibaDACM2007
Mobile Transactions
Mobile environment characteristics
¾Frequent disconnections ¾Variable bandwidth (55kbs or 100Mbs) ¾Communication cost maybe high ¾MU have “limited” capabilities ƒBatteries ƒComputing power ƒSecondary storage
Course Content
¾Moving Objects DB & Location Dependent Queries ¾Stream Data Management: general introduction & OLAP/Data Mining reminder ƒSensor DB: tinyDBMS & Tiny SQL ƒSTREAM Project (Stanford) ƒAURORA, BOREALIS & StreamBase (Product) ƒStream Management : other issues ¾Mobile Transactions
© Michel AdibaDACM2007
© Michel AdibaDACM2007
© Michel AdibaDACM2007
11
Voir icon more
Alternate Text