Data
management is ever evolving. The emphasis is more on availability, scalability,
flexibility, multimedia and big/ large scale data.
At
least a few areas draw attention:
What
is the vision and concept behind this evolution?
What
are the capabilities/ offerings of the Big data systems compared to the
traditional databases?
What
is the impact of these capabilities on the consumer industries?
How
can the new capabilities enable new thinking to solve problems which were not possible
before?
I
found the following references providing answers to some of my questions on
this:
First,
in reference (1) Jim Gray emphasized the need for new data stores to
accommodate large scale data that
can happen with the “synthesis of database systems and file systems” when
file systems grow to peta byte scale archives with billions of files”. I think
this paper provides sufficient background for understanding the vision and
concept.
In
the second article under the references, the authors suggest that the “vision not only applies for
scientific data management, but also applies to any data intensive system”.
In [3],
R. Cattell examines the NOT only SQL (so called NoSQL) data stores
“designed to scale simple OLTP style application loads over many servers”
The
author’s survey is hopefully the most closest we can get to understand the
basis, strengths of the new data stores systems in contrast to the
traditional, scalable relational db systems.
The
relational db systems are known to
provide
ACID transactional properties.
In the
case of the Not only SQL systems, "updates are
eventually
propagated, but there are limited guarantees
on the
consistency of reads. Some authors suggest a
“BASE”
acronym in contrast to the “ACID” acronym:
• BASE = Basically
Available, Soft state,
Eventually
consistent
• ACID =
Atomicity, Consistency, Isolation, and
Durability. The
idea is that by giving up ACID constraints, one
can
achieve much higher performance and scalability".
References:
1.
J. Gray et al.,
Scientific data management in the coming decade, SIGMOD Record, 34 (4): 34 – 41
[2005]
2.
Hung-chih
Yang, Ali Dasdan, Ruey-Lung Hsiao, and D. Stott Parker. 2007. Map-reduce- merge: simplified relational data processing
on large clusters. In Proceedings
of the 2007 ACM SIGMOD
international conference on Management of data (SIGMOD '07). ACM, New York, NY, USA, 1029-1040.
DOI=10.1145/1247480.1247602 http://doi.acm.org/10.1145/1247480.1247602
3.
Rick Cattell. 2011.
Scalable SQL and NoSQL data stores. SIGMOD
Rec. 39, 4
(May 2011), 12-27. DOI=10.1145/1978915.1978919