Introduction to No-SQL

hello everyone,

I’m starting a new series of blog post around no-sql. Why did I choose the domain ? Because, I think that the future will pass by No-SQL, and that evereybody talks about big data. So before talking about big data, for me, I think it is essential to first talk about No-SQL.

What is No-SQL ?

Before giving any definition, I will present a very important matter in database environment : The CAP Theorem

CAP Theorem

The CAP theorem explains that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees :

  • Consistency (all nodes see the same data at the same time)
  • Availability (a guarantee that every request receives a response about whether it was successful or failed)
  • Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

All database sytems can only provide two guarantess, and all of them can be classified with the CAP Theorem (include SQL and NO-SQL system) :

cap theorem

As we can seen it, SQL and No-SQL cohabit in database environment and give an answer to differents issues. You cannot simply forget SQL system for using No-SQL.

OK got it, but what is hidden behind the terme NO-SQL ?

Wikipedia say :

A NoSQL or Not Only SQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability. The data structure (e.g. key-value, graph, or document) differs from the RDBMS, and therefore some operations are faster in NoSQL and some in RDBMS. There are differences though and the particular suitability of a given NoSQL DB depends on the problem to be solved (e.g. does the solution use graph algorithms?). The appearance of mature NoSQL databases has reduced the rationale for Java content repository (JCR) implementations.

NoSQL databases are finding significant and growing industry use in big data and real-time web applications.[1] NoSQL systems are also referred to as « Not only SQL » to emphasize that they may in fact allow SQL-like query languages to be used. Many NoSQL stores compromise consistency (in the sense of the CAP theorem) in favor of availability and partition tolerance. Barriers to the greater adoption of NoSQL stores include the use of low-level query languages, the lack of standardized interfaces, and the huge investments already made in SQL by enterprises.[2] Most NoSQL stores lack true ACID transactions, although a few recent systems, such as FairCom c-treeACE, Google Spanner and FoundationDB, have made them central to their designs.

Remember the CAP theorem, No-SQL only gives two different combinaisons relative to SGDB.

In fact, No-SQL offers some very interesting properties like response time on very large volume (we talk about peta octet), and there is neither schema problem nor data type. On the opposite side, there is neither join operator nor agregation, you need to do this in your code.

Why I won’t compare performance between SQL and No-SQL

On the web, you find lot of comparison between SQL and No-SQL systems. For me, that doesn’t make any sense because they don’t work on the same segment. It’s like adding cabbage and carrots. The only significant possible comparison is about volume, and it’s sure that SQL manages the lowest volume. But everyone agrees with me to say that the volum is only one side of database system problematics.

Database Classification

Now that we have defined No-SQL, let’s classify them. There are four main categories but more do exist  (you can find a list here)

Data Model Performance Scalability Flexibility Complexity Functionality Project Name
Key–value Stores high high high moderate associative array  Voldemort
Column Store high high moderate low columnar database  Hbase, Cassandra
Document Store high variable (high) high low object model  MongoDB, SImpleDB
Graph Database variable variable high high graph theory  Neo4J, AllegroGraph

 

For a developed tabular click here.

In next articles, I will try to introduce thoses databse systems