Databases in the Cloud can be the foundation of almost all web applications of a company and thus have an essential importance for its success. Previously, we published the basics on Databases in the Cloud, but this is an in detail article. New technologies such as cloud computing raises the question whether the conventional procedures with their own database servers on the corporate network, are still the best way to organize its data management.
In this article, these two approaches are more closely examined and compared. The objective is to give a basic understanding of cloud computing, common database services for the enterprise and basics of cloud-based database services and an outlook for the future.
At the beginning of this article, the basics of cloud computing arrives. This article describes the concept and the different types and characteristics of the cloud, to provide a basic knowledge. Subsequently, in the linked article there is the ellaboration of the traditional Cloud Computing service models.
---
Table of Contents |
Databases in the Cloud : Technical Background
Virtualization
A technique that allows effective implementation of cloud computing is virtualization. Logical server systems and services are treated separately from the physical level and thus create an important foundation for cloud computing. Hardware resources such as computing power and storage are jointly multiple virtual servers available to users (resource pooling), the execution of those applications is done, however, they are completely isolated from each other so that no disturbance should arise from parallel working users. Logically, everyone can use the ascribed part of the resources and possibly also the other one, so far as they are not currently needed, but more capacity is physically not mandatory.
Just then, when the end user does not have powerful hardware, or may have, Services in the cloud are interesting. The “power” can be shifted from the cloud to another location. There is only one need – network access to the cloud, so for example, an Internet connection.
Databases in the Cloud
NoSQL
In the cloud different types of database services are offered. Typical relational databases are offered as a service, but also non-relational or NoSQL databases are offered. Here, the “NoSQL” does not stand for “no SQL” but for “Not Only SQL”, not just SQL.This trend evolved to compensate for the weaknesses of relational databases in terms of scalability. While it operates in the standard SQL “up-scaling”, ie the server provides more hardware resources, we follow in the cloud and NoSQL in the “scale-out” method. Here, instead of a server to allocate more resources simply another server (cheap standard servers, virtual machines or cloud instances) is added. The coordination of the write and read operations performed by a load balancer. Thus, in capacity bottlenecks fast action is possible and as soon as the resources are no longer needed, removing the server from the network again. These operations can even be automated, so that, no intervention of the administrative side is necessary.
NoSQL is divided into different main categories, document-oriented databases, key-value databases, column-oriented databases and graph databases. The categories differ according to their field of application. Graph databases, for example, have specialized to show relationships between objects and define. For relational databases, the resolution of such relationships often needed multiple queries. Neo4J is an example of this type of database. Instead of conventional data sets for constructing new nodes that are linked together by the relationships that you define between them. A Example in Java code:
1 2 3 | Node FirstNode = graphDb.createNode (); Node secondNode = graphDb.createNode (); Relationship relationship = firstNode.createRelationshipTo (secondNode, MyRelationshipTypes.OWNS); |
Here two nodes are created. Between these two elements, a relationship exists – “owns” – which is defined elsewhere as a simple enum. Information about the nodes and their relationships are stored as properties:
1 2 3 | firstNode.setProperty ("name", "Wallace"); secondNode.setProperty ("name", "Grommit"); relationship.setProperty ("pays", "dog tax"); |
The first node (Wallace) owns the second node (Grommit) – he pays tax on dogs. Based on this simple model can be easily scrollable graph of relationships can be created.
In document-oriented databases, the term “document” is misleading because it is not about documents per se, but to data pages or entries that contain all the necessary information about yourself. This means that the data page is self-explanatory and complete in itself. Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | { "ID" = "700" "NAME" = "GHOSH" "FIRST NAME =" ABHISHEK " "ADDRESS" = "F04/502, PEERLESS NAGAR" "ACTIVITIES" = ["BUSINESS", "TECHNOLOGY", "HEALTH"] } { "ID" = "750" "NAME" = "MUELLER" "FIRST NAME =" LARS " "ADDRESS" = "Rentnerweg. 143, 67891 pensioners bush" "MARITAL STATUS" = "SINGLE" } |
As you can see, the data pages do not have the same fields. Search for individual attributes provide only the data pages with the respective attribute returns (ID = 700 or marital status = single). This explains why this type of database is scalable because it does not matter on which server it is located. key-value stored are the simplified form of document-oriented databases.
For column-oriented databases or Wide Column Stores, the idea is, a separate table store each attribute next to each other (column based) and not, as in relational databases in a table with each other (line based). This type of storage is mainly used in dimension-related applications such as OLAP cube or data warehouse environments. Today’s systems use a technique that can partially called column-oriented and which is based on Google’s BigTable architecture. Google describes this architecture “sparse, distributed multi-dimensional sorted map. These are usually multi-dimensional tables in the following format:
1 | n * [domain / Keys Pace] x [Item / Column Family] x [Key x] n * [Key + Value] |
ACID consistency model is softened with NoSQL databases. The method used here is BASE model (Basically Available, Soft state, eventually consistent) sets out the priorities on availability instead of the data consistency. For this reason, such systems are mainly used for non-critical and usually stores large amounts of data such as in social networks. The result is the hype around NoSQL in distributed systems through the CAP Theorem, which databases ascribes three characteristics:
- Consistency – All members in the cluster have the same data status
- Availability – Even if a cluster member fails, the database is still online.
- Partition Tolerance – Despite of the loss of data, the system can continue to operate
For this reason, large distributed systems such as Facebook, Google and Amazon decided to go with partition tolerance and high availability.
Databases in the Cloud : Providers
- Amazon SimpleDB
- Amazon DynamoDB
- Amazon Relational Database Services (RDS)
- Microsoft Sql Azure – Microsoft, Rackspace
- Oracle Database Cloud Service – Rackspace, Others
- Google AppEngine DataStore – Google Only
- MySQL-based database services – Rackspace, Salesforce.com / Database.com, Success Bricks – ClearDB, Xeround