Tuesday, September 2, 2008

Database Sharding Vs. Mysql Clusters

Databases have been an integral part of information technology for a long time. They have taken various forms and shapes, relational, object oriented, hierarchical, in-memory, replicated, partitioned but mostly residing on the same machine. But the present internet party has more participants than ever before. And almost every action carried out by these netizens causes some results, which are stored in some database somewhere. Actions range from creating/deleting accounts, searching a phrase, commenting, rating to just clicking an ad, navigating a link or scrolling through a page. No single database can record the volume of information generated. Since only scaling-up is not an option anymore, more and more businesses are opting for scaling-out (may be in conjunction with scaling-up). Scaling-up is when you add more power to the existing machine or replace the existing machine with a more powerful one. In scaling-out more less expensive nodes are added to the system and the load is shared. In our case, the database is divided and stored on multiple machines. This brings us to sharding.

Broken glass in St Patricks Cathedral by féileacán


What is Sharding?

Sharding is the concept of splitting the table horizontally, and sometimes vertically, each split or shard stored on a different server. This is 'database partitioning' provided by the database providers and more. While partitioning allows only partitioning by primary key or index, whereas these restrictions do not exist in sharding. Master-Slave replication can also be done in sharding to deliver better availability. Please note that in such cases replication helps provide high availability and does not address scalability issues. For example, a database of search terms can be sharded based on the country it originated from. One server would be dedicated to store search terms of USA whereas another would store search terms from countries in Africa.

Sharding : Common Features

Since in sharding, databases are horizontally partitioned and spread across machines, joins are expensive or out-right not possible. Hence normalised tables are not optimal. Related data are put together in the same table, ensuring the queries are minimal. The nodes are less-expensive commodity machines. Blobs and large texts are moved to different servers ensuring the load is balanced. Since the shards function in the capacity of master databases, this is ideal for applications that have high write/read ratio.

See scalability.org for more on Sharding

MYSQL Clustering:

Of recent I found out about MySQL Clustering.

This is MySQL Clustering as described in mysql.com

"MySQL Cluster is a real-time open source transactional database designed for fast, always-on access to data under high throughput conditions. Cluster utilizes a 'shared nothing' architecture which does not require any additional infrastructure investment to provide 99.999% data availability with no single point of failure. "

A first time look gives an impression that it is quite similar to shards. Is it really? I tried to find out and tried to read about MySQL Cluster and Shards.

MySQL Cluster:

MySQL Cluster is a distributed database that ensures high performance using in-memory databases. It uses a distributed node-based architecture with automatic fast fail over in less than a second. While the whole database is usually loaded in-memory, it is possible in MySQL 6.0 to use disk cache for non-indexed fields. It limits Input/Output bottlenecks by asynchronously writing transactions logs to disk. The architecture is composed of three types of nodes.

Database Nodes:

All data is stored on the Database Nodes. The database transactions are handled by these nodes. The data is replicated in such a way that if a node goes down, there is always some other node that has a replication of the down node.

Application Nodes:

These nodes contain the applications that connect to the database. Failure of a database node does not affect the application.

Management Server Nodes:

These handle the system configuration. The management server nodes are used only at startup and for system reconfiguration and function independent of the other nodes.

This architecture ensures there is no single point of failure. Applications continue to run and data remains consistent, even if any one of the database, application, or management server nodes fail.

While the nodes can be distributed across geographies, the official FAQ of MySQL Clusters cautions against it, as it can greatly increase the response time. But if the cluster is geographically distributed, then the replication of data becomes asynchronous.

Sharding Vs MySQL Cluster

Differences in Intention:

Understanding this is very important as this underlies all the design and architectural decisions taken for the these two entities.

Sharding is designed for

1. Quick Scalability

2. Handling high write/read ratios

MySQL Clusters are designed for

1. Fast Responses

2. Always on 99.999% reliability

3. Handling high read/write ratios

Architectural Differences:

MySQL Clusters are geared to handle fast fail overs automatically. This is ensured by replicating the data in several database nodes to handle random node failures.

While replication is required and can possibly done through Master-Slave systems, this is not emphasised and is not the focus of Shards.

While any number of shards can be added to the system as long as the network supports, the number of nodes in a MySQL Cluster is limited. As of MySQL Cluster 5.16 , the maximum number of nodes a Cluster can have is 63 and the maximum number of supported data nodes is 48.

SQL Operations:

On Shards, joins are inherently impossible. MySQL Clusters support joins but they are much slower than on traditional databases as it has to be run against data on different nodes.

The MySQL uses the NDB storage engine which imposes some restrictions on memory consumed. The free memory that results from a row deletion is not recovered. So other tables cannot use this memory. But if another row is inserted into the same table, it can utilize the data. Only dropping the table will actually release the memory. With Shards , one has the freedom to choose the type of storage engine.

MySQL Clusters do not support a few features of the traditional database systems, owing to using the NDB storage engine. Some of them are Temporary tables, indexes on BLOB or TEXT data types, FULLTEXT indexes, partial transactions, partial rollbacks of transactions.

MySQL Cluster does not handle large transactions well; it is better to perform a number of small transactions with a few operations each than to attempt a single large transaction containing a great many operations.

The limitations of MySQL Cluster on naming is as follows: Attribute names are truncated to 31 characters, and if not unique after truncation give rise to errors. Database names and table names can total a maximum of 122 characters. The maximum number of attributes (that is, columns and indexes) per table is limited to 128.

Horizontal Partitioning:

On Shards the partitioning logic can be defined by the user. So essentially any logic can be employed. As of MySQL Cluster 5.1.6, all Cluster tables are by default partitioned by KEY using the table's primary key as the partitioning key. If no primary key exists, a hidden autoincrement column is created and used as primary key which is also used for partitioning.

On Shards, an user application ensures that the query goes to the shard where it is relevant based on the partitioning logic. This can add more work on the developer side, but reduces the no of queries executed.

For MySQL Cluster, there is a single point of contact to connect to the database, and the developer/user is screened from the internals of where the data is stored. The query may be executed on more than one database, (depending on the query).

Rebalancing Data & Backup Strategies:

In shard, These features have been custom implemented currently. There are some libraries that are coming up to assist the same.

On a MySQL Cluster these are implemented and require least work.

No comments: