May 11, 2012 Leave a comment
More and more businesses look into scaling solutions for their DB. That is for good reasons! Usually the number of clients or sales or licenses grows over time as a linear function. As a result we have a linear function for number of DB Transactions, and amount of data accumulated by the business. However the query response time grows exponentially. Thus, the performance of your single DB solution will yield a bad user experience.
Fig.1, DB Query Response Time
As we may clearly see from the diagram, eventually the business will run into the DB bottleneck.
DB Scaling solutions: Scale UP vs. Scale OUT
When scaling the DB you may consider two approaches: Scale Up and Scale Out.
This approach relates to improving the hardware solution of the DB. That means use faster processor, more RAM, faster Hard Drive, etc. This solution is works very well when you have to improve DB performance without changing the architecture. I.e. when dealing with a legacy system. It is not difficult however to see that this approach has limitations, including associated price for the upgraded units.
The Scale OUT approach relates to implementing a DB solution in which the I/O process will be distributed across multiple DB partitions. All options of Scale OUT assume some amount of re-work. What option to choose? Most of the choices revolve around the nature of your data and the nature of use of the data. Let us review the possible options of Scale OUT:
- Table partitioning
- Master/Slave configurations
- Custer Computing
Table Partitioning. In the case of Table Partitioning a large table is split into two or more physically separate partitions. For instance you may have clients who live on the east coast to be stored in the table CustomersEast and clients who live on the West Coast to be stored in the table CustomersWest. In case of such distribution the queries will end up hitting two separate tables. The tables can also be located in different DB files which helps the I/O performance.
Master/Slave configurations. In the Master/Slave solution we have one database which is called “Master” and multiple databases called “Slaves”. The Master DB is set up for read/write access; the Slave databases are read-only. There is a data replication between the Master and the Slaves. Master/Slave configuration is suggested by Microsoft when doing Peer-to-Peer Replication. The drawback of this solution relates to the situation in which the Master fails. The Master/Slave approach does not need complex changes and provides good performance result. Microsoft recommends this solution when you expect moderate amount of data updates.
Cluster Computing. Similar to the Master/Slave in the DB Cluster we have one database node which is a DB available for read/write and multiple nodes which are in the read-only mode. All data updates are received by the “Master” node and replicated into the read-only nodes. The DB Cluster Computing solution addresses the case of the failing Master. In such scenario one of the read-only nodes becomes the new “Master”. Thus the DB Cluster solution is more reliable than the Master/Slave solution.
Sharding. The term “Sharding” or “Shared nothing” was coined by Google engineers and implies a solution in which the application operates with “Shards” which can be physically separate databases. With Sharding there are queries which run inside the shard and queries which are distributed across multiple shards. It is important to design your data model to minimize the amount of distributed queries.
Fig.2, Scaling OUT: Sharding
SQL Azure Federations
Azure Federation is a Sharding technology which allows distributing the DB transactions across multiple databases which are called Federation Members.
Fig.3, SQL Azure Feredation
Let us review some terminology. The SQL Azure Federation have following essential objects:
- Federation object
- Federation Key
- Federation Member
- Atomic Unit
Is an object stored in SQL Azure database. This object is responsible for the distribution schema of the Federation Members as well as the type of the Federation Key. The Federation object is of a Range value. It covers the possible values of the Federation Key.
Federation Key is a column in the Federated Table. The value of the Federation Key determines in which Federation Member the data row is stored. As you can see from the listing, the CREATE TABLE instruction for the Federated table differs from the regular CREATE TABLE by “FEDERATED ON” clause.
Federation Member is a physically separate SQL Azure database. The Federation Member holds data related to a range of Federation Keys, configured for this Federation Member in the Federation object. The Federation Member can have two types of tables: Federated tables and Non-Federated tables.
Atomic Unit represents all rows of data from Federated Table(s) which have a particular value of the Federation Key.
How does it work?
Once you connect to the SQL Azure DB you create a Federation object. The Federation object is created and stored in the DB you connected. This database becomes Federation Root. When you create a Federation object your first Federation Member, a separated SQL Azure DB, is also created. The size and the type of the newly created Federation Member SQL Azure DB is inherited from the size and type of the Federation Root SQL Azure DB.
Fig.4, Azure Federation Databases and Tables
Management of Azure Federations
Azure Federation is an out-of-the box solution. As such it provides an interface for following manipulation of the Federation object:
- Split/Drop Azure Federation
- Monitor Azure Federation
Split/Drop Azure Federation
Once you create the Federation object, your first Federation Member is created. Hopefully your business picks up and your Federation Member DB is not enough to handle the load. In such case you need to add more databases. That is achieved by splitting your Federation Member. On contrary, if you need to downsize, you may want to drop the Federation member.
I would like to provide the example listing which you may use to explore Azure Federation. Run this code segment by segment. Be sure to establish a connection to Azure DB first.
Monitoring Azure Federation
You can monitor your Azure Federation through a set of Diagnostic Views and tables:
- Federation Operation Views. These system diagnostic views contain information about the current operations on Azure Federation Objects
- Federation History Tables. The Federation History Tables contain the historical information about the operations which affect the Federation object.
Federation Metadata Tables – contain the current distribution of federation members in the Federation.
Fig.5 Azure Diagnostics Views and Tables
Pricing and Billing
The pricing of Azure Federations is no different than pricing of SQL Azure:
- DB Charges are prorated to a day.
Example: Cost of 1Gb Web edition is $0.33/day
- All DB that exist in the day are charged
- Only DB that are in ready state are charged
Following operations related to Azure Federations will affect your bill:
- Federation Creation
- Federation Repartitioning Operations
Modification of Federation Billing properties. Please see below how you may alter your SQL Azure DB to reduce/upgrade the size and the edition of your DB.
SQL Azure Federations provide a powerful OUT-OF-THE box solution for scaling out your SQL Azure DB. The solution is based on SQL Azure and comes with all good things of SQL Azure including two backup copies, pay-as –you-go pricing model. Finally the bulk of the db partitioning complexity is implemented “under the hood”. That is very convenient as it allows the developer to concentrate on creating the scalable solution.