Chapter 5. MySQL Replication for Scale-Out
Tải bản đầy đủ - 0trang
degree in Baron Schwartz et al.’s High Performance MySQL (O’Reilly, http://oreilly.com/
catalog/9780596101718/), but we will talk about how to set up replication in MySQL
to make the best use of scale-out. After some basic instructions for replication, we’ll
start to develop a Python library that makes it easy to administer replication over large
sets of servers, and we’ll examine how replication fits into your organization’s business
needs.
The most common uses for scaling out and replication are:
Load balancing for reads
Since the master is occupied with updating data, it can be wise to have separate
servers to answer queries. Since queries only need to read data, you can use replication to send changes on the master to slaves—as many as you feel you need—
so that they have current data and can process queries.
Load balancing for writes
High-traffic deployments distribute processing over many computers, sometimes
several thousand. Here, replication plays a critical role in distributing the information to be processed. The information can be distributed in many different ways
based on the business use of your data and the nature of the use:
• Distributed based on the information’s role. Rarely updated tables can be kept
on a single server, while frequently updated tables are partitioned over several
servers.
• Partitioned by geographic region so that so that traffic can be directed to the
closest server.
Disaster avoidance through hot standby
If the master goes down, everything will stop—it will not be possible to execute
(perhaps critical) transactions, get information about customers, or retrieve other
critical data. This is something that you want to avoid at (almost) any cost since it
can severely disrupt your business. The easiest solution is to configure a slave with
the sole purpose of acting as a hot standby, ready to take over the job of the master
if it fails.
Disaster avoidance through remote replication
Every deployment runs the risk of having a data center go down due to a disaster,
be it a power failure, an earthquake, or a flood. To mitigate this, use replication to
transport information between geographically remote sites.
Making backups
Keeping an extra server around for making backups is very common. This allows
you to make your backups without having to disturb the master at all, since you
can take the backup server offline and do whatever you like with it.
Report generation
Creating reports from data on a server will degrade the server’s performance, in
some cases significantly. If you’re running lots of background jobs to generate
reports, it’s worth creating a slave just for this purpose. You can get a snapshot of
148 | Chapter 5: MySQL Replication for Scale-Out
the database at a certain time by stopping replication on the slave and then running
large queries on it without disturbing the main business server. For example, if you
stop replication after the last transaction of the day, you can extract your daily
reports while the rest of the business is humming along at its normal pace.
Filtering or partitioning data
If the network connection is slow, or if some data should not be available to certain
clients, you can add a server to handle data filtering. This is also useful when the
data needs to be partitioned and reside on separate servers.
Scaling Out Reads, Not Writes
It is important to understand that scaling out in this manner scales out reads, not writes.
Each new slave has to handle the same write load as the master. The average load of
the system can be described as:
So if you have a single server with a total capacity of 10,000 transactions per second,
and there is a write load of 4,000 transactions per second on the master, while there is
a read load of 6,000 transactions per second, the result will be:
Now, if you add three slaves to the master, the total capacity increases to 40,000 transactions per second. Because the write queries are replicated as well, each query is executed a total of four times—once on the master and once on each of the three slaves—
which means that each slave has to handle 4,000 transactions per second in write load.
The total read load does not increase because it is distributed over the slaves. This
means that the average load now is:
Notice that in the formula, the capacity is increased by a factor of 4, since we now have
a total of four servers, and replication causes the write load to increase by a factor of 4
as well.
It is quite common to forget that replication forwards to each slave all the write queries
that the master handles. So you cannot use this simple approach to scale writes, only
reads. Later in this chapter, you will see how to scale writes using a technique called
sharding.
Scaling Out Reads, Not Writes | 149
The Value of Asynchronous Replication
MySQL replication is asynchronous, a type of replication particularly suitable for modern applications such as websites.
To handle a large number of reads, sites use replication to create copies of the master
and then let the slaves handle all read requests while the master handles the write
requests. This replication is considered asynchronous because the master does not wait
for the slaves to apply the changes, but instead just dispatches each change request to
the slaves and assumes they will catch up eventually and replicate all the changes. This
technique for improving performance is usually a good idea when you are scaling out.
In contrast, synchronous replication keeps the master and slaves in sync and does not
allow a transaction to be committed on the master unless the slave agrees to commit it
as well. That is, synchronous replication makes the master wait for all the slaves to keep
up with the writes.
Asynchronous replication is a lot faster than synchronous replication, for reasons our
description should make obvious. Compared to asynchronous replication, synchronous replication requires extra synchronizations to guarantee consistency. It is usually
implemented through a protocol called two-phase commit, which guarantees consistency between the master and slaves, but requires extra messages to ping-pong between
them. Typically, it works like this:
1. When a commit statement is executed, the transaction is sent to the slaves and the
slave is asked to prepare for a commit.
2. Each slave prepares the transaction so that it can be committed, and then sends an
OK (or ABORT) message to the master, indicating that the transaction is prepared
(or that it could not be prepared).
3. The master waits for all slaves to send either an OK or an ABORT message.
a. If the master receives an OK message from all slaves, it sends a commit message
to all slaves asking them to commit the transaction.
b. If the master receives an ABORT message from any of the slaves, it sends an
abort message to all slaves asking them to abort the transaction.
4. Each slave is then waiting for either an OK or an ABORT message from the master.
a. If the slaves receive the commit request, they commit the transaction and send
an acknowledgment to the master that the transaction is committed.
b. If the slaves receive an abort request, they abort the transaction by undoing
any changes and releasing any resources they held, then send an acknowledgment to the master that the transaction was aborted.
5. When the master has received acknowledgments from all slaves, it reports the
transaction as committed (or aborted) and continues with processing the next
transaction.
150 | Chapter 5: MySQL Replication for Scale-Out
What makes this protocol slow is that it requires a total of four messages, including the
messages with the transaction and the prepare request. The major problem is not the
amount of network traffic required to handle the synchronization, but the latency introduced by the network and by processing the commit on the slave, together with the
fact that the commit is blocked on the master until all the slaves have acknowledged
the transaction. In contrast, asynchronous replication requires only a single message
to be sent with the transaction. As a bonus, the master does not have to wait for the
slave, but can report the transaction as committed immediately, which improves performance significantly.
So why is it a problem that synchronous replication blocks each commit while the slaves
process it? If the slaves are close to the master on the network, the extra messages needed
by synchronous replication make little difference, but if the slaves are not nearby—
maybe in another town or even on another continent—it makes a big difference.
Table 5-1 shows some examples for a server that can commit 10,000 transactions per
second. This translates to a commit time of 0.1 ms (but note that some implementations, such as MySQL Cluster, are able to process several commits in parallel if they
are independent). If the network latency is 0.01 ms (a number we’ve chosen as a baseline
by pinging one of our own computers), the transaction commit time increases to 0.14
ms, which translates to approximately 7000 transactions per second. If the network
latency is 10 ms (which we found by pinging a server in a nearby city), the transaction
commit time increases to 40.1 ms, which translates to about 25 transactions per second!
In contrast, asynchronous replication introduces no delay at all, because the transactions are reported as committed immediately, so the transaction commit time stays at
the original 10,000 per second, just as if there were no slaves.
Table 5-1. Typical slowdowns caused by synchronous replication
Latency (ms)
Transaction commit time (ms)
Equivalent transactions per second
Example case
0.01
0.14
~7,100
Same computer
0.1
0.5
~2,000
Small LAN
1
4.1
~240
Bigger LAN
10
40.1
~25
Metropolitan network
100
400.1
~2
Satellite
The performance of asynchronous replication comes at the price of consistency. Recall
that in asynchronous replication the transaction is reported as committed immediately,
without waiting for any acknowledgment from the slave. This means the master may
consider the transaction committed when the slave does not. As a matter of fact, it
might not even have left the master, but is still waiting to be sent to the slave.
The Value of Asynchronous Replication | 151
There are two problems with this that you need to be aware of:
• In the event of crashes on the master, transactions can “disappear.”
• A query executed on the slaves might return old data.
Later in this chapter, we will talk about how to ensure you are reading current data,
but for now, just remember that asynchronous replication comes with its own set of
caveats that you have to handle.
Managing the Replication Topology
A deployment is scaled by creating new slaves and adding them to the collection of
computers you have. The term replication topology refers to the ways you connect servers using replication. Figure 5-1 shows some examples of replication topologies: a simple topology, a tree topology, a dual-master topology, and a circular topology.
Figure 5-1. Simple, tree, dual-master, and circular replication topologies
These topologies are used for different purposes: the dual-master topology handles
failovers elegantly, for example, and circular replication and dual masters allow different sites to work locally while still replicating changes over to the other sites.
The simple and tree topologies are used for scale-out. The use of replication causes the
number of reads to greatly exceed the number of writes. This places special demands
on the deployment in two ways:
It requires load balancing
We’re using the term load balancing here to describe any way of dividing queries
among servers. Replication creates both reasons for load balancing and methods
for doing so. First, replication imposes a basic division of the load by specifying
writes to be directed to the masters while reads go to the slaves. Furthermore, you
sometimes have to send a particular query to a particular slave.
152 | Chapter 5: MySQL Replication for Scale-Out
It requires you to manage the topology
Servers crash sooner or later, which makes it necessary to replace them. Replacing
a crashed slave might not be urgent, but you’ll have to replace a crashed master
quickly.
In addition to this, if a master crashes, clients have to be redirected to the new
master. If a slave crashes, it has to be taken out of the pool of load balancers so no
queries are directed to it.
To handle load balancing and management, you should put tools in place to manage
the replication topology, specifically tools that monitor the status and performance of
servers and tools to handle the distribution of queries.
For load balancing to be effective, it is necessary to have spare capacity on the servers.
There are a few reasons for ensuring you have spare capacity:
Peak load handling
You need to have margins to be able to handle peak loads. The load on a system
is never even but fluctuates up and down. The spare capacity necessary to handle
a large deployment depends a lot on the application, so you need to monitor it
closely to know when the response times start to suffer.
Distribution cost
You need to have spare capacity for running the replication setup. Replication
always causes a “waste” of some capacity on the overhead of running a distributed
system. It involves extra queries to manage the distributed system, such as the extra
queries necessary to figure out where to execute a read query.
One item that is easily forgotten is that each slave has to perform the same writes
as the master. The queries from the master are executed in an orderly manner (that
is, serially), with no risk of conflicting updates, but the slave needs extra capacity
for running replication.
Administrative tasks
Restructuring the replication setup requires spare capacity so you can support
temporary dual use, for example, when moving data between servers.
Load balancing works in two basic ways: either the application asks for a server based
on the type of query, or an intermediate layer—usually referred to as a proxy—analyzes
the query and sends it to the correct server.
Using an intermediate layer to analyze and distribute the queries (as shown in Figure 5-2) is by far the most flexible approach, but it has two disadvantages:
• Processing resources have to be spent on analyzing queries. This delays the query,
which now has to be parsed and analyzed twice: once by the proxy and again by
the MySQL server. The more advanced the analysis, the more the query is delayed.
Depending on the application, this may or may not be a problem.
Managing the Replication Topology | 153
Figure 5-2. Using a proxy to distribute queries
• Correct query analysis can be hard to implement, sometimes even impossible. A
proxy will often hide the internal structure of the deployment from the application
programmer so that she does not have to make the hard choices. For this reason,
the client may send a query that can be very hard to analyze properly and might
require a significant rewrite before being sent to the servers.
One of the tools that you can use for proxy load balancing is MySQL Proxy. It contains
a full implementation of the MySQL client protocol, and therefore can act as a server
for the real client connecting to it and as a client when connecting to the MySQL server.
This means that it can be fully transparent: a client can’t distinguish between the proxy
and a real server.
The MySQL Proxy is controlled using the Lua programming language. It has a built-in
Lua engine that executes small—and sometimes not so small—programs to intercept
and manipulate both the queries and the result sets. Since the proxy is controlled using
a real programming language, it can carry out a variety of sophisticated tasks, including
query analysis, query filtering, query manipulation, and query distribution.
Configuration and programming of the MySQL Proxy are beyond the scope of this
book, but there are extensive publications about it online. Some of the ones we find
useful are:
http://dev.mysql.com/tech-resources/articles/proxy-gettingstarted.html
“Getting Started with MySQL Proxy” is Giuseppe Maxia’s classic article introducing the MySQL Proxy.
154 | Chapter 5: MySQL Replication for Scale-Out
http://forge.mysql.com/wiki/MySQL_Proxy
The MySQL Proxy wiki page on MySQL Forge contains a lot of information about
the proxy, including a lot of references and examples.
http://forge.mysql.com/wiki/MySQL_Proxy_RW_Splitting
This is a description on MySQL Forge of how you can use MySQL Proxy for read/
write splitting, that is, sending read queries to some set of servers and write queries
to the master.
The precise methods for using a proxy depend entirely on the type of proxy you use,
so we will not cover that information here. Instead, we’ll focus on using a load balancer
in the application layer. There are a number of load balancers available, including:
•
•
•
•
Hardware
Simple software load balancers, such as Balance
Peer-based systems, such as Wackamole
Full-blown clustering solutions, such as the Linux Virtual Server
It is also possible to distribute the load on the DNS level and to handle the distribution
directly in the application.
Example of an Application-Level Load Balancer
Let’s tackle the task of designing and implementing a simple application-level load
balancer to see how it works. In this section, we’ll implement read/write splitting. We’ll
extend the load balancer later in the chapter to handle data partition.
The most straightforward approach to load balancing at the application level is to have
the application ask the load balancer for a connection based on the type of query it is
going to send. In most cases, the application already knows if the query is going to be
a read or write query and also which tables will be affected. In fact, forcing the application developer to consider these issues when designing the queries may produce other
benefits for the application, usually in the form of improved overall performance of the
system. Based on this information, a load balancer can provide a connection to the right
server, which the application then can use to execute the query.
A load balancer on the application layer needs to have a central store with information
about the servers and what queries they should handle. Functions in the application
layer send queries to this central store, which returns the name or IP address of the
MySQL server to query.
Let’s develop a simple load balancer like the one shown in Figure 5-3 for use by the
application layer. We’ll use PHP for the presentation logic because it’s so popular on
web servers. It is necessary to write functions for updating the server pool information
and functions to fetch servers from the pool.
Managing the Replication Topology | 155
Figure 5-3. Load balancing on the application level
The pool is implemented by creating a table with all the servers in the deployment in
a common database that is shared by all nodes. In this case, we just use the host and
port as primary key for the table (instead of creating a host ID) and create a common
database to contain the tables of the shared data.
You should duplicate the central store so that it doesn’t create a single
point of failure. In addition, because the list of available servers does
not often change, load balancing information is a perfect candidate for
caching.
For the sake of simplicity—and to avoid introducing dependencies on
other systems—we demonstrate the application-level load balancer using a pure MySQL implementation.
There are many other techniques that you can use that do not involve
MySQL. The most common technique is to use round-robin DNS; another alternative is using Memcached, which is a distributed in-memory
key/value store.
Also note that the addition of an extra query might be a significant
overhead for high-performing systems and should be avoided.
The load balancer lists servers in the load balancer pool, separated into categories based
on what kind of queries they can handle. Information about the servers in the pool is
stored in a central repository. The implementation consists of a table in the common
database given in Example 5-1, the PHP functions in Example 5-2 for querying the load
balancer from the application, and the Python functions in Example 5-3 for updating
information about the servers.
156 | Chapter 5: MySQL Replication for Scale-Out
Example 5-1. Database tables for the load balancer
CREATE TABLE nodes (
host CHAR(28) NOT NULL,
port INT UNSIGNED NOT NULL,
sock CHAR(64) NOT NULL,
type SET('READ','WRITE') NOT NULL DEFAULT '',
PRIMARY KEY (host, port)
);
We store for each host regarding whether it accepts reads, writes, both, or neither. This
information is stored in the type field. By setting it to the empty set, we can bring the
server offline, which is important for maintenance.
A simple SELECT will suffice to find all the servers that can accept the query. Since we
want just a single server, we limit the output to a single line using the LIMIT modifier
to the SELECT query, and to distribute queries evenly among available servers, we use
the ORDER BY RAND() modifier.
Using the ORDER BY RAND() modifier requires the server to sort the rows
in the table, which may not be the most efficient way to pick a number
randomly (it’s actually a very bad way to pick a number randomly), but
we picked this approach for demonstration purposes only.
Example 5-2 shows the PHP function getServerConnection, which queries for a server
and connects to it. It returns a connection to the server suitable for issuing a query, or
NULL if no suitable server can be found. The helper function connect_to constructs a
suitable connection string given its host, port, and a Unix socket. If the host is local
host, it will use the socket to connect to the server for efficiency.
Example 5-2. PHP function for querying the load balancer
function connect_to($host, $port, $socket) {
$db_server = $host == "localhost" ? ":{$socket}" : "{$host}:{$port}";
return mysql_connect($db_server, 'query_user');
}
$COMMON = connect_to(host, port, socket);
mysql_select_db('common', $COMMON);
define('DB_WRITE', 'WRITE');
define('DB_READ', 'READ');
function getServerConnection($queryType)
{
global $COMMON;
$query = <<
SELECT host, port, sock FROM nodes
WHERE FIND_IN_SET('$queryType', type)
ORDER BY RAND() LIMIT 1
END_OF_SQL;
Managing the Replication Topology | 157
}
$result = mysql_query($query, $COMMON);
if ($row = mysql_fetch_row($result))
return connect_to($row[0], $row[1], $row[2]);
return NULL;
The final task is to provide utility functions for adding and removing servers and for
updating the capabilities of a server. Since these are mainly to be used from the administration logic, we’ve implemented this function in Python using the Replicant library. The utility consists of three functions:
pool_add(common, server, type)
Adds a server to the pool. The pool is stored at the server denoted by common, and
the type to use is a list—or other iterable—of values to set.
pool_del(common, server)
Deletes a server from the pool.
pool_set(common, server, type)
Changes the type of the server.
Example 5-3. Administrative functions for the load balancer
class AlreadyInPoolError(replicant.Error):
pass
_INSERT_SERVER = """
INSERT INTO nodes(host, port, sock, type)
VALUES (%s, %s, %s, %s)"""
_DELETE_SERVER = "DELETE FROM nodes WHERE host = %s AND port = %s"
_UPDATE_SERVER = "UPDATE nodes SET type = %s WHERE host = %s AND port = %s"
def pool_add(common, server, type=[]):
common.use("common")
try:
common.sql(_INSERT_SERVER,
(server.host, server.port, server.socket, ','.join(type)));
except MySQLdb.IntegrityError:
raise AlreadyInPoolError
def pool_del(common, server):
common.use("common")
common.sql(_DELETE_SERVER, (server.host, server.port))
def pool_set(common, server, type):
common.use("common")
common.sql(_UPDATE_SERVER, (','.join(type), server.host, server.port))
These functions can be used as shown in the following examples:
pool_add(common, master, ['READ', 'WRITE'])
158 | Chapter 5: MySQL Replication for Scale-Out