MySQLDump To a Remote Server

I was running out of disk space on a server today. The server had a large database table that was no longer used, so I wanted to archive it and then drop the table. But the server didn’t have enough disk space to dump it out to disk before copying it off to a remote server for archiving.

The first thought was to run mysqldump dump on the destination machine, and to access the database over the network. That however, doesn’t compress or encrypt the data. Plus I would have had to create a mysql user with permission to access the database remotely.

The solution I came up with worked out well: mysqldump directly to the remote host with this command:

mysqldump <DATABASE_NAME> [mysqldump options] | gzip -c | ssh user@remotehost "cat > /path/to/some-file.sql.gz"

That pipes the mysqldump command through gzip, then to through and SSH connection. SSH on the remote side runs the ‘cat’ command to read the stdin, then redirects that to the actual file where I want it saved.

Fixing a Corrupt MySQL Relay Log

As an extension of my post yesterday about skipping corrupt queries in the relay log, I found out that my problem is due to some network problems between the servers which triggers a MySQL bug.

The connection and replication errors in my MySQL log looks like this:

080930 12:26:52 [ERROR] Error reading packet from server: Lost connection to MySQL server 
  during query ( server_errno=2013)
080930 12:26:52 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, 
  log 'mysql-bin.000249' position 747239037
080930 12:26:53 [Note] Slave: connected to master 'replicate@mysqltunnel:13306',replication 
  resumed in log 'mysql-bin.000249' at position 747239037
080930 13:18:49 [ERROR] Error reading packet from server: Lost connection to MySQL server during
   query ( server_errno=2013)
080930 13:18:49 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 
  'mysql-bin.000249' position 783782486
080930 13:18:49 [ERROR] Slave: Error 'You have an error in your SQL syntax; check the manual 
  that corresponds to your MySQL server version for the right syntax to use near '!' at line 6' 
  on query. Default database: 'database'. Query: 'INSERT INTO `sometable`
            SET   somecol         = 3,
                    comeothercol  = 8,
                    othervalue      = NULL!', Error_code: 1064
080930 13:18:49 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and 
  restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000249' 
  position 783781942
080930 13:18:50 [Note] Slave: connected to master 'replicate@mysqltunnel:13306',
  replication resumed in log 'mysql-bin.000249' at position 783782486

When there are network problems between the server, there was some issue where the master didn’t properly detect and notify the slave of the failure. This resulted in parts of queries missing, duplicated, or replaced by random bits in the relay log on the slave. When the slave tries to execute the corrupt query, it will likely generate an error that begins with:

Error You have an error in your SQL syntax; check the manual that corresponds to 
  your MySQL server version for the right syntax to use near . . 

This bug has been fixed in MySQL releases since February 2008, but still hasn’t made its way into the CentOS 5 repositories. Until then, that bug report contained a work-around which forces the slave to re-request the binary log from the master. Run ‘SHOW SLAVE STATUS’ and make note of the Master_Log_File and Exec_Master_Log_Pos columns. Then run ‘STOP SLAVE’ to suspend replication, and run this SQL:

CHANGE MASTER TO master_log_file='<Value from Relay_Master_Log_File>',
  master_log_pos=<Value from Exec_master_log_pos>;

After that, simply run ‘START SLAVE’ to have replication pick up from there again. That evidently has the slave re-request the rest of the master’s binary log, which it should (hopefully) get without becoming corrupt, and replication will continue as normal.

I guess the network connection between my servers is problematic lately. I’ve had to fix this several times in the past couple days. If that keeps up, I may add this fix to my Replication checking-script until I’m able to upgrade to a version of MySQL that contains this fix.

Compression for MySQL Replication

I have a MySQL database that does a fair number of updates and inserts. The server is replicated to an off-site server located across the country. With MySQL replication, any Insert, Update, or Delete statements are written to the binary log, then sent from the master server in San Jose to the slave in New York.

I noticed today that the slave server was falling behind the master and had trouble keeping up. I noticed that there was a sizable amount of bandwidth between the two servers and after investigating for a little while, determined that the bandwidth between the servers wasn’t sufficient to keep up with the replication.

We have applications running on the server in New York that were significantly behind or slow. After a bit of research, I found the slave_compressed_protocol setting in MySQL which allows the master and slave to compress the replication data between the two servers. After enabling that, the slave was able to catch up within a matter of minutes and has stayed caught up just fine. The bandwidth usage has dropped from a consistent 600 kb/s to around 20 kb/s.

Upon looking into MySQL replication, I also experimented with SSH compression since the replication goes through an SSH Tunnel. I had similar success with SSH compression as well.

Don’t Use Integers as Values in an Enum Field

I just got through fixing a messy problem where a database had a table defined with a couple columns that were ENUM’s with integer values.   This leads to extreme amounts of confusion, because there is a lot of ambiguity when doing queries whether the integer is supposed to be treated as the enumerated value, or as the key.

Imagine a table with a column defined as ENUM(‘0’, ‘1’, ‘2’, ‘3’).  When doing queries, if you try to do anything with that column, it is unclear whether you mean to use the actual value you pass in, or the position.  For example, if I as to say ‘WHERE confusing_column = 2’, it could be interpreted as either meaning the value ‘2’, or the item in the second position (ie; ‘1’).    It is even hard to explain because it is so confusing.

The MySQL Documentation does a decent job of explaining it.   I agree with their recommendation:

For these reasons, it is not advisable to define an ENUM column with enumeration values that look like numbers, because this can easily become confusing.

I ended up converting everything to Tinyint’s. It takes a few more bits per row, but worth it in my opinion to avoid the confusion.

MyTop Stops and Beeps on When a Query Contains Binary Data

MyTop is a handy utility for watching the queries being executed on a MySQL server from a terminal window.   It is written in Perl, and is pretty straightforward.  It just does a ‘SHOW FULL PROCESSLIST’ on the database, and then displays the currently running queries.   You can sort by various columns, and in generally is just tons easier than running SHOW PROCESSLIST from the MySQL command prompt.

My database does some inserts that contain binary data.  I noticed that when running mytop, and one of those queries came up, the terminal would beep and it would stop and prompt me to enter something.

To resolve, I added this to about line 970 so that it filters out most non-displayable characters.   Feel free to let me know a better regex to use.  This one is pretty ugly, but works for now. (Also, wordpress might have mangled some of the slashes)

## Try to filter out binary information and still provide all of the necessary detail
$thread->{Info} =~ s/[^\\w\\d\\s\\(\\)\\[\\]\\-\\;\\:\\'\\"\\,\\.\\<\\>\\?\\/\\\\\\*\\~\\!\\@\\#\\$\\%\\^\\&\\*\\-_\\+\\=\\` ]//g;

Poor Performance After Enabling Repliction Due to sync_binlog

I was pretty happy with myself with setting up some fairly complicated MySQL circular replication the other night.  I did it far after peak hours so as not to disturb any visitors if it caused any problems.   Everything appeared to be working great until I started watching things the next morning.

I started to notice that the main MySQL server seemed to be running really slow.   One process that we have usually completes in a couple hours, ended up taking well over 16 hours to complete.   I spent the whole day troubleshooting it, which got me familiar with all sorts of handy tools.   ‘mytop‘ is a handy version of ‘top’ for MySQL queries.  I got familiar with iostat for watching disk I/O performance.

In the end, after a whole day of troubleshooting it came down to the ‘sync_binlog‘ setting that I had enabled because I read some howto that mentioned it was useful for the replication master.  My understanding now of the setting is that it causes the operating system to tell the disk to sync the file to disk after each write to the binary log (every UPDATE, INSERT, or DELETE).   The idea is that when the data is sync’d to disk, the drive physically writes it to the drive, instead of keeping it in a cache.    My application does a ton, of inserts, so it was killing performance.

A Case for Choosing Good Server Names

This morning, I had a client call me bright and early, frantic about some mail problems they were having.  All of their mail servers had stopped accepting incoming SMTP connections for some reason, and they couldn’t figure out why.

After a little bit of investigation, I found that they were using postfix with MySQL-based virtual domains.   The MySQL authentication was failing, which meant that postfix was unable to look up any valid recipient names.   That, in turn was causing tons of retried connections, until they hit the maximum number of connections where Postfix would refuse additional connections.

The problem is that these mail servers were initially set up with some dumb names for some reason.    A new administrator noticed the silly names in their Reverse DNS entries and changed them to some more sensible names.  The MySQL permissions were based off of the hostnames, so when the names in Reverse DNS changed, it broke the permissions, and the clients were unable to connect.

Solving the problem was simple enough – I just corrected the MySQL permissions, and then had to deal with some huge mail queues for a little while as all of the messages waiting to come in were finally allowed all at once.

The moral of the story is to use sensible names to start out with.   These names were chosen to be sortof funny I guess, but it didn’t end up being so amusing in the midst of all of the problems it caused.  As a side note, I usually do MySQL permissions based on IP Address as well, so that you further reduce this kind of problem.

Creating a Permanent SSH Tunnel Between Linux Servers

I recently had a need to create a permanent SSH tunnel between Linux servers. My need was to allow regular non-encrypted MySQL connections over an encrypted tunnel, but there could be many other uses as well. Google can identify plenty of resources regarding the fundamental SSH commands for port forwarding but I didn’t ever find a good resource for setting up a connection and ensuring that it remains active, which is what I hope to provide here.

The SSH commands for port forwarding can be found in the ssh man page. The steps described here will create an unprivileged user named ‘tunnel’ on each server. That user will then be used to create the tunnel and run a script via cron to ensure that it remains up.

First, select one of the servers that will initiate the SSH connection. SSH allows you to map both local and remote ports, so it doesn’t really matter which end of the connection you choose to initiate the connection. I’ll refer to the box that initiates the connection as Host A, and the box that we connect to as Host B.

Create a ‘tunnel’ user on Host A:

[root@hosta ~]# useradd -d /home/tunnel tunnel
[root@hosta ~]# passwd tunnel       ## Set a strong password
[root@hosta ~]# su - tunnel           ## Become the user 'tunnel'

Now create a public/private key pair:

[tunnel@hosta ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/tunnel/.ssh/id_rsa):    ## hit enter to accept the default
Enter passphrase (empty for no passphrase):                           ## don't use a  passphrase
Enter same passphrase again:
Your identification has been saved in /home/tunnel/.ssh/id_rsa.
Your public key has been saved in /home/tunnel/.ssh/id_rsa.pub.
The key fingerprint is:
6f:30:b8:e1:36:49:74:b9:32:68:6e:bf:3e:62:d3:c2 tunnel@hosta

Now cat out the id_rsa.pub file which contains the public key that we will need to put on host b:

[tunnel@hosta ~]# cat /.ssh/id_rsa.pub
ssh-rsa blahAAAAB3NzaC1yc2EAAAABIwAAAQEA......6BEKKCxTIxgBqjLP tunnel@hosta

Now create a ‘tunnel’ user on Host B and save the public key for tunnel@hosta in the authorized_keys file

[root@hostb ~]# useradd -d /home/tunnel tunnel
[root@hostb ~]# passwd tunnel       ## Set a strong password
[root@hostb ~]# su - tunnel
[tunnel@hostb ~]# mkdir .ssh
[tunnel@hostb ~]# vi .ssh/authorized_keys   ## Now paste in the public key for tunnel@hosta

At this point you should be able to ssh from tunnel@hosta to tunnel@hostb without using a password. Depending on your configuration, you might need to allow the user ‘tunnel’ in /etc/ssh/sshd_config. You might also set some SSH options like the destination port in ~/.ssh/config.

Now, create this script as hosta:/home/tunnel/check_ssh_tunnel.sh

createTunnel() {
    /usr/bin/ssh -f -N -L13306:hostb:3306 -L19922:hostb:22 tunnel@hostb
    if [[ $? -eq 0 ]]; then
        echo Tunnel to hostb created successfully
    else
        echo An error occurred creating a tunnel to hostb RC was $?
    fi
}
## Run the 'ls' command remotely.  If it returns non-zero, then create a new connection
/usr/bin/ssh -p 19922 tunnel@localhost ls
if [[ $? -ne 0 ]]; then
    echo Creating new tunnel connection
    createTunnel
fi

Save that file and make it executable:

chmod 700 ~/check_ssh_tunnel.sh

This script will attempt to SSH to localhost port 19922 and run the ‘ls’ command. If that fails, it will attempt to create the SSH tunnel. The command to create the SSH tunnel will tunnel local port 13306 to port 3306 on hostb. You should modify that as necessary for your configuration. It will also create a tunnel for local port 19922 to port 22 on hostb which the script uses for testing the connection.

Now just add that script to the user ‘tunnel’s crontab to check every few minutes, and it will automatically create a tunnel and reconnect it if something fails. When it does create a new connection it will send an email to the ‘tunnel’ user, so you can create a .forward file to forward those messages to you.