Web Programming, Linux System Administation, and Entrepreneurship in Athens Georgia

Author: Brandon (Page 13 of 29)

Fixing a Corrupt MySQL Relay Log

As an extension of my post yesterday about skipping corrupt queries in the relay log, I found out that my problem is due to some network problems between the servers which triggers a MySQL bug.

The connection and replication errors in my MySQL log looks like this:

080930 12:26:52 [ERROR] Error reading packet from server: Lost connection to MySQL server 
  during query ( server_errno=2013)
080930 12:26:52 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, 
  log 'mysql-bin.000249' position 747239037
080930 12:26:53 [Note] Slave: connected to master 'replicate@mysqltunnel:13306',replication 
  resumed in log 'mysql-bin.000249' at position 747239037
080930 13:18:49 [ERROR] Error reading packet from server: Lost connection to MySQL server during
   query ( server_errno=2013)
080930 13:18:49 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 
  'mysql-bin.000249' position 783782486
080930 13:18:49 [ERROR] Slave: Error 'You have an error in your SQL syntax; check the manual 
  that corresponds to your MySQL server version for the right syntax to use near '!' at line 6' 
  on query. Default database: 'database'. Query: 'INSERT INTO `sometable`
            SET   somecol         = 3,
                    comeothercol  = 8,
                    othervalue      = NULL!', Error_code: 1064
080930 13:18:49 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and 
  restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000249' 
  position 783781942
080930 13:18:50 [Note] Slave: connected to master 'replicate@mysqltunnel:13306',
  replication resumed in log 'mysql-bin.000249' at position 783782486

When there are network problems between the server, there was some issue where the master didn’t properly detect and notify the slave of the failure. This resulted in parts of queries missing, duplicated, or replaced by random bits in the relay log on the slave. When the slave tries to execute the corrupt query, it will likely generate an error that begins with:

Error You have an error in your SQL syntax; check the manual that corresponds to 
  your MySQL server version for the right syntax to use near . . 

This bug has been fixed in MySQL releases since February 2008, but still hasn’t made its way into the CentOS 5 repositories. Until then, that bug report contained a work-around which forces the slave to re-request the binary log from the master. Run ‘SHOW SLAVE STATUS’ and make note of the Master_Log_File and Exec_Master_Log_Pos columns. Then run ‘STOP SLAVE’ to suspend replication, and run this SQL:

CHANGE MASTER TO master_log_file='<Value from Relay_Master_Log_File>',
  master_log_pos=<Value from Exec_master_log_pos>;

After that, simply run ‘START SLAVE’ to have replication pick up from there again. That evidently has the slave re-request the rest of the master’s binary log, which it should (hopefully) get without becoming corrupt, and replication will continue as normal.

I guess the network connection between my servers is problematic lately. I’ve had to fix this several times in the past couple days. If that keeps up, I may add this fix to my Replication checking-script until I’m able to upgrade to a version of MySQL that contains this fix.

Fix MySQL Replication by Skipping The Problematic Query

MySQL replication can be a bit fragile.  Most of the time replication fails when the slave tries to run an SQL statement that causes an error.   I’ve most frequently seen this when the slave is missing some table that I forgot to import when setting replication up.   Usually it is a simple enough job to create the table, then start the slave SQL thread again.

However, sometimes replication breaks for no particular reason at all. This morning, I had a slave stop because an invalid query somehow got written to the binary log.  I’m not quite sure how that happened, but fortunately I found a way to just skip the bad query.

mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)

mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

The ‘SQL_SLAVE_SKIP_COUNTER setting tells the slave SQL thread to skip that many queries when starting up. Note that you should really know what your database is doing and why it stopped before you just go running this command. It may cause your slave to get seriously out-of-sync with the master server.

Derby Startup Script

I was surprised when installing Derby for a customer, that it only provided a command to start derby running as a server from the terminal. I guess most users likely used it in embedded mode where the application runs derby itself. But surely there are people who would like the multi-user features of having it run as a standalone server.

Google searches didn’t find any suitable startup scripts either, so I wrote my own and figured it might be useful for others. Anybody who is interested can download it here. Simply save it as /etc/init.d/derby and create a ‘derby’ user before using it. It assumes that derby is installed in /usr/local/derby, so be sure to modify the first few lines to match your exact configuration.

Vacation (AutoReply) Message with Virtual Users and Postfix

I’ve previously written about both Virtual Mail users, and about enabling vacation messages for postfix. The next step was to get vacation working with virtual users.

My first thought was to try and make the sendmail ‘vacation’ program work with virtual users, but after digging into that a bit, it looked like more trouble than it was worth. I remembered that PostfixAdmin had some kind of support for this, so I decided to check it out, which proved a much more promising solution.

PostfixAdmin ships with a perl-based script that can be piped an email message and then will send an auto-reply to the sender. The script is able to grab a customized subject and message body from a MySQL database and then reply to senders as appropriate. It also keeps track of who it has auto-replied to so that each sender only gets one auto-reply in a given length of time.

The instructions for implementing it can be found at https://postfixadmin.svn.sourceforge.net/viewvc/postfixadmin/trunk/VIRTUAL_VACATION/INSTALL.TXT?view=markup. I found the documentation to be fairly straightforward.

Essentially, when a user enables the auto-reply, it adds an email address to the aliases table that points to user#[email protected]. You then configure postfix to send everything to the ‘autoreply.yourdomain.com’ domain to the vacation script, which then can read the original recipient’s address and respond as desired.

I now have this working on our hosted mail solution, so that RoundSphere mail customers now have auto-reply functionality. In addition, I made an addition to the webmail application (RoundCube) so that users can modify their vacation message themselves instead of having to have a mail administrator do it through postfixadmin.

Testing for Vulnerable Caching Name Servers

Most of the technical community has probably heard of the recently found DNS weakness.  The basic premise is that if a recursive nameserver doesn’t use sufficently random source ports when making recursive queries, it can be vulnerable to an attacker who is trying to poisen the cache, or fill it with incorrect data.

I’ve now heard reports about it from various news sources who make it sound much more drastic than it actually is.   Granted, it is a serious flaw, but fortunately most companies with any desire for security use SSL, which provides an additional layer for identity verification.  Also, for most any company with an IT staff, patching the DNS server with the required fixes should be a fairly trivial task.   The most important servers to be fixed are those run by ISPs and Datacenters, both of which should have their servers fixed by now.

Tools for testing your DNS servers are fairly easy to come by.  dns-oarc.net has a web-based test, although I don’t know how it discovers your DNS Servers.  For windows users, you can run ‘nslookup’ like this:

C:\Documents and Settings\Brandon>nslookup
Default Server:  cns.manassaspr.va.dc02.comcast.net
Address:  68.87.73.242

> set type=TXT
> porttest.dns-oarc.net
Server:  cns.manassaspr.va.dc02.comcast.net
Address:  68.87.73.242

Non-authoritative answer:
porttest.dns-oarc.net   canonical name = porttest.y.x.w.v.u.t.s.r.q.p.o.n.m.l.k.
j.i.h.g.f.e.d.c.b.a.pt.dns-oarc.net
porttest.y.x.w.v.u.t.s.r.q.p.o.n.m.l.k.j.i.h.g.f.e.d.c.b.a.pt.dns-oarc.net
text =

        "68.87.73.245 is GREAT: 26 queries in 2.3 seconds from 25 ports with std
 dev 16592"
>

To test from a linux machine, you can use dns-oarc’s test with dig like this:

root@server:~# dig porttest.dns-oarc.net in txt +short
porttest.y.x.w.v.u.t.s.r.q.p.o.n.m.l.k.j.i.h.g.f.e.d.c.b.a.pt.dns-oarc.net.
"72.249.0.34 is GREAT: 26 queries in 1.2 seconds from 26 ports with std dev 20533"

Your are looking for a response that contains GOOD or GREAT. If your results contain something else, you should notify your ISP or Data Center to have them fix their servers.

Compression for MySQL Replication

I have a MySQL database that does a fair number of updates and inserts. The server is replicated to an off-site server located across the country. With MySQL replication, any Insert, Update, or Delete statements are written to the binary log, then sent from the master server in San Jose to the slave in New York.

I noticed today that the slave server was falling behind the master and had trouble keeping up. I noticed that there was a sizable amount of bandwidth between the two servers and after investigating for a little while, determined that the bandwidth between the servers wasn’t sufficient to keep up with the replication.

We have applications running on the server in New York that were significantly behind or slow. After a bit of research, I found the slave_compressed_protocol setting in MySQL which allows the master and slave to compress the replication data between the two servers. After enabling that, the slave was able to catch up within a matter of minutes and has stayed caught up just fine. The bandwidth usage has dropped from a consistent 600 kb/s to around 20 kb/s.

Upon looking into MySQL replication, I also experimented with SSH compression since the replication goes through an SSH Tunnel. I had similar success with SSH compression as well.

Don’t Use Integers as Values in an Enum Field

I just got through fixing a messy problem where a database had a table defined with a couple columns that were ENUM’s with integer values.   This leads to extreme amounts of confusion, because there is a lot of ambiguity when doing queries whether the integer is supposed to be treated as the enumerated value, or as the key.

Imagine a table with a column defined as ENUM(‘0’, ‘1’, ‘2’, ‘3’).  When doing queries, if you try to do anything with that column, it is unclear whether you mean to use the actual value you pass in, or the position.  For example, if I as to say ‘WHERE confusing_column = 2’, it could be interpreted as either meaning the value ‘2’, or the item in the second position (ie; ‘1’).    It is even hard to explain because it is so confusing.

The MySQL Documentation does a decent job of explaining it.   I agree with their recommendation:

For these reasons, it is not advisable to define an ENUM column with enumeration values that look like numbers, because this can easily become confusing.

I ended up converting everything to Tinyint’s. It takes a few more bits per row, but worth it in my opinion to avoid the confusion.

SSL Certificate Notes

Whenever I create an SSL Certificate, I find myself going back and forth between several pages of notes.  I’m about to do this with a half-dozen certs that were generated on a Debian box with weak keys.  Here are the OpenSSL commands I find most useful:

Create a new key:

openssl genrsa -out MYDOMAIN.COM.key 2048

Remove the Pass Phrase from an existing key:

openssl rsa -in MYDOMAIN.COM.key.withpassword -out MYDOMAIN.COM.key

Create a Certificate Signing Request (CSR):

openssl req -new -key YOURDOMAIN.COM.key -out YOURDOMAIN.COM.csr

Inspect your CSR (or a previous one to copy values out of):

openssl req -noout -text -in MYDOMAIN.COM.csr

Self-Sign a Certificate

openssl x509 -req -days 3650 -in MYDOMAIN.COM.csr \
  -signkey MYDOMAIN.COM.key \
  -out MYDOMAIN.COM.crt

Inspect a certificate

openssl x509 -in MYDOMAIN.COM.crt -text | head -n 12

Hibernate Your Windows Machine With an Icon or a Script

This command can be used to hibernate a Windows machine from some kind of script:

rundll32 powrprof.dll,SetSuspendState

I have found this useful in a couple situations. One is where I have a computer that I like to have on during the day, but not at night. I configured the Bios to turn the computer on automatically each day at a specified time. And then I used windows task scheduler to run the hibernate command at a specified time at night.

The other situation is some problem on one of my machines where it takes literally three or four minutes to bring up the shut down box after clicking start =>shutdown. Instead of doing that, I just created a shortcut on my desktop that runs the hibernate command and it does that immediately

« Older posts Newer posts »

© 2025 Brandon Checketts

Theme by Anders NorenUp ↑