Web Programming, Linux System Administation, and Entrepreneurship in Athens Georgia

Category: Linux System Administration (Page 5 of 11)

SSH Error: trying to get more bytes 4 than in buffer 0

I ran across this cryptic error message in an SSH log file today:

Feb  7 03:33:21 hostname sshd[19439]: error: buffer_get_ret: trying to get more bytes 4 than in buffer 0
Feb  7 03:33:21 hostname sshd[19439]: fatal: buffer_get_int: buffer error

The problem is actually a corrupt line in a users ~/.ssh/authorized_keys file. This user had copy/pasted a new key into his authorized_keys file and it had a newline after the ssh-rsa. Strangely enough, people were still able to authenticate if their key was above the corrupted line. Users whose key was listed below the corrupt line were not able to log in.

Save Internet Audio Streams to MP3s

I’ve got a couple of radio programs that I like to listen to. The only problem is that I rarely am able to listen to them live. I was wishing that somebody made a good DVR-like device for the radio, but after some thought figured out a way to do it on my own using internet audio streams that most radio stations now have available.

Googling for instructions on how to save Internet audio streams will return a lot of semi-workable but mostly garbage instructions. The best set of instructions I found was at Instructables.com where the basic concept is to use the command-line mplayer to save the stream as a wave file, then use lame to convert it to an MP3.

The instructables tutorial had several downfalls though. First, it is not able to stop mplayer on its own, so it uses a second cron job to kill the original mplayer command – a little to crude for my taste. Secondly, and more importantly, you have to know the exact stream URL which is not easy to identify from most internet radio websites. They tend to hide the actual stream source behind layers of javascript so that their web-based players can synchronize ads and such while listening to the streams.

I created some PHP code that automates this process and makes it pretty simple. The basic streamsave class provides functions for downloading the wave file and converting it to an MP3. I then extend that class for specific radio stations that I want to save. The extended class provides functions that run through all of the javascript garbage to get to the actual stream source.

Using those classes, this simple script now saves my stream to an MP3 file and emails me the location when it is done:

<?php
require_once dirname(__FILE__).'/ss_640wgst.class.php';

$streamsave = new ss_640wgst();

$streamsave->stream_url = $streamsave ->getStreamURL();
$streamsave->seconds = 60 * 60; // One hour

// This saves the stream to a temporary wav file
$streamsave->save_stream();

// Now encode it to an mp3
$output_file = "/tmp/some_directory/some_program_".date('Y-m-d-His').'.mp3';
$streamsave->encode_to_mp3($output_file);

// Delete the large wav file
unlink($streamsave->wavfile);

// And tell me that the file was saved
echo "File saved (if all went okay) to {$streamsave->mp3file}\n\n";
mail('[email protected]', 'Audio File Saved', "File saved to {$streamsave->mp3file}");

// It would be cool to create a podcast XML file here that contains your new file

?>

Downloads

The abstract class file: streamsave.class.php
The extended class specifically for 640 WGST in Atlanta: ss_640wgst.class.php

I’ve created the extended classes for stations that are useful for me. If there seems to be any interest, I can work on developing that a bit more to make it more generalized and work for more radio stations.

MySQLDump To a Remote Server

I was running out of disk space on a server today. The server had a large database table that was no longer used, so I wanted to archive it and then drop the table. But the server didn’t have enough disk space to dump it out to disk before copying it off to a remote server for archiving.

The first thought was to run mysqldump dump on the destination machine, and to access the database over the network. That however, doesn’t compress or encrypt the data. Plus I would have had to create a mysql user with permission to access the database remotely.

The solution I came up with worked out well: mysqldump directly to the remote host with this command:

mysqldump <DATABASE_NAME> [mysqldump options] | gzip -c | ssh user@remotehost "cat > /path/to/some-file.sql.gz"

That pipes the mysqldump command through gzip, then to through and SSH connection. SSH on the remote side runs the ‘cat’ command to read the stdin, then redirects that to the actual file where I want it saved.

Preparing WordPress for a Large Traffic Spike

The Hallmark Hall of Fame Movie ‘Front of the Class’ premiered this past weekend with an expected 12-15 million viewers.  We have been preparing the website (ClassPerformance.com) for the event. We expected a significant number of visitors to the website in the 24-48 hours after the movie aired, so I did a number of things to ensure that the site would be able to run without incident during this critical time.

  1. Move temporarily to a higher powered server.
  2. The site is normally hosted on an inexpensive shared-hosting plan. I’ve run some shared-hosting servers before and don’t have much faith that they would handle any amount of significant load. They also usually don’t allow you to configure some of the Apache settings that I was planning on using below.

  3. Serve images and other static content from an alternate location.
  4. I set up a domain alias of ‘static.classperformance.com’ pointed to the same DocumentRoot as the main site. Then I edited the template files to serve most of the background, header, and footer images from that location. For normal usage, serving them from the same server works fine, but this allows the flexibility to move that static content to a separate server if/when it is needed.

    I also copied the entire website to a second server and had it configured so that at any time I could change DNS to point ‘static.classperformance.com’ to the second server in order to reduce the bandwidth from the primary server

  5. Generate static pages wherever possible.
  6. I used wget to download everything, and then deleted the pages that needed to be parsed through PHP (ie: contact forms, etc). Most of the pages don’t change from visitor to visitor, so this can be done for the home page, all of the blog posts, and any other pages. This significantly reduces the overhead due to database queries and just the overhead of running PHP and including multiple files.

    I then added this to my Apache configuration to tell the web server to use the static content if it exists:

        ## Serve static content for files that exist
        RewriteCond /home/classperformance.com/www/rendered/%{REQUEST_URI} -f
        RewriteRule (.*) /rendered/\ [L]
    
        ## For requests without an extension, wget has saved those files as 'index.html'
        ## so the rewrite rule needs to reflect that:
        RewriteCond /home/classperformance.com/www/rendered/%{REQUEST_URI} -d
        RewriteRule (.*) /rendered/\/index.html [L]
    

    I did some performance tests with ApacheBenchmark, and serving the static content had a dramatic effect on the speed, and the number concurrent users. There is probably a more elegant way to configure mod_cache do a similar thing in a more automated fashion, but this was quick and easy, and I didn’t have to worry about checking the various HTTP headers. In my opinion, this was the single most effective thing to do. By serving static content, Apache also correctly handles many of the HTTP headers that enable effective caching (E-Tags, expires, last-modified, etc).

  7. Installed a PHP Accelerator
  8. I’ve previously written about how easy and effective eAccelerator is to install. There are very few scenarios where this is not effective. Again, ApacheBenchmark tests easily showed a huge increase in the number of concurrent requests when eAccelerator was enabled.

  9. Check Apache settings
  10. On a vanilla CentOS install, Apache has the ServerLimit set to 256. By serving primarily static content, you will likely reduce the amount of memory that each Apache child requires, and have memory for more children. I did some quick math and figured that I could have around 800 children before memory became a concern. I also enabled KeepAlives with a very short (1 second) KeepAliveTimeout so that sequential requests from the same user don’t have to recreate TCP sessions.

    Also, by serving static content, I found that WordPress was handling the 301 redirect from the Non-www version of the site to the correct url. I moved that into Apache with this directive:

       ## Rewrite to the desired domain name
        RewriteCond %{HTTP_HOST} !^www\.classperformance\.com [NC] OR
        RewriteCond %{HTTP_HOST} !^static\.classperformance\.com [NC]
        RewriteRule ^/(.*) https://www.classperformance.com/\ [L,R=301]
    
  11. Enable server-side compression
  12. The default Apache install doesn’t compress any content. I configured mod_deflate to compress the static content and thus reduce the bandwidth usage. Compression should easily reduce the bandwidth for HTML and CSS files by one half (even up to one tenth). This not only reduces your bandwidth bill, but since the 100Mbps switch port is potentially a bottleneck, it enables more concurrent users if it approaches anywhere near that limit (and it may have if I hadn’t enabled compression)

  13. Set up some Monitoring
  14. I installed MRTG with some basic graphs. Also, I configured Apache so that I could view the ServerStatus. I also installed iftop to get a real-time view of the bandwidth usage.

With all of these changes, I’m very happy that we had tens of thousands of visitors during and shortly after the show, and everything ran perfectly. I had the static content running on a separate server for the busiest time and combined bandwidth usage peaked at around 90 Mbps shortly after the end of the show.

‘Maintenance’ Pages via Apache mod_rewrite

Occasionally, I’ve found it useful to put up a maintenance page while performing some work on a website. It may be useful if you are debugging and want to ensure that regular visitors don’t see any application generated error messages or blank pages or anything.

This method uses mod_rewrite to redirect all requests to a maintenance page that you create. Since

First create maint.html with some message that you want to display to your users. Then add this to your Apache configuration to redirect users to that page. Obviously, you’ll need to substitute your own IP address. You can add multiple lines to include multiple users if necessary. The configuration essentially says requests not from your IP (notice the exclamation point) will be redirected to /maint.html and that is the last Rewrite rule that should be followed.

  ##### Maintenance section
  ## Uncomment and add your IP address for performing maintenance
  ## Add multiple addresses on multiple lines if necessary
  RewriteCond %{REMOTE_ADDR} !^11\.22\.33\.44$
  RewriteCond %{REMOTE_ADDR} !^1\.1\.1\.1$
  RewriteRule . /maint.html [L]
  ##### End Maintenance section

ddrescue Saves the Day

I had a very bad thing happen to my regular work PC last week.   I use a Windows PC for my normal desktop machine, and when I turned it on one morning is refused to boot up.   After several attempts, it became obvious that the hard drive was dying and wouldn’t last much longer.   I have most of my irreplaceable files backed up to Amazon S3 via JungleDisk, but it is still a huge pain to try and reinstall an operating system, all of my applications, and try to get back to a working system

Fortunately, at a recent CALUG meeting, we had Barry Grundy give a presentation on Data Recovery.   Barry is the author of LinuxLEO – a pretty comprehensive document about Data Forensics  using open-source tools.  In his presentation he covered a number of open-source tools that are super-useful for recovering raw data and then in making sense of it.    The main tool that I found useful was GNU ddrescue which is a variant of dd specifically created to retrieve as much data as possible from a failing drive.’

ddrescue works by reading data from the drive.  When it encounters a bad sector it skips forward a ways and tries to read again.   If it is unsuccessful, it skips forward a larger amount and continues until it is able to read something successfully.  That process repeats until it has gone through the whole drive.    This retrieves as much good data from the drive as possible.   You can then run it again with some different parameters to go back and retry those error areas to retrieve all of the questionable areas.

The drive that I had failing was a 160 GB SATA drive that was over 4 years old.   The first round with ddrescue looked pretty bad – it had around 25% of the drive was bad or questionable.  After some experimenting to figure out the ideal parameters and a few passes through the entire drive I ended up recovering my entire drive minus just 110kb of bad sectors.

At that point I had almost all of my data, but I wasn’t able to boot off of the drive.  There were some problems with the master boot record and the NTFS volume was corrupt and wouldn’t mount cleanly.  I ended up attaching the drive to a working machine so that I could run chkdsk on it which solved the NTFS corruption problems.   I had to work around quite a few problems, but eventually was able to restore it to a point where I was able to boot just fine.

Fixing a Corrupt MySQL Relay Log

As an extension of my post yesterday about skipping corrupt queries in the relay log, I found out that my problem is due to some network problems between the servers which triggers a MySQL bug.

The connection and replication errors in my MySQL log looks like this:

080930 12:26:52 [ERROR] Error reading packet from server: Lost connection to MySQL server 
  during query ( server_errno=2013)
080930 12:26:52 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, 
  log 'mysql-bin.000249' position 747239037
080930 12:26:53 [Note] Slave: connected to master 'replicate@mysqltunnel:13306',replication 
  resumed in log 'mysql-bin.000249' at position 747239037
080930 13:18:49 [ERROR] Error reading packet from server: Lost connection to MySQL server during
   query ( server_errno=2013)
080930 13:18:49 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 
  'mysql-bin.000249' position 783782486
080930 13:18:49 [ERROR] Slave: Error 'You have an error in your SQL syntax; check the manual 
  that corresponds to your MySQL server version for the right syntax to use near '!' at line 6' 
  on query. Default database: 'database'. Query: 'INSERT INTO `sometable`
            SET   somecol         = 3,
                    comeothercol  = 8,
                    othervalue      = NULL!', Error_code: 1064
080930 13:18:49 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and 
  restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000249' 
  position 783781942
080930 13:18:50 [Note] Slave: connected to master 'replicate@mysqltunnel:13306',
  replication resumed in log 'mysql-bin.000249' at position 783782486

When there are network problems between the server, there was some issue where the master didn’t properly detect and notify the slave of the failure. This resulted in parts of queries missing, duplicated, or replaced by random bits in the relay log on the slave. When the slave tries to execute the corrupt query, it will likely generate an error that begins with:

Error You have an error in your SQL syntax; check the manual that corresponds to 
  your MySQL server version for the right syntax to use near . . 

This bug has been fixed in MySQL releases since February 2008, but still hasn’t made its way into the CentOS 5 repositories. Until then, that bug report contained a work-around which forces the slave to re-request the binary log from the master. Run ‘SHOW SLAVE STATUS’ and make note of the Master_Log_File and Exec_Master_Log_Pos columns. Then run ‘STOP SLAVE’ to suspend replication, and run this SQL:

CHANGE MASTER TO master_log_file='<Value from Relay_Master_Log_File>',
  master_log_pos=<Value from Exec_master_log_pos>;

After that, simply run ‘START SLAVE’ to have replication pick up from there again. That evidently has the slave re-request the rest of the master’s binary log, which it should (hopefully) get without becoming corrupt, and replication will continue as normal.

I guess the network connection between my servers is problematic lately. I’ve had to fix this several times in the past couple days. If that keeps up, I may add this fix to my Replication checking-script until I’m able to upgrade to a version of MySQL that contains this fix.

Derby Startup Script

I was surprised when installing Derby for a customer, that it only provided a command to start derby running as a server from the terminal. I guess most users likely used it in embedded mode where the application runs derby itself. But surely there are people who would like the multi-user features of having it run as a standalone server.

Google searches didn’t find any suitable startup scripts either, so I wrote my own and figured it might be useful for others. Anybody who is interested can download it here. Simply save it as /etc/init.d/derby and create a ‘derby’ user before using it. It assumes that derby is installed in /usr/local/derby, so be sure to modify the first few lines to match your exact configuration.

Vacation (AutoReply) Message with Virtual Users and Postfix

I’ve previously written about both Virtual Mail users, and about enabling vacation messages for postfix. The next step was to get vacation working with virtual users.

My first thought was to try and make the sendmail ‘vacation’ program work with virtual users, but after digging into that a bit, it looked like more trouble than it was worth. I remembered that PostfixAdmin had some kind of support for this, so I decided to check it out, which proved a much more promising solution.

PostfixAdmin ships with a perl-based script that can be piped an email message and then will send an auto-reply to the sender. The script is able to grab a customized subject and message body from a MySQL database and then reply to senders as appropriate. It also keeps track of who it has auto-replied to so that each sender only gets one auto-reply in a given length of time.

The instructions for implementing it can be found at https://postfixadmin.svn.sourceforge.net/viewvc/postfixadmin/trunk/VIRTUAL_VACATION/INSTALL.TXT?view=markup. I found the documentation to be fairly straightforward.

Essentially, when a user enables the auto-reply, it adds an email address to the aliases table that points to user#[email protected]. You then configure postfix to send everything to the ‘autoreply.yourdomain.com’ domain to the vacation script, which then can read the original recipient’s address and respond as desired.

I now have this working on our hosted mail solution, so that RoundSphere mail customers now have auto-reply functionality. In addition, I made an addition to the webmail application (RoundCube) so that users can modify their vacation message themselves instead of having to have a mail administrator do it through postfixadmin.

Testing for Vulnerable Caching Name Servers

Most of the technical community has probably heard of the recently found DNS weakness.  The basic premise is that if a recursive nameserver doesn’t use sufficently random source ports when making recursive queries, it can be vulnerable to an attacker who is trying to poisen the cache, or fill it with incorrect data.

I’ve now heard reports about it from various news sources who make it sound much more drastic than it actually is.   Granted, it is a serious flaw, but fortunately most companies with any desire for security use SSL, which provides an additional layer for identity verification.  Also, for most any company with an IT staff, patching the DNS server with the required fixes should be a fairly trivial task.   The most important servers to be fixed are those run by ISPs and Datacenters, both of which should have their servers fixed by now.

Tools for testing your DNS servers are fairly easy to come by.  dns-oarc.net has a web-based test, although I don’t know how it discovers your DNS Servers.  For windows users, you can run ‘nslookup’ like this:

C:\Documents and Settings\Brandon>nslookup
Default Server:  cns.manassaspr.va.dc02.comcast.net
Address:  68.87.73.242

> set type=TXT
> porttest.dns-oarc.net
Server:  cns.manassaspr.va.dc02.comcast.net
Address:  68.87.73.242

Non-authoritative answer:
porttest.dns-oarc.net   canonical name = porttest.y.x.w.v.u.t.s.r.q.p.o.n.m.l.k.
j.i.h.g.f.e.d.c.b.a.pt.dns-oarc.net
porttest.y.x.w.v.u.t.s.r.q.p.o.n.m.l.k.j.i.h.g.f.e.d.c.b.a.pt.dns-oarc.net
text =

        "68.87.73.245 is GREAT: 26 queries in 2.3 seconds from 25 ports with std
 dev 16592"
>

To test from a linux machine, you can use dns-oarc’s test with dig like this:

root@server:~# dig porttest.dns-oarc.net in txt +short
porttest.y.x.w.v.u.t.s.r.q.p.o.n.m.l.k.j.i.h.g.f.e.d.c.b.a.pt.dns-oarc.net.
"72.249.0.34 is GREAT: 26 queries in 1.2 seconds from 26 ports with std dev 20533"

Your are looking for a response that contains GOOD or GREAT. If your results contain something else, you should notify your ISP or Data Center to have them fix their servers.

« Older posts Newer posts »

© 2025 Brandon Checketts

Theme by Anders NorenUp ↑