Redirecting WordPress ‘numeric’ permalinks

When I set up my blog, I configured it to use the ‘numeric’ permalinks that look something like http://www.brandonchecketts.com/archives/61

Of course, that is pretty straightforward, but it isn’t very readable, and we can probably get a few extra SEO points by putting the post’s title in the URL. However, just changing the permalink format is a bad idea since I have a bunch (okay, a few) incoming links that I don’t necessarily want to break.

So, I wrote a quick WordPress Plugin that redirects these old numeric links to my new format. Simply create a file named ‘numeric_permalink_redirect.php’ in your wp-content/plugins directory with this content:

<?php
/*
Plugin Name: BC Rewriter
Plugin URI: http://www.brandonchecketts.com/
Description: Redirect old requests to new permalink format
Author: Brandon Checketts
Version: 1.5
Author URI: http://www.brandonchecketts.com/
*/

// redirect old "numeric" type archives to our current permalin structure
function check_numeric_permalink()
{
  if(preg_match("/^/archives/([0-9]+)/", $_SERVER['REQUEST_URI'], $matches)) {
  $post_id = $matches[1];
  $url = get_permalink($post_id);
  header("HTTP/1.1 301 Moved Permanently");
  header("Location: $url");
  exit;
}

add_action('init', 'check_numeric_permalink');

?>

That will now do a 301 Permanent redirect to the new URL so that you shouldn’t lose anybody, and the search engines should change their incoming links.

Multi-threaded perl

I’ve been experimenting on multi-threading in perl for a new project, and am impressed with how straightforward it is. Before digging into it, I never really considered doing anything with it because it was always kindof ‘mysterious’ to me. Now, I’m seeing how useful it is to have multiple threads that are able to share variables.

In the application I’m rewriting, I used to have one script that listened for network data, then saved that out to a file. I had another script that read through the output files, and then inserted the data into a database. Now, with a multi-threaded program, I just have one thread that listens, and another thread (or multiple threads) that parse the data and manipulate it however I want. In this case, that saves a lot of disk activity, and makes the program a lot more efficient, and straight-forward.
I’m also able to use the Thread::Queue module to create a queue that the listener process can add to, and then have ‘worker’ threads that can go through the data and format/summarize/whatever I’m going to do with it.

I’m looking forward to seeing how this all works out.  I’m impressed so far.

The coolest, most efficient way to copy a directory between servers

I was recently quizzed about the quickest, most efficient way to copy an entire directory between servers. I typically do this by tar’ing it up on one server, copying it to the other, then extracting it. However, this has a couple obvious problems. One, is that it requires large chunks of disk space to hold the archive file on both the source and destination. If you are low on disk space, this can be a real pain. The other bad thing, is that it a waist of time since it reads through all of the files three times (read, copy, extract).

The original thought I had was to use “scp -r” which will recursively copy a directory over to the destination. This, however, doesn’t copy directories that start with a dot, and it doesn’t preserve file ownership information.

The best way, is to use a combination of tar and ssh. The idea is to tar the files up to STDOUT, then create an SSH session to the remote host, and extract from STDIN. After you’ve got the idea, the command is pretty simple:

tar -cvzf – /path/to/local/dir | ssh root@remotebox “cd /path/to/extract; tar -xvzf -”

That’s it. One simple command and it can save you tons of time creating, copying, and extracting by doing it all at once.

Cacti stops updating graphs after upgrade to version 0.8.6j

It turns out the latest update to Cacti, the popular SNMP and RRDTool graphing program, has a bug that makes it so graphs based on SNMP Data aren’t updated after upgrading.  The problem has to do with using the PHP “snmpgetnext” function, which is unimplemented in PHP 4. 

There is a discussion on Cacti’s forum at http://forums.cacti.net/about19199.html  where a developer posts a new ping.php that will resolve the problem.

Internet Explorer Oddities

I spent about an hour debugging a dumb behavior of Internet Explorer.  The problem site is one that stored some session data in PHP’s $_SESSION variable for display later.  The form would use parameters from the GET request to populate some data in the users $_SESSION.  Upon trying to retrieve the data in a subsequent page though, it was missing or incorrect.. but only in Internet Explorer.

The failure of PHP Sessions is typically a server-side problem, so it didn’t make sense that the browser was causing a problem.   I spent a while verifying that the sessions were, in fact, working properly in all different browsers, but that still didn’t explain the problem.
The odd behavior comes, though, when the page had an image tag with a blank “src” parameter.    This causes most browsers to try and fetch the current page.   But Internet Explorer tries to fetch the parent directory.
For example, if your page is a http://www.somesite.com/somedirectory/somepage.php, most browsers will try and fetch that URL for images with a blank src parameter.   Internet Explorer, though will try to fetch http://www.somesite.com/somedirectory/

Either case is really not what one would expect.   I would think that without a destination, it wouldn’t try to fetch anything.  Attempting to fetch the page that called it (obviously not a graphic) or the parent directory (why would it do that) doesn’t really make any sense.

In this case, fetching the parent directory hit my problem script since it was the DirectoryIndex (index.php).   Calling the script without any parameters erased the saved variable that I was looking for, so the subsequent page that I hit was missing this variable.
I guess the moral of the story is to not leave images with a blank src parameter, because it will do weird things.

I’ve written up a sample page available at  http://www.brandonchecketts.com/examples/ie_blank_img_src_tag/index.php to demonstrate what I’m talking about

ldssd.org is now live

I’ve spent the past few days working on a new website at ldssd.org. The site has most of the LDS Scriptures available online, and can generate them in an RSS Feed that will deliver one chapter to you each day. The site still has a couple of small issues that should be fixed soon, but I wanted to make sure that it was ‘officially’ launched today in time for people (me) to keep their New Years Resolutions to read the scriptures each day.

Database encryption made easy

I’ve always wondered how one would securely store sensitive information in a MySQL database. A recent project has given me the opportunity to work on it, and I’ve been impressed on how easy it is to implement. MySQL provides an easy interface for encrypting data before storing it in the database. Simply use the AES_ENCRYPT and AES_DECRYPT functions when reading or writing to a table.

Simply make your column a blob field, then use something like this to write to the table

(using a PEAR::DB syntax)

$db->query("
UPDATE sometable
SET    some_col = AES_ENCRYPT( ?, ?)
WHERE something_else = ?
" array( $sensitive_value, $encryption_key, $index));

and something like this to read it back out

$value = $db->getOne("
SELECT AES_DECRYPT( some_col, ?)
FROM   sometable
WHERE something_else = ?
", array( $encryption_key, $index));

There’s no reason not to use mod_deflate

I’ve been trying to convince one of our larger customers to install mod_deflate on their server for about a month. They have had concerns about compatibility with older browsers and the possibility that it will affect the PageRank, but I have finally put enough pressure on to have them let me try it. They have very few users with old browsers (and really, if somebody is running using that archaic of a browser, how likely is it that they are going to buy something on your site), and convinced them that there should be no SEO consequences with the change (if anything, the search engines will respect your site more for using less bandwidth and having a knowledgeable administrator).

Early this morning, I got it installed (took all of about 15 minutes) and it’s running great. It’s compressing HTML and Javascript files to about 20% of their original size, which equates to some significant bandwidth savings, and quicker page-load times. About 60% of the total bandwidth used on this site is for HTML and JavaScript files (the other 40% is images, movies, and a few other odds and ends). Overall, it looks like about a 30-40% drop in total bandwidth usage, which is very significant. I’ve heard of no problems with browser compatibility either, so everybody is happy.

Overall, I’d say that there is no good reason not to use mod_deflate on your site. Especially if you ever get charged for bandwidth overages.

Here are some useful resources for installing and gathering statistics on mod_deflate

Awstats – Detailed Web-Based statistics package
Perl-based mod_deflate statistics utility
Apache’s mod_deflate documentation
Firefox plugin to view the HTTP Response headers
(and lots of other useful stuff)

And here’s a sample Apache configuration section that I picked up from somewhere. I just save this in /etc/httpd/conf.d/deflate.conf and restart apache, then you are good to go (requires that mod_deflate is already compiled and installed. Actual file location may vary, depending on your OS. This works for Red-Hat derivatives)

### Enable mod_deflate to compress output
# Insert filter
SetOutputFilter DEFLATE

# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html

# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4\.0[678] no-gzip

# MSIE masquerades as Netscape, but it is fine
BrowserMatch bMSIE !no-gzip !gzip-only-text/html

## Don't compress for IE5.0
BrowserMatch "MSIE 5.0" no-gzip

# Don't compress images
SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png|swf)$ no-gzip dont-vary

# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary

## Log some stuff for mod_deflate stats
DeflateFilterNote Input instream
DeflateFilterNote Output outstream
DeflateFilterNote Ratio ratio

LogFormat '"%r" %{outstream}n/%{instream}n (%{ratio}n%%)' deflate
## end mod_deflate stats

#### END mod_deflate configuration

So much for good database design…

I’m installing and modifying WordPress MU (Multi-user) for a client and am amazed at the poor database design. For each blog you set up, it generates 8 new tables for that blog, which have an identical design to the same 8 tables for every other blog it creates. This is extremely poor design according to any database design standards.   Despite the poor design, they do have a good reason for doing it.  Quoted from http://mu.wordpress.org/faq/

Does it scale? (Also: The way you do your databases and tables doesn’t scale!)

WordPress MU creates tables for each blog, which is the system we found worked best for plugin compatibility and scaling after lots of testing and trial and error. This takes advantage of existing OS-level and MySQL query caches and also makes it infinitely easier to segment user data, which is what all services that grow beyond a single box eventually have to do. We’re practical folks, so we’ll use whatever works best, and for the 400k and counting on WordPress.com, MU has been a champ.

The main reason for doing this is that it makes compatibility with existing WordPress plugins much easier. I guess the real source of the problem was poor planning and foresight during the development of the original WordPress application.  They claim it works well, so even though I cringe every time I see it, I guess I’ll just have to live with it and complain.