Category: PHP (Page 2 of 3)

PHP Wrapper Class for a Read-only database

January 4, 2010 / Brandon / 0 Comments

This is a pretty special case of a database wrapper class where I wanted to discard any updates to the database, but want SELECT queries to run against an alternative read-only database. In this instance, I have a planned outage of a primary database server, but would like the public-facing websites and web services to remain as accessible as possible.

I wrote this quick database wrapper class that will pass all SELECT queries on to a local replica of the database, and silently discard any updates. On this site almost all of the functionality still works, but it obviously isn’t saving and new information while the primary database is unavailable.

Here is my class. This is intended as a wrapper to an ADOdb class, but it is generic enough that I think it would work for many other database abstraction functions as well as seamless data pump.

class db_unavailable {
    var $readonly_db;

    function __construct($readonly_db)
    {
        $this->query_db = $readonly_db;
    }

    function query($sql)
    {
        $args = func_get_args();
        if (preg_match("#(INSERT INTO|REPLACE INTO|UPDATE|DELETE)#i", $args[0])) {
            // echo "Unable to do insert/replace/update/delete query: $sql\n";
            return true;
        } else {
            return call_user_func_array(array($this->readonly_db, 'query'), $args);
        }
    }

    function __call($function, $args)
    {
        return call_user_func_array(array($this->readonly_db, $function), $args);
    }
}

I simply create my $query_db object that points to the read-only database. Then create my main $db object as a new db_unavailable() object. Any select queries against $db will behave as they normally do, and data-modifying queries will be silently discarded.

LUG Presentation on SQL Basics

December 17, 2009 / Brandon / 0 Comments

I gave a presentation tonight at my local Linux Users Group meeting on SQL Basics. I had a fun time preparing the presentation and made up a bunch of examples having to do with Santa’s database.

It started out with a simple table for kids who were either naughty or nice. We then added some reports to that. Then imported kids’ wish lists from CSV files. From there we were able to generate some manufacturing reports for the workshop.

When we joined the wish list table with the kids table, we were then able to generate a sleigh-loading report which included only gifts for kids who had been good. Then we got even more complicated and introduced several joins with some complicated mathematics to select gifts for kids within a certain radius from a given zip code.

The presentation is available for download here. And Brian recorded part of the presentation which is available to view on uStream.tv

PHP Code to Sign any Amazon API Requests

June 30, 2009 / Brandon / 22 Comments

Starting next month, any requests to the Amazon Product Advertising API need to be cryptographically signed. Amazon has given about three months notice and the deadline is quickly approaching. I use the Amazon web services on several sites and came up a fairly generic way to convert an existing URL to a signed URL. I’ve tested with several sites and a variety of functions, and this is working well for me so far:

function signAmazonUrl($url, $secret_key)
{
    $original_url = $url;

    // Decode anything already encoded
    $url = urldecode($url);

    // Parse the URL into $urlparts
    $urlparts       = parse_url($url);

    // Build $params with each name/value pair
    foreach (split('&', $urlparts['query']) as $part) {
        if (strpos($part, '=')) {
            list($name, $value) = split('=', $part, 2);
        } else {
            $name = $part;
            $value = '';
        }
        $params[$name] = $value;
    }

    // Include a timestamp if none was provided
    if (empty($params['Timestamp'])) {
        $params['Timestamp'] = gmdate('Y-m-d\TH:i:s\Z');
    }

    // Sort the array by key
    ksort($params);

    // Build the canonical query string
    $canonical       = '';
    foreach ($params as $key => $val) {
        $canonical  .= "$key=".rawurlencode(utf8_encode($val))."&";
    }
    // Remove the trailing ampersand
    $canonical       = preg_replace("/&$/", '', $canonical);

    // Some common replacements and ones that Amazon specifically mentions
    $canonical       = str_replace(array(' ', '+', ',', ';'), array('%20', '%20', urlencode(','), urlencode(':')), $canonical);

    // Build the sign
    $string_to_sign             = "GET\n{$urlparts['host']}\n{$urlparts['path']}\n$canonical";
    // Calculate our actual signature and base64 encode it
    $signature            = base64_encode(hash_hmac('sha256', $string_to_sign, $secret_key, true));

    // Finally re-build the URL with the proper string and include the Signature
    $url = "{$urlparts['scheme']}://{$urlparts['host']}{$urlparts['path']}?$canonical&Signature=".rawurlencode($signature);
    return $url;
}

To use it, just wrap your Amazon URL with the signAmazonUrl() function and pass it your original string and secret key as arguments. As an example:

$xml = file_get_contents('https://webservices.amazon.com/onca/xml?some-parameters');

becomes

$xml = file_get_contents(signAmazonUrl('https://webservices.amazon.com/onca/xml?some-parameters', $secret_key));

Like most all of the variations of this, it does require the hash functions be installed to use the hash_hmac() function. That function is generally available in PHP 5.1+. Older versions will need to install it with Pecl. I tried using a couple of versions that try to create the Hash in pure PHP code, but none worked and installing it via Pecl was pretty simple.

(Note that I’ve slightly revised this code a couple of times to fix small issues that have been noticed)

Array versus String in CURLOPT_POSTFIELDS

May 29, 2009 / Brandon / 6 Comments

The PHP Curl Documentation for CURLOPT_POSTFIELDS makes this note:

This can either be passed as a urlencoded string like ‘para1=val1&para2=val2&…’ or as an array with the field name as key and field data as value. If value is an array, the Content-Type header will be set to multipart/form-data.

I’ve always discounted the importance of that, and in most cases it doesn’t generally matter. The destination server and application likely know how to deal with both multipart/form-data and application/x-www-form-urlencoded equally well. However, the data is passed in a much different way using these two different mechanisms.

application/x-www-form-urlencoded

application/x-www-form-urlencoded is what I generally think of when doing POST requests. It is the default when you submit most forms on the web. It works by appending a blank line and then your urlencoded data to the end of the POST request. It also sets the Content-Length header to the length of your data. A request submitted with application/x-www-form-urlencoded looks like this (somewhat simplified):

POST /some-form.php HTTP/1.1
Host: www.brandonchecketts.com
Content-Length: 23
Content-Type: application/x-www-form-urlencoded

name=value&name2=value2

multipart/form-data

multipart/form-data is much more complicated, but more flexible. Its flexibility is required when uploading files. It works in a manner similar to MIME types. The HTTP Request looks like this (simpified):

POST / HTTP/1.1
Host: www.brandonchecketts.com
Content-Length: 244
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------26bea3301273

And then subsequent packets are sent containing the actual data. In my simple case with two name/value pairs, it looks like this:

HTTP/1.1 100 Continue
------------------------------26bea3301273
Content-Disposition: form-data; name="name"

value
------------------------------26bea3301273
Content-Disposition: form-data; name="name2"

value2
------------------------------26bea3301273--

CURL usage

So, when sending POST requests in PHP/cURL, it is important to urlencode it as a string first.

This will generate the multipart/form-data version

$data = array('name' => 'value', 'name2' => 'value2');
curl_setopt($curl_object, CURLOPT_POSTFIELDS,  $data)

And this simple change will ensure that it uses the application/x-www-form-urlencoded version:

$data = array('name' => 'value', 'name2' => 'value2');
$encoded = '';
foreach($data as $name => $value){
    $encoded .= urlencode($name).'='.urlencode($value).'&';
}
// chop off the last ampersand
$encoded = substr($encoded, 0, strlen($encoded)-1);
curl_setopt($curl_object, CURLOPT_POSTFIELDS,  $encoded)

KnitMeter is now a Facebook App

May 29, 2009 / Brandon / 3 Comments

KnitMeter.com is a site that I wrote quickly for my wife to keep track of how much she has knit. It generate a little ‘widget’ image that can be placed on blogs, forums, etc and says how many miles of yarn you have knit in some period. The site has been live for about a year and a half now and has a couple thousand registered users.

I have been receiving an increasing number of requests to add a method for adding a KnitMeter it to Facebook. I’ve experimented with a couple of other ideas on Facebook and found that it was pretty straightforward to write an app. KnitMeter seems like a decent candidate for a social app, so I started working on it about a week ago. I’m happy to say that I just made the application live late last night. It is available at https://apps.facebook.com/knitmeter/. If you use other social media apps, then go here where you can buy YouTube subscribers or views, or even Instagram followers.

Features include:

Ability to add projects and add knitted lengths to a project (or not)
Settings for inputting lengths in feet, yards, or meters
Display how much you’ve knit in feet, yards, meters, kilometers, or miles
When entering a new length, you can choose to have it publish a ‘story’ on your profile page
You can add a tab on your profile page that shows each of your projects as well as a total
You can add a KnitMeter ‘box’ to the side of your profile page, or on your ‘boxes’ tab.

I recreated the database from scratch and defined it a little better, so I have a little bit of work to do in migrating the existing site and database over to the new structure. Once that is done users will be able to import their data from the existing KnitMeter.com by providing their email/password.

What is in a gclid?

April 15, 2009 / Brandon / 3 Comments

When you use auto-tagging with your Adwords campaign, all request that are generated by Google Adwords contain a ?glcid parameter in the Request. Adwords uses this to pass some information to Analytics for traffic analysis.

I was curious, about what data the gclid parameter contained. My guess was that it contained some encoded or encrypted information regarding the origin of the click, so I did some analysis on the clicks that I received. Some discussion about it was available on this post.

I ended up writing a quick PHP script that parses through an Apache log file. It finds requests that contain a gclid and then produces a report of which letters occur in which positions of the gclid.

Found 32507 appropriate lines
Character  1 [ 1] C
Character  2 [ 8] IJKLMNOP
Character  3 [32] -CDGHKLOPSTWX_abefijmnqruvyz2367
Character  4 [64] -CDEFG0ABHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz123456789
Character  5 [32] -_0ghijklmnopqrstuvwxyz123456789
Character  6 [32] -IJKLMNOPYZ_abcdefopqrstuv456789
Character  7 [32] -CDGHKLOPSTWX_abefijmnqruvyz2367
Character  8 [64] -ABCDEFG0HIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz123456789
Character  9 [32] 0-_ghijklmnopqrstuvwxyz123456789
Character 10 [ 4] JZp5
Character 11 [ 8] IMQUYcgk
Character 12 [ 1] C
Character 13 [ 1] F
Character 14 [10] QRSUWYZcde
Character 15 [61] -ABCEFGHIJKLMNOPQRSTUVWXYZ_ab0cdefghiklmnopqrstuvwxy123456789
Character 16 [63] -ABCDEFGHIJKLMNOQRSTUVWXYZ_abcde0fghijklmnopqrstuvwxyz123456789
Character 17 [17] DFGHIQabgiknrsx57
Character 18 [ 4] AQgw
Character 19 [ 1] o
Character 20 [ 1] d
Character 21 [64] -ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwx0yz123456789
Character 22 [32] ABCDEFGHQRSTUVWXghijklmnwyz0x123
Character 23 [64] -ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuv0wxyz123456789
Character 24 [64] -ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrs0tuvwxyz123456789
Character 25 [62] 0-ABCDEFHIJKLMOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz123456789
Character 26 [ 4] AQgw

This makes it clear that the parameter has some structure, but I’m still no closer to determining what it contains. Counting up the unique values, it would seem that they have about 95 bits of information available, which might be enough room to store everything it would need to know about the search that created it. Based on the reporting details in Analytics, I would presume that it somehow contains at least the following information:

Campaign (id)
Keyword (id)
Ad Variation (id)
Position

I did some research by clicking an ad multiple times and examining the glcids for those:

        12345678901234567890123456
/?gclid=CNHz5eD_8pkCFRCdnAodzniYQg
/?gclid=CIX_u-X_8pkCFQKenAodlWprSg
/?gclid=CMyI_4OA85kCFRIhnAodc2_oRg
/?gclid=CO_0pYyA85kCFQghnAodDDpaRQ
/?gclid=CIXo9JeA85kCFRIhnAodc2_oRg
/?gclid=CLitgp2A85kCFQubnAod1nx7Qg
/?gclid=CN3_1aOA85kCFQghnAodDDpaRQ
/?gclid=CPyi1quA85kCFRabnAodWnZbRQ 
/?gclid=COq-67OA85kCFRMhnAodyQvSRg
/?gclid=COOplrmA85kCFRCdnAodzniYQg

I noticed that most of the characters which use 32-64 characters vary quite a bit except for character #9, which was always an 8, and character #10 which was a ‘p’ for the first two clicks, and then a ‘5’ for all subsequent clicks. That likely has some significance, but I’m out of time for playing with it for now.

Hopefully the script and this basic analysis might be of use for somebody else to use in digging into it further.

One other thought that I had is that the data (or each field) is somehow encrypted and when you ‘link’ your Analytics account to your Adwords account it shares the decryption key so that it can get at the detail.

Announcing WebPasswd

March 15, 2009 / Brandon / 0 Comments

Do you have users who need access to web-based applications on multiple servers? Managing those users can be a pain when dealing with normal htpasswd-based permissions. Adding or removing users means editing each htpasswd file and remembering where all of them are.

Mod_auth_mysql is a good way to centralize that user database so that you can avoid having all of the separate htpasswd files. The apache module is available from any modern Linux distribution, so installing and configuring it takes less than 5 minutes. I started using it almost 2 years ago, and over that time have made a simple web application for managing the users and granting them permission to each application.

I’ve released the program as WebPasswd for anybody else who wants to use it. Now adding users and granting them access to application can be don with just a few clicks. Granting and revoking access to an application takes just seconds and is applied immediately. Configuring a new application takes a couple clicks, and then you just copy/paste the Apache configuration into the appropriate place on your web server. Try it out with this demo.

I think this will be useful to people. I have not seen another application that does something similar. Let me know if it works for you.

PHP 5.1 Doesn’t have timezone_identifiers_list() by default

February 5, 2009 / Brandon / 0 Comments

According to the PHP documentation for timezone_identifiers_list(), that function should be included in PHP 5.1.x. The note on DateTime installation mentions, however, that it was only experimental support, and had to be compiled specifically to support it.

The fix, then, is to recompile PHP 5.1.x with

CFLAGS=-DEXPERIMENTAL_DATE_SUPPORT=1

or to upgrade to PHP 5.2 where it is enabled by default.

My particular problem surfaced with some Drupal code that required the function.

Save Internet Audio Streams to MP3s

January 21, 2009 / Brandon / 1 Comment

I’ve got a couple of radio programs that I like to listen to. The only problem is that I rarely am able to listen to them live. I was wishing that somebody made a good DVR-like device for the radio, but after some thought figured out a way to do it on my own using internet audio streams that most radio stations now have available.

Googling for instructions on how to save Internet audio streams will return a lot of semi-workable but mostly garbage instructions. The best set of instructions I found was at Instructables.com where the basic concept is to use the command-line mplayer to save the stream as a wave file, then use lame to convert it to an MP3.

The instructables tutorial had several downfalls though. First, it is not able to stop mplayer on its own, so it uses a second cron job to kill the original mplayer command – a little to crude for my taste. Secondly, and more importantly, you have to know the exact stream URL which is not easy to identify from most internet radio websites. They tend to hide the actual stream source behind layers of javascript so that their web-based players can synchronize ads and such while listening to the streams.

I created some PHP code that automates this process and makes it pretty simple. The basic streamsave class provides functions for downloading the wave file and converting it to an MP3. I then extend that class for specific radio stations that I want to save. The extended class provides functions that run through all of the javascript garbage to get to the actual stream source.

Using those classes, this simple script now saves my stream to an MP3 file and emails me the location when it is done:

<?php
require_once dirname(__FILE__).'/ss_640wgst.class.php';

$streamsave = new ss_640wgst();

$streamsave->stream_url = $streamsave ->getStreamURL();
$streamsave->seconds = 60 * 60; // One hour

// This saves the stream to a temporary wav file
$streamsave->save_stream();

// Now encode it to an mp3
$output_file = "/tmp/some_directory/some_program_".date('Y-m-d-His').'.mp3';
$streamsave->encode_to_mp3($output_file);

// Delete the large wav file
unlink($streamsave->wavfile);

// And tell me that the file was saved
echo "File saved (if all went okay) to {$streamsave->mp3file}\n\n";
mail('[email protected]', 'Audio File Saved', "File saved to {$streamsave->mp3file}");

// It would be cool to create a podcast XML file here that contains your new file

?>

Downloads

The abstract class file: streamsave.class.php
The extended class specifically for 640 WGST in Atlanta: ss_640wgst.class.php

I’ve created the extended classes for stations that are useful for me. If there seems to be any interest, I can work on developing that a bit more to make it more generalized and work for more radio stations.

Preparing WordPress for a Large Traffic Spike

December 10, 2008 / Brandon / 1 Comment

The Hallmark Hall of Fame Movie ‘Front of the Class’ premiered this past weekend with an expected 12-15 million viewers.Â Â We have been preparing the website (ClassPerformance.com) for the event. We expected a significant number of visitors to the website in the 24-48 hours after the movie aired, so I did a number of things to ensure that the site would be able to run without incident during this critical time.

Move temporarily to a higher powered server.

The site is normally hosted on an inexpensive shared-hosting plan. I’ve run some shared-hosting servers before and don’t have much faith that they would handle any amount of significant load. They also usually don’t allow you to configure some of the Apache settings that I was planning on using below.

Serve images and other static content from an alternate location.

I set up a domain alias of ‘static.classperformance.com’ pointed to the same DocumentRoot as the main site. Then I edited the template files to serve most of the background, header, and footer images from that location. For normal usage, serving them from the same server works fine, but this allows the flexibility to move that static content to a separate server if/when it is needed.

I also copied the entire website to a second server and had it configured so that at any time I could change DNS to point ‘static.classperformance.com’ to the second server in order to reduce the bandwidth from the primary server

Generate static pages wherever possible.

I used wget to download everything, and then deleted the pages that needed to be parsed through PHP (ie: contact forms, etc). Most of the pages don’t change from visitor to visitor, so this can be done for the home page, all of the blog posts, and any other pages. This significantly reduces the overhead due to database queries and just the overhead of running PHP and including multiple files.

I then added this to my Apache configuration to tell the web server to use the static content if it exists:

    ## Serve static content for files that exist
    RewriteCond /home/classperformance.com/www/rendered/%{REQUEST_URI} -f
    RewriteRule (.*) /rendered/\ [L]

    ## For requests without an extension, wget has saved those files as 'index.html'
    ## so the rewrite rule needs to reflect that:
    RewriteCond /home/classperformance.com/www/rendered/%{REQUEST_URI} -d
    RewriteRule (.*) /rendered/\/index.html [L]

I did some performance tests with ApacheBenchmark, and serving the static content had a dramatic effect on the speed, and the number concurrent users. There is probably a more elegant way to configure mod_cache do a similar thing in a more automated fashion, but this was quick and easy, and I didn’t have to worry about checking the various HTTP headers. In my opinion, this was the single most effective thing to do. By serving static content, Apache also correctly handles many of the HTTP headers that enable effective caching (E-Tags, expires, last-modified, etc).

Installed a PHP Accelerator

I’ve previously written about how easy and effective eAccelerator is to install. There are very few scenarios where this is not effective. Again, ApacheBenchmark tests easily showed a huge increase in the number of concurrent requests when eAccelerator was enabled.

Check Apache settings

On a vanilla CentOS install, Apache has the ServerLimit set to 256. By serving primarily static content, you will likely reduce the amount of memory that each Apache child requires, and have memory for more children. I did some quick math and figured that I could have around 800 children before memory became a concern. I also enabled KeepAlives with a very short (1 second) KeepAliveTimeout so that sequential requests from the same user don’t have to recreate TCP sessions.

Also, by serving static content, I found that WordPress was handling the 301 redirect from the Non-www version of the site to the correct url. I moved that into Apache with this directive:

   ## Rewrite to the desired domain name
    RewriteCond %{HTTP_HOST} !^www\.classperformance\.com [NC] OR
    RewriteCond %{HTTP_HOST} !^static\.classperformance\.com [NC]
    RewriteRule ^/(.*) https://www.classperformance.com/\ [L,R=301]

Enable server-side compression

The default Apache install doesn’t compress any content. I configured mod_deflate to compress the static content and thus reduce the bandwidth usage. Compression should easily reduce the bandwidth for HTML and CSS files by one half (even up to one tenth). This not only reduces your bandwidth bill, but since the 100Mbps switch port is potentially a bottleneck, it enables more concurrent users if it approaches anywhere near that limit (and it may have if I hadn’t enabled compression)

Set up some Monitoring

I installed MRTG with some basic graphs. Also, I configured Apache so that I could view the ServerStatus. I also installed iftop to get a real-time view of the bandwidth usage.

With all of these changes, I’m very happy that we had tens of thousands of visitors during and shortly after the show, and everything ran perfectly. I had the static content running on a separate server for the busiest time and combined bandwidth usage peaked at around 90 Mbps shortly after the end of the show.