What is in a gclid?

Posted on April 15th, 2009 in General,Linux System Administration,PHP,Programming by Brandon

When you use auto-tagging with your Adwords campaign, all request that are generated by Google Adwords contain a ?glcid parameter in the Request. Adwords uses this to pass some information to Analytics for traffic analysis.

I was curious, about what data the gclid parameter contained. My guess was that it contained some encoded or encrypted information regarding the origin of the click, so I did some analysis on the clicks that I received. Some discussion about it was available on this post.

I ended up writing a quick PHP script that parses through an Apache log file. It finds requests that contain a gclid and then produces a report of which letters occur in which positions of the gclid.

The script is available for download here, and it generates a report like this:

Found 32507 appropriate lines
Character  1 [ 1] C
Character  2 [ 8] IJKLMNOP
Character  3 [32] -CDGHKLOPSTWX_abefijmnqruvyz2367
Character  4 [64] -CDEFG0ABHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz123456789
Character  5 [32] -_0ghijklmnopqrstuvwxyz123456789
Character  6 [32] -IJKLMNOPYZ_abcdefopqrstuv456789
Character  7 [32] -CDGHKLOPSTWX_abefijmnqruvyz2367
Character  8 [64] -ABCDEFG0HIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz123456789
Character  9 [32] 0-_ghijklmnopqrstuvwxyz123456789
Character 10 [ 4] JZp5
Character 11 [ 8] IMQUYcgk
Character 12 [ 1] C
Character 13 [ 1] F
Character 14 [10] QRSUWYZcde
Character 15 [61] -ABCEFGHIJKLMNOPQRSTUVWXYZ_ab0cdefghiklmnopqrstuvwxy123456789
Character 16 [63] -ABCDEFGHIJKLMNOQRSTUVWXYZ_abcde0fghijklmnopqrstuvwxyz123456789
Character 17 [17] DFGHIQabgiknrsx57
Character 18 [ 4] AQgw
Character 19 [ 1] o
Character 20 [ 1] d
Character 21 [64] -ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwx0yz123456789
Character 22 [32] ABCDEFGHQRSTUVWXghijklmnwyz0x123
Character 23 [64] -ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuv0wxyz123456789
Character 24 [64] -ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrs0tuvwxyz123456789
Character 25 [62] 0-ABCDEFHIJKLMOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz123456789
Character 26 [ 4] AQgw

This makes it clear that the parameter has some structure, but I’m still no closer to determining what it contains. Counting up the unique values, it would seem that they have about 95 bits of information available, which might be enough room to store everything it would need to know about the search that created it. Based on the reporting details in Analytics, I would presume that it somehow contains at least the following information:

  • Campaign (id)
  • Keyword (id)
  • Ad Variation (id)
  • Position

I did some research by clicking an ad multiple times and examining the glcids for those:

        12345678901234567890123456
/?gclid=CNHz5eD_8pkCFRCdnAodzniYQg
/?gclid=CIX_u-X_8pkCFQKenAodlWprSg
/?gclid=CMyI_4OA85kCFRIhnAodc2_oRg
/?gclid=CO_0pYyA85kCFQghnAodDDpaRQ
/?gclid=CIXo9JeA85kCFRIhnAodc2_oRg
/?gclid=CLitgp2A85kCFQubnAod1nx7Qg
/?gclid=CN3_1aOA85kCFQghnAodDDpaRQ
/?gclid=CPyi1quA85kCFRabnAodWnZbRQ
/?gclid=COq-67OA85kCFRMhnAodyQvSRg
/?gclid=COOplrmA85kCFRCdnAodzniYQg

I noticed that most of the characters which use 32-64 characters vary quite a bit except for character #9, which was always an 8, and character #10 which was a ‘p’ for the first two clicks, and then a ’5′ for all subsequent clicks. That likely has some significance, but I’m out of time for playing with it for now.

Hopefully the script and this basic analysis might be of use for somebody else to use in digging into it further.

One other thought that I had is that the data (or each field) is somehow encrypted and when you ‘link’ your Analytics account to your Adwords account it shares the decryption key so that it can get at the detail.

Announcing WebPasswd

Posted on March 15th, 2009 in General,Linux System Administration,MySQL,PHP,Programming by Brandon

Do you have users who need access to web-based applications on multiple servers? Managing those users can be a pain when dealing with normal htpasswd-based permissions. Adding or removing users means editing each htpasswd file and remembering where all of them are.

Mod_auth_mysql is a good way to centralize that user database so that you can avoid having all of the separate htpasswd files. The apache module is available from any modern Linux distribution, so installing and configuring it takes less than 5 minutes. I started using it almost 2 years ago, and over that time have made a simple web application for managing the users and granting them permission to each application.

I’ve released the program as WebPasswd for anybody else who wants to use it. Now adding users and granting them access to application can be don with just a few clicks. Granting and revoking access to an application takes just seconds and is applied immediately. Configuring a new application takes a couple clicks, and then you just copy/paste the Apache configuration into the appropriate place on your web server. Try it out with this demo.

I think this will be useful to people. I have not seen another application that does something similar. Let me know if it works for you.

PHP 5.1 Doesn’t have timezone_identifiers_list() by default

Posted on February 5th, 2009 in General,PHP by Brandon

According to the PHP documentation for timezone_identifiers_list(), that function should be included in PHP 5.1.x. The note on DateTime installation mentions, however, that it was only experimental support, and had to be compiled specifically to support it.

The fix, then, is to recompile PHP 5.1.x with

CFLAGS=-DEXPERIMENTAL_DATE_SUPPORT=1

or to upgrade to PHP 5.2 where it is enabled by default.

My particular problem surfaced with some Drupal code that required the function.

Save Internet Audio Streams to MP3s

Posted on January 21st, 2009 in General,Linux System Administration,PHP,Programming by Brandon

I’ve got a couple of radio programs that I like to listen to. The only problem is that I rarely am able to listen to them live. I was wishing that somebody made a good DVR-like device for the radio, but after some thought figured out a way to do it on my own using internet audio streams that most radio stations now have available.

Googling for instructions on how to save Internet audio streams will return a lot of semi-workable but mostly garbage instructions. The best set of instructions I found was at Instructables.com where the basic concept is to use the command-line mplayer to save the stream as a wave file, then use lame to convert it to an MP3.

The instructables tutorial had several downfalls though. First, it is not able to stop mplayer on its own, so it uses a second cron job to kill the original mplayer command – a little to crude for my taste. Secondly, and more importantly, you have to know the exact stream URL which is not easy to identify from most internet radio websites. They tend to hide the actual stream source behind layers of javascript so that their web-based players can synchronize ads and such while listening to the streams.

I created some PHP code that automates this process and makes it pretty simple. The basic streamsave class provides functions for downloading the wave file and converting it to an MP3. I then extend that class for specific radio stations that I want to save. The extended class provides functions that run through all of the javascript garbage to get to the actual stream source.

Using those classes, this simple script now saves my stream to an MP3 file and emails me the location when it is done:

<?php
require_once dirname(__FILE__).'/ss_640wgst.class.php';

$streamsave = new ss_640wgst();

$streamsave->stream_url = $streamsave ->getStreamURL();
$streamsave->seconds = 60 * 60; // One hour

// This saves the stream to a temporary wav file
$streamsave->save_stream();

// Now encode it to an mp3
$output_file = "/tmp/some_directory/some_program_".date('Y-m-d-His').'.mp3';
$streamsave->encode_to_mp3($output_file);

// Delete the large wav file
unlink($streamsave->wavfile);

// And tell me that the file was saved
echo "File saved (if all went okay) to {$streamsave->mp3file}\n\n";
mail('you@yourdomain.com', 'Audio File Saved', "File saved to {$streamsave->mp3file}");

// It would be cool to create a podcast XML file here that contains your new file

?>

Downloads

The abstract class file: streamsave.class.php
The extended class specifically for 640 WGST in Atlanta: ss_640wgst.class.php

I’ve created the extended classes for stations that are useful for me. If there seems to be any interest, I can work on developing that a bit more to make it more generalized and work for more radio stations.

Preparing WordPress for a Large Traffic Spike

Posted on December 10th, 2008 in General,Linux System Administration,PHP by Brandon

The Hallmark Hall of Fame Movie ‘Front of the Class’ premiered this past weekend with an expected 12-15 million viewers.  We have been preparing the website (ClassPerformance.com) for the event. We expected a significant number of visitors to the website in the 24-48 hours after the movie aired, so I did a number of things to ensure that the site would be able to run without incident during this critical time.

  1. Move temporarily to a higher powered server.
  2. The site is normally hosted on an inexpensive shared-hosting plan. I’ve run some shared-hosting servers before and don’t have much faith that they would handle any amount of significant load. They also usually don’t allow you to configure some of the Apache settings that I was planning on using below.

  3. Serve images and other static content from an alternate location.
  4. I set up a domain alias of ‘static.classperformance.com’ pointed to the same DocumentRoot as the main site. Then I edited the template files to serve most of the background, header, and footer images from that location. For normal usage, serving them from the same server works fine, but this allows the flexibility to move that static content to a separate server if/when it is needed.

    I also copied the entire website to a second server and had it configured so that at any time I could change DNS to point ‘static.classperformance.com’ to the second server in order to reduce the bandwidth from the primary server

  5. Generate static pages wherever possible.
  6. I used wget to download everything, and then deleted the pages that needed to be parsed through PHP (ie: contact forms, etc). Most of the pages don’t change from visitor to visitor, so this can be done for the home page, all of the blog posts, and any other pages. This significantly reduces the overhead due to database queries and just the overhead of running PHP and including multiple files.

    I then added this to my Apache configuration to tell the web server to use the static content if it exists:

        ## Serve static content for files that exist
        RewriteCond /home/classperformance.com/www/rendered/%{REQUEST_URI} -f
        RewriteRule (.*) /rendered/$1 [L]
    
        ## For requests without an extension, wget has saved those files as 'index.html'
        ## so the rewrite rule needs to reflect that:
        RewriteCond /home/classperformance.com/www/rendered/%{REQUEST_URI} -d
        RewriteRule (.*) /rendered/$1/index.html [L]
    

    I did some performance tests with ApacheBenchmark, and serving the static content had a dramatic effect on the speed, and the number concurrent users. There is probably a more elegant way to configure mod_cache do a similar thing in a more automated fashion, but this was quick and easy, and I didn’t have to worry about checking the various HTTP headers. In my opinion, this was the single most effective thing to do. By serving static content, Apache also correctly handles many of the HTTP headers that enable effective caching (E-Tags, expires, last-modified, etc).

  7. Installed a PHP Accelerator
  8. I’ve previously written about how easy and effective eAccelerator is to install. There are very few scenarios where this is not effective. Again, ApacheBenchmark tests easily showed a huge increase in the number of concurrent requests when eAccelerator was enabled.

  9. Check Apache settings
  10. On a vanilla CentOS install, Apache has the ServerLimit set to 256. By serving primarily static content, you will likely reduce the amount of memory that each Apache child requires, and have memory for more children. I did some quick math and figured that I could have around 800 children before memory became a concern. I also enabled KeepAlives with a very short (1 second) KeepAliveTimeout so that sequential requests from the same user don’t have to recreate TCP sessions.

    Also, by serving static content, I found that WordPress was handling the 301 redirect from the Non-www version of the site to the correct url. I moved that into Apache with this directive:

       ## Rewrite to the desired domain name
        RewriteCond %{HTTP_HOST} !^www\.classperformance\.com [NC] OR
        RewriteCond %{HTTP_HOST} !^static\.classperformance\.com [NC]
        RewriteRule ^/(.*) http://www.classperformance.com/$1 [L,R=301]
    
  11. Enable server-side compression
  12. The default Apache install doesn’t compress any content. I configured mod_deflate to compress the static content and thus reduce the bandwidth usage. Compression should easily reduce the bandwidth for HTML and CSS files by one half (even up to one tenth). This not only reduces your bandwidth bill, but since the 100Mbps switch port is potentially a bottleneck, it enables more concurrent users if it approaches anywhere near that limit (and it may have if I hadn’t enabled compression)

  13. Set up some Monitoring
  14. I installed MRTG with some basic graphs. Also, I configured Apache so that I could view the ServerStatus. I also installed iftop to get a real-time view of the bandwidth usage.

With all of these changes, I’m very happy that we had tens of thousands of visitors during and shortly after the show, and everything ran perfectly. I had the static content running on a separate server for the busiest time and combined bandwidth usage peaked at around 90 Mbps shortly after the end of the show.

Don’t Use Integers as Values in an Enum Field

Posted on July 25th, 2008 in General,MySQL,PHP,Programming by Brandon

I just got through fixing a messy problem where a database had a table defined with a couple columns that were ENUM’s with integer values.   This leads to extreme amounts of confusion, because there is a lot of ambiguity when doing queries whether the integer is supposed to be treated as the enumerated value, or as the key.

Imagine a table with a column defined as ENUM(’0′, ’1′, ’2′, ’3′).  When doing queries, if you try to do anything with that column, it is unclear whether you mean to use the actual value you pass in, or the position.  For example, if I as to say ‘WHERE confusing_column = 2′, it could be interpreted as either meaning the value ’2′, or the item in the second position (ie; ’1′).    It is even hard to explain because it is so confusing.

The MySQL Documentation does a decent job of explaining it.   I agree with their recommendation:

For these reasons, it is not advisable to define an ENUM column with enumeration values that look like numbers, because this can easily become confusing.

I ended up converting everything to Tinyint’s. It takes a few more bits per row, but worth it in my opinion to avoid the confusion.

Checking MySQL Replication

Posted on June 23rd, 2008 in General,Linux System Administration,PHP,Programming by Brandon

MySQL replication is pretty easy to set up, but needs a few extra things to make it more reliable. I wrote this quick PHP script to alert me when replication has failed and is more than 5 minutes behind the master.

<?php

$user = 'username';
$pass = 'password';
$host = 'localhost';
// Grant this user permission to check the status with this mysql statement
// GRANT REPLICATION CLIENT on *.* TO 'user'@'host' IDENTIFIED BY 'password';

$threshold = 300;

$db = mysql_connect($host, $user, $pass);

$result = mysql_query('SHOW SLAVE STATUS');
if (!$result) {
    // Make sure that your user has the 'REPLICATION CLIENT' privlege
    echo "Error 'SHOW SLAVE STATUS' command failed\n";
    echo mysql_error()."\n";
    exit(1);
}

$status = mysql_fetch_array($result);

if (!isset($status['Seconds_Behind_Master'])) {
    echo "Error: Seconds_Behind_Master column not found in result\n";
    print_r($status);
    exit(2);
}

if ($status['Seconds_Behind_Master'] > $threshold) {
    $minutes = floor($status['Seconds_Behind_Master'] / 60);
    echo "Error: Slave is $minutes minutes behind the master server\n";
    exit(3);
}

exit(0);
?>

This script is intended to be run periodically from cron. It doesn’t generate any output unless something is wrong. The behavior of cron is that when a script generates output, it will email the output to the user, so make sure that you have mail on your system configured to send you the cron output correctly. The script also exits with a non-zero status on each error, so you might include this in a more complicated script that attempts to do something else based on the status.

I use something like this in a non-privileged user’s crontab:

*/15 * * * /usr/bin/php /path/to/check_replication.php

bcSpamblock Updated to Version 1.3

Posted on May 14th, 2008 in General,PHP,Programming,Security,Spam by Brandon

Thanks to jontiw for pointing out a potential problem in my bcSpamblock code.  He noted the the PHP crypt() function returns the salt along with the encrypted value.  My code was passing the salt to the visitor so that an attacker could potentially learn the salt value that a website was using and create valid responses.

I modified the code to strip out that salt before passing it to the user.  I also modified the data used to create the salt so that previous vulnerable version doesn’t use the same value for the site.  The wordpress plugin has also been updated as well.

I was happy to see other people looking through my code and pointing this type of issue out.

Problems to Anticipate When Upgrading From PHP4 to PHP5, and MySQL4 to MySQL5

Posted on March 19th, 2008 in Linux System Administration,PHP,Programming by Brandon

A client website just upgraded from PHP4 to PHP5 and MySQL4 to MySQL5 and completely broke. Doing such significant upgrades should have been tested first, but for some reason didn’t happen. I got invited to fix and ran across several problems:

MySQL queries containing some explicit JOINs broke. A simple query like this doesn’t work in MySQL5:

SELECT table1.*, table2.*
FROM table1, table2
LEFT JOIN table3 on table1.col1 = table3.col1

In MySQL 5, the JOIN operator now has a higher precedence than the comma operator, so it interprets the query differently. See this post or the MySQL documentation for more information. The quick fix is to put parenthesis around the tables in the FROM statement, like this:

SELECT table1.*, table2.*
FROM (table1, table2)
LEFT JOIN table3 on table1.col1 = table3.col1

The other significant problem was in the upgrade from PHP4 to PHP5, the XML parsing functions are completely different. PHP 4 used the domxml extentions, where PHP 5 uses a newere DOM extention.

From http://www.php.net/manual/en/ref.domxml.php:

It will, however, never be released with PHP 5, and will only be distributed with PHP 4. If you need DOM XML support with PHP 5 you can use the DOM extension. This domxml extension is not compatible with the DOM extension.

The solution for fixing this, however is quite a bit more complicated. I had to rewrite the XML producing scripts to use the new functionality. Fortunately, the new DOM functionality is pretty straightforward and easier to write, so porting it from one to the other is fairly straightforward, but does require some effort.

GnuPG Encryption with PHP

Posted on February 26th, 2008 in Encryption,General,PHP,Programming,Security by Brandon

I found PHP’s documentation on the GnuPG functions to be pretty sparse, so thought I would share some specific steps that I went though in order to get everything working.

Prerequisites

First off, you have to install the GnuPG PHP libraries through pecl. It requires the GnuPG Made Easy (gpgme) packages to get working. The following shell commands will install the OS packages, install the GnuPG PHP libraries, then enable the PHP extension and restart Apache:

# apt-get install gnupg gpgme gpgme-devel

# pecl install gnupg

# echo extension=gnupg.so > /etc/php.d/gnupg.ini

# apachectl restart

Creating GnuPG Keys

Next, you need to create a set of keys to encrypt and decrypt your data. You’ll need to put the keys somewhere where the webserver can read and write to a directory. I’ll use /var/www/.gnupg since that is the default home directory for many Apache installations. After running the gpg command, answer the questions as prompted. User input is red in the output shown below.

# mkdir -p /var/www/.gnupg

# gpg --homedir /var/www/.gnupg --gen-keygpg
WARNING: unsafe permissions on homedir `/tmp/keys'

gpg (GnuPG) 1.4.5; Copyright (C) 2006 Free Software Foundation, Inc.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions. See the file COPYING for details.
gpg: keyring `/tmp/keys/secring.gpg' created
gpg: keyring `/tmp/keys/pubring.gpg' created
Please select what kind of key you want:
   (1) DSA and Elgamal (default)
   (2) DSA (sign only)
   (5) RSA (sign only)
Your selection? 1
DSA keypair will have 1024 bits.
ELG-E keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 2048
Requested keysize is 2048 bits
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0) 10y
Key expires at Fri Feb 23 16:35:14 2018 PST
Is this correct? (y/N) y
You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"
Real name: Some User
Email address: some@user.com
Comment: This is a key for Some User
You selected this USER-ID:
    "Some User (This is a key for Some User) <some@user.com>"
Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o
You need a Passphrase to protect your secret key. Enter your passphrase here
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
gpg: /tmp/keys/trustdb.gpg: trustdb created
gpg: key 21CCC3D6 marked as ultimately trusted
public and secret key created and signed.
.... a bunch of random characters here....
gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2018-02-24
pub   1024D/21CCC3D6 2008-02-27 [expires: 2018-02-24]
      Key fingerprint = FA45 1EE9 8772 70EF 1CFA  99CE 048A 6139 21CC C3D6
uid                  Some User (This is a key for Some User) <some@user.com>
sub   2048g/A83E754B 2008-02-27 [expires: 2018-02-24]
#chown -R apache:apache /var/www/.gnupg

Make note of the key fingerprint in the 4th from the bottom line. You’ll need this in your PHP code when referencing the key. Also, make sure that you write down your pass phrase somewhere. Your encrypted data will be useless if you don’t have the pass phrase.

Your Application

Now you can write your PHP code that will do the encryption. Here is a sample that encrypts, then decrypts something:

<?php
$CONFIG['gnupg_home'] = '/var/www/.gnupg';
$CONFIG['gnupg_fingerprint'] = 'FA451EE9877270EF1CFA99CE048A613921CCC3D6';

$data = 'this is some confidential information';

$gpg = new gnupg();
putenv("GNUPGHOME={$CONFIG['gnupg_home']}");
$gpg->seterrormode(GNUPG_ERROR_SILENT);
$gpg->addencryptkey($CONFIG['gnupg_fingerprint']);
$encrypted =  $this->gpg->encrypt($data);
echo "Encrypted text: \n<pre>$encrypted</pre>\n";

// Now you can store $encrypted somewhere.. perhaps in a MySQL text or blob field.

// Then use something like this to decrypt the data.
$passphrase = 'Your_secret_passphrase';
$gpg->adddecryptkey($CONFIG['gnugp_fingerprint'], $passphrase);
$decrypted = $gpg->decrypt($encrypted);

echo "Decrypted text: $decrypted";
?>

It would be best to store $passphrase somewhere completely separate from your application configuration. Perhaps an admin user would be required to enter the passphrase when looking up this information. That way your passphrase is not stored in your config file or anywhere that an attacker could potentially gain access to it.

Troubleshooting

Make sure that the web server can write to the GnuPG Home directory. This obviously is not ideal, but seems to be required in the testing that I have done. I’ve been able to set ‘secring.gpg’ to be owned by root, but that does little good since the directory it is in has to be writable.

You can raise the error mode to GNUPG_ERROR_WARNING to generate PHP warnings on GnuPG errors. That might help to track down where errors are occurring

« Previous Page