Find MySQL indexes that can be removed to free up disk space and improve performance

I wrote this handy query to find indexes that can be deleted because they have not been in use. It
queries the performance_schema database for usage on the indexes, and joins on INFORMATION_SCHEMA.TABLES
to see the index size.

Indexes that have zero reads and writes are obvious candidates for removal. They take extra write overhead to keep them
updated, and you can improve performance on a busy server by removing them. You can also free up some disk space
without them. The size column below helps to understand where you have the most opportunity for saving on disk
usage.

mysql>
SELECT. OBJECT_NAME,
        index_name,
        SUM(INDEX_LENGTH) AS size,
        SUM(count_star) AS count_star,
        SUM(count_read) AS count_read,
        SUM(count_write) AS count_write
FROM  table_io_waits_summary_by_index_usage
JOIN information_schema.TABLES
    ON table_io_waits_summary_by_index_usage.OBJECT_SCHEMA = TABLES.TABLE_SCHEMA
   AND table_io_waits_summary_by_index_usage.OBJECT_NAME = TABLES.TABLE_NAME
WHERE OBJECT_SCHEMA LIKE 'mydatabase%'
GROUP BY object_name, index_name
ORDER BY count_star ASC, size DESC
LIMIT 20;

+------------------------------+---------------------------------+-------------+------------+------------+-------------+
| OBJECT_NAME                  | index_name                      | size        | count_star | count_read | count_write |
+------------------------------+---------------------------------+-------------+------------+------------+-------------+
| transactions                 | order_id                        | 42406641664 |          0 |          0 |           0 |
| transactions                 | msku-timestamp                  | 42406641664 |          0 |          0 |           0 |
| transactions                 | fkTransactionsBaseEvent         | 42406641664 |          0 |          0 |           0 |
| baseEvent                    | PRIMARY                         | 33601945600 |          0 |          0 |           0 |
| baseEvent                    | eventTypeId                     | 33601945600 |          0 |          0 |           0 |
| orders                       | modified                        | 20579876864 |          0 |          0 |           0 |
| orders                       | buyerId-timestamp               | 20579876864 |          0 |          0 |           0 |
| productReports               | productAd-date-venue            |  8135458816 |          0 |          0 |           0 |
| shipmentEvent                | id                              |  7831928832 |          0 |          0 |           0 |
| shipmentEvent                | eventTypeId                     |  7831928832 |          0 |          0 |           0 |
| historyEvents                | timestamp_venue_entity          |  4567531520 |          0 |          0 |           0 |
| targetReports                | venueId-date-targetId           |  3069771776 |          0 |          0 |           0 |
| productAds                   | venue-productAd                 |  1530888192 |          0 |          0 |           0 |
| keywords                     | venue-keyword                   |   895598592 |          0 |          0 |           0 |
| targetingExpressions         | venue-target                    |   215269376 |          0 |          0 |           0 |
| targetingExpressions         | rType-rValue                    |   215269376 |          0 |          0 |           0 |
| serviceFeeEvent              | PRIMARY                         |    48234496 |          0 |          0 |           0 |
| serviceFeeEvent              | id                              |    48234496 |          0 |          0 |           0 |
| serviceFeeEvent              | eventTypeId                     |    48234496 |          0 |          0 |           0 |
| adGroups                     | venue-adGroup                   |    42336256 |          0 |          0 |           0 |

MySQL Encryption In-Transit Does NOT Require Client-Side Certificates

There are many articles around the Internet that discuss enabling Encryption in-transit to MySQL servers. They all include instructions about creating Client Certificates, but they don’t clearly explain that Client-Side Certificates are not a requirement to achieve end-to-end encryption between client and server.

Creating Client certificates that can be authenticated by the server can be complicated. It is not even possible in some scenarios, such as using servers hosted by AWS RDS, since AWS runs its own Certificate Authority. But don’t let that stop you. Below, I will demonstrate that enabling SSL/TLS on the server, and using a client that supports encryption is sufficient to securely encrypt traffic between the two.

First, I set up a MySQL server on RDS using MySQL version 8.0.25. Nothing special here, except that I’m going to make it “Publicly Accessible” which gives is a Public IP Address so that I can access it over the Internet. My Security Group here already allows inbound port 3307 from my desired IP Addresses for testing:

aws rds create-db-instance \
    --db-instance-identifier=encryption-tester \
    --allocated-storage=20 \
    --db-instance-class=db.t3.micro \
    --engine=mysql \
    --master-username=admin \
    --master-user-password="thepasswordIchose" \
    --vpc-security-group-ids="sg-0bf6fa7080100e55b" \
    --backup-retention-period=0 \
    --port=3307 \
    --no-multi-az \
    --engine-version=8.0.25 \
    --publicly-accessible

It takes several minutes for my Database Instance to be created, then I can log into it with the command:

mysql -h encryption-tester.accountidstuff.us-east-1.rds.amazonaws.com -u admin --port=3307 -p

I run the command show status like 'ssl_cipher'; and look at that! My connection is encrypted already, as indicated by Cipher method present:

$ mysql -h encryption-tester.accountidstuff.us-east-1.rds.amazonaws.com -u admin --port=3307 -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 17
Server version: 8.0.25 Source distribution

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show status like 'ssl_cipher';
+---------------+-----------------------------+
| Variable_name | Value                       |
+---------------+-----------------------------+
| Ssl_cipher    | ECDHE-RSA-AES128-GCM-SHA256 |
+---------------+-----------------------------+
1 row in set (0.01 sec)

I haven’t set up any client certificates or anything special, yet my connection is encrypted. But let’s not take the session variable’s word for it. Lets double-check by capturing and inspecting some packets.

I’ll run tcpdump with this command:

sudo tcpdump -i any host encryption-tester.accountidstuff.us-east-1.rds.amazonaws.com  -s 65535 -w /tmp/initial-connection.pcap

To make it quick, Instead of using a full packet-analysis program, I just run the strings command to look for text strings in the packet capture:

17:05 $ strings -8  /tmp/initial-connection.pcap
=JpgP~eS
mysql_native_password
Washington1
Seattle1"0
Amazon Web Services, Inc.1
Amazon RDS1%0#
Amazon RDS us-east-1 2019 CA0
210824170035Z
240822170850Z0
:encryption-tester.accountidstuff.us-east-1.rds.amazonaws.com1
Amazon.com1
Seattle1
Washington1
Seattle1
Washington1"0
Amazon Web Services, Inc.1
Amazon RDS1 0
Amazon RDS Root 2019 CA
:encryption-tester.accountidstuff.us-east-1.rds.amazonaws.com0
Seattle1
Washington1"0
Amazon Web Services, Inc.1
Amazon RDS1 0
Amazon RDS Root 2019 CA0
190919181653Z
240822170850Z0
Washington1
Seattle1"0
Amazon Web Services, Inc.1
Amazon RDS1%0#
Amazon RDS us-east-1 2019 CA0
HId0%aC>

Looks like a lot of stuff in that output about the certificate and SSL negotiation, but nothing containing the queries I executed.

I’m going to try it again and specifically disable encryption to see what the packets look like and ensure they contain the plain-text statements and responses I expect:

This is my SQL session:

17:05 $ mysql --ssl-mode=DISABLED -h encryption-tester.accountidstuff.us-east-1.rds.amazonaws.com -u admin --port=3307 -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 32
Server version: 8.0.25 Source distribution

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.25    |
+-----------+
1 row in set (0.00 sec)

mysql> show status like 'ssl_cipher';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Ssl_cipher    |       |
+---------------+-------+
1 row in set (0.01 sec)

mysql> \q

The strings in the packets captured during that session clearly contain things relevant to the commands that I executed:

17:05 $ strings -5 /tmp/skip-ssl.pcap
1%aCu
1%a\u
8.0.25
*Bi|tm
UkU-dsK
mysql_native_password
admin
mysql_native_password
Linux
_client_name
libmysql
24817
_client_version
5.7.35  _platform
x86_64
program_name
mysql
select @@version_comment limit 1
@@version_comment
Source distribution
select version()
        version()
8.0.25
show status like 'ssl_cipher'
performance_schema
session_status
session_status
Variable_name
Variable_name
performance_schema
session_status
session_status
Value
Value
Ssl_cipher

Conclusion:

SSL Client-Certificates are NOT required for traffic to be encrypted to a MySQL server. In fact, with a modern client and server, SSL is preferred and is automatically enabled. Just like I’d expect for traffic to be encrypted by default in 2021.

So what is the purpose of Client Certificates during a MySQL Connection

Client Certificates are intended to verify the identity of the Client. They are an extra step of authentication beyond a typical username and password. By presenting a client certificate that has been properly signed by a recognized Certificate Authority, the client is proving that their identity or system has been verified by the Certificate Authority.

Because SSL is complicated and is not well understood, many well-meaning people have instructions for creating a client key and client certificate, and transmitting those to authorized users. While that does provide a second piece of information needed to authenticate to the server, it is not how a secure client should authenticate.

The proper, fully secure method for a client to get a certificate is for the client to create its own private key. It should never share that key, even with the Certificate Authority. With the private key created, it would then create a certificate signing request (CSR), and present only the certificate signing request to the Certificate Authority. The certificate authority takes whatever steps it requires to verify the authenticity of the client, then provide the client back with a Client Certificate signed by the Certificate Authority. That Client Certificate is the client’s evidence that it’s identity has been verified by the Certificate Authority. The Certificate Authority is able to provide the client certificate without ever having the client’s private key.

Best Practices

If you run a MySQL Server and want to require that all clients to connect via SSL/TLS, you can set the global setting require_secure_transport to true. To require SSL only for specific users, use the CREATE USER ... REQUIRE SSL statement when creating the MySQL user

Installing Composer Packages with custom SSH Identities

Several posts around the Internet describe how to use a specific SSH Identity composer packages, but I can never find them when needed.

This is how I use a specific SSH identity file for packages deployed with GitHub via Deploy keys. GitHub allows a deploy key to be used with only a single repository, so if you have multiple repositories, you need a separate SSH key for each.

Create the SSH Key

ssh-keygen -t ed25519 -f ~/.ssh/repo-foobar -N '' -C "foobar-deploy"

Copy the contents of ~/.ssh/repo-foobar.pub into the “Deploy Key” section of the Repository settings in GitHub.

Now, you can script a deploy, including a composer install that includes that repository with the command

Use a custom GIT_SSH_COMMAND during composer install

cd /path/to/codebase
export GIT_SSH_COMMAND="ssh -i /home/username/.ssh/repo-foobar -o 'IdentitiesOnly yes '"
COMPOSER_HOME="/home/username/" composer install

The composer_install command uses the defined SSH command (instead of just plain ssh). In doing so, it uses the identity only from the specified key.

Have multiple repos included in your composer.json file that each need a separate identity?

You’ll need to create the SSH key and upload a separate key to GitHub for each repo. However, you can only specify one SSH key to use during the composer install. While there are more elegant solutions, I’ve found the simplest is just to run composer install multiple times, one for each package, and change the identity file between each one. The first execution will fail, but will keep the downloaded code in the composer cache. The second one won’t need to re-download the first again since it is already in the cache, and if you with as many packages as you have, it will eventually succeed, having downloaded each of them.

LastPass Challenges with Multiple Organizations

As a parallel entrepreneur, I’m a participating member of multiple companies. That brings with it some unique challenges, as many software tools don’t gracefully handle a user belonging to multiple organizations. I’ve learned to deal with that in many situations. Typically I’ll often having to log out and back in as the desired user or have multiple browsers or browser profiles open – one for each organization.

One area that has been particularly challenging has been group password management. There are not a lot of software options, although there are getting to be some new players. LastPass is the most mature option, and is the product that I have used for a long time. I investigated some alternatives including 1Password and DashLane. Both of those looked a little more modern and polished, but neither seemed to have mature support for multiple organizations.

Lastpass does claim to have robust support for organizations, but there is minimal, if any, mention on their website or elsewhere that mentions belonging to multiple organizations. It has taken me a lot of experimenting, but I’ve finally come up with a solution that works well.

You might think, as the diagram above indicates, that each organization to which you belong should invite your personal account to become a member of the organization. You would be wrong. Although this seems like the intuitive relationship, it does not work since LastPass only allows a personal account to attach to exactly one LastPass Enterprise account. Not more.

The correct way to belong to multiple Enterprise Accounts in LastPass is to choose one of the organizations to be your “Main” account to which you log in on a daily basis. You connect your Personal account to this enterprise account so that your personal sites appear alongside your work passwords.

Then, to add additional organizations, you don’t purchase a user license in those other organizations. Instead you create one or more shared folders, and share the folders with the email address for your “Main” organization account. There is a limitation that you can’t be an admin of the shared folders in these other organizations since you are not part of the Enterprise, but sharing and day-to-day password usage works generally as expected.

This method seems less intuitive, but works well now that I’ve figured it out. As I’ve learned more about how LastPass works internally, I understand why this unorthodox configuration is required

A few other quirks I’ve found, which just take some getting used-to:

  • Shared folders from my personal account DO NOT SHOW UP when logged into my enterprise account. You have to share to your main organization email address instead.
  • Folder structure in my Personal Account is not confusing in the User-Interface when browsing passwords in my enterprise account. The folder-within-folder structure doesn’t render well, and it is confusing as to which “level” I’m at.

I hope that the folks at LastPass are able to simplify this or make it more obvious how it is to be configured.

Do you have a better solution for password sharing with multiple organizations? Please let me and others know in the comments.

Configuring a Bitcoin Antminer S9 to Run Only part of the Day During Non-Peak Hours

I’ve started tinkering with Bitcoin Mining. Power usage is the single largest cost, so you need ensure that you are using the absolute least expensive power in order to generate as much profit as possible. I live in Georgia and subscribe to Georgia Power’s ‘Smart Usage’ program, which has lower costs most of the time, but peak periods which are Mondays-Fridays in June-September between 2pm and 7pm have a much higher power cost. My full calculation puts the non-peak price per kWh at about $0.048 and peak prices at about three times higher at $0.151 per kWh. Mining bitcoin on an Antminer S9 is mildly profitable at the non-peak price, but definitely loses money at the peak price.

Since it runs on a 220v plug, I can’t use something off-the-shelf like a smart plug to turn it on and off. I’m a linux geek anyway and would rather do it with software. The Antminer has a very bare-bones Linux OS, but it fortunately has crond installed, even though it is not running. These steps will enable crond and create a cron job that kills the bmminer process during the peak hours. It then reboots when the peak period is ending and starts everything back up.

Note that the machine is still on with fans running. It just doesn’t run the mining process which consumes all of the power.

You can see my power usage in the chart below, showing that power usage dropped significantly during the time from 2pm-7pm.

Here is how to make it work:

  1. SSH into the Antminer. Default user is root, password of admin
    ssh root@192.168.1.200
  2. Have it start cron at boot by adding this line to the bottom of /etc/inittab:
    echo "cron:2345:once:/usr/sbin/crond" >> /etc/inittab
  3. mkdir /var/spool/cron/crontabs
  4. Run crontab -e to edit the root crontab in vi
  5. Paste in this content, modify for your desired time times. Note that times are in Universal Coordinated Time (UTC)
  6. ## Here we will stop `single-board-test` and `bmminer` from running during "Peak" periods for Georgia Power
    ## when it is unprofitable to mine due to increase in power cost
    ## 'Peak' is defined as 2pm-7pm, Monday-Friday, in June-September
    
    ## Since monitorcg is started from inittab and can't effectively be killed, we kill single-board-test and bmminer every minute
    ## during the peak hours
    
    ## kill `single-board-test`, which monitors and restarts `bmminer`
    * 18-22 * 6-9 1-5  /bin/kill `/bin/ps -ef | /bin/grep single-board | /bin/grep -v grep | /usr/bin/head -n1 | /usr/bin/cut -c1-5`
    
    ## Also, obviously kill `bmminer`
    * 18-22 * 6-9 1-5 /bin/kill `/bin/ps -ef | /bin/grep bmminer       | /bin/grep -v grep | /usr/bin/head -n1 | /usr/bin/cut -c1-5`
    
    
    ## Reboot at 6:59pm EDT, which will restart the whole machine, bmminer with it (and takes a few minutes to start back up)
    59 22 * 6-9 1-5 /sbin/reboot
    
  7. Exit vi with saving by typing<ESC> :wq<ENTER>
  8. Finally, just type reboot at the command line to have the machine restart.

PHP Sessions with Redis Cluster (using AWS Elasticache)

I’ve recently been moving some of our project from a single Redis server (or server with a replica) to the more modern Redis Cluster configuration. However, when trying to set up PHP sessions to use the cluster, I found there wasn’t a lot of documentation or examples. This serves as a walk-through for setting up PHP sessions to use a redis Cluster, specifically with Elasticache on AWS.

First, create your Elasticache Redis Instance like so. Note the “Cluster Mode Enabled” is what causes redis to operate in Cluster mode.

AWS Elasticache Redis Creation

Once there servers are launched, make note of the Configuration Endpoint which should look something like: my-redis-server.dltwen.clustercfg.usw1.cache.amazonaws.com:6379

Finally, use these settings in your php.ini file. The exact location of this file will depend on your OS, but on modern Ubuntu instances, You can place it in /etc/php/7.0/apache2/conf.d/30-redis-sessions.ini

Note the special syntax for the save_path where is has seed[]=. You only need to put the main cluster configuration endpoint here. Not all of the individual instances as other examples online appear to use.


session.save_handler = rediscluster
session.save_path = "seed[]=my-redis-server.dltwen.clustercfg.usw1.cache.amazonaws.com:6379"
session.gc_maxlifetime = 1296000

That’s it. Restart your webserver and sessions should now get saved to your Redis cluster.

IIn the even that something goes wrong, you might see something like this in your web server log files:


PHP Warning: Unknown: Failed to write session data (redis). Please verify that the current setting of session.save_path is correct (tcp://my-redis-server.dltwen.clustercfg.use1.cache.amazonaws.com:6379) in Unknown on line 0

MySQL Statistics for Updates/Inserts per-table

For a long time, I’ve never been able to answer some basic questions that I thought fundamental to optimizing server performance. MySQL gives you some server-wide metrics about activity, but none of it is broken down per-table so that an application developer could look into where to reduce the number of writes, or generally where to focus their attention in order to improve the server performance.

I finally got ambitious enough to tackle this problem and asked a question on StackOverflow at http://stackoverflow.com/questions/39459185/mysql-how-to-count-the-number-of-inserts-updates-to-a-table

A commenter named barat pointed me to this post which had the insightful idea of parsing the binary log for analysis.
Since my servers are generally hosted on AWS, I don’t have direct access to the binary log, so I had to retrieve those. The MySQL documentation for the mysqlbinlog command briefly mentions how to read the binary log from a remote server. It took some experimentation to get the right command and output options with all of the data I wanted. Specifically, the `–base64-output=DECODE-ROWS –verbose` options which translate some of the row-based logging into MySQL commands that can be parsed.

The first step is to create a user that has access to the binary logs. I used the main ‘admin’ user that RDS creates because it was convenient. If creating a new user, you probably need to grant the REPLICATION_SLAVE privilege.

You can see which binary logs are available on the server with the SHOW BINARY LOGS; command:

mysql> show binary logs;
+----------------------------+-----------+
| Log_name                   | File_size |
+----------------------------+-----------+
| mysql-bin-changelog.232522 | 16943219  |
| mysql-bin-changelog.232523 | 32300889  |
| mysql-bin-changelog.232524 | 15470603  |
+----------------------------+-----------+

Then you can actually retrieve the log and print to STDOUT using this command:

14:01 $ mysqlbinlog --read-from-remote-server \
  --host myhost.somerandomchars.us-east-1.rds.amazonaws.com \
  --user admin \
  --password="mypassword"
  mysql-bin-changelog.232522

Note that if you get the error below, you need to make sure that your MySQL client and server tools are using the same version. I originally attempted to use MySQL 5.5 tools with a MySQL 5.6 server.

ERROR: Got error reading packet from server: Slave can not handle replication events with
the checksum that master is configured to log; the first event 'mysql-bin-changelog.232519'
at 4, the last event read from '/rdsdbdata/log/binlog/mysql-bin-changelog.232519' at 120,
the last byte read from '/rdsdbdata/log/binlog/mysql-bin-changelog.232519' at 120.

After that, it was just a matter of parsing the file for the relevant commands. I’ve put all of that logic now into a quick PHP script that I can reuse anywhere. Now, I can go through a bunch of binary logs on a server and see which tables are updated the most frequently with output like this:

Parsed 1,096,063 lines spanning 300 seconds between 2016-09-13 03:05:00 and 2016-09-13 03:10:00
master                         metrics                        update          = 43570
master                         metrics                        insert into     = 9
DEFAULT                        accounts                       update          = 501
DEFAULT                        users                          update          = 5
DEFAULT                        logins                         insert into     = 1
mysql                          rds_heartbeat2                 insert into     = 1

I’ve committed this project to Github at https://github.com/sellerlabs/mysql-writes-per-table for others to use.

Docker Syslog Container for Sending Logs to CloudWatch

AWS’s CloudWatch Logs was first available about a year ago, and to my estimation has gone largely unnoticed. The initial iteration was pretty rough, but some recent changes have made it more useful, including the ability to search logs, and generate events for monitoring in CloudWatch from log content.

Unfortunately, the Cloudwatch Logs agent just watches log files on disk and doesn’t act as a syslog server. An AWS blog post explained how to get the the Cloudwatch Logs Agent running inside a container and monitoring the log output from rsyslogd, but the instructions used Amazon’s ECS service, which still doesn’t quite offer the flexibility that CoreOS or Deis offer IMHO. ECS does some magic behind the scenes in passing credentials around that you have to do yourself when using CoreOS.

I’ve just provided a GitHub repository with the tools to make this work pretty easily, as well as a Docker Image with some reasonable defaults.

When trying to pull all of this together to work, I discovered a problem due to a bug in the overlayfs that is in current Deis Releases which causes the AWS Logs agent not to notice changes in the syslog files. A workaround is available that reformats the host OS back to btrfs to solve that particular problem

Note when running on Deis 561+ to revert to btrfs

Deis add Key from an ssh-agent

Evidently it is not possible to add an SSH key directly from an SSH agent. Instead, you can grep the public key from your ~/.ssh/authorized_keys file then, have deis use that key. Or if you only have one line in ~/.ssh/authorized_keys, you can just tell deis to use that file directly with the command

deis keys:add ~/.ssh/authorized_keys

Unattended install of Cloudwatch Logs Agent

So far, I’m pretty impressed with cloudwatch logs. The interface for it isn’t as fancy, and search capability isn’t as deep as other tools like PaperTrail or Loggly, but the cost is significantly less, and I like the fact that you can store different log groups for different lengths of time.

I’m trying to get the cloudwatch logs agent to install as part of an automated script, and couldn’t find any easy instructions to do that, so here is how I got it working with a shell script on an Ubuntu 14.04 host

echo Creating cloudwatch config file in /root/awslogs.conf
cat <<EOF >/root/awslogs.conf
[general]
state_file = /var/awslogs/state/agent-state
## Your config file would have a lot more with the logs that you want to monitor and send to Cloudwatch
EOF

echo Creating aws credentials in /root/.aws/credentials
cat <<EOF > /root/.aws/credentials
[default]
aws_access_key_id = YOUR_AWS_ACCESS_KEY_HERE
aws_secret_access_key = YOUR_AWS_SECRET_KEY_HERE
EOF

echo Downloading cloudwatch logs setup agent
cd /root
wget https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py
echo running non-interactive cloudwatch-logs setup script
python ./awslogs-agent-setup.py --region us-west-2 --non-interactive --configfile=/root/awslogs.conf