Google Docs and Sheets should Almost Always be Restricted to Defined Users

Somebody sends you a link to a Google Sheet and it just works. It’s magical.
But that magic comes at a cost. I see far, far too many organizations that regularly share Google Documents and Sheets by using the share with “Anyone with the link” option that Google easily provides.

That is almost ALWAYS a bad idea. The convenience of having it shared with anybody is, at the same time, a potential security problem today and in the future.

But that long link with the 44 random-looking characters would be impossible for somebody to guess, right?

Yes. It would be statistically improbable for somebody to just guess a random string of 44 characters that would result in an actual document. It is possible that an attacker could write programs that could guess millions and millions of links to try them until they found some documents that actually exist. But that’s not the most likely weakness.

Consider what happens when you email a for your spreadsheet to somebody else. You have zero control over who accesses it after that. What if the recipient forwards your email with the link to somebody else? Often emails to businesses are forwarded into Customer Relationship Management (CRM) or similar systems where that link is now accessible to many other people in the organization. What if an attacker has access to a recipients email? Or a CRM system? How about if an employee leaves the company and they still have it in a browser history.

In all of those scenarios, and hundreds more that you can’t imagine, if your document is shared with “Anyone with the link”, literally anybody that sees that link can open it and you have absolutely no knowledge that they did.

Always share only with specific email addresses.

Sharing with Google Groups

Sharing with specific people can become a headache to maintain as people change roles. Consider using the Google Groups feature in your organization. You can set up a Google Group for something like ‘’ or ‘’ and ask to have documents shared with that group instead of individual people. You can then add and remove people from the groups to provide access to only those that are allowed.

See More information about sharing with Groups at

Solving ECS Stuck in Pending and Frozen / Stalled ECS Hosts Problems

We’ve had a strange, hard to track-down problem for months now. It has felt like a bug with Amazon ECS, but everything seems to have been working correctly.

The main way that we’ve observed this problem is that ECS would say that it was launching tasks, but they would stay in a “PENDING” state forever. Conversely, when tasks needed to be killed, the desired state would change to Stopped, but the ECS Console would indicate that they were still running. We discovered quickly, that some of our ECS Host Servers would become completely unresponsive. Sometimes with 100% CPU usage, sometimes with near zero CPU Usage. Terminating the instance, and having the Auto-Scaling group recreate it would generally solve the problem, but its never good to have things frozen without understanding why.

Often, the host servers would be completely unresponsive. We were usually unable to SSH into the server to investigate. When able to access them, looked through logs and found it full of failures about being unable to talk to external resources. After diving pretty deep, we figured out that the route table was missing a default gateway. It’s hard to talk to anything when you can only use a local network.

This is an example of a missing default gateway.

[ec2-user@ip-172-31-45-74 ~]$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface     U     0      0        0 docker0   U     0      0        0 eth0

On a functioning instance, it should look like this. Notice the destination of with the IP Address to the Default Gateway:

[ec2-user@ip-172-31-39-228 ~]$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface         UG    0      0        0 eth0 UH    0      0        0 eth0     U     0      0        0 docker0   U     0      0        0 eth0

It was puzzling how the machine would work for a while, and then its default gateway would disappear.

I’m still not certain how exactly that is happening. However, the system log indicates that there is a period of extremely high load
and it gets frozen for minutes (maybe hours) at a time.

Some of these log entries are indicitive of major delays:

Jan 20 13:26:44 ip-172-31-123-45.ec2.internal crond[21992]: (root) INFO (Job execution of per-minute job scheduled for 13:25 delayed into subsequent minute 13:26. Skipping job run.)

Jan 17 21:20:31 ip-172-31-45-166.ec2.internal chronyd[2696]: Forward time jump detected!

Notice how these logs are out of order too:

Jan 20 13:39:22 ip-172-31-123-45.ec2.internal kernel: R13: 00007faf9dc777a8 R14: 00000000000031f9 R15: 00007faf9dc7d510
Jan 20 13:28:30 ip-172-31-123-45.ec2.internal dockerd[4660]: http: superfluous response.WriteHeader call from (httputils.go:107)
Jan 20 13:36:03 ip-172-31-123-45.ec2.internal dhclient[3275]: XMT: Solicit on eth0, interval 129760ms.
Jan 20 13:28:30 ip-172-31-123-45.ec2.internal dockerd[4660]: http: superfluous response.WriteHeader call from (httputils.go:107)

Finally, this may be the thing that ultimately disables the networking. It looks like `oom-killer` killed the `dhclient-script`, which maybe left the network in an very bad state:

Jan 20 15:28:36 ip-172-31-45-74.ec2.internal kernel: dhclient-script invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=(null),  order=0, oom_score_adj=0
Jan 20 15:28:36 ip-172-31-45-74.ec2.internal kernel: dhclient-script cpuset=/ mems_allowed=0

You can simply run

sudo dhclient eth0

to have it grab the default gateway from DHCP again. But its best to put other memory limits in place to prevent it from running out of resources to begin with.

Find MySQL indexes that can be removed to free up disk space and improve performance

I wrote this handy query to find indexes that can be deleted because they have not been in use. It
queries the performance_schema database for usage on the indexes, and joins on INFORMATION_SCHEMA.TABLES
to see the index size.

Indexes that have zero reads and writes are obvious candidates for removal. They take extra write overhead to keep them
updated, and you can improve performance on a busy server by removing them. You can also free up some disk space
without them. The size column below helps to understand where you have the most opportunity for saving on disk

        SUM(INDEX_LENGTH) AS size,
        SUM(count_star) AS count_star,
        SUM(count_read) AS count_read,
        SUM(count_write) AS count_write
FROM  table_io_waits_summary_by_index_usage
JOIN information_schema.TABLES
    ON table_io_waits_summary_by_index_usage.OBJECT_SCHEMA = TABLES.TABLE_SCHEMA
   AND table_io_waits_summary_by_index_usage.OBJECT_NAME = TABLES.TABLE_NAME
GROUP BY object_name, index_name
ORDER BY count_star ASC, size DESC

| OBJECT_NAME                  | index_name                      | size        | count_star | count_read | count_write |
| transactions                 | order_id                        | 42406641664 |          0 |          0 |           0 |
| transactions                 | msku-timestamp                  | 42406641664 |          0 |          0 |           0 |
| transactions                 | fkTransactionsBaseEvent         | 42406641664 |          0 |          0 |           0 |
| baseEvent                    | PRIMARY                         | 33601945600 |          0 |          0 |           0 |
| baseEvent                    | eventTypeId                     | 33601945600 |          0 |          0 |           0 |
| orders                       | modified                        | 20579876864 |          0 |          0 |           0 |
| orders                       | buyerId-timestamp               | 20579876864 |          0 |          0 |           0 |
| productReports               | productAd-date-venue            |  8135458816 |          0 |          0 |           0 |
| shipmentEvent                | id                              |  7831928832 |          0 |          0 |           0 |
| shipmentEvent                | eventTypeId                     |  7831928832 |          0 |          0 |           0 |
| historyEvents                | timestamp_venue_entity          |  4567531520 |          0 |          0 |           0 |
| targetReports                | venueId-date-targetId           |  3069771776 |          0 |          0 |           0 |
| productAds                   | venue-productAd                 |  1530888192 |          0 |          0 |           0 |
| keywords                     | venue-keyword                   |   895598592 |          0 |          0 |           0 |
| targetingExpressions         | venue-target                    |   215269376 |          0 |          0 |           0 |
| targetingExpressions         | rType-rValue                    |   215269376 |          0 |          0 |           0 |
| serviceFeeEvent              | PRIMARY                         |    48234496 |          0 |          0 |           0 |
| serviceFeeEvent              | id                              |    48234496 |          0 |          0 |           0 |
| serviceFeeEvent              | eventTypeId                     |    48234496 |          0 |          0 |           0 |
| adGroups                     | venue-adGroup                   |    42336256 |          0 |          0 |           0 |

MySQL Encryption In-Transit Does NOT Require Client-Side Certificates

There are many articles around the Internet that discuss enabling Encryption in-transit to MySQL servers. They all include instructions about creating Client Certificates, but they don’t clearly explain that Client-Side Certificates are not a requirement to achieve end-to-end encryption between client and server.

Creating Client certificates that can be authenticated by the server can be complicated. It is not even possible in some scenarios, such as using servers hosted by AWS RDS, since AWS runs its own Certificate Authority. But don’t let that stop you. Below, I will demonstrate that enabling SSL/TLS on the server, and using a client that supports encryption is sufficient to securely encrypt traffic between the two.

First, I set up a MySQL server on RDS using MySQL version 8.0.25. Nothing special here, except that I’m going to make it “Publicly Accessible” which gives is a Public IP Address so that I can access it over the Internet. My Security Group here already allows inbound port 3307 from my desired IP Addresses for testing:

aws rds create-db-instance \
    --db-instance-identifier=encryption-tester \
    --allocated-storage=20 \
    --db-instance-class=db.t3.micro \
    --engine=mysql \
    --master-username=admin \
    --master-user-password="thepasswordIchose" \
    --vpc-security-group-ids="sg-0bf6fa7080100e55b" \
    --backup-retention-period=0 \
    --port=3307 \
    --no-multi-az \
    --engine-version=8.0.25 \

It takes several minutes for my Database Instance to be created, then I can log into it with the command:

mysql -h -u admin --port=3307 -p

I run the command show status like 'ssl_cipher'; and look at that! My connection is encrypted already, as indicated by Cipher method present:

$ mysql -h -u admin --port=3307 -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 17
Server version: 8.0.25 Source distribution

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show status like 'ssl_cipher';
| Variable_name | Value                       |
| Ssl_cipher    | ECDHE-RSA-AES128-GCM-SHA256 |
1 row in set (0.01 sec)

I haven’t set up any client certificates or anything special, yet my connection is encrypted. But let’s not take the session variable’s word for it. Lets double-check by capturing and inspecting some packets.

I’ll run tcpdump with this command:

sudo tcpdump -i any host  -s 65535 -w /tmp/initial-connection.pcap

To make it quick, Instead of using a full packet-analysis program, I just run the strings command to look for text strings in the packet capture:

17:05 $ strings -8  /tmp/initial-connection.pcap
Amazon Web Services, Inc.1
Amazon RDS1%0#
Amazon RDS us-east-1 2019 CA0
Amazon Web Services, Inc.1
Amazon RDS1 0
Amazon RDS Root 2019 CA
Amazon Web Services, Inc.1
Amazon RDS1 0
Amazon RDS Root 2019 CA0
Amazon Web Services, Inc.1
Amazon RDS1%0#
Amazon RDS us-east-1 2019 CA0

Looks like a lot of stuff in that output about the certificate and SSL negotiation, but nothing containing the queries I executed.

I’m going to try it again and specifically disable encryption to see what the packets look like and ensure they contain the plain-text statements and responses I expect:

This is my SQL session:

17:05 $ mysql --ssl-mode=DISABLED -h -u admin --port=3307 -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 32
Server version: 8.0.25 Source distribution

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select version();
| version() |
| 8.0.25    |
1 row in set (0.00 sec)

mysql> show status like 'ssl_cipher';
| Variable_name | Value |
| Ssl_cipher    |       |
1 row in set (0.01 sec)

mysql> \q

The strings in the packets captured during that session clearly contain things relevant to the commands that I executed:

17:05 $ strings -5 /tmp/skip-ssl.pcap
5.7.35  _platform
select @@version_comment limit 1
Source distribution
select version()
show status like 'ssl_cipher'


SSL Client-Certificates are NOT required for traffic to be encrypted to a MySQL server. In fact, with a modern client and server, SSL is preferred and is automatically enabled. Just like I’d expect for traffic to be encrypted by default in 2021.

So what is the purpose of Client Certificates during a MySQL Connection

Client Certificates are intended to verify the identity of the Client. They are an extra step of authentication beyond a typical username and password. By presenting a client certificate that has been properly signed by a recognized Certificate Authority, the client is proving that their identity or system has been verified by the Certificate Authority.

Because SSL is complicated and is not well understood, many well-meaning people have instructions for creating a client key and client certificate, and transmitting those to authorized users. While that does provide a second piece of information needed to authenticate to the server, it is not how a secure client should authenticate.

The proper, fully secure method for a client to get a certificate is for the client to create its own private key. It should never share that key, even with the Certificate Authority. With the private key created, it would then create a certificate signing request (CSR), and present only the certificate signing request to the Certificate Authority. The certificate authority takes whatever steps it requires to verify the authenticity of the client, then provide the client back with a Client Certificate signed by the Certificate Authority. That Client Certificate is the client’s evidence that it’s identity has been verified by the Certificate Authority. The Certificate Authority is able to provide the client certificate without ever having the client’s private key.

Best Practices

If you run a MySQL Server and want to require that all clients to connect via SSL/TLS, you can set the global setting require_secure_transport to true. To require SSL only for specific users, use the CREATE USER ... REQUIRE SSL statement when creating the MySQL user

Installing Composer Packages with custom SSH Identities

Several posts around the Internet describe how to use a specific SSH Identity composer packages, but I can never find them when needed.

This is how I use a specific SSH identity file for packages deployed with GitHub via Deploy keys. GitHub allows a deploy key to be used with only a single repository, so if you have multiple repositories, you need a separate SSH key for each.

Create the SSH Key

ssh-keygen -t ed25519 -f ~/.ssh/repo-foobar -N '' -C "foobar-deploy"

Copy the contents of ~/.ssh/ into the “Deploy Key” section of the Repository settings in GitHub.

Now, you can script a deploy, including a composer install that includes that repository with the command

Use a custom GIT_SSH_COMMAND during composer install

cd /path/to/codebase
export GIT_SSH_COMMAND="ssh -i /home/username/.ssh/repo-foobar -o 'IdentitiesOnly yes '"
COMPOSER_HOME="/home/username/" composer install

The composer_install command uses the defined SSH command (instead of just plain ssh). In doing so, it uses the identity only from the specified key.

Have multiple repos included in your composer.json file that each need a separate identity?

You’ll need to create the SSH key and upload a separate key to GitHub for each repo. However, you can only specify one SSH key to use during the composer install. While there are more elegant solutions, I’ve found the simplest is just to run composer install multiple times, one for each package, and change the identity file between each one. The first execution will fail, but will keep the downloaded code in the composer cache. The second one won’t need to re-download the first again since it is already in the cache, and if you with as many packages as you have, it will eventually succeed, having downloaded each of them.

LastPass Challenges with Multiple Organizations

As a parallel entrepreneur, I’m a participating member of multiple companies. That brings with it some unique challenges, as many software tools don’t gracefully handle a user belonging to multiple organizations. I’ve learned to deal with that in many situations. Typically I’ll often having to log out and back in as the desired user or have multiple browsers or browser profiles open – one for each organization.

One area that has been particularly challenging has been group password management. There are not a lot of software options, although there are getting to be some new players. LastPass is the most mature option, and is the product that I have used for a long time. I investigated some alternatives including 1Password and DashLane. Both of those looked a little more modern and polished, but neither seemed to have mature support for multiple organizations.

Lastpass does claim to have robust support for organizations, but there is minimal, if any, mention on their website or elsewhere that mentions belonging to multiple organizations. It has taken me a lot of experimenting, but I’ve finally come up with a solution that works well.

You might think, as the diagram above indicates, that each organization to which you belong should invite your personal account to become a member of the organization. You would be wrong. Although this seems like the intuitive relationship, it does not work since LastPass only allows a personal account to attach to exactly one LastPass Enterprise account. Not more.

The correct way to belong to multiple Enterprise Accounts in LastPass is to choose one of the organizations to be your “Main” account to which you log in on a daily basis. You connect your Personal account to this enterprise account so that your personal sites appear alongside your work passwords.

Then, to add additional organizations, you don’t purchase a user license in those other organizations. Instead you create one or more shared folders, and share the folders with the email address for your “Main” organization account. There is a limitation that you can’t be an admin of the shared folders in these other organizations since you are not part of the Enterprise, but sharing and day-to-day password usage works generally as expected.

This method seems less intuitive, but works well now that I’ve figured it out. As I’ve learned more about how LastPass works internally, I understand why this unorthodox configuration is required

A few other quirks I’ve found, which just take some getting used-to:

  • Shared folders from my personal account DO NOT SHOW UP when logged into my enterprise account. You have to share to your main organization email address instead.
  • Folder structure in my Personal Account is not confusing in the User-Interface when browsing passwords in my enterprise account. The folder-within-folder structure doesn’t render well, and it is confusing as to which “level” I’m at.

I hope that the folks at LastPass are able to simplify this or make it more obvious how it is to be configured.

Do you have a better solution for password sharing with multiple organizations? Please let me and others know in the comments.

Configuring a Bitcoin Antminer S9 to Run Only part of the Day During Non-Peak Hours

I’ve started tinkering with Bitcoin Mining. Power usage is the single largest cost, so you need ensure that you are using the absolute least expensive power in order to generate as much profit as possible. I live in Georgia and subscribe to Georgia Power’s ‘Smart Usage’ program, which has lower costs most of the time, but peak periods which are Mondays-Fridays in June-September between 2pm and 7pm have a much higher power cost. My full calculation puts the non-peak price per kWh at about $0.048 and peak prices at about three times higher at $0.151 per kWh. Mining bitcoin on an Antminer S9 is mildly profitable at the non-peak price, but definitely loses money at the peak price.

Since it runs on a 220v plug, I can’t use something off-the-shelf like a smart plug to turn it on and off. I’m a linux geek anyway and would rather do it with software. The Antminer has a very bare-bones Linux OS, but it fortunately has crond installed, even though it is not running. These steps will enable crond and create a cron job that kills the bmminer process during the peak hours. It then reboots when the peak period is ending and starts everything back up.

Note that the machine is still on with fans running. It just doesn’t run the mining process which consumes all of the power.

You can see my power usage in the chart below, showing that power usage dropped significantly during the time from 2pm-7pm.

Here is how to make it work:

  1. SSH into the Antminer. Default user is root, password of admin
    ssh root@
  2. Have it start cron at boot by adding this line to the bottom of /etc/inittab:
    echo "cron:2345:once:/usr/sbin/crond" >> /etc/inittab
  3. mkdir /var/spool/cron/crontabs
  4. Run crontab -e to edit the root crontab in vi
  5. Paste in this content, modify for your desired time times. Note that times are in Universal Coordinated Time (UTC)
  6. ## Here we will stop `single-board-test` and `bmminer` from running during "Peak" periods for Georgia Power
    ## when it is unprofitable to mine due to increase in power cost
    ## 'Peak' is defined as 2pm-7pm, Monday-Friday, in June-September
    ## Since monitorcg is started from inittab and can't effectively be killed, we kill single-board-test and bmminer every minute
    ## during the peak hours
    ## kill `single-board-test`, which monitors and restarts `bmminer`
    * 18-22 * 6-9 1-5  /bin/kill `/bin/ps -ef | /bin/grep single-board | /bin/grep -v grep | /usr/bin/head -n1 | /usr/bin/cut -c1-5`
    ## Also, obviously kill `bmminer`
    * 18-22 * 6-9 1-5 /bin/kill `/bin/ps -ef | /bin/grep bmminer       | /bin/grep -v grep | /usr/bin/head -n1 | /usr/bin/cut -c1-5`
    ## Reboot at 6:59pm EDT, which will restart the whole machine, bmminer with it (and takes a few minutes to start back up)
    59 22 * 6-9 1-5 /sbin/reboot
  7. Exit vi with saving by typing<ESC> :wq<ENTER>
  8. Finally, just type reboot at the command line to have the machine restart.

PHP Sessions with Redis Cluster (using AWS Elasticache)

I’ve recently been moving some of our project from a single Redis server (or server with a replica) to the more modern Redis Cluster configuration. However, when trying to set up PHP sessions to use the cluster, I found there wasn’t a lot of documentation or examples. This serves as a walk-through for setting up PHP sessions to use a redis Cluster, specifically with Elasticache on AWS.

First, create your Elasticache Redis Instance like so. Note the “Cluster Mode Enabled” is what causes redis to operate in Cluster mode.

AWS Elasticache Redis Creation

Once there servers are launched, make note of the Configuration Endpoint which should look something like:

Finally, use these settings in your php.ini file. The exact location of this file will depend on your OS, but on modern Ubuntu instances, You can place it in /etc/php/7.0/apache2/conf.d/30-redis-sessions.ini

Note the special syntax for the save_path where is has seed[]=. You only need to put the main cluster configuration endpoint here. Not all of the individual instances as other examples online appear to use.

session.save_handler = rediscluster
session.save_path = "seed[]"
session.gc_maxlifetime = 1296000

That’s it. Restart your webserver and sessions should now get saved to your Redis cluster.

IIn the even that something goes wrong, you might see something like this in your web server log files:

PHP Warning: Unknown: Failed to write session data (redis). Please verify that the current setting of session.save_path is correct (tcp:// in Unknown on line 0

MySQL Statistics for Updates/Inserts per-table

For a long time, I’ve never been able to answer some basic questions that I thought fundamental to optimizing server performance. MySQL gives you some server-wide metrics about activity, but none of it is broken down per-table so that an application developer could look into where to reduce the number of writes, or generally where to focus their attention in order to improve the server performance.

I finally got ambitious enough to tackle this problem and asked a question on StackOverflow at

A commenter named barat pointed me to this post which had the insightful idea of parsing the binary log for analysis.
Since my servers are generally hosted on AWS, I don’t have direct access to the binary log, so I had to retrieve those. The MySQL documentation for the mysqlbinlog command briefly mentions how to read the binary log from a remote server. It took some experimentation to get the right command and output options with all of the data I wanted. Specifically, the `–base64-output=DECODE-ROWS –verbose` options which translate some of the row-based logging into MySQL commands that can be parsed.

The first step is to create a user that has access to the binary logs. I used the main ‘admin’ user that RDS creates because it was convenient. If creating a new user, you probably need to grant the REPLICATION_SLAVE privilege.

You can see which binary logs are available on the server with the SHOW BINARY LOGS; command:

mysql> show binary logs;
| Log_name                   | File_size |
| mysql-bin-changelog.232522 | 16943219  |
| mysql-bin-changelog.232523 | 32300889  |
| mysql-bin-changelog.232524 | 15470603  |

Then you can actually retrieve the log and print to STDOUT using this command:

14:01 $ mysqlbinlog --read-from-remote-server \
  --host \
  --user admin \

Note that if you get the error below, you need to make sure that your MySQL client and server tools are using the same version. I originally attempted to use MySQL 5.5 tools with a MySQL 5.6 server.

ERROR: Got error reading packet from server: Slave can not handle replication events with
the checksum that master is configured to log; the first event 'mysql-bin-changelog.232519'
at 4, the last event read from '/rdsdbdata/log/binlog/mysql-bin-changelog.232519' at 120,
the last byte read from '/rdsdbdata/log/binlog/mysql-bin-changelog.232519' at 120.

After that, it was just a matter of parsing the file for the relevant commands. I’ve put all of that logic now into a quick PHP script that I can reuse anywhere. Now, I can go through a bunch of binary logs on a server and see which tables are updated the most frequently with output like this:

Parsed 1,096,063 lines spanning 300 seconds between 2016-09-13 03:05:00 and 2016-09-13 03:10:00
master                         metrics                        update          = 43570
master                         metrics                        insert into     = 9
DEFAULT                        accounts                       update          = 501
DEFAULT                        users                          update          = 5
DEFAULT                        logins                         insert into     = 1
mysql                          rds_heartbeat2                 insert into     = 1

I’ve committed this project to Github at for others to use.

Docker Syslog Container for Sending Logs to CloudWatch

AWS’s CloudWatch Logs was first available about a year ago, and to my estimation has gone largely unnoticed. The initial iteration was pretty rough, but some recent changes have made it more useful, including the ability to search logs, and generate events for monitoring in CloudWatch from log content.

Unfortunately, the Cloudwatch Logs agent just watches log files on disk and doesn’t act as a syslog server. An AWS blog post explained how to get the the Cloudwatch Logs Agent running inside a container and monitoring the log output from rsyslogd, but the instructions used Amazon’s ECS service, which still doesn’t quite offer the flexibility that CoreOS or Deis offer IMHO. ECS does some magic behind the scenes in passing credentials around that you have to do yourself when using CoreOS.

I’ve just provided a GitHub repository with the tools to make this work pretty easily, as well as a Docker Image with some reasonable defaults.

When trying to pull all of this together to work, I discovered a problem due to a bug in the overlayfs that is in current Deis Releases which causes the AWS Logs agent not to notice changes in the syslog files. A workaround is available that reformats the host OS back to btrfs to solve that particular problem

Note when running on Deis 561+ to revert to btrfs