I have been familiar with some of the benefits of the InnoDB storage engine for a long time. Of particular interest to me has been row-level locking which should significantly reduce some locking problems that I have on a big table during peak times.
I have made various attempts over the past six months or so to convert this table to InnoDB, but each attempt has always taken a tremendously long time and I have always ended up canceling the query before it completes, because it doesn’t seem to be making any progress.
I finally have had a reason and opportunity to dig into this more and have spent the last day or two learning and experimenting with InnoDB. Of particular use was this set of slides from a 2007 presentation on InnoDB performance. It had some very good information about understanding how InnoDB works.
Slide 9 of the presentation includes this extremely helpful bit:
PRIMARY KEY in random order are costly and lead to table fragmentation (primary key inserts should normally be in an ascending order)
Load data in primary_key order if you can
In the data that I have been attempting to convert from MyISAM, my primary keys are inserted in whatever order that that the data has occurred. When converting that to InnoDB, it would basically run a bunch of insert statements using my unordered data. Each insert would require the storage engine to move a bunch of data in the InnoDB table around to keep the primary keys in order.
In my case, I was converting a MyISAM table containing about 1.9 million rows and occupying 600 MB of disk space. That took over 8 hours using the unordered data. After ordering the data and retrying, it then took about 10 minutes.
Also, I learned the usefulness of the ‘SHOW TABLE STATUS’ command. It is semi-useful in seeing the progress of the import. It is a bit strange to me it shows the number of rows in the table changing by 10% every second, but it is better than nothing:
mysql> SHOW TABLE STATUS WHERE name = 'sometable' *************************** 1. row *************************** Name: books_innodb Engine: InnoDB Version: 10 Row_format: Compact Rows: 2158501 Avg_row_length: 286 Data_length: 617611264 Max_data_length: 0 Index_length: 70008832 Data_free: 0 Auto_increment: NULL Create_time: 2009-08-25 10:03:43 Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: InnoDB free: 1799168 kB
Also worth noting is the data inserted in random order used about 30% more space according to the Data_length value.