I am learning about MySQL performance tuning. I read the Performance Schema chapter of the MySQL 5.7 manual and I have a MySQL 5.7.20 database running under Linux. I have a test table and I am running a simple query that does a full scan of the table and returns the sum of a couple of columns. The goal was to look at some Performance Schema tables to see what they would tell me about this simple query. I bounce the MySQL database service before running my test script so that the query pulls data from disk and not memory. A zip of my test script and its output is here: zip
I looked at the tables that had this pattern events_*_history_long. These are the four tables:
- events_statements_history_long
- events_transactions_history_long
- events_stages_history_long
- events_waits_history_long
There seems to be a hierarchical relationship among these tables something like this:
- statements->transactions
- statements->stages->waits->waits
There may be other relationships that my simple test did not capture. If you look at the zip you will see all the levels of the hierarchy that I explored but, in this case, the bottom level seems the most interesting. Here is the query that explores the bottom of the statements->stages->waits->waits path:
select * from events_waits_history_long
where
NESTING_EVENT_ID in
(select EVENT_ID from events_waits_history_long
where
NESTING_EVENT_ID in
(select EVENT_ID from events_stages_history_long
where
NESTING_EVENT_ID in
(select EVENT_ID from events_statements_history_long
where SQL_TEXT like concat('%te','st%'))
and NESTING_EVENT_TYPE = 'STATEMENT')
and NESTING_EVENT_TYPE = 'STAGE')
and NESTING_EVENT_TYPE = 'WAIT'
order by EVENT_ID;
The output reminds me of a trace in Oracle. It shows the individual disk reads and waits for internal locks and mutexs.
Most of the time seems to be taken up on reads from a file, which is what I expected. If you look at the wait/io/file/innodb/innodb_data_file waits they seem to be 16384 byte reads from the file associated with the table. I could use this information to build a query to show a profile of the time spent by the query based on EVENT_NAME. It would be kind of like a tkprof of an Oracle trace.
This post is just a simple first test. I am new to MySQL tuning so feel free to leave a comment or email me if you have suggestions or corrections related to this post.
Bobby
P.S. Looking at this again, it does not look like the file reads are really physical I/O because they are too short. The TIMER_WAIT column uses picoseconds so if you divide 3499928 by 1000000 you should get 3.49 microseconds for the first read which is tiny compared to a normal disc read of 1000 microseconds or more. Also, I added up all the wait times from the statements->stages->waits->waits query and the total was about 100 times less than the query run time. The higher levels of this hierarchy all had waits in the half second range, but the lowest level of waits adds to about one hundredth of that. So, I have more work to do at this point.
P.P.S. Looks like the reads are all from memory. I looked at several things but here is a profile of the waits from the stage level and then at the wait level:
+-------------------------------------------+---------------+
| EVENT_NAME | total_seconds |
+-------------------------------------------+---------------+
| wait/io/table/sql/handler | 0.4819 |
| wait/io/socket/sql/client_connection | 0.0001 |
| wait/synch/mutex/sql/THD::LOCK_query_plan | 0.0000 |
| wait/synch/mutex/sql/LOCK_table_cache | 0.0000 |
| wait/synch/mutex/sql/THD::LOCK_thd_query | 0.0000 |
| wait/synch/mutex/innodb/trx_mutex | 0.0000 |
| wait/synch/mutex/sql/THD::LOCK_thd_data | 0.0000 |
+-------------------------------------------+---------------+
+-------------------------------------------+---------------+
| EVENT_NAME | total_seconds |
+-------------------------------------------+---------------+
| wait/io/file/innodb/innodb_data_file | 0.0041 |
| wait/synch/mutex/innodb/buf_pool_mutex | 0.0004 |
| wait/synch/sxlock/innodb/hash_table_locks | 0.0002 |
| wait/synch/mutex/innodb/fil_system_mutex | 0.0001 |
+-------------------------------------------+---------------+
It seems that the wait/io/table/sql/handler event includes the CPU to process the query which takes about half a second whereas the wait/io/file/innodb/innodb_data_file event seems to just cover the time to read the data from the buffer pool. Here is a zip of my script to create these profiles and its output: zip
I got a profile of the query another way and that confirmed that almost all of the half second run time of the query was CPU:
Sending data spends .52 seconds on the CPU which is most of the query’s .54 second run time in this case.
P.P.P.S. This post, https://mysql.wisborg.dk/2018/08/05/what-does-io-latencies-and-bytes-mean-in-the-performance-and-sys-schemas/, says that wait/io/table/sql/handler events can be reads from the buffer pool but that wait/io/file/innodb/innodb_data_file events are actual reads from the file system. I can only assume that the data is cached by a Linux filesystem cache if this is true, because the wait/io/file/innodb/innodb_data_file events are much too short to be a read from disk. By the way, the disk in this case is an external hard drive on my laptop connected through USB. Typical I/O times are around 20 milliseconds. So, reads from the filesystem that finish in a few microseconds have to be coming from memory.
P.P.P.P.S. I looked at the source code and it looks like the wait/io/file/innodb/innodb_data_file wait events are physical I/O. But I had a hard time forcing a really slow read from the table. Setting innodb_flush_method=O_DIRECT in my /etc/my.cnf file and bouncing the MySQL service resulted in a slower query the first time I ran it, but then everything was cached in the database’s buffer pool and run time was back to half a second. It seems that the first time I run the query it does a small amount of physical I/O but the data for this small table is cached somewhere. I am running VirtualBox on top of my Windows 7 laptop so maybe the table’s data is cached in Windows even after turning on direct I/O to bypass the filesystem cache at the Linux level within the VM. This is all the same sort of thing you see with Oracle database tuning. You must navigate the many layers of cache between the database and the physical disk when you look at physical I/O times.
I like how you post things as you are learning – it’s a refreshing .
The wait wait/io/table/sql/handler seems like it comes up for multiple reasons. I’m still trying to figure this one out.
Thanks for your comment Kyle. I have a long way to go for MySQL tuning, but I am trying to document what I learn along the way this year. It helps me remember if nothing else. 🙂
A few months ago I looked at the code for “wait/io/table/sql/handler” and it shows up for row level locks, reading blocks in the cache (i.e. CPU time), and reading off disk. It looks like a mess. Not sure why it hasn’t been cleaned up.
I’m still trying to understand the performance schema views. Don’t see much documentation or examples on google searches.
For your query on events_waits_history_long, wondering why you have the subselects instead of just querying events_waits_history_long directly. What do the subselects for statement and stages give you?
Thanks
Kyle
Kyle,
Thanks for your comment and your question. I reviewed this post and the attached sql script and output. Best I can tell I was trying to show the hierarchy between the different views from the statement level on down to the waits. I think my idea was that you could identify one statement and then ultimately get all the waits associated with it. But if all you have to identify the statement is the sql text then I think you are right that I would not need the subselects and I could just query the waits view with a condition against the sql text. I have not really used these views much since I did this post so I have not gone much further with MySQL tuning so I unfortunately am not much help in this area. But I’m pretty sure my idea is that you would have a handle on one SQL statement in the top level view and then you could join through the hierarchy down to the wait level to see the time spent by the one SQL statement.
Bobby
Pingback: Data Virtualizer » MySQL -> what to do with NESTING_EVENT_ID - ABBEY BLOG