I was trying to tune a MySQL query this week. I ran the same query against Oracle with the same data and got a much faster runtime on Oracle. I couldn’t get MySQL to do a range scan on the column that Oracle was doing it on. So, I just started barely scratching the surface with a simple test of when MySQL will use an index versus a full table scan in a range query. In my test MySQL always uses an index except on extreme out of range conditions. This is funny because in my real problem query it was the opposite. But I might as well document what I found for what it’s worth. I haven’t blogged much lately.
Here is my testcase and its output:
https://www.bobbydurrettdba.com/uploads/mysqlindexuserangequeries.zip
This is on 8.0.26 as part of an AWS Aurora MySQL RDS instance with 2 cores and 16 gigabytes of RAM.
I created a simple test table and put 10485760 rows in it:
create table test
(a integer NOT NULL AUTO_INCREMENT,
b integer,
PRIMARY KEY (a));
The value of b is always 1 and a ranges from 1 to 10878873.
This query uses a range query using the index:
select
sum(b)
from
test
where
a > -2147483648;
This query uses a full table scan:
select
sum(b)
from
test
where
a > -2147483649;
The full scan is slightly faster.
Somehow when you are 2147483650 units away from the smallest value of a the MySQL optimizer suddenly thinks you need a full scan.
There are a million more tests I could do like things with a million variables, but I thought I might as well put this out there. I’m not really any the wiser but it is a type of test that might be worth mentioning.
Bobby
P.S. I did find this document:
https://dev.mysql.com/doc/refman/8.0/en/integer-types.html
-2147483648 is the smallest value for a 4 byte int type so that is probably why the behavior changes at -2147483649. Not sure what that information is worth!
P.P.S I think this explains why -2147483649 leads to the full scan in this case maybe:
https://dev.mysql.com/doc/refman/8.0/en/where-optimization.html
“Beginning with MySQL 8.0.16, comparisons of columns of numeric types with constant values are checked and folded or removed for invalid or out-of-rage values”
Since -2147483648 is the smallest value for an int then with -2147483649 the optimizer removes the condition.
MySQL range performance has dropped substantially since 8.0.18, I suspect a few causes, but still haven’t figured out an acceptable solution.
https://dev.mysql.com/doc/refman/8.0/en/bnl-bka-optimization.html
In MySQL 8.0, the block_nested_loop flag of the optimizer_switch system variable works as follows:
Prior to MySQL 8.0.20, it controls how the optimizer uses the Block Nested Loop join algorithm.
In MySQL 8.0.18 and later, it also controls the use of hash joins (see Section 10.2.1.4, “Hash Join Optimization”).
Beginning with MySQL 8.0.20, the flag controls hash joins only, and the block nested loop algorithm is no longer supported.
The batched_key_access flag controls how the optimizer uses the Batched Key Access join algorithms.
Thanks for your reply! I will review it.
Hey people!!!!!
Good mood and good luck to everyone!!!!!