In an earlier post I described how some queries with long parse times were causing long library cache lock waits. Friday I applied a patch from Oracle that resolved the long parse times. Here are the conditions which may relate to this bug:
- Interval partitioned table
- Partitioned by range
- Sub-partitioned by list
- Open ended range on partitioning column in where clause
- Tens of thousands of sub-partitions
Prior to applying the patch I did an explain plan on two versions of the problematic query. One version specified an open-ended range on the partitioning column and the other closed off the range.
Slow version:
... T392658.CLNDR_DT >= TO_DATE('2014-11-17' , 'YYYY-MM-DD') and ... Explained. Elapsed: 00:00:46.20
Fast version:
... T392658.CLNDR_DT >= TO_DATE('2014-11-17' , 'YYYY-MM-DD') and T392658.CLNDR_DT <= TO_DATE('2014-11-26' , 'YYYY-MM-DD') and ... Explained. Elapsed: 00:00:00.09
The queries are the same except for the extra date condition which closes off the date range. Note that the explain plan took 46 seconds with the open-ended range and less than a tenth of a second with the closed off range.
With the patch the slow version is just as fast as the fast one.
This is bug 20061582 in Oracle’s system.
From my conversations with Oracle it sounds like in certain cases the optimizer is reviewing information for many or all of the sub-partitions and since we have 20,000 for this table in production it can take minutes to parse. I also messed with the dates on the open-ended condition and found that if I made the date early enough the parse time issue went away. So, it seems that there is some set of conditions, which I don’t know how to easily reproduce, which lead the optimizer to look at sub-partition information and slows parse time.
This is on 11.2.0.4 on an Exadata system. I was able to reproduce the long parse times on non-Exadata Linux 11.2.0.4 and 12.1.0.2 systems so it does not occur only on Exadata.
This issue surprised me because I thought that the optimizer would not look at partition or sub-partition statistics in a query that has a range condition which spanned more than one partition. In the past I have always seen the global or table level stats used in these situations. But, now I know that there are some cases where the optimizer will dig into the sub-partition stats even though the query covers more than one partition.
Looking at the plan of my slow parsing query before and after the patch I get a clue that the optimizer is looking at sub-partition stats:
Partition range without the patch:
KEY(SQ)|KEY(SQ)
Partition range with the patch:
1 | 136
Evidently KEY(SQ) relates to some sort of filtering of the sub-partitions which cause the long parse time. The manuals describe KEY(SQ) in a section titled “Dynamic Pruning with Subqueries” so maybe the problem queries have some sub-query that the optimizer was using to choose which sub-partitions that the query needed.
If you have an interval partitioned table with tens of thousands of sub-partitions and parse times in the minutes with open-ended ranges specified on the partitioning column your system could be hitting this same issue.
– Bobby
Pingback: Different plan_hash_value same plan | Bobby Durrett's DBA Blog
Thanks. I am having this issue on a 12.1.0.2.0 DB. Much appreciated. Now I just have to find out if there is a patch for our DB version.
I would open a case with Oracle support and tell them that you think you are seeing the same symptoms as bug 20061582 and ask them if there is a patch for 12.1.0.2. I looked up the bug on Oracle’s support site and it says the fix will be included in 12.2. But, you may be able to request a 12.1.0.2 version of the patch if you have support.
My similar issue was caused by some invalid subpartitions in PK indexes. Once I rebuilt them, my parse time returned to 2/100ths of a second from 85 seconds (no errors). The indexes were invalidated from an error in my a code for creating a partition. The code the did not fail, but created a system generated default subpartition, one partition of each table. When I generated and ran scripts to create my subpartitions correctly, I neglected the indexes.
Thanks for your update. Yes, it has something to do with the optimizer looking at many sub-partitions worth of statistics in cases like this. Of course, I don’t have the code, but that seems to be what’s going on.