Last week I came across what seemed like a simple query tuning problem. A PeopleSoft batch job ran for many hours and when I did an AWR report I found that the top query was doing a full scan when an index should help.
Here is the query and its bad plan:
SQL_ID 1jx5w9ybgb51g -------------------- UPDATE PS_JGEN_ACCT_ENTRY SET JOURNAL_ID = :1, JOURNAL_DATE = TO_DATE(:2,'YYYY-MM-DD'), FISCAL_YEAR = :3, ACCOUNTING_PERIOD = :4, GL_DISTRIB_STATUS = 'D', JOURNAL_LINE = :5 WHERE PROCESS_INSTANCE = 6692638 AND GL_DISTRIB_STATUS = 'J' AND ACCOUNT=:6 AND DEPTID=:7 AND CURRENCY_CD=:8 AND FOREIGN_CURRENCY=:9 Plan hash value: 1919611120 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | UPDATE STATEMENT | | | | 21649 (100)| | | 1 | UPDATE | PS_JGEN_ACCT_ENTRY | | | | | | 2 | TABLE ACCESS FULL| PS_JGEN_ACCT_ENTRY | 1 | 58 | 21649 (5)| 00:01:27 | -----------------------------------------------------------------------------------------
The problematic batch job ran variations of this query with different literal values for PROCESS_INSTANCE corresponding to each flat file being loaded. Three updates of this type were in the awr report for the 16 hour period that covered the run of the batch job:
Elapsed CPU Elap per % Total Time (s) Time (s) Executions Exec (s) DB Time SQL Id ---------- ---------- ------------ ---------- ------- ------------- 16,899 5,836 3,811 4.4 3.5 4h54qqmbkynaj UPDATE PS_JGEN_ACCT_ENTRY SET JOURNAL_ID = :1, JOURNAL_DATE = TO_DATE(:2,'YYYY-M M-DD'), FISCAL_YEAR = :3, ACCOUNTING_PERIOD = :4, GL_DISTRIB_STATUS = 'D', JOURN AL_LINE = :5 WHERE PROCESS_INSTANCE = 6692549 AND GL_DISTRIB_STATUS = 'J' AND A CCOUNT=:6 AND DEPTID=:7 AND CURRENCY_CD=:8 AND FOREIGN_CURRENCY=:9 6,170 2,190 1,480 4.2 1.3 a5rd6vx6sm8p3 UPDATE PS_JGEN_ACCT_ENTRY SET JOURNAL_ID = :1, JOURNAL_DATE = TO_DATE(:2,'YYYY-M M-DD'), FISCAL_YEAR = :3, ACCOUNTING_PERIOD = :4, GL_DISTRIB_STATUS = 'D', JOURN AL_LINE = :5 WHERE PROCESS_INSTANCE = 6692572 AND GL_DISTRIB_STATUS = 'J' AND A CCOUNT=:6 AND DEPTID=:7 AND CURRENCY_CD=:8 AND FOREIGN_CURRENCY=:9 6,141 1,983 1,288 4.8 1.3 1jx5w9ybgb51g UPDATE PS_JGEN_ACCT_ENTRY SET JOURNAL_ID = :1, JOURNAL_DATE = TO_DATE(:2,'YYYY-M M-DD'), FISCAL_YEAR = :3, ACCOUNTING_PERIOD = :4, GL_DISTRIB_STATUS = 'D', JOURN AL_LINE = :5 WHERE PROCESS_INSTANCE = 6692638 AND GL_DISTRIB_STATUS = 'J' AND A CCOUNT=:6 AND DEPTID=:7 AND CURRENCY_CD=:8 AND FOREIGN_CURRENCY=:9
The batch job ran about 15 and a half hours so these three plus others like them surely combined to make up the bulk of the run time.
It made sense to me to just add an index on all the columns in the where clause – PROCESS_INSTANCE,GL_DISTRIB_STATUS,ACCOUNT,DEPTID,CURRENCY_CD,FOREIGN_CURRENCY
Just to check how selective this combination of columns might be I did a count on each grouping of these columns and came up with about 50 rows per combination:
>select max(cnt),avg(cnt),min(cnt) 2 from 3 (select 4 PROCESS_INSTANCE, 5 GL_DISTRIB_STATUS, 6 ACCOUNT, 7 DEPTID, 8 CURRENCY_CD, 9 FOREIGN_CURRENCY, 10 count(*) cnt 11 from sysadm.PS_JGEN_ACCT_ENTRY 12 group by 13 PROCESS_INSTANCE, 14 GL_DISTRIB_STATUS, 15 ACCOUNT, 16 DEPTID, 17 CURRENCY_CD, 18 FOREIGN_CURRENCY); MAX(CNT) AVG(CNT) MIN(CNT) ---------- ---------- ---------- 9404 50.167041 1
The table itself has 3 million rows so this is pretty selective:
OWNER TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN SAMPLE_SIZE LAST_ANALYZED DEGREE INSTANCES -------------------- ------------------------------ ---------- ---------- ----------- ----------- ------------------- ---------- ---------- SYSADM PS_JGEN_ACCT_ENTRY 3145253 82857 204 3145253 2014-04-21 21:07:02 1 1
But, the strange thing was when we added the index on our test system we didn’t see any performance improvement! We ran the largest production file on test and it ran in ten minutes with or without the index. Yack!
So, I tried my own test in sqlplus with the select equivalent of the update and hardcoded values instead of bind variables – quick and dirty. I thought I had extracted some valid values although I later realized they weren’t. Here is what I ran and notice the full scan ran just as fast as with the index:
>select * from 2 sysadm.PS_JGEN_ACCT_ENTRY 3 WHERE PROCESS_INSTANCE = 6138803 AND 4 GL_DISTRIB_STATUS = 'J' AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND 5 FOREIGN_CURRENCY = NULL; no rows selected Elapsed: 00:00:00.30 Execution Plan ---------------------------------------------------------- Plan hash value: 1762298626 --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 203 | 0 (0)| | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS BY INDEX ROWID| PS_JGEN_ACCT_ENTRY | 1 | 203 | 5 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | PSAJGEN_ACCT_ENTRY | 1 | | 4 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(NULL IS NOT NULL) 3 - access("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND "CURRENCY_CD"='USD') filter(TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND "CURRENCY_CD"='USD') Statistics ---------------------------------------------------------- 1761 recursive calls 0 db block gets 557 consistent gets 14 physical reads 0 redo size 1866 bytes sent via SQL*Net to client 239 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 5 sorts (memory) 0 sorts (disk) 0 rows processed > >select /*+full(PS_JGEN_ACCT_ENTRY) */ * from 2 sysadm.PS_JGEN_ACCT_ENTRY 3 WHERE PROCESS_INSTANCE = 6138803 AND 4 GL_DISTRIB_STATUS = 'J' AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND 5 FOREIGN_CURRENCY = NULL; no rows selected Elapsed: 00:00:00.17 Execution Plan ---------------------------------------------------------- Plan hash value: 3728573827 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 203 | 0 (0)| | |* 1 | FILTER | | | | | | |* 2 | TABLE ACCESS FULL| PS_JGEN_ACCT_ENTRY | 1 | 203 | 12185 (2)| 00:02:27 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(NULL IS NOT NULL) 2 - filter("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND "CURRENCY_CD"='USD') Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 0 consistent gets 0 physical reads 0 redo size 1873 bytes sent via SQL*Net to client 239 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed
It looks like I passed in a NULL for a column with a NOT NULL constraint and that is what made the full scan version just as fast as the indexed one. The FILTER condition must have realized no rows could meet both NULL and NOT NULL conditions. With both plans the database realized immediately that there were no rows matching this bogus collection of constants. So, then I replaced the NULL with a zero and finally we had proof of the performance improvement of the index:
>select * from 2 sysadm.PS_JGEN_ACCT_ENTRY 3 WHERE PROCESS_INSTANCE = 6138803 AND 4 GL_DISTRIB_STATUS = 'J' AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND 5 FOREIGN_CURRENCY = 0; no rows selected Elapsed: 00:00:00.02 Execution Plan ---------------------------------------------------------- Plan hash value: 2047014499 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 203 | 5 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| PS_JGEN_ACCT_ENTRY | 1 | 203 | 5 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | PSAJGEN_ACCT_ENTRY | 1 | | 4 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND "CURRENCY_CD"='USD') filter(TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND TO_NUMBER("FOREIGN_CURRENCY")=0 AND "CURRENCY_CD"='USD') Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 1866 bytes sent via SQL*Net to client 239 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed > >select /*+full(PS_JGEN_ACCT_ENTRY) */ * from 2 sysadm.PS_JGEN_ACCT_ENTRY 3 WHERE PROCESS_INSTANCE = 6138803 AND 4 GL_DISTRIB_STATUS = 'J' AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND 5 FOREIGN_CURRENCY = 0; no rows selected Elapsed: 00:00:37.11 Execution Plan ---------------------------------------------------------- Plan hash value: 1758291200 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 203 | 12185 (2)| 00:02:27 | |* 1 | TABLE ACCESS FULL| PS_JGEN_ACCT_ENTRY | 1 | 203 | 12185 (2)| 00:02:27 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND TO_NUMBER("FOREIGN_CURRENCY")=0 AND "CURRENCY_CD"='USD') Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 56110 consistent gets 55409 physical reads 0 redo size 1873 bytes sent via SQL*Net to client 239 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed
So, I can’t tell you what happened in test but I suspect that we were passing null into one of the bind variables and got a similar efficient filter due to some data that was missing running a production file on an out of date test system. But, once I forced the equivalent to the production full scan by supplying non-null values for all the constants the value of the index became clear. It went into production last week and this weekend’s run ran in a few minutes instead of 15 hours.
– Bobby
Bobby, thank you for posting this info. I am actually testing this same issue. Do you have proper index for PS_JGEN_ACCTG_LINE?
We didn’t add an index to PS_JGEN_ACCTG_LINE. You might search Oracle’s support web site to see if there are application fixes that include additional indexes on that table.