Query tuning example

A coworker of mine cleaned up a bunch of old data to improve performance on an older system of ours and one of the queries started running slower.  It looks like the optimizer was choosing a full table scan when an index existed that was much faster.  So, I took at look at why it was choosing the full scan and what could be done to get the query to run with the index.

This is a 9.2.0.6 64 bit HP-UX PA-Risc system.  The query without any hints runs like this:

Elapsed: 00:04:54.65

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE 
          (Cost=23208 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'MYTABLE' 
              (Cost=23208 Card=68262 Bytes=819144)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     242929  consistent gets
     238854  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        242  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

I renamed the production table to MYTABLE to hide its name.  Here is how the query runs with the index hint:

Elapsed: 00:00:00.21

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE 
          (Cost=34865 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'MYTABLE' 
              (Cost=34865 Card=68262 Bytes=819144)
   3    2       INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) 
                (Cost=194 Card=68262)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1180  consistent gets
          0  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        242  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

I got this output by putting these commands before the sql:

set autotrace on
set timing on

The query looks something like this:

SELECT  sum(COLUMN2)
  FROM MYTABLE
   WHERE MYTABLE.COLUMN1 = '01242014';

The index is on COLUMN1 only.

Here is what the query looks like with the index hint:

SELECT /*+ INDEX (MYTABLE MYINDEX) */ sum(COLUMN2)
  FROM MYTABLE
   WHERE MYTABLE.COLUMN1 = '01242014';

So, the question is why does the optimizer choose a full scan which runs for almost 5 minutes instead of the index scan which runs in a fifth of a second?  I think that the answer is that the optimizer assumes there is no caching.  Notice the number of consistent gets and physical reads in each case:

FULL SCAN

     242929  consistent gets
     238854  physical reads

INDEX SCAN

       1180  consistent gets           
          0  physical reads

If you look at the autotrace output and you see consistent gets but no physical reads it means that the blocks are being read from memory and not the disk.  So, just about every block read by the full scan is read from the disk and not from memory.  In the case of the index scan all the blocks are in memory.

Other than the caching the optimizer’s cost estimates aren’t that far off.  I think the optimizer’s cost units are the equivalent of single block reads.  I think I read this in Jonathan Lewis’ book.  These are the sections of the plan that show the cost of the two ways of reading from the table:

   2    1     TABLE ACCESS (FULL) OF 'MYTABLE' 
              (Cost=23208 Card=68262 Bytes=819144)

   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'MYTABLE' 
              (Cost=34865 Card=68262 Bytes=819144)
   3    2       INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) 
                (Cost=194 Card=68262)

So the full scan has a cost of 23208 which is the equivalent of that many 10 millisecond single block reads.  That’s about 232 seconds which is about 4 minutes.  So, that’s not far off for the cost of the full scan which ran in 5 minutes.  Also, full table scans do multi block reads so when autotrace says physical reads = 238854 what it really means is that many blocks were read in through physical reads.  Our system has db_file_multiblock_read_count=16 so probably 16 blocks are read per physical read.  The optimizer estimated a cost of 23208 which is about 10% of the physical blocks and this wasn’t a bad estimate because the blocks are read together.  So it assumed that the 238854 blocks would be read in the time it would take for 23208 single block reads and this was pretty close.

But, the index scan estimated 34865 sequential reads which is about 6 minutes.  Now, if there was no caching this wouldn’t be such a bad estimate.  Autotrace says there are 1180 blocks read from memory.  If these were read from disk each block would be a separate disk I/O so it would be about 12 seconds.  I checked and I did find that some values of the column had more rows than others and for the value 01242014 that I was testing with was one of the values with fewer rows.  So, for different values of COLUMN1 without caching the index would be a lot closer to the full scan:

SELECT COLUMN1,count(*) cnt
FROM MYTABLE
group by COLUMN1
order by cnt desc;

COLUMN1         CNT
-------- ----------
11262013     873867
11252013     576299
09222013     237098
08052013     179476
12082013     177359
11102013     175178
09152013     174220
10132013     174204
11172013     174102
08252013     173758
... skipped some ...
01242014      53801
10232013      53785
10072013      53335

So the value 01242014 only has 53801 rows but 11262013 has 863867 rows so it probably has blocks closer to the estimated 34865.  Here is a query to show how many blocks each value of the column has:

SELECT COLUMN1,
count(distinct DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)) cnt
FROM MYTABLE
group by COLUMN1
order by cnt desc;

COLUMN1        CNT
-------- ----------
11262013      16338
11252013      10847
09222013       4409
08052013       3437
12082013       3337
11102013       3305
09152013       3290
10132013       3286
11172013       3285
08252013       3281
10272013       3255
... skipped some ...
01242014        926
01232014        924
09262013        922
01132014        915

So, for the values of the column with the most rows there are 16338 blocks, somewhere in the vicinity of the estimated 34865 single block reads.  Anyway, if the optimizer is off by less than a factor of 100 I’d say it is pretty close.  But, the run time is off by about 600 – 5 minutes versus .2 seconds so that is a real problem.

Oracle’s method of dealing with caching of index blocks and indexed table blocks is one of the optimizer parameters.  Here are the settings as we have them on our system:

NAME                                 VALUE
------------------------------------ -----
optimizer_index_caching              0
optimizer_index_cost_adj             100

I messed with different values of optimizer_index_cost_adj and found that 66 was the highest value that would cause the plan to switch to an index scan.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE 
          (Cost=23011 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'MYTABLE' 
              (Cost=23011 Card=68262 Bytes=819144)
   3    2       INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) 
                (Cost=194 Card=68262)

Note that the cost of the index access to the table has come down to 23011 which is less than the 23208 estimated for the full scan.  Also note that 23011/34865=.66=66/100.  So, the cost of the index access to the table is just multiplied by the value of optimizer_index_cost_adj divided by 100.

So, now we have come down to where the rubber meets the road.  What should we then do with all of this information?  The development team has told us that they can add a hint to their code, but they are concerned that there is some general problem with the database and they will have to keep adding hints in various other places.  Note that this is a very old system with vendor supplied code.  Presumably before the data was cleaned up the cost of the full scan exceeded that of the index scan and that it is only because our new table has so much less data that the index scan has become less costly.

So, we could set optimizer_index_cost_adj to 66 or some other value but then we would need to test the entire system.  How much effort are we going to put into a system that we need to retire anyway?  This is really where I come down with hints and other performance methods.  I’ve known about using optimizer_index_cost_adj to encourage index scans for a long time.  But, if I’m supporting a bunch of older systems I can’t just go in and start tweaking parameters without appropriate testing.  So, I’d love to adjust the parameter rather than using the hint to band aid this one query, but we have a lot of databases to support and some times you just have to apply the quick fix rather than taking the time to apply a more global solution.

– Bobby

Posted in Uncategorized | 2 Comments

SS Enqueue and sort segment request waits on 8.1.7.4 after trying to drop temp tablespace

Participated in a long forum thread that I started with this title:

SS Enqueue and sort segment request waits on 8.1.7.4 after trying to drop temp tablespace

We have had quite the saga on an old 8.1.7 database.  Really, it has been going on since December 26th.  It was pretty cool to see the interaction with the Oracle community while at the same time getting some good help from Oracle support on such an old release.  I definitely appreciate everyone’s efforts!

The core issue relates to dropping objects in a dictionary managed tablespace that have many small extents.  I had similar issues a number of years ago that I mentioned on my old blog:

April 24, 2008 post

May 19, 2008 post

June 2, 2008 post

Bottom line is that if you still have a system with dictionary managed tablespaces don’t let your tables and indexes have small extent sizes or you may end up with an object that is difficult to drop and various issues will ensue if you do drop it.

– Bobby

Posted in Uncategorized | Leave a comment

Bug 13930580 Workaround Effective

We put the workaround for Bug 13930580 in on Friday and the results have been much better than I expected.  Here is when the workaround went in as reported in the alert log:

Fri Jan 17 18:38:26 2014
ALTER SYSTEM SET _use_adaptive_log_file_sync='FALSE' SCOPE=BOTH;

Here are the log file sync average times.  Notice how they go down after 7 pm Friday:

END_INTERVAL_TIME          number of waits ave microseconds
-------------------------- --------------- ----------------
17-JAN-14 12.00.49.669 AM            78666       15432.6923
17-JAN-14 01.00.27.862 AM            13380       15509.9778
17-JAN-14 02.00.11.834 AM            15838       17254.2949
17-JAN-14 03.00.56.429 AM            10681       29832.4282
17-JAN-14 04.00.39.502 AM            26127       14880.2097
17-JAN-14 05.00.22.716 AM            21637       10952.5322
17-JAN-14 06.00.01.558 AM            67162       9756.44207
17-JAN-14 07.00.45.358 AM           123705       11755.6535
17-JAN-14 08.00.29.811 AM           223799       11341.2467
17-JAN-14 09.00.19.275 AM           319051       13651.4647
17-JAN-14 10.00.09.089 AM           507335       13991.5543
17-JAN-14 11.00.59.502 AM           583835       11609.8432
17-JAN-14 12.00.44.044 PM           627506       10857.4556
17-JAN-14 01.00.30.133 PM           610232       11233.9348
17-JAN-14 02.00.18.961 PM           664368       10880.3887
17-JAN-14 03.00.05.694 PM           647896       9865.96367
17-JAN-14 04.00.44.694 PM           538270       10425.6479
17-JAN-14 05.00.24.376 PM           343863       9873.98468
17-JAN-14 06.00.11.481 PM           169654       9735.80996
17-JAN-14 07.00.03.087 PM            87590       7046.92633
17-JAN-14 08.00.52.390 PM            69297       2904.62955
17-JAN-14 09.00.29.888 PM            38244       3017.15969
17-JAN-14 10.00.09.436 PM            28166       3876.77469
17-JAN-14 11.00.54.765 PM            23220       11109.3063
18-JAN-14 12.00.33.790 AM            13293       9749.99428
18-JAN-14 01.00.17.853 AM            15332       3797.76839
18-JAN-14 02.00.56.050 AM            16137       6167.15127
18-JAN-14 03.00.33.908 AM            14621       9664.63108
18-JAN-14 04.00.12.383 AM             9708        6024.9829
18-JAN-14 05.00.56.348 AM            14565       3618.76938
18-JAN-14 06.00.39.683 AM            14323       3517.45402
18-JAN-14 07.00.29.535 AM            38243       3753.46422
18-JAN-14 08.00.16.778 AM            44878       2280.22924
18-JAN-14 09.00.01.176 AM            73082       9689.52484
18-JAN-14 10.00.45.168 AM            99302       2094.03293
18-JAN-14 11.00.35.070 AM           148789       1898.40424
18-JAN-14 12.00.23.344 PM           151780       1932.64997
18-JAN-14 01.00.08.631 PM           186040       2183.18563
18-JAN-14 02.00.59.839 PM           199826       2328.87331
18-JAN-14 03.00.45.441 PM           210098        1335.9759
18-JAN-14 04.00.36.453 PM           177331       1448.39219
18-JAN-14 05.00.21.669 PM           150837       1375.07256
18-JAN-14 06.00.59.959 PM           122234       1228.21767
18-JAN-14 07.00.37.851 PM           116396       1334.64569
... skip a few to find some higher load times...
19-JAN-14 10.00.01.434 AM           557020       2131.02737
19-JAN-14 11.00.42.786 AM           700781       1621.16596
19-JAN-14 12.00.31.934 PM           715327       1671.72335
19-JAN-14 01.00.10.699 PM           718417       1553.98083
19-JAN-14 02.00.51.524 PM           730149        2466.6241
19-JAN-14 03.00.38.088 PM           628319       2465.45829

When the system is busy we are seeing less than 3000 microseconds = 3 milliseconds log file sync which is good.  We were seeing 10 milliseconds or more which isn’t that great.

Oracle support has been pushing this for a long time but our own testing wasn’t able to recreate the problem.  Have to hand it to them.  They were right!

Here is a link to my previous post on this issue: url

– Bobby

Posted in Uncategorized | Leave a comment

Getting started with 12c

Back in July I finally got Oracle 12c installed on my laptop as documented in this post: url

But, that was as far as I got.  The last thing I did was get an error message creating a user.  Well, I figured out how to create a new user and a few other things.  I’m working with the ORCL database that comes with the install and all the parameters, etc. that come with it.

Evidently the default install comes with a PDB called PDBORCL.  So, I have two tns entries one for the parent CBD and one for the child PDB and they look like this:

ORCL.WORLD =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.133.128)
(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl.mydomain.com)
    )
  )

PDB.WORLD =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.133.128)
(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = pdborcl.mydomain.com)
    )
  )

I guess the service name has the name of the PDB in it.

So, if I connect as SYSTEM/password@orcl I’m connected to the CDB and if I connect to SYSTEM/password@pdb I’m connected to the PDB.  When I connected to the PDB I could create a new user without getting an error.

But, when I first tried connecting to the PDB I got this error, even though the database was up:

ORA-01033: ORACLE initialization or shutdown in progress

So, to bring the database up (by the way, I’m on 64 bit Linux) after booting the Linux VM the following steps were required:

lsnrctl start

sqlplus / as sysdba

startup

alter session set container=PDBORCL;

startup

Probably this could all be scripted but that’s what I did today.

Interestingly there is only one pmon:

$ ps -ef | grep pmon
oracle   29495     1  0 06:52 ?        00:00:00 ora_pmon_orcl

But you get different results when you query dba_data_files depending on whether connected to the CDB or PDB:

CDB

FILE_NAME                                 
------------------------------------------
/u01/app/oracle/oradata/orcl/system01.dbf 
/u01/app/oracle/oradata/orcl/sysaux01.dbf 
/u01/app/oracle/oradata/orcl/undotbs01.dbf
/u01/app/oracle/oradata/orcl/users01.dbf

PDB

FILE_NAME                                                     
--------------------------------------------------------------
/u01/app/oracle/oradata/orcl/pdborcl/system01.dbf             
/u01/app/oracle/oradata/orcl/pdborcl/sysaux01.dbf             
/u01/app/oracle/oradata/orcl/pdborcl/SAMPLE_SCHEMA_users01.dbf
/u01/app/oracle/oradata/orcl/pdborcl/example01.dbf

So, I guess each PDB has its own SYSTEM and SYSAUX tablespaces?

Lastly when running my scripts to poke around I edited my sqlplus header script to report which container you are in.  It looks like this now:

set linesize 32000
set pagesize 1000
set long 2000000000
set longchunksize 1000
set head off;
set verify off;
set termout off;

column u new_value us noprint;
column n new_value ns noprint;
column c new_value cs noprint;

select name n from v$database;
select user u from dual;
SELECT SYS_CONTEXT('USERENV', 'CON_NAME') c FROM DUAL;

set sqlprompt &ns:&cs:&us>

set head on
set echo on
set termout on
set trimspool on

spool &ns..&cs..logfilename.log

Replace “logfilename” with whatever you want for your script name.

It puts out a prompt like this:

CDB

ORCL:CDB$ROOT:SYSTEM>

PDB

ORCL:PDBORCL:SYSTEM>

And the log file names:

ORCL.CDB$ROOT.sessions.log

ORCL.PDBORCL.sessions.log

Anyway, this is just a quick post about my first attempts to get around in 12c.

– Bobby

Posted in Uncategorized | Leave a comment

Adaptive Optimization Limitation Example

I’ve been reading up on Oracle 12c to get certified and to help advise my company on potential uses for the new version.  I’ve been looking forward to researching the new Adaptive Optimization features because it makes so much sense that the database should change its plans when it finds differences between the expected number of rows each part of a plan sees and the actual number of rows.

I’ve written blog posts in the past about limitations of the optimizer related to its ability to determine the number of rows (cardinality) that steps in a plan would see.  I took the scripts from some of these and ran them on a 12c instance to see if the new features would cause any of the inefficient plans to change to the obvious efficient plans.

Sadly, none of my examples ran differently on 12c.  I don’t doubt that there are examples that run better because of the new features but the ones I constructed earlier didn’t see any improvement.  So, I thought I would blog about one such example.

Here is the original blog post with an example run on 11.2 Oracle: url

Here is the same script run on 12c: zip

Here is the query with the bad plan:

select B.DIVNUM 
from DIVISION A,SALES B 
where a.DIVNUM=B.DIVNUM and                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
A.DIVNAME='Mayberry'                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    

Plan hash value: 480645376                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              

--------------------------------------------------------------------                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
--------------------------------------------------------------------                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|   0 | SELECT STATEMENT   |          |       |       |   421 (100)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|*  1 |  HASH JOIN         |          |   500K|  8300K|   421   (2)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|*  2 |   TABLE ACCESS FULL| DIVISION |     1 |    14 |     2   (0)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
|   3 |   TABLE ACCESS FULL| SALES    |  1000K|  2929K|   417   (1)|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
--------------------------------------------------------------------

There is only 1 SALES row that has the DIVNUM associated with DIVNAME=’Mayberry’. There 1,000,001 SALES rows  and there is an index on SALES.DIVNUM so an index scan would be the most efficient access method and a nested loops join the most efficient join method. But the 12c optimizer chooses a hash join and full table scan instead.

According to the 12c SQL Tuning manual there are two types of Adaptive Optimization that might help in this case: Adaptive Plans and Adaptive Statistics.  I tried to tweak my test script to get Adaptive Statistics to kick in by commenting out the dbms_stats calls but it didn’t help.  Also, I tried running the query several times in a row but it never changed plan.

I can see why Adaptive Plans wouldn’t work.  How long will it let the full scan of SALES go before it decides to switch to a nested loops join with an index scan?  If it gets half way through the table it is too late.  So, I’m not sure how Adaptive Plans could change the plan on the fly when it expects a lot of rows and only finds a few.

On the Adaptive Statistics I guess this is just a case that it still can’t handle.  I guess it is like a histogram across joins case that would be pretty complex to solve in general.

Anyway, this all reminds me of when I first learned about histograms.  I got all excited that histograms would solve all my query performance problems and then came crashing down to earth when I realized it wasn’t the case.  I think the analogy fits.  Histograms improved cardinality estimates and can help in certain cases.  I think the new adaptive features will help improve plans by using real cardinality figures where it can, but they aren’t a cure-all.

I’m not sure that getting cardinality right is a solvable problem in the general case.  You have to have a fast optimizer so there are limits to how much it can do.

I ran all this as the user SYSTEM on the base 12c 64 bit Linux install with all the default parameters unchanged on the standard default database.

– Bobby

Posted in Uncategorized | 2 Comments

New Features Guide Highlights

I just finished reading the main section of the Oracle 12c New Features Guide.  I read pages 17-107 of the version I have – Section 1 titled “Oracle Database 12c Release 1 (12.1) New Features”.  I underlined and asterisked the things that stuck out as interesting in this pass and thought I would include them in this post:

1.1.4.7 New PL/SQL DBMS_UTILITY.EXPAND_SQL_TEXT Procedure

Expands views into SQL automatically.

1.1.6.1 Default Values for Columns Based on Oracle Sequences

Use sequence in column definition for default values.

1.1.6.4 Increased Size Limit for VARCHAR2, NVARCHAR2, and RAW 
Data Types

32K varchar2 columns

1.1.8.6 JDBC Support for Database Resident Connection Pool

Possible alternative to shared servers

1.2.3.1 Asynchronous Global Index Maintenance for DROP and 
TRUNCATE Partition

Global index not made unusable by partition maintenance

1.2.4.1 Adaptive Query Optimization

Plans change when DB sees that its cardinality estimates
were wrong.

1.2.4.15 Session-Private Statistics for Global Temporary Tables

Gather stats on global temp tables for your session only - cool.

1.3.4.3 SecureFiles is the Default for LOB Storage

Not sure what the means, but good to know that the default 
changed.

1.4.1 Database Consolidation

Subsections 2-6 give some clues to the way the multitenant 
architecture works.

1.4.3.1 Cloning a Database

Sounds similar to Delphix

1.4.4.3 Oracle Data Pump Change Table Compression at Import Time

Imported data can be compressed using HCC on target.

1.5.5.7 Multiple Indexes on Same Set of Columns

Can have different kinds of indexes on same set of columns 
(same order I assume)

1.5.9.1 Active Database Duplication Enhancements

Faster clone of an open database using RMAN

1.6.1.1 Enterprise Manager Database Express

Sounds like a better EM tool - would like to check it out
and review the 2-Day dba manuals which show uses of it.

1.6.3.1 Queryable Patch Inventory

Don't have to do lsniventory to see your patches?

1.8.3.1 Multi-Process Multi-Threaded Oracle

Wondering what platforms this is on and what it really means.

1.9.7.1 Unified Context-Based Database Audit Architecture

Sounds like this may replace some third party tools.  Improved 
auditing.

1.12.1.2 Parallel Upgrade

May speed up upgrade by parallelising

I read all 91 pages but there were sections that didn’t really interest me since they are about features we don’t use such as Spatial.  If you are interested in 12c I encourage you to read this as I did.  I printed out about 10 pages at a time.  It’s only 91 pages so it doesn’t take forever to read it.

– Bobby

Posted in Uncategorized | Leave a comment

Bug 13930580 Example

It looks like we hit 13930580 on one of our systems and I wanted to put some example output showing log file sync times going above 10 milliseconds (10,000 microseconds) after the log writer switches to polling.

Here is the output in the lgwr trace file showing the switch to polling:

*** 2013-12-27 01:50:47.374
kcrfw_update_adaptive_sync_mode: post->poll long#=1 sync#=1 
sync=55336 poll=1913
4 rw=5506 rw+=5506 ack=0 min_sleep=19134

*** 2013-12-27 01:50:47.374
Log file sync switching to polling
Current scheduling delay is 1 usec
Current approximate redo synch write rate is 0 per sec

Here is a report showing the number of log file sync waits and their average time in microseconds:

END_INTERVAL_TIME          number of waits ave microseconds
-------------------------- --------------- ----------------
26-DEC-13 03.00.19.333 PM           976539       2279.06861
26-DEC-13 04.00.43.475 PM           681212       2029.32406
26-DEC-13 05.00.07.975 PM           343932       1575.26284
26-DEC-13 06.00.34.772 PM           163911       1850.74354
26-DEC-13 07.01.00.911 PM            73151       3815.28597
26-DEC-13 08.00.24.463 PM            39304       5038.05427
26-DEC-13 09.00.48.970 PM            32122       5677.00557
26-DEC-13 10.00.13.491 PM           472349       2353.95857
26-DEC-13 11.00.40.521 PM            18679       18655.5294
27-DEC-13 12.00.06.276 AM            19618       17046.2287
27-DEC-13 01.00.30.299 AM            18983       5721.99178
27-DEC-13 02.00.54.261 AM            17724       17106.3415
27-DEC-13 03.00.18.282 AM             9088       25342.7271
27-DEC-13 04.00.42.218 AM            14365          12128.7
27-DEC-13 05.00.06.391 AM            16323       12879.8831
27-DEC-13 06.00.31.379 AM            43758       15326.7298
27-DEC-13 07.00.56.027 AM            83819       14796.2851
27-DEC-13 08.00.20.637 AM           168718       13506.4363
27-DEC-13 09.00.47.262 AM           302827       19116.9491
27-DEC-13 10.00.14.014 AM           480347       19358.6655
27-DEC-13 11.00.41.178 AM           512777       15952.2358
27-DEC-13 12.00.08.220 PM           511091       13799.5512
27-DEC-13 01.00.38.131 PM           576341       10183.4347
27-DEC-13 02.00.06.308 PM           524568       10251.1259

Notice how the average wait time goes above 10,000 microseconds consistently once the log writer switches to polling between 1 and 2 am on 12/27/2013.  I didn’t show all the output but this long log file sync wait time has continued since the switch.

Also, these long log file sync times don’t correspond to long log file parallel write times.  Here are the number and averages of log file parallel write waits for the same time frame:

END_INTERVAL_TIME          number of waits ave microseconds
-------------------------- --------------- ----------------
26-DEC-13 03.00.19.333 PM           902849       1426.66601
26-DEC-13 04.00.43.475 PM           659701       1394.87763
26-DEC-13 05.00.07.975 PM           344245       1294.92401
26-DEC-13 06.00.34.772 PM           166643       1586.64944
26-DEC-13 07.01.00.911 PM            80457       4019.29429
26-DEC-13 08.00.24.463 PM            46409       5580.67827
26-DEC-13 09.00.48.970 PM            69218       5115.20904
26-DEC-13 10.00.13.491 PM           475297       2219.80541
26-DEC-13 11.00.40.521 PM            40943        19405.052
27-DEC-13 12.00.06.276 AM            38835       18160.8073
27-DEC-13 01.00.30.299 AM            24734       6321.38425
27-DEC-13 02.00.54.261 AM            33617       11723.6698
27-DEC-13 03.00.18.282 AM            36469       17485.2614
27-DEC-13 04.00.42.218 AM            19344       6955.27042
27-DEC-13 05.00.06.391 AM            17857       4399.75718
27-DEC-13 06.00.31.379 AM            45098       4923.02763
27-DEC-13 07.00.56.027 AM            83700       3610.39713
27-DEC-13 08.00.20.637 AM           160919       2841.31507
27-DEC-13 09.00.47.262 AM           266405       3523.86855
27-DEC-13 10.00.14.014 AM           384795        3367.5075
27-DEC-13 11.00.41.178 AM           437806       2729.84248
27-DEC-13 12.00.08.220 PM           448261       2442.81012
27-DEC-13 01.00.38.131 PM           511648       1880.74418
27-DEC-13 02.00.06.308 PM           481106       1919.21158

The average I/O time – log file parallel write – is pretty low when the system is active (more than 100,000 waits per hour) – usually less than 4000 microseconds and yet log file sync is always more than 10,000 after the switch to polling. Also, the CPU on the system is consistently less than 30% used so it isn’t a system load issue.

Here are some Oracle support documents related to this issue:

Adaptive Switching Between Log Write Methods can Cause 
'log file sync' Waits (Doc ID 1462942.1)

Waits for "log file sync" with Adaptive Polling vs Post/Wait 
Choice Enabled (Doc ID 1541136.1)

Bug 13930580: LGWR IS BLOCKING SESSIONS

Here is the script I used to get the wait output:

set linesize 32000
set pagesize 1000
set long 2000000000
set longchunksize 1000
set head off;
set verify off;
set termout off;

column u new_value us noprint;
column n new_value ns noprint;

select name n from v$database;
select user u from dual;
set sqlprompt &ns:&us>

set head on
set echo on
set termout on
set trimspool on

UNDEFINE WAITNAME
UNDEFINE MINIMUMWAITS

spool "&ns.&&WAITNAME..log"

column END_INTERVAL_TIME format a26

select sn.END_INTERVAL_TIME,
(after.total_waits-before.total_waits) "number of waits",
(after.time_waited_micro-before.time_waited_micro)/
(after.total_waits-before.total_waits) "ave microseconds",
before.event_name "wait name"
from DBA_HIST_SYSTEM_EVENT before, 
DBA_HIST_SYSTEM_EVENT after,
DBA_HIST_SNAPSHOT sn
where before.event_name='&&WAITNAME' and
after.event_name=before.event_name and
after.snap_id=before.snap_id+1 and
after.instance_number=1 and
before.instance_number=after.instance_number and
after.snap_id=sn.snap_id and
after.instance_number=sn.instance_number and
(after.total_waits-before.total_waits) > &&MINIMUMWAITS
order by after.snap_id;

spool off

I gave log file sync as the WAITNAME and 1 as MINIMUMWAITS for the first output.  I changed WAITNAME to log file parallel write for the second one with 1 still for MINIMUMWAITS.

It looks like there is a new feature in 11.2 that was finally turned on by default in 11.2.0.3.  The work around is to set a hidden parameter to false to turn off the new feature.  Check out the Oracle support docs I listed for details.

– Bobby

p.s. I forgot to mention that when I tested on a test database with and without this new feature the log file sync times were lower with the new feature, as they should be.

With _use_adaptive_log_file_sync=TRUE (feature enabled):

Top 5 Timed Foreground Events

Event Waits Time(s) Avg wait (ms) % DB time Wait Class
log file sync 639,598 3,466 5 86.74 Commit
DB CPU 397 9.93
buffer exterminate 683 14 21 0.36 Other
virtual circuit wait 912 12 13 0.31 Network
SQL*Net message to client 1,293,196 7 0 0.17 Network

With _use_adaptive_log_file_sync=FALSE (disabled as in earlier versions of 11.2):

Top 5 Timed Foreground Events

Event Waits Time(s) Avg wait (ms) % DB time Wait Class
log file sync 639,644 3,553 6 87.31 Commit
DB CPU 367 9.02
buffer exterminate 1,364 28 21 0.69 Other
buffer busy waits 2,679 15 6 0.37 Concurrency
virtual circuit wait 903 13 15 0.32 Network

With the new feature enabled log file sync was 5 milliseconds instead of 6 without it.  So, the new feature does speed up log file sync waits when it is working normally.  But, there must be some bug condition where it degrades to greater than 10 millisecond log file syncs.

Posted in Uncategorized | 5 Comments

Delphix upgrade in twenty minutes

Just got off of a Webex with Delphix support.  They upgraded our Delphix server from version 3.0.3.1 to 3.2.5.1.  It took about twenty minutes.  Pretty nice compared to an Oracle database upgrade I think!

The only thing that took some time was that I had to be sure I had enough space.  It looks like Delphix needs your disk space to be no more than 85% utilized to fully function.  We ended up extending our four 1.5 terabyte luns to 1.9 TB each to give us space.  Then I cleaned up some archive log space by running a snapsync on each source database to completion.  Our system just needed a little TLC to get some space free again.

But, the upgrade itself, running the scripts and rebooting the server, took 20 minutes and was all done by Delphix support for no extra charge.  Sweet.

– Bobby

Posted in Uncategorized | 3 Comments

Cool Picture of Instance/Database from 12c Concepts

I just think this is a cool picture of an Oracle 12c instance and database from Oracle’s 12c Concepts manual (not my own work):

Instance and Database from 12c Concepts

This is from the concepts manual found here: url

– Bobby

Posted in Uncategorized | 2 Comments

Finished reading Oracle Core

Just finished reading the book by Jonathan Lewis titled “Oracle Core: Essential Internals for DBAs and Developers“.  I think I picked it up at the Collaborate 13 conference in Denver last April but haven’t had time (or taken the time) to read it.

Reading a book like Oracle Core can be a challenge because it is pretty dense with example scripts and outputs including dumps in hex.  So, I decided to take the strategy of pushing myself to crash through the book without carefully following every example.  I may only have absorbed about 10% of the material but if I didn’t jam through it I would have gotten 0%!

I picked up Oracle Core because I had read another book by the same author titled “Cost-Based Oracle Fundamentals” which has paid for itself 100 times over in terms of helping me tune queries.  I highly recommend Cost-Based Oracle Fundamentals without reservation.  But, like Oracle Core it can be a challenge to just sit down and read it and follow every SQL example and output.  Probably it would be worth making a first pass focusing on just the English language text and skimming the examples, maybe delving into the examples of most interest.

In the case of Oracle Core I haven’t yet put it to practical use but I’m glad to have at least skimmed through it.  Now I know what’s in it and can refer back to it when needed.

Next I hope to start reading up on Oracle 12c since I plan to get my certification this year.  But, I wanted to finish Oracle Core before I moved on, even if I only read it at a high level.

– Bobby

Posted in Uncategorized | 3 Comments