Shared servers results in lower CPU usage for top query in AWR

In two previous posts I’ve discussed the possibility of switching from the default dedicated server configuration to shared servers to support a large web farm.

The original post I decribed how Tom Kyte recommended a Youtube video that champions reducing the number of connections from a web farm to improve database performance.  Shared servers was one option.  Others would be some sort of connection pooling or a transaction processing monitor such as Tuxedo.

The second post described how I built a test with many dedicated server connections running many short transactions – lots of commits – and showed that switching to shared servers greatly reduced the commit time in this scenario.

Saturday night we switched our production database server to shared servers and we have seen the top query on our AWR report from peak times the previous week running several times faster because it consumes much less CPU per execution.  These results come from the DBA_HIST_SQLSTAT view.  I have not verified that the users themselves are seeing performance gains so I’m assuming that DBA_HIST_SQLSTAT is telling the truth. 🙂

Here is my DBA_HIST_SQLSTAT query:

>column END_INTERVAL_TIME format a25
>
>select ss.sql_id,
  2  ss.plan_hash_value,
  3  sn.END_INTERVAL_TIME,
  4  ss.executions_delta,
  5  ELAPSED_TIME_DELTA/(executions_delta*1000) "Elapsed Average ms",
  6  CPU_TIME_DELTA/(executions_delta*1000) "CPU Average ms",
  7  IOWAIT_DELTA/(executions_delta*1000) "IO Average ms",
  8  CLWAIT_DELTA/(executions_delta*1000) "Cluster Average ms",
  9  APWAIT_DELTA/(executions_delta*1000) "Application Average ms",
 10  CCWAIT_DELTA/(executions_delta*1000) "Concurrency Average ms",
 11  BUFFER_GETS_DELTA/executions_delta "Average buffer gets",
 12  DISK_READS_DELTA/executions_delta "Average disk reads",
 13  ROWS_PROCESSED_DELTA/executions_delta "Average rows processed",
 14  FETCHES_DELTA/executions_delta "Average fetches",
 15  END_OF_FETCH_COUNT_DELTA/executions_delta "Average end of fetch count",
 16  SORTS_DELTA/executions_delta "Average sorts",
 17  PX_SERVERS_EXECS_DELTA/executions_delta "Average PX servers execs",
 18  LOADS_DELTA/executions_delta "Average loads",
 19  INVALIDATIONS_DELTA/executions_delta "Average invalidations",
 20  PARSE_CALLS_DELTA/executions_delta "Average parse calls",
 21  DIRECT_WRITES_DELTA/executions_delta "Average direct writes",
 22  PLSEXEC_TIME_DELTA/executions_delta "Average PS/SQL exec time",
 23  JAVEXEC_TIME_DELTA/executions_delta "Average Java exec time"
 24  from DBA_HIST_SQLSTAT ss,DBA_HIST_SNAPSHOT sn
 25  where ss.sql_id = '1p08fp6u57us5'
 26  and ss.snap_id=sn.snap_id
 27  and executions_delta > 100000
 28  order by ss.snap_id,ss.sql_id;

This averages out various statistics on a per execution basis for the given hour interval.  I’ve excluded any interval with <= 100000 executions so we can look at peak usage times only.  I edited out all the columns of the output that were not different or significant to make this summary output:

PLAN_HASH_VALUE END_INTERVAL_TIME         EXECUTIONS_DELTA Elapsed Average ms CPU Average ms
--------------- ------------------------- ---------------- ------------------ --------------
      156426008 12-JUL-13 03.00.42.806 PM           107606         28.2400079      26.081442
      156426008 15-JUL-13 08.00.21.424 AM           113049         6.54265593     6.45811559

The key details here are that the plan did not change and the number of executions is about the same, but the CPU per execution is considerably lower after the move to shared servers.

Here is an edited version of the query and its plan.  I’ve changed the table and column names.

Plan hash value: 156426008

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |       |       |    11 (100)|          |
|   1 |  HASH JOIN OUTER             |                  |     6 |   702 |    11  (10)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TABLE_A          |     6 |   492 |     8   (0)| 00:00:01 |
|   3 |    INDEX RANGE SCAN          | TABLE_A_INDEX    |     6 |       |     3   (0)| 00:00:01 |
|   4 |   TABLE ACCESS FULL          | TABLE_B          |    18 |   630 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

SELECT TABLE_A.COLUMN1,
       TABLE_A.COLUMN2,
       TABLE_A.COLUMN3,
       TABLE_B.COLUMN5,
  FROM TABLE_A, TABLE_B
 WHERE TABLE_A.COLUMN1 = :B1 AND 
       TABLE_A.COLUMN2 = TABLE_B.COLUMN4(+);

I don’t really know why the switch to shared servers caused this query to use so much less CPU.  During these peak times we have about 400 database sessions from about 50 web servers.  With dedicated servers that translates to 400 Unix processes with the 100,000 executions of this query spread evenly across them.  With shared servers the workload is processed by only 32 Unix processes.  So, I’m guessing that there are efficiencies that come from doing the work in fewer Unix processes but I haven’t found a statistic that homes in on where those efficiencies are coming from.

– Bobby

 

 

 

About Bobby

I live in Chandler, Arizona with my wife and three daughters. I work for US Foods, the second largest food distribution company in the United States. I have worked in the Information Technology field since 1989. I have a passion for Oracle database performance tuning because I enjoy challenging technical problems that require an understanding of computer science. I enjoy communicating with people about my work.
This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Shared servers results in lower CPU usage for top query in AWR

  1. Narendra says:

    Bobby,

    That is quite interesting. You specifically say you ” edited out all the columns of the output that were not different or significant to make this summary output” by which I suppose number of sorts, buffer gets, physical reads all were almost identical?? I thought these factors generally tend to affect CPU usage. So I am also intrigued as to why just a switch to shared server from dedicated server should drop CPU usage significantly without the amount of work being done changing.
    Will keep an eye here when you find out the reason.

  2. Bobby says:

    Narendra,

    Yes you are right. I can upload the output if you want to see it. It was just a very long line. I was looking at the other columns for some reason for the lower CPU and couldn’t find any. Buffer gets for example were the same (I think 14 buffer gets per execution). I’m guessing this has to do with context switching at the OS level or something like that. If you look at our CPU reports the system CPU % has gone way down since moving from dedicated to shared servers. Overall CPU is down some, but system CPU is down a ton – maybe a quarter what it was before.

    Thanks for your comment.

    – Bobby

  3. Narendra says:

    Bobby,

    Thanks for the details. You might be correct that switching from dedicated to shared server connection mode may have had quite an impact on the way OS handles the processing.
    After my earlier comment, I was wondering whether the difference in CPU time could be due to CPU run-queue waits. With dedicated server mode, if the server had to cater to about 400 processes, could it be possible that some of the processes were waiting on CPU run-queue (and hence were recorded as “CPU time” and not a “Wait Time”) ? When you switched to shared server mode, the server has only 32 processes to manage, which might have reduced/eliminated most of the CPU run-queue waits.
    Of course, this point becomes invalid if your server already has enough CPU power to cater to 400 OS processes.
    On a side note, I am bit surprised to notice that Oracle has chosen to do a HASH JOIN instead of a NESTED LOOP join, especially when the “estimated rows” are small. It is possible that I need to “refresh” my knowledge though.

    • Bobby says:

      Peak CPU is in the 35% range so there shouldn’t be much queuing. It is a 16 CPU PA-RISC system running HP-UX with Oracle 10.2.0.3.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.