Real World Index Example

This is a follow up to my previous post with a real world example of how much difference indexes can make on tables that do not have them.

I looked at an AWR report of a peak hour on a busy database and noticed sql_id 028pmgvcryk5n:

Peak Hour No Indexes

The top SQL is a PL/SQL call so it does not have a plan but 028pmgvcryk5n stood out because it was taking 11.24% of the total elapsed time and running over 2 seconds (2.23) per execution. Most queries on this system run in hundredths of a second like .02 seconds.

I looked at this query’s plan and it included two tables with full scans on both:

I put indexes on each table and the new plan looks like this:

With indexes it ran the query in about 1 millisecond:

Without indexes it ran from 600 to 2000 milliseconds:

I guess I have a simple point. Indexes can make a big difference. A query that runs in 600 milliseconds may be good enough in some cases but if it can run in 1 millisecond it is worth putting on the right indexes.

Bobby

p.s. I used my sqlstat.sql and vsqlarea.sql scripts to get the execution history with and without the indexes. I used getplans.sql to get the plan without indexes from the AWR and extractplansqlid.sql to get the plan with the indexes from the V$ views. The query ran too fast to show up on the AWR once the indexes were in place so that is why I used the V$ queries to get information about the query after adding the indexes.

Posted in Uncategorized | Leave a comment

Simple Index Example

There is a lot to be said about when to add indexes on Oracle tables, but I wanted to show a simple example. In this case we have a table with no indexes and a query with two equal conditions in the where clause which identify a single row out of many. Here is a zip of the SQL*Plus script and log for this post: zip

The table TEST is a clone of DBA_TABLES. I load it up with a bunch of copies of DBA_TABLES and with a single row that is unique. Then I run the query without any indexes on the table:

SQL> select TABLESPACE_NAME
  2  from test
  3  where
  4  owner='TESTOWNER' and
  5  table_name = 'TESTTABLE';

TABLESPACE_NAME
------------------------------
TESTTABLESPACE

Elapsed: 00:00:00.15

I add an index on owner and table_name and run it again:

SQL> select TABLESPACE_NAME
  2  from test
  3  where
  4  owner='TESTOWNER' and
  5  table_name = 'TESTTABLE';

TABLESPACE_NAME
------------------------------
TESTTABLESPACE

Elapsed: 00:00:00.00

This may not seem like a big deal going from .15 seconds to .00 seconds (less than .01 seconds). But if you start running a query like this hundreds of thousands of times per hour the .15 seconds of CPU per execution can slow your system down.

See the zip for the details. The create index command looked like this:

SQL> create index testi on test (owner, table_name);

Index created.

Bobby

Posted in Uncategorized | 1 Comment

Fixed Broken Links

I had a couple of new comments on older blog pages and I noticed that some links pointed to things which no longer exist. I fixed a few things that I knew were wrong. Finally, this week I decided to review every post back to the beginning 9 years ago and click on every link.

For dead links I removed the link and added a comment like (DOES NOT EXIST). In some cases, the link still existed but was different. I changed several links to Oracle’s documentation for example. In many cases I put (updated) or something like that to show that there was a new link. I synced up a lot of the old links to the current version of my scripts on GitHub. Hopefully these changes won’t make the older posts unreadable, but at least the links will point to useful and up to date versions of things.

I did not carefully read every post because I was in a hurry, but I did look at every post and it gave me the chance to see how things changed over the past 9 years. It was kind of a quick review of what I was thinking. Some of the posts seemed kind of dumb. (What was I thinking?) But others are genuinely useful. But it was interesting to see which topics I talked about and how that changed over time. It makes me wonder where things will go in the future. I guess I cannot expect to fully predict the future, but it is good to think about what I should be learning and what skills I should be developing as things go forward.

Anyway, hopefully the updated links will make the posts a little more helpful to myself and others.

Bobby

Posted in Uncategorized | Leave a comment

SymPy Tutorial Repository

I have been playing with the Python SymPy package and created a repository with my test scripts and notes:

https://github.com/bobbydurrett/sympytutorial

Might be helpful to someone. I just got started.

I had used Maxima before. SymPy and Maxima are both what Wikipedia calls “Computer Algebra Systems.” They have a nice list here:

https://en.wikipedia.org/wiki/List_of_computer_algebra_systems

I got a lot of use out of Maxima but I think it makes sense to switch the SymPy because it is written in Python and works well with other mainstream Python packages that I use like Matplotlib. They both fall under the SciPy umbrella of related tools so for me if I need some computer algebra I probably should stick with SymPy.

Maxima and SymPy are both free.

Bobby

Posted in Uncategorized | Leave a comment

ORA-14767 when day of month > 28 with interval partitioning month interval

SQL> CREATE TABLE test(
  2  RUN_DATE DATE,
  3  MY_NBR NUMBER(4)
  4  )
  5  PARTITION BY RANGE (RUN_DATE)
  6  INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))
  7  (
  8    PARTITION data_p1 VALUES LESS THAN (TO_DATE('01/29/2017', 'MM/DD/YYYY'))
  9  );
CREATE TABLE test(
*
ERROR at line 1:
ORA-14767: Cannot specify this interval with existing high bounds


SQL> 
SQL> CREATE TABLE test(
  2  RUN_DATE DATE,
  3  MY_NBR NUMBER(4)
  4  )
  5  PARTITION BY RANGE (RUN_DATE)
  6  INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))
  7  (
  8    PARTITION data_p1 VALUES LESS THAN (TO_DATE('01/28/2017', 'MM/DD/YYYY'))
  9  );

Table created.

Creating a range partitioned table with a date type partitioning column and a month interval must have a starting partition that has a day < 29 or it gets an ORA-14767 error.

The error message “Cannot specify this interval with existing high bounds” is not helpful. How about something like “Need a day of the month that exists in every month”? February only has 28 days most years so 28 is the max.

Bobby

Posted in Uncategorized | 3 Comments

Free Machine Learning Class from MIT

I noticed this new class from MIT:

https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-036-introduction-to-machine-learning-fall-2020/

It is about machine learning and is free. I think it has some built in exercises with automatic grading but no instructor to interact with.

Since ML is such a hot topic I thought I would share it. I have not taken the class.

Bobby

Posted in Uncategorized | 2 Comments

60,000 sessions in 64 gigabyte VM using shared servers

Just a quick note. I have an application that is generating thousands of inactive sessions and with the default dedicated server configuration we are having to add more and more memory to our virtual host to support the connections. We estimate that the application may need 45,000 mostly inactive sessions once the application is fully rolled out. So, I thought about how much memory would be required to support 45,000 sessions using shared servers. In an earlier post I mentioned how I got the sessions up to about 11,000 so I just took the Java program from that post and tried to adjust memory parameters to support over 45,000. I got it up to 0ver 60,000 so the test was essentially successful. I don’t think I would want to run a system with 60,000 sessions on a single node, but it is nice to see that it is to some degree possible.

I used a 64 gigabyte Linux VM and set these parameters:

sga_max_size=52G
sga_target=52G
shared_pool_size=36G
dispatchers='(PROTOCOL=TCP)(DISPATCHERS=64)'
max_shared_servers=16
shared_servers=16
large_pool_size=512M

Pretty sure that the large pool grew dynamically to fill the sga space not taken up by the shared pool. 52-36=16 gigabyte large pool.

Anyway, I don’t have time to write this up carefully now, but I wanted to publish the parameters.

Here is the previous post with the Java program I used to open 1000 connections:

https://www.bobbydurrettdba.com/2013/06/26/testing-maximum-number-of-oracle-sessions-supported-by-shared-servers/

I ended up running 30 of these on 3 servers for a total of 90,000 potential logins and got up to over 63,000.

Bobby

Posted in Uncategorized | 2 Comments

ORA-00904 DBMS_XPLAN HINT DBMS_LOB GRANT

I ran this query with a hint:

SQL> select /*+ full(my_tables) */ blocks
  2  from my_tables
  3  where
  4  owner = 'SYS' and
  5  table_name = 'TAB$';

    BLOCKS
----------
      1625

I ran this select to get the plan:

select * from table(dbms_xplan.display_cursor(null,null,'ALL'));

I was getting this error:

Column Projection Information (identified by operation id):
 1 - "BLOCKS"[NUMBER,22]
 ORA-00904: : invalid identifier

I found that my user or public needed an execute grant for DBMS_LOB to fix this:

SQL> grant execute on DBMS_LOB to PUBLIC;

Grant succeeded.

I am not sure why this grant was not in place on this database but it took a while to figure this out so I thought I would put it out there. I found the error in a trace and I suspected the issue was due to permissions. The trace was like:

PARSE ERROR ... err=904
 SELECT ST.* FROM XMLTABLE('/hint_usage/... DBMS_LOB.SUBSTR...

So that gave me the idea that I needed an execute grant on DBMS_LOB. EXECUTE ANY PROCEDURE did not do it.

After the grant it shows the hint report. This is on 19c:

Column Projection Information (identified by operation id):
 1 - "BLOCKS"[NUMBER,22]
 Hint Report (identified by...
 Total hints for statement: 1
 1 -  SEL$1 / MY_TABLES@SEL$1
            -  full(my_tables)

Bobby

P.S. Full log of the script that got the error:

Full log of the working script:

Full length trace lines:

Posted in Uncategorized | 2 Comments

Finding queries with bad plans from an AWR report of a load test

I want to document some recent steps that I have been taking to support new development on a transactional system. Every time the system has a new release, if that release includes Oracle SQL changes, I am asked to create and review an AWR report covering the time of a load test (usually several tests) and to see if I see any problems. In the past I looked for longer running application SQL but recently I changed to look at anything that averages over .1 seconds and that has been helpful. So, that is what this post is about. Obviously, if you have faster hardware or different workloads this rule of thumb will not help you. But maybe other higher-volume transactional systems will follow similar patterns.

Here is the top elapsed SQL from a recent load test:

SQL Ordered by Elapsed Time

I have only shown some of the columns to hide some details that I do not want to make public. Every SQL here whose “Elapsed Time per Exec (s)” value is above .1 seconds is not part of the application except the last one, 6kmnq0uj99a3c. This was a load test on a non-production system that ramped up a bunch of activity over several hours. This problem query only ran 664 times so if that is representative of how often it runs in production it may not really matter that it is inefficient. But you never know for sure, so I reviewed it anyway. All the queries listed that run in .03, .01, .02, and .00 seconds are representative of the typical queries with good plans on this system. So, that is why .1 ended up standing out. Also, not long ago I found two or three running in production with more than .1 seconds average runtime and they needed new indexes to bring them back under the .1 second threshold. So, for me .1 seconds is the current magical number.

To test it I used two of my scripts.

  • bind2.sql – to extract values used in one of the executions
  • test2.sql – to find out which table the execution spent the most time on

I replaced the bind variables with constants and ran the query in my test2.sql script and found that most of the time was on a certain range scan on what looked like the correct index. But on closer inspection I realized that a type conversion had prevented the last column of the index from being used. Here is what it looked like with the columns renamed to hide the production names.

filter((TO_NUMBER(MY_NUMBER)="XYZ"."MY_NUMBER" ...

One table has the “MY_NUMBER” column as a character string and the other as a number. So, it was doing a range scan and not a unique scan. I changed the query to convert the number to a character string and the plan used a unique scan.

ABC.MY_NUMBER= to_char(XYZ.MY_NUMBER)

Table ABC was the one that was doing a range scan on three columns and not on MY_NUMBER, the last column in the index. MY_NUMBER is a character column on ABC. XYZ was the other table with MY_NUMBER as a NUMBER type column. I am forcing the conversion of XYZ.MY_NUMBER to a character for the comparison instead of letting the optimizer choose to convert ABC.MY_NUMBER to a number which would suppress the use of the last column of the index on table ABC.

My point was not to talk about implicit type conversions preventing indexes from being used although that is very interesting. My point is that a plan like this that is pretty efficient could run less than .1 seconds if the index was used correctly. And if the application users end up scaling the use of this query way up to thousands or tens of thousands of executions per hour that unique scan could make a huge difference over the range scan without the last column of the index. Your CPUs might be 10 times faster than mine so your threshold might be lower than .1 seconds, but I think the idea is the same. There is some threshold that indicates a simple, frequently used, transactional SQL may not be using the right indexes. Does not apply in all cases but at the moment this is a useful rule of thumb for me.

I had just written the previous paragraphs before getting an email that our QA team had run another load test with the to_char explicit type conversion in place. It did not make as great of an improvement as I expected. Here are some edited outputs from my sqlstat.sql script:

Original query 6kmnq0uj99a3c:

END_INTERVAL_TIME     EXECUTIONS_DELTA Elapsed Average ms
 
 09-MAR-21 11.00.08 AM               79         171.306696
 09-MAR-21 12.00.35 PM               84         176.152667
 09-MAR-21 01.00.03 PM               80         178.420588
 09-MAR-21 02.00.32 PM               80         171.877913
 09-MAR-21 03.00.01 PM               81         174.509975
 09-MAR-21 04.00.29 PM               83         180.367157

New query 2ndfgypwp3qf0 with the to_char to allow the unique index scan:

END_INTERVAL_TIME     EXECUTIONS_DELTA Elapsed Average ms
 
 12-MAR-21 09.00.15 AM               80         107.822088
 12-MAR-21 10.00.44 AM               83         104.453446
 12-MAR-21 11.00.12 AM               81          105.34042
 12-MAR-21 12.00.42 PM               80          103.05625
 12-MAR-21 01.00.12 PM               79         106.738557
 12-MAR-21 02.00.42 PM               82         101.285183
 12-MAR-21 03.00.12 PM               81         105.172531

Kind of disappointing. I expected a greater improvement based on my testing. Still, .1 seconds per execution is better than .17. Maybe if the tables grow with more data over time this improvement will be greater.

Even though this query did not turn out to have a dramatic improvement I did find a way to improve the plan. My .1 seconds cutoff pointed me to a query that did not have the ideal use of indexes and lead to an improvement in performance. In other cases, in the past I have seen 20x improvements so it is worth reviewing the ones over .1 seconds.

Bobby

Posted in Uncategorized | 2 Comments

Simple tools I use

Just a quick note about two tools I use:

TextPad – my favorite text editor. I know everyone has their own, but this is mine.

UnixUtls – Unix like tools on a Windows command line.

Bobby

P.S. I need to do a post about my approach to doing database work. I work from a Windows 10 laptop supporting Linux and HP Unix Oracle database servers. I am command line oriented and I like my Windows text editor. But some things work better on the database server itself such as a query that dumps out a ton of output. Or if you want to get a timing on a query without including the network time from my laptop (now at home over the VPN). But usually it is easier to work on my laptop without any network between me and my files. Plus, my 64-bit editor can open very large files. I use Python as my main programming language, and I have a 32-bit Python installed on my laptop. The UnixUtls let me do things like ls, find, and diff on my laptop. I probably use diff the most. All of this is in the Windows command prompt. I am a dinosaur I know with my command line orientation. Anyway, there is a ton I could say so rather than post a huge discussion I at least wanted to mention the two tools above.

Posted in Uncategorized | 2 Comments