After I did my previous post on how adding a column to a table with more than 255 columns can cause a ton of row chaining the question came up about Exadata and the form of compression we use on our > 255 column tables. Would we see the same sort of row chaining on our tables that have more than 255 columns, especially after we add new columns and populate them with data?
So, I reran the same scripts from the previous post unchanged and with the addition of QUERY HIGH compression of the tables after they are loaded with data. Our real tables on our Exadata system are compressed in this way so I wanted to see if, assuming the tables are re-compressed when updates require it, will we still see row chaining.
Bottom line is that after compressing the tables for query high row chaining was almost eliminated even with more than 255 columns and after adding a new column to a table with more than 255 columns.
Here are the test results from the previous post – 16K block, 11.2.0.3 HP-UX Itanium 64-bit. These are the increase in the table fetch continued row statistic after running a query with the given column in the where clause and forcing a serial full scan.
Table Column 1 Column 2 Column 3 Last Column-1 Last Column TEST255C 0 0 0 0 0 TEST256C 0 83333 83333 83333 83333 TEST256CPLUS1 0 83333 1999999 1999999 1999999
Here is on Exadata – V2 quarter rack with 11.2.0.2 BP20 and no compression. Also, block size is 8K – half of what we had before.
Table Column 1 Column 2 Column 3 Last Column-1 Last Column TEST255C 0 0 0 0 14 TEST256C 0 166666 166666 166666 166666 TEST256CPLUS1 0 166666 1999999 1999999 1999999
So, without compression basically the same results. I think the smaller block is probably why there is more chaining for the same amount of data.
But check out the dramatic reduction in chaining if you re-compress the table in each case with QUERY HIGH compression:
Table Column 1 Column 2 Column 3 Last Column-1 Last Column TEST255C 46 46 46 181 182 TEST256C 42 42 42 172 172 TEST256CPLUS1 41 41 41 172 172
Negligible chaining regardless and the elapsed times are all less than .2 seconds.
...>grep Elapsed *.log nexttolastcolumns.log:Elapsed: 00:00:00.09 nexttolastcolumns.log:Elapsed: 00:00:00.08 nexttolastcolumns.log:Elapsed: 00:00:00.09 test255c.log:Elapsed: 00:00:00.09 test256c.log:Elapsed: 00:00:00.09 test256cplus1.log:Elapsed: 00:00:00.07 testfirstcolumns.log:Elapsed: 00:00:00.11 testfirstcolumns.log:Elapsed: 00:00:00.11 testfirstcolumns.log:Elapsed: 00:00:00.09 testsecondcolumns.log:Elapsed: 00:00:00.08 testsecondcolumns.log:Elapsed: 00:00:00.08 testsecondcolumns.log:Elapsed: 00:00:00.09 testthirdcolumns.log:Elapsed: 00:00:00.09 testthirdcolumns.log:Elapsed: 00:00:00.08 testthirdcolumns.log:Elapsed: 00:00:00.08
By comparison the uncompressed Exadata elapsed times were in the seconds with chaining:
...>grep Elapsed *.log nexttolastcolumns.log:Elapsed: 00:00:00.99 nexttolastcolumns.log:Elapsed: 00:00:01.83 nexttolastcolumns.log:Elapsed: 00:00:02.81 test255c.log:Elapsed: 00:00:01.07 test256c.log:Elapsed: 00:00:01.83 test256cplus1.log:Elapsed: 00:00:03.19 testfirstcolumns.log:Elapsed: 00:00:00.72 testfirstcolumns.log:Elapsed: 00:00:10.97 testfirstcolumns.log:Elapsed: 00:00:00.51 testsecondcolumns.log:Elapsed: 00:00:04.80 testsecondcolumns.log:Elapsed: 00:00:04.63 testsecondcolumns.log:Elapsed: 00:00:01.18 testthirdcolumns.log:Elapsed: 00:00:04.36 testthirdcolumns.log:Elapsed: 00:00:04.56 testthirdcolumns.log:Elapsed: 00:00:02.21
On our non-Exadata system the chaining resulted in times in minutes with the most chaining:
...>grep Elapsed *.log nexttolastcolumns.log:Elapsed: 00:00:22.89 nexttolastcolumns.log:Elapsed: 00:00:30.15 nexttolastcolumns.log:Elapsed: 00:04:19.77 test255c.log:Elapsed: 00:00:02.49 test256c.log:Elapsed: 00:00:06.79 test256cplus1.log:Elapsed: 00:00:09.41 testfirstcolumns.log:Elapsed: 00:00:20.91 testfirstcolumns.log:Elapsed: 00:00:23.24 testfirstcolumns.log:Elapsed: 00:00:28.78 testsecondcolumns.log:Elapsed: 00:00:11.34 testsecondcolumns.log:Elapsed: 00:00:15.91 testsecondcolumns.log:Elapsed: 00:00:18.46 testthirdcolumns.log:Elapsed: 00:00:13.29 testthirdcolumns.log:Elapsed: 00:00:17.95 testthirdcolumns.log:Elapsed: 00:04:12.92
So, if we can keep our tables with more than 255 columns compressed for query high row chaining doesn’t appear to be a factor, even if we add new columns and populate them. I believe this is because QUERY HIGH compression spreads the data for the rows in several blocks across the blocks anyway so they really aren’t stored in the same way that uncompressed rows are stored with all the columns for a row kept together.
– Bobby
Bobby, Do you now what is the impact to update/merge time on a compressed row vs an uncompressed row?
From what I have read, it will take longer to update a compresed row because of the time to uncompress.
Jim,
It’s a good question. I’d have to test it to know for sure. I think that with hybrid columnar compression like we have with query high compression the columns for the rows in a group of blocks get grouped together where for one column in all the rows they share the same block. So, if you do an update of a single column that might be pretty efficient. But, if you are updating a bunch of columns it seems like it would have to be less efficient since they would be spread over multiple blocks. Maybe I can put together a test to prove this out.
But, one thing to consider is that in our systems we often have merge statements where almost all the time is spent processing the query behind the merge and not on the update/insert itself. So, if the query that’s part of the merge statement takes most of the time then the time to uncompress may not be important in terms of the overall run time of the merge statement.
– Bobby
Jim,
I did a quick test. Updates on query high compressed tables are much slower than on uncompressed tables. At least, they were in my testcase which was on a table with 256 columns. It didn’t matter which column I updated. It was more than twice as long.
I may put out a blog post on this if I get time today.
– Bobby
Pingback: Updates of tables with query high compression slow | Bobby Durrett's DBA Blog