How Much Work is a SQL Statement Doing?: Databases for Developers: Performance #3

View All Tutorials Login to Run Tutorial

Tutorial How Much Work is a SQL Statement Doing?: Databases for Developers: Performance #3
Description When figuring out how to make a SQL statement faster, how many rows it processes is only part of the equation. You also need to look at how much resource it uses to read these rows. This tutorial teaches you how to see how much I/O a SQL statement did.
Tags performance, execution plan
Area Performance and Tuning
Contributor Chris Saxon (Oracle)
Created Thursday February 13, 2020
Modules 10

Prerequisite SQL

alter session set WORKAREA_SIZE_POLICY = auto;

create table colours as  
  select 'red' colour from dual 
  union  all  
  select 'blue' colour from dual 
  union  all  
  select 'green' colour from dual 
  union  all  
  select 'orange' colour from dual 
  union  all  
  select 'yellow' colour from dual 
  union  all  
  select 'purple' colour from dual 
  union  all  
  select 'brown' colour from dual 
  union  all  
  select 'white' colour from dual 
  union  all  
  select 'black' colour from dual;

create table bricks as  
  select rownum brick_id, colour, sysdate insert_datetime, lpad ( 'x', 100, 'x' ) stuff 
  from   colours 
  cross  join (  
    select * from dual 
    connect by level <= 250 
  );

create table pens as  
  select rownum pen_id, colour, sysdate insert_datetime, lpad ( 'x', 100, 'x' ) stuff 
  from   colours 
  cross  join (  
    select * from dual 
    connect by level <= 250 
  );

create table bricks_indexed as  
  select * from bricks;

create index brick_colour_i on bricks_indexed ( colour );
alter table bricks_indexed 
  modify colour not null;

create or replace view complex_query as   
  select count ( row_id ) row_count from (  
    select rownum row_id, bricks.*, pens.*   
    from   colours  
    join   pens  
    on     colours.colour = pens.colour  
    join   bricks  
    on     colours.colour = bricks.colour  
    order  by bricks.brick_id  
  );

select /* init */* from complex_query;

select /* init */count ( colour ) from bricks_indexed;

Module 1
Measuring Work Done
To make SQL statements faster, you need to reduce the amount of work they do. The two key resources are CPU and I/O.

I/O is usually the limiting factor in databases. All other things being equal, the less I/O a statement does, the better it will scale.

Oracle Database stores rows in blocks. To access a row, it must first read the block the row is stored in. Each time the database reads a new block, this is another logical I/O operation.

So one of the key metrics to measure in Oracle Database are the logical I/Os for a query. This has two components:
- Consistent (read) gets
- Current mode gets
Consistent gets

These access blocks as they existed at some point in the past. This is usually the time the query started. This ensures that the query only returns values that existed when the query started. Any changes made after this are ignored.

You'll see these gets when running queries or to process the where clause of UPDATEs and DELETEs.

Current mode gets

This gets the block as it exists right now. You'll see these gets when changing data in INSERT, UPDATE, and DELETE statements.

When reading a block - whether in consistent or current mode - if it's not cached in memory, the database will also do a physical read. This fetches the block from disk.

You can view the I/O for a statement with the IOSTATS format in DBMS_XPlan. This adds a Buffers column to the plan. This sums the consistent and current mode gets for the operations:
```
select /*+ gather_plan_statistics */count(*) from bricks;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
```
Module 2
How Many Logical I/Os Did an Operation Do?
The IOSTATS format reports the logical I/O in the buffers column for each operation in the plan. These figures are cumulative. So to find the I/Os for an operation, you need to subtract the values for its direct children.
```
select /*+ gather_plan_statistics */
       count (*)
from   colours c
join   bricks b
on     c.colour = b.colour;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
```
The Buffers reported at the top of the plan (step 0) is the total I/O for the query. So this query did ~44 gets. The vast majority of these were reading the BRICKS table, with a couple for reading the COLOURS table.

The HASH JOIN does little or no gets. Verify this by subtracting the total gets for reading COLOURS and BRICKS from the Buffers at operation 2.

So does this mean the join does no work?! No! The database will attempt to process the join in memory. But if there's not enough memory available it will have to write to temporary disk.

Module 3

Try It!

Replace /*TODO*/ section in the call to DBMS_XPlan to see how many buffers were needed to run this query:

select /*+ gather_plan_statistics */ *
from   colours c
join   bricks b
on     c.colour = b.colour
order  by b.brick_id
fetch first 1 rows only;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => '/*TODO*/'));

Module 4

How Much Memory Did an Operation Use?

Sorting and joining operations need to read and write data too. Often these are able to process the rows in memory. You can see how much memory an operation needed by getting the plan with the MEMSTATS format:

select /*+ gather_plan_statistics */* from complex_query;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'MEMSTATS LAST'));

This adds OMem, 1Mem, and Used-Mem columns to the plan:

-------------------------------------------------------------
| Id  | Operation                |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------
|   0 | SELECT STATEMENT         |       |       |          |
|   1 |  VIEW                    |       |       |          |
|   2 |   SORT AGGREGATE         |       |       |          |
|   3 |    VIEW                  |       |       |          |
|   4 |     SORT ORDER BY        |    17M|  1546K|   15M (0)|
|   5 |      COUNT               |       |       |          |
|*  6 |       HASH JOIN          |  1856K|  1856K| 1115K (0)|
|   7 |        TABLE ACCESS FULL |       |       |          |
|*  8 |        HASH JOIN         |  2171K|  2171K| 1241K (0)|
|   9 |         TABLE ACCESS FULL|       |       |          |
|  10 |         TABLE ACCESS FULL|       |       |          |
-------------------------------------------------------------

Unlike the I/O figures, these values are specific to each operation in the plan. The OMem and 1Mem figures are estimates. Used-Mem reports how much memory it actually used.

So the database estimated that the HASH JOIN at operation 8 would need 2,171Kb of memory. But it only used about half that; 1,241Kb.

If memory is limited or when processing huge data sets all the rows may not fit in memory. In this case the database needs to write the data to disk to complete the operation. Oracle Database uses the temporary tablespace to do this.

Module 5

Try It!

Replace /*TODO*/ section in the call to DBMS_XPlan to see how much memory this query used:

select /*+ gather_plan_statistics */* from bricks
order by brick_id
fetch first 1 rows only;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => '/*TODO*/'));

Module 6
How Many Disk Reads and Writes Did an Operation Do?
When the database is unable to sort rows in memory it writes them to temporary disk. You can see the amount of disk needed in the Used-Tmp column with the MEMSTATS format.

To view the number of disk reads and writes, check the Reads and Writes columns with the IOSTATS format.

An easy way to display all this information is with the ALLSTATS format. This is shorthand for MEMSTATS IOSTATS ROWSTATS.

The alter session statements in this section limit the amount of memory available for joins and sorts. So these operations will need to write to disk:
```
alter session set workarea_size_policy = manual;
alter session set sort_area_size = 25000;

select /*+ gather_plan_statistics */ row_count from complex_query;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'ALLSTATS LAST'));
```
It's rare you need to set the sort_area_size. The above is to show the principle on a small data set.

As with Buffers, the Reads and Writes column figures are cumulative. You may also see physical reads when querying rows that are not cached in memory (the buffer cache in Oracle Database).
Module 7
Reducing I/O: Only Select Needed Values
When writing a query, it's good practice to only select the columns you need. This reduces the amount of data sent over the network from the database to the client. And it can make operations like sorting and grouping more efficient.

These two queries get the first 25 rows in the table, sorted by colour. But the first fetches all the columns in the table. The second only gets the colour values:
```
select /*+ gather_plan_statistics */* 
from   bricks
order by colour
fetch first 25 rows only;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'MEMSTATS LAST'));

select /*+ gather_plan_statistics */ colour
from   bricks
order by colour
fetch first 25 rows only;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'MEMSTATS LAST'));
```
This leads to a big reduction in the memory needed to do the sort. When working with huge data sets, this reduces the chance the database will have to use temporary disk to order the data.

Selecting only the values you need also makes it more likely that the optimizer will use an index.
Module 8
Reducing I/O: Creating New Data Structures
A full table scan reads all the rows in a table. If the table stores millions of rows, this leads to lots of wasted work if your query only reads a few rows. You can create or change data structures to enable the database to read rows more efficiently.

The most common way to do this is by creating indexes! These can help the database find a few rows quickly.

These two queries show the power an index can have. The BRICKS and BRICKS_INDEXED tables store the same rows. But the second table has an index on COLOUR. Because the queries only select COLOUR, the database is able to answer the second query by reading the index instead of the table:
```
select /*+ gather_plan_statistics */
       colour, count(*) 
from   bricks
group  by colour
order  by colour;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));

select /*+ gather_plan_statistics */
       colour, count(*) 
from   bricks_indexed
group  by colour
order  by colour;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
```
The second query does around a quarter of the number of gets as the first, a considerable saving! This is because the index is a much smaller data structure than the table. You can verify this by querying *_SEGMENTS, which reports the size of each data structure in bytes:
```
select segment_name, segment_type, bytes
from   user_segments
where  segment_name in ( 
  'BRICKS', 'BRICKS_INDEXED', 'BRICK_COLOUR_I' 
);
```
Module 9
Reducing I/O: Filter vs Access Predicates
Indexes can also help the database find rows it's looking for faster. With a full table scan, the database reads every row and checks if it meets the WHERE clause.

This means the database may read more rows than your query returns, leading to wasted I/O. You can verify this by looking at the "Predicate Information" section of an execution plan:
```
select /*+ gather_plan_statistics */
       count(*) 
from   bricks
where  colour = 'red';

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
```
For this query, the predicate is a filter operation:
```
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("COLOUR"='red')
```
Whereas with an index on COLOUR, the database can use this to only read red rows:
```
select /*+ gather_plan_statistics */  
       count(*) 
from   bricks_indexed
where  colour = 'red';

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
```
The predicate is now an ACCESS:
```
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("COLOUR"='red')
```
This means the database only reads rows that match the search criteria. In this case it lead to a big reduction in work; just three I/Os to read the rows with an index compared to 40+ for a full table scan.

The next two modules will teach you how to create indexes and get the best out of them in Oracle Database.
Module 10
Challenge!
Run this query and get its execution stats:
```
select /*+ gather_plan_statistics */ *
from   colours c
join   bricks b
on     c.colour = b.colour
order  by b.colour
fetch first 10 rows only;

select *  
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'ALLSTATS LAST'));
```
This only needs to return the COLOUR of the first ten rows stored in BRICKS, sorted by COLOUR. Other columns are unnecessary. What changes can you make to this query to reduce the resources it uses?

Hint: remember BRICKS_INDEXED stores the same data as BRICKS.

Experiment with selecting different values. What's the smallest amount of memory and buffers you can get a query returning the top 10 colours to use?

DBMS_XPlan only returns metrics which have non-zero values for at least one operation in the plan. If a column is missing (e.g. Used-Mem) this means the database needed no extra memory to process the query!

Additional Information

Database on OTN SQL and PL/SQL Discussion forums
Oracle Database
Download Oracle Database

How Much Work is a SQL Statement Doing?: Databases for Developers: Performance #3

Prerequisite SQL

Measuring Work Done

Consistent gets

Current mode gets

How Many Logical I/Os Did an Operation Do?

Try It!

How Much Memory Did an Operation Use?

Try It!

How Many Disk Reads and Writes Did an Operation Do?

Reducing I/O: Only Select Needed Values

Reducing I/O: Creating New Data Structures

Reducing I/O: Filter vs Access Predicates

Challenge!

Links

Additional Information