alter session set WORKAREA_SIZE_POLICY = auto;
create table colours as
select 'red' colour from dual
union all
select 'blue' colour from dual
union all
select 'green' colour from dual
union all
select 'orange' colour from dual
union all
select 'yellow' colour from dual
union all
select 'purple' colour from dual
union all
select 'brown' colour from dual
union all
select 'white' colour from dual
union all
select 'black' colour from dual;
create table bricks as
select rownum brick_id, colour, sysdate insert_datetime, lpad ( 'x', 100, 'x' ) stuff
from colours
cross join (
select * from dual
connect by level <= 250
);
create table pens as
select rownum pen_id, colour, sysdate insert_datetime, lpad ( 'x', 100, 'x' ) stuff
from colours
cross join (
select * from dual
connect by level <= 250
);
create table bricks_indexed as
select * from bricks;
create index brick_colour_i on bricks_indexed ( colour );
alter table bricks_indexed
modify colour not null;
create or replace view complex_query as
select count ( row_id ) row_count from (
select rownum row_id, bricks.*, pens.*
from colours
join pens
on colours.colour = pens.colour
join bricks
on colours.colour = bricks.colour
order by bricks.brick_id
);
select /* init */* from complex_query;
select /* init */count ( colour ) from bricks_indexed;To make SQL statements faster, you need to reduce the amount of work they do. The two key resources are CPU and I/O.
I/O is usually the limiting factor in databases. All other things being equal, the less I/O a statement does, the better it will scale.
Oracle Database stores rows in blocks. To access a row, it must first read the block the row is stored in. Each time the database reads a new block, this is another logical I/O operation.
So one of the key metrics to measure in Oracle Database are the logical I/Os for a query. This has two components:
These access blocks as they existed at some point in the past. This is usually the time the query started. This ensures that the query only returns values that existed when the query started. Any changes made after this are ignored.
You'll see these gets when running queries or to process the where clause of UPDATEs and DELETEs.
This gets the block as it exists right now. You'll see these gets when changing data in INSERT, UPDATE, and DELETE statements.
When reading a block - whether in consistent or current mode - if it's not cached in memory, the database will also do a physical read. This fetches the block from disk.
You can view the I/O for a statement with the IOSTATS format in DBMS_XPlan. This adds a Buffers column to the plan. This sums the consistent and current mode gets for the operations:
select /*+ gather_plan_statistics */count(*) from bricks;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));The IOSTATS format reports the logical I/O in the buffers column for each operation in the plan. These figures are cumulative. So to find the I/Os for an operation, you need to subtract the values for its direct children.
select /*+ gather_plan_statistics */
count (*)
from colours c
join bricks b
on c.colour = b.colour;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
The Buffers reported at the top of the plan (step 0) is the total I/O for the query. So this query did ~44 gets. The vast majority of these were reading the BRICKS table, with a couple for reading the COLOURS table.
The HASH JOIN does little or no gets. Verify this by subtracting the total gets for reading COLOURS and BRICKS from the Buffers at operation 2.
So does this mean the join does no work?! No! The database will attempt to process the join in memory. But if there's not enough memory available it will have to write to temporary disk.
Replace /*TODO*/ section in the call to DBMS_XPlan to see how many buffers were needed to run this query:
select /*+ gather_plan_statistics */ *
from colours c
join bricks b
on c.colour = b.colour
order by b.brick_id
fetch first 1 rows only;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => '/*TODO*/'));Sorting and joining operations need to read and write data too. Often these are able to process the rows in memory. You can see how much memory an operation needed by getting the plan with the MEMSTATS format:
select /*+ gather_plan_statistics */* from complex_query;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'MEMSTATS LAST'));
This adds OMem, 1Mem, and Used-Mem columns to the plan:
------------------------------------------------------------- | Id | Operation | OMem | 1Mem | Used-Mem | ------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | 1 | VIEW | | | | | 2 | SORT AGGREGATE | | | | | 3 | VIEW | | | | | 4 | SORT ORDER BY | 17M| 1546K| 15M (0)| | 5 | COUNT | | | | |* 6 | HASH JOIN | 1856K| 1856K| 1115K (0)| | 7 | TABLE ACCESS FULL | | | | |* 8 | HASH JOIN | 2171K| 2171K| 1241K (0)| | 9 | TABLE ACCESS FULL| | | | | 10 | TABLE ACCESS FULL| | | | -------------------------------------------------------------
Unlike the I/O figures, these values are specific to each operation in the plan. The OMem and 1Mem figures are estimates. Used-Mem reports how much memory it actually used.
So the database estimated that the HASH JOIN at operation 8 would need 2,171Kb of memory. But it only used about half that; 1,241Kb.
If memory is limited or when processing huge data sets all the rows may not fit in memory. In this case the database needs to write the data to disk to complete the operation. Oracle Database uses the temporary tablespace to do this.
When the database is unable to sort rows in memory it writes them to temporary disk. You can see the amount of disk needed in the Used-Tmp column with the MEMSTATS format.
To view the number of disk reads and writes, check the Reads and Writes columns with the IOSTATS format.
An easy way to display all this information is with the ALLSTATS format. This is shorthand for MEMSTATS IOSTATS ROWSTATS.
The alter session statements in this section limit the amount of memory available for joins and sorts. So these operations will need to write to disk:
alter session set workarea_size_policy = manual;
alter session set sort_area_size = 25000;
select /*+ gather_plan_statistics */ row_count from complex_query;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'ALLSTATS LAST'));
It's rare you need to set the sort_area_size. The above is to show the principle on a small data set.
As with Buffers, the Reads and Writes column figures are cumulative. You may also see physical reads when querying rows that are not cached in memory (the buffer cache in Oracle Database).
When writing a query, it's good practice to only select the columns you need. This reduces the amount of data sent over the network from the database to the client. And it can make operations like sorting and grouping more efficient.
These two queries get the first 25 rows in the table, sorted by colour. But the first fetches all the columns in the table. The second only gets the colour values:
select /*+ gather_plan_statistics */*
from bricks
order by colour
fetch first 25 rows only;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'MEMSTATS LAST'));
select /*+ gather_plan_statistics */ colour
from bricks
order by colour
fetch first 25 rows only;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'MEMSTATS LAST'));
This leads to a big reduction in the memory needed to do the sort. When working with huge data sets, this reduces the chance the database will have to use temporary disk to order the data.
Selecting only the values you need also makes it more likely that the optimizer will use an index.
A full table scan reads all the rows in a table. If the table stores millions of rows, this leads to lots of wasted work if your query only reads a few rows. You can create or change data structures to enable the database to read rows more efficiently.
The most common way to do this is by creating indexes! These can help the database find a few rows quickly.
These two queries show the power an index can have. The BRICKS and BRICKS_INDEXED tables store the same rows. But the second table has an index on COLOUR. Because the queries only select COLOUR, the database is able to answer the second query by reading the index instead of the table:
select /*+ gather_plan_statistics */
colour, count(*)
from bricks
group by colour
order by colour;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
select /*+ gather_plan_statistics */
colour, count(*)
from bricks_indexed
group by colour
order by colour;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
The second query does around a quarter of the number of gets as the first, a considerable saving! This is because the index is a much smaller data structure than the table. You can verify this by querying *_SEGMENTS, which reports the size of each data structure in bytes:
select segment_name, segment_type, bytes
from user_segments
where segment_name in (
'BRICKS', 'BRICKS_INDEXED', 'BRICK_COLOUR_I'
);Indexes can also help the database find rows it's looking for faster. With a full table scan, the database reads every row and checks if it meets the WHERE clause.
This means the database may read more rows than your query returns, leading to wasted I/O. You can verify this by looking at the "Predicate Information" section of an execution plan:
select /*+ gather_plan_statistics */
count(*)
from bricks
where colour = 'red';
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
For this query, the predicate is a filter operation:
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("COLOUR"='red')
Whereas with an index on COLOUR, the database can use this to only read red rows:
select /*+ gather_plan_statistics */
count(*)
from bricks_indexed
where colour = 'red';
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST'));
The predicate is now an ACCESS:
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COLOUR"='red')
This means the database only reads rows that match the search criteria. In this case it lead to a big reduction in work; just three I/Os to read the rows with an index compared to 40+ for a full table scan.
The next two modules will teach you how to create indexes and get the best out of them in Oracle Database.
Run this query and get its execution stats:
select /*+ gather_plan_statistics */ *
from colours c
join bricks b
on c.colour = b.colour
order by b.colour
fetch first 10 rows only;
select *
from table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'ALLSTATS LAST'));
This only needs to return the COLOUR of the first ten rows stored in BRICKS, sorted by COLOUR. Other columns are unnecessary. What changes can you make to this query to reduce the resources it uses?
Hint: remember BRICKS_INDEXED stores the same data as BRICKS.
Experiment with selecting different values. What's the smallest amount of memory and buffers you can get a query returning the top 10 colours to use?
DBMS_XPlan only returns metrics which have non-zero values for at least one operation in the plan. If a column is missing (e.g. Used-Mem) this means the database needed no extra memory to process the query!