Online Partition Operations (1)

View All Scripts Login to Run Script

Script Name Online Partition Operations (1)
Description Demonstrate ONLINE partition operations (1) NON-PARTITIONED TABLE to RANGE PARTITIONED TABLE Note 1: please consult the documentation at docs.oracle.com for more details Note 2: reference at support.oracle.com is "Master Note for Partitioning (Doc ID 1312352.1)"
Area Partitioning
Contributor Thomas Teske (Oracle)
Created Monday February 19, 2018

Statement 1
We intend working with ONE table only. DROP it for now. We build it entirely here.
```
drop table events purge;
```
```
ORA-00942: table or view does not exist 
```
More Details: https://docs.oracle.com/error-help/db/ora-00942

Statement 2

Create an empty table having a column with identities ...

Create an empty, non-partitionioned table having a generated IDENTITY

create table events 
( event_id integer GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL PRIMARY KEY,
  event_date date not null,  
  sensor_id  integer not null, 
  reading_uom varchar2(32) not null, 
  reading_val varchar2(32) not null )

Table created.

Statement 3

Seed first set of data, just 20 records - one per sensor.

1st seeding data for ONE DATE and TWENTY sensors

insert into events ( event_date, sensor_id, reading_uom, reading_val )
SELECT to_date ('19981215','YYYYMMDD'), 
                 rownum, 
                 'Ampere', 
                 round( dbms_random.value(1,1000), 6 )
FROM dual CONNECT BY LEVEL <= 20

Statement 4

For each of the existing rows add another nine records with other event_dates.

2nd seeding data for VARIOUS RANDOM DATES up to TWELVE years span

insert into events ( event_date, sensor_id, reading_uom, reading_val )
SELECT to_date ('19981215','YYYYMMDD') + round( dbms_random.value(1, 12 * 365 ), 4 ),
       sensor_id,
       reading_uom,
       round( dbms_random.value(1,1000), 6 )
FROM events, 
     ( select 'something' FROM dual CONNECT BY LEVEL <= 9 )

Statement 5

Generate another set of records - again ten times of the current set.

3rd seeding data for VARIOUS RANDOM DATES up to FIVE years span

insert into events ( event_date, sensor_id, reading_uom, reading_val )
SELECT to_date ('19981215','YYYYMMDD') +  12 * 365 + round( dbms_random.value(1, 5 * 365 ), 6 ),
       sensor_id,
       reading_uom,
       round( dbms_random.value(1,1000), 6 )
FROM events, 
     ( select 'something' FROM dual CONNECT BY LEVEL <= 10 )

Statement 6

Now change the non-partitioned table into an explicitly range partitioned table by year.

Change NON-PARTITIONED table to RANGE PARTITIONED table ONLINE including INDEX maintenance

alter table events modify
partition by range ( event_date )
(PARTITION observations_PAST    VALUES LESS THAN (TO_DATE('20000101','YYYYMMDD')),
 PARTITION observations_CY_2000 VALUES LESS THAN (TO_DATE('20010101','YYYYMMDD')),
 PARTITION observations_CY_2001 VALUES LESS THAN (TO_DATE('20020101','YYYYMMDD')),
 PARTITION observations_CY_2002 VALUES LESS THAN (TO_DATE('20030101','YYYYMMDD')),
 PARTITION observations_CY_2003 VALUES LESS THAN (TO_DATE('20040101','YYYYMMDD')),
 PARTITION observations_CY_2004 VALUES LESS THAN (TO_DATE('20050101','YYYYMMDD')),
 PARTITION observations_CY_2005 VALUES LESS THAN (TO_DATE('20060101','YYYYMMDD')),
 PARTITION observations_CY_2006 VALUES LESS THAN (TO_DATE('20070101','YYYYMMDD')),
 PARTITION observations_CY_2007 VALUES LESS THAN (TO_DATE('20080101','YYYYMMDD')),
 PARTITION observations_CY_2008 VALUES LESS THAN (TO_DATE('20090101','YYYYMMDD')),
 PARTITION observations_CY_2009 VALUES LESS THAN (TO_DATE('20100101','YYYYMMDD')),
 PARTITION observations_CY_2010 VALUES LESS THAN (TO_DATE('20110101','YYYYMMDD')),
 PARTITION observations_CY_2011 VALUES LESS THAN (TO_DATE('20120101','YYYYMMDD')),
 PARTITION observations_CY_2012 VALUES LESS THAN (TO_DATE('20130101','YYYYMMDD')),
 PARTITION observations_CY_2013 VALUES LESS THAN (TO_DATE('20140101','YYYYMMDD')),
 PARTITION observations_CY_2014 VALUES LESS THAN (TO_DATE('20150101','YYYYMMDD')),
 PARTITION observations_CY_2015 VALUES LESS THAN (TO_DATE('20160101','YYYYMMDD')),
 PARTITION observations_CY_2016 VALUES LESS THAN (TO_DATE('20170101','YYYYMMDD')),
 PARTITION observations_CY_2017 VALUES LESS THAN (TO_DATE('20180101','YYYYMMDD')),
 PARTITION observations_CY_2018 VALUES LESS THAN (TO_DATE('20190101','YYYYMMDD')),
 PARTITION observations_FUTURE  VALUES LESS THAN ( MAXVALUE )
) ONLINE
UPDATE INDEXES;

Statement 7

Get a range of years for a range partitioning.

Simple query - obtain earliest and last year

select sensor_id, 
       to_char( min(event_date),    'YYYY')  earliest_year, 
       to_char( max(event_date),    'YYYY')  latest_year
from   events
group by cube(sensor_id);

Statement 8

We generate again 10 times more data in a new range of dates. The reason is: we want a skewed data distribution.

4th seeding data for VARIOUS RANDOM DATES up to FIVE years span

insert into events ( event_date, sensor_id, reading_uom, reading_val )
SELECT to_date ('19981215','YYYYMMDD') +  19 * 365 + round( dbms_random.value(1, 5 * 365 ), 6 ),
       sensor_id,
       reading_uom,
       round( dbms_random.value(1,1000), 6 )
FROM events, 
     ( select 'something' FROM dual CONNECT BY LEVEL <= 10 )

Statement 9
Now we intend starting data queries. Let us gather table statistics to keep the optimizer well informed.
1st time statistics gathering - the entire table
```
execute DBMS_STATS.GATHER_TABLE_STATS( null, tabname => 'events')
```
Statement 10
Now we want to know, how much data is located in each partition.
Provide basic information about the TABLE and PARTITIONS
```
select partition_name, subpartition_count, num_rows 
from user_tab_partitions
order by partition_position
```
Statement 11
A simple query (without any statistical or windowing functions) - provide per sensor_id the earliest & last year of observations made. Provide details about the number of days per sensor for which you made observations. Provide details about the number of hours per sensor for which you made observations. This shall provide a brief overview of the data distribution. Is it any sparse? Probably yes.
We have all this FUTURE dated events. Just an overview to identify the need for partition splitting
```
select sensor_id,  
       to_char( min(event_date),    'YYYY')  earliest_year,  
       to_char( max(event_date),    'YYYY')  latest_year, 
       count(*) readings_in_year, 
       count( distinct to_char( event_date, 'YYYYMMDD' ) ) days_having_readings,
       count( distinct to_char( event_date, 'HH24'     ) ) hours_having_readings 
from   events partition ( observations_future ) 
group by cube(sensor_id) 
order by sensor_id
```

Statement 12

Note that we keep a FUTURE but moved away the years 2019 to 2022.

Split the partition for FUTURE data into several ones in one step.

ALTER TABLE events SPLIT PARTITION observations_future INTO 
  (PARTITION observations_CY_2019 VALUES LESS THAN ( to_date( '20200101', 'YYYYMMDD')),
   PARTITION observations_CY_2020 VALUES LESS THAN ( to_date( '20210101', 'YYYYMMDD')), 
   PARTITION observations_CY_2021 VALUES LESS THAN ( to_date( '20220101', 'YYYYMMDD')), 
   PARTITION observations_CY_2022 VALUES LESS THAN ( to_date( '20230101', 'YYYYMMDD')), 
   PARTITION observations_FUTURE 
   ) ONLINE;

Statement 13

Consider the NEW functionailty for partial/efficient statistics management - we keep it simple here to make clear. https://blogs.oracle.com/optimizer/efficient-statistics-maintenance-for-partitioned-tables-using-incremental-statistics-part-1

Produce OPTIMIZER STATISTICS for the NEWLY splitted partitions - show the results

execute DBMS_STATS.GATHER_TABLE_STATS( null, tabname => 'events', partname => 'observations_cy_2019');
execute DBMS_STATS.GATHER_TABLE_STATS( null, tabname => 'events', partname => 'observations_cy_2020');
execute DBMS_STATS.GATHER_TABLE_STATS( null, tabname => 'events', partname => 'observations_cy_2021');
execute DBMS_STATS.GATHER_TABLE_STATS( null, tabname => 'events', partname => 'observations_cy_2022');
execute DBMS_STATS.GATHER_TABLE_STATS( null, tabname => 'events', partname => 'observations_future');

Statement 14

Query OPTIMIZER STATISTICS for ALL partitions - show the results

select partition_name, subpartition_count, num_rows 
from user_tab_partitions
order by partition_position

Statement 15

This should start a long lasting discussion with you. What is best to work effciently and effectively on dates? Will you use categories or numbers to represent them? My answer: it depends on your use case!

Just another simple query - trick question: is the predicate on EVENT_DATE clever?

select sensor_id,  
       to_char( min(event_date),    'YYYYMMDD')  earliest_day,  
       to_char( max(event_date),    'YYYYMMDD')  latest_day, 
       count(*) readings_in_month, 
       count( distinct to_char( event_date, 'YYYYMMDD' ) ) days_having_readings,
       count( distinct to_char( event_date, 'HH24'     ) ) hours_having_readings 
from   events 
where  to_char( event_date, 'YYYYMM' ) = '201807'
group by cube(sensor_id) 
order by sensor_id

Statement 16
At times you need the YEAR or HOUR from the EVENT_DATE - why not just create virtual columns? This is significantly easier than creating VIEWS already. Best of all: you can use them with the In-Memory data store in columnar fashion, which is a significant performance gain.
Add virtual columns CALENDAR_YEAR and THE_HOUR - make it look nicer with virtual columns
```
alter table events add cal_year as ( to_number( to_char( event_date, 'YYYY' )));
alter table events add cal_hour as ( to_number( to_char( event_date, 'HH24' )));
```
Statement 17
Please share your feedback with me. Send an email to thomas.teske@oracle.com or send a Tweet to @thomasteskeorcl - Thank you!
Thank you
```
select 'That is all folks' from dual;
```

Additional Information

Database on OTN SQL and PL/SQL Discussion forums
Oracle Database
Download Oracle Database