CREATE SYNONYM time_dim FOR av.time_dim;
CREATE SYNONYM product_dim FOR av.product_dim;
CREATE SYNONYM geography_dim FOR av.geography_dim;
CREATE SYNONYM sales_fact FOR av.sales_fact;
CREATE OR REPLACE ATTRIBUTE DIMENSION time_attr_dim
USING time_dim
ATTRIBUTES (
year_name
, quarter_name
, month_name
, month_end_date)
LEVEL year
KEY year_name
LEVEL quarter
KEY quarter_name
DETERMINES (year_name)
LEVEL month
KEY month_name
ORDER BY month_end_date
DETERMINES (month_end_date,quarter_name);
CREATE OR REPLACE HIERARCHY time_hier
USING time_attr_dim (
month CHILD OF
quarter CHILD OF
year);
CREATE OR REPLACE ATTRIBUTE DIMENSION product_attr_dim
USING product_dim
ATTRIBUTES (
department_name
, category_name
, category_id)
LEVEL department
KEY department_name
LEVEL category
KEY category_id
MEMBER NAME category_name
DETERMINES (department_name);
CREATE OR REPLACE HIERARCHY product_hier
USING product_attr_dim (
category CHILD OF
department);
CREATE OR REPLACE ATTRIBUTE DIMENSION geography_attr_dim
USING geography_dim
ATTRIBUTES (
region_name
, country_name
, state_province_name
, state_province_id)
LEVEL region
KEY region_name
LEVEL country
KEY country_name
DETERMINES (region_name)
LEVEL state_province
KEY state_province_id
MEMBER NAME state_province_name
DETERMINES (country_name);
CREATE OR REPLACE HIERARCHY geography_hier
USING geography_attr_dim (
state_province CHILD OF
country CHILD OF
region );
CREATE OR REPLACE ANALYTIC VIEW sales_av
USING sales_fact
DIMENSION BY (
time_attr_dim
KEY month_id REFERENCES month_name
HIERARCHIES (time_hier DEFAULT )
, product_attr_dim
KEY category_id REFERENCES category_id
HIERARCHIES (product_hier DEFAULT)
, geography_attr_dim
KEY state_province_id REFERENCES state_province_id
HIERARCHIES (geography_hier DEFAULT)
)
MEASURES (
sales FACT sales
, units FACT units
, sales_prior_period AS (LAG(sales) OVER (HIERARCHY time_hier OFFSET 1)));
In the Update Tables with Simple Analytic Views - Part 1: Creating a Simple AV, analytic view objects were created using relatively simple DDL. This DDL includes only the elements necessary to create the structure needed for aggregation, hierarchical navigation, and calculation. It relied on defaults wherever possible and omitted descriptive metadata. This analytic view is well-suited for many applications and is easy to create.
Part 2 will demonstrate how easy it is to generate SQL to query this analytic view. Queries will be generated that select from both tables and the analytic view. You will see that the SQL required for tables grows increasingly complex as calculations are added and that the SQL required for the analytic view remains relatively simple.
Although these examples query the analytic view created with simple DDL, they are applicable to any analytic view. The analytic view is created by the Prerequisite SQL for this lab.
Before getting started with querying analytic view objects, take a quick look at the tables in the AV schema on Live SQL. The AV schema contains a simple star schema with three dimension tables and a fact table.
The TIME_DIM table.
SELECT * FROM time_dim;
The PRODUCT_DIM table.
SELECT * FROM product_dim;
The GEOGRAPHY_DIM table.
SELECT * FROM geography_dim;
And the SALES_FACT table.
SELECT * FROM sales_fact;
That is the raw material that we have to work with.
Take a minute and review the analytic view objects.
List the analytic views, hierarchies and levels.
SELECT
analytic_view_name
, hier_alias AS hierarchy
, level_name
, order_num
FROM
user_analytic_view_levels
ORDER BY
hier_alias,
order_num;
List the base (fact) measures. These are the measure that come from the fact table.
SELECT
analytic_view_name
, measure_name
, table_alias AS table_name
, column_name
FROM
user_analytic_view_base_meas;
List the calculated measures of the analytic view.
SELECT
analytic_view_name
, measure_name
, meas_expression
FROM
user_analytic_view_calc_meas;
Review the sample data using queries that select from the dimension lookup tables and hierarchy views.
A subset of columns in the tables was used in the analytic view. The queries will select those columns.
Query the TIME_DIM table.
SELECT
year_name
, quarter_name
, month_name
FROM
time_dim
ORDER BY
month_end_date;
The following query selects the same columns and rows from the TIME_HIER hierarchy view.
SELECT
year_name
, quarter_name
, month_name
FROM
time_hier
WHERE
level_name = 'MONTH'
ORDER BY
hier_order;
The hierarchy view can return rows for all levels of aggregation. Filtering to the lowest level causes the hierarchy view to return the same rows as the table.
The previous query selects attribute columns. Attribute columns reflect the data in the same columns of the dimension table.
Hierarchy views also include hierarchical columns. Hierarchical columns are automatically created in every hierarchy view. Hierarchical columns are very useful for navigating hierarchies. The names and roles are the same for every hierarchy view and analytic view, so they are easily queried with repeatable patterns.
The next query selects hierarchical columns for data at the Year level.
SELECT
member_name
, member_unique_name
, level_name
, hier_order
FROM
time_hier
WHERE
level_name = 'YEAR'
ORDER BY
hier_order;
The MEMBER_UNIQUE_NAME column is the primary key of the hierarchy view. Values will always be unique hierarchy members unique across levels. This solves the problem where the same value is at more than one level in the source tables. For example 'New York' in a CITY column and 'New York' in a STATE column.
To query Quarter level data just change the LEVEL_NAME filter to 'QUARTER'.
SELECT
member_name
, member_unique_name
, level_name
, hier_order
FROM
time_hier
WHERE
level_name = 'QUARTER'
ORDER BY
hier_order;
The hierarchy view can query data at multiple levels. Note that the values are sorted by HIER_ORDER, which sorts children within parents and ancestors.
SELECT
member_name
, member_unique_name
, level_name
, hier_order
FROM
time_hier
WHERE
level_name IN ('YEAR','QUARTER')
ORDER BY
hier_order;
Review data in the other dimension tables and hierarchy views.
This query selects from the PRODUCT_DIM table.
SELECT
department_name
, category_name
FROM
product_dim
ORDER BY
department_name
, category_name;
This query selects from the PRODUCT_HIER hierarchy, returning the same rows and columns.
SELECT
department_name
, category_name
FROM
product_hier
WHERE
level_name = 'CATEGORY'
ORDER BY
hier_order;
This query selects from the GEOGRAPHY_DIM table.
SELECT
region_name
, country_name
, state_province_name
FROM
geography_dim
ORDER BY
region_name
, country_name
, state_province_name;
This query selected from the GEOGRAPHY_HIER hierarchy, returning the same rows and columns.
SELECT
region_name
, country_name
, state_province_name
FROM
geography_hier
WHERE
level_name = 'STATE_PROVINCE'
ORDER BY
hier_order;
The hierarchical columns work well in query templates because they have a consistent set of names. Application developers will find them easy to use in SQL generators.
Select all of the hierarchical columns from the GEOGRAPHY_HIER hierarchy view.
SELECT
member_name
, member_unique_name
, parent_unique_name
, level_name
, hier_order
, depth
, is_leaf
FROM
geography_hier
ORDER BY
hier_order;
The following query selecting from the GEOGRAPHY_DIM table could be used to replicate the hierarchical columns of the hierarchy view.
WITH foo AS (
SELECT DISTINCT
region_name AS member_name
, '[REGION].&['
|| region_name
|| ']' AS member_unique_name
, NULL AS parent_unique_name
, 'REGION' AS level_name
, 0 AS depth
, 0 AS is_leaf
FROM
geography_dim
UNION
SELECT DISTINCT
country_name AS member_name
, '[COUNTRY].&['
|| country_name
|| ']' AS member_unique_name
, '[REGION].&['
|| region_name
|| ']' AS parent_unique_name
, 'COUNTRY' AS level_name
, 1 AS depth
, 0 AS is_leaf
FROM
geography_dim
UNION
SELECT DISTINCT
state_province_name AS member_name
, '[STATE_PROVINCE].&['
|| state_province_name
|| ']' AS member_unique_name
, '[COUNTRY].&['
|| country_name
|| ']' AS parent_unique_name
, 'STATE_PROVINCE' AS level_name
, 2 AS depth
, 1 AS is_leaf
FROM
geography_dim
)
SELECT
member_name,
member_unique_name,
parent_unique_name,
level_name,
depth,
is_leaf
FROM
foo;
Business intelligence applications often let users drill down on an aggregate value to reveal more detail.
The following query could be used to select children of Asia, selecting from the GEOGRPAHY_DIM table.
SELECT
country_name
FROM
geography_dim
WHERE
region_name = 'Asia';
The following query could be used to select children of Asia, selecting from the GEOGRAPHY_HIER hierarchy view.
SELECT
country_name
FROM
geography_hier
WHERE
region_name = 'Asia'
AND level_name = 'COUNTRY';
This query also selects the children of Asia. Because it filters on PARENT_UNIQUE_NAME, the LEVEL_NAME filter is not required.
.
SELECT
member_name
, member_unique_name
, parent_unique_name
, level_name
, hier_order
, depth
, is_leaf
FROM
geography_hier
WHERE
parent_unique_name = '[REGION].&[Asia]'
ORDER BY
hier_order;
This query template is reusable. It can also be used to select the children of Japan.
SELECT
member_name
, member_unique_name
, parent_unique_name
, level_name
, hier_order
, depth
, is_leaf
FROM
geography_hier
WHERE
parent_unique_name = '[COUNTRY].&[Japan]'
ORDER BY
hier_order;
Attribute columns (the original columns from the table) and hierarchical columns can be used in the same query. Use the columns in any way you find useful. In this example, the attribute columns are returned while the hierarchical columns are used in the WHERE and ORDER BY clauses.
SELECT
region_name
, country_name
, state_province_name
FROM
geography_hier
WHERE
parent_unique_name = '[COUNTRY].&[Japan]'
ORDER BY
hier_order;
A comparison of queries access the fact table will begin with a simple aggregation of Sales to the Year, Department, and Year levels.
The table query is relatively straightforward, with JOINS and GROUP BY.
SELECT
t.year_name
, p.department_name
, g.region_name
, SUM(sales)
FROM
time_dim t
, product_dim p
, geography_dim g
, sales_fact f
WHERE
t.month_name = f.month_id
AND p.category_id = f.category_id
AND g.state_province_id = f.state_province_id
GROUP BY
t.year_name
, p.department_name
, g.region_name
ORDER BY
t.year_name
, p.department_name
, g.region_name;
There are two basic forms of this query using the analytic view. The first selects from attribute columns and most closely mimics the table query.
SELECT
year_name
, department_name
, region_name
, sales
FROM
sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
WHERE
time_hier.level_name = 'YEAR'
AND product_hier.level_name = 'DEPARTMENT'
AND geography_hier.level_name = 'REGION'
ORDER BY
year_name
, department_name
, region_name;
The HIERARCHIES clause replaces joins and GROUP BY.
It would probably be fair to characterize the complexity of these queries as being about the same.
The AV query could also be written to select hierarchical columns as in the following example.
SELECT
time_hier.member_name AS time
, product_hier.member_name AS product
, geography_hier.member_name AS geography
, sales
FROM
sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
WHERE
time_hier.level_name = 'YEAR'
AND product_hier.level_name = 'DEPARTMENT'
AND geography_hier.level_name = 'REGION'
ORDER BY
time_hier.hier_order
, product_hier.hier_order
, geography_hier.hier_order;
One of the advantages of using hierarchical columns is that they do not need to change when selecting data at different levels of aggregation. Only the value of a LEVEL_NAME filter needs to change to select different aggregate levels.
Stable column names can also be an advantage when mapping the query to visualizations in your application. Those mappings typically need to remain stable. Keeping this in mind, the following queries select data at the Quarter level and Category levels.
The select list, GROUP BY, and ORDER BY need to change. Aliases are added for name stability.
SELECT
t.quarter_name AS time
, p.category_name AS product
, g.region_name AS geography
, SUM(sales) AS sales
FROM
time_dim t
, product_dim p
, geography_dim g
, sales_fact f
WHERE
t.month_name = f.month_id
AND p.category_id = f.category_id
AND g.state_province_id = f.state_province_id
GROUP BY
t.quarter_name
, p.category_name
, g.region_name
ORDER BY
t.quarter_name
, p.category_name
, g.region_name;
In the query selecting from the analytic view, only the level filters need to change. The query template is simply reused.
SELECT
time_hier.member_name AS time
, product_hier.member_name AS product
, geography_hier.member_name AS geography
, sales
FROM
sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
WHERE
time_hier.level_name = 'QUARTER'
AND product_hier.level_name = 'CATEGORY'
AND geography_hier.level_name = 'REGION'
ORDER BY
time_hier.hier_order
, product_hier.hier_order
, geography_hier.hier_order;
If you are generating queries against tables metadata that defines the meaning of each column might be needed. If you are querying analytic views, queries can be hard-coded to hierarchical column names. And, if you need metadata, it is in the Oracle data dictionary.
It is easy to query multiple levels of aggregation from the analytic view. In this example, data is selected at both the Year and Quarter levels.
SELECT
time_hier.member_name AS time
, product_hier.member_name AS product
, geography_hier.member_name AS geography
, sales
FROM
sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
WHERE
time_hier.level_name IN ('YEAR','QUARTER')
AND product_hier.level_name = 'CATEGORY'
AND geography_hier.level_name = 'REGION'
ORDER BY
product_hier.hier_order
, geography_hier.hier_order
, time_hier.hier_order;
Time series calculations common to business intelligence applications.
When querying tables a prior period calculation is relatively easy if the assumption is made that the row set is dense over time. Dense means that there are no gaps in the time periods within the selected row set. In this example, it means that for each value of Department and Region, there are sales for every time period. Then this is true, a simple LAG analytic window function can be used in the query of the tables.
The LAG function cannot be used when the query includes GROUP BY, so this is a two-pass query. The first pass aggregates data and the second calculates the LAG.
WITH sum_sales AS (
SELECT
t.year_name
, p.department_name
, g.region_name
, SUM(sales) AS sales
FROM
time_dim t
, product_dim p
, geography_dim g
, sales_fact f
WHERE
t.month_name = f.month_id
AND p.category_id = f.category_id
AND g.state_province_id = f.state_province_id
GROUP BY
t.year_name
, p.department_name
, g.region_name
)
SELECT
year_name
, department_name
, region_name
, sales
, LAG(sales)
OVER(PARTITION BY department_name, region_name
ORDER BY
year_name ASC
) AS sales_prior_period
FROM
sum_sales
ORDER BY
year_name
, department_name
, region_name;
Calculations can be defined in the analytic view. The sample analytic view includes a SALES_PRIOR_PERIOD calculation. The expression can be seen in the data dictionary.
SELECT
measure_name
, meas_expression
FROM
user_analytic_view_calc_meas;
The calculation can be queried as a column of the analytic view. Note that this analytic view query differs from previous examples only by the addition of SALES_PRIOR_PERIOD in the select list.
SELECT
year_name
, department_name
, region_name
, sales
, sales_prior_period
FROM
sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
WHERE
time_hier.level_name = 'YEAR'
AND product_hier.level_name = 'DEPARTMENT'
AND geography_hier.level_name = 'REGION'
ORDER BY
year_name
, department_name
, region_name;
The following query is the same as the previous one except that it includes the USING form of the FROM clause.
SELECT
time_hier.member_name AS time
, product_hier.member_name AS product
, geography_hier.member_name AS geography
, sales
FROM ANALYTIC VIEW (
USING sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
)
WHERE
time_hier.level_name = 'YEAR'
AND product_hier.level_name = 'DEPARTMENT'
AND geography_hier.level_name = 'REGION'
ORDER BY
time_hier.hier_order
, product_hier.hier_order
, geography_hier.hier_order;
The user form of the from clause can be used anytime, but it is required when using the ADD MEASURES clause. Calculated measures can be added using ADD MEASURES. In the following example, the Sales Prior Period measure is added in the query.
SELECT
year_name
, department_name
, region_name
, sales
, sales_pp
FROM ANALYTIC VIEW (
USING sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
ADD MEASURES (
sales_pp AS (LAG(sales) OVER (HIERARCHY time_hier OFFSET 1))
)
)
WHERE
time_hier.level_name = 'YEAR'
AND product_hier.level_name = 'DEPARTMENT'
AND geography_hier.level_name = 'REGION'
ORDER BY
year_name
, department_name
, region_name;
Notice that the calculation is named SALES_PP. The query modifies the definition of the analytic view for the duration of the query. Because the measure SALES_PRIOR_PERIOD is defined permanently in the analytic view a different name is needed for the on-the-fly calculation.
If it is assumed that data is dense over time and it is not, wrong data can be returned. For example, sales for 2015 could be returned as the prior period to 2017. It is important to understand your data when you make certain assumptions.
It would be expected that a time lookup table would have all time periods (no gaps). To guard against sparse data, add a partitioned outer join between the aggregated row set and the TIME_DIM table.
WITH sum_sales AS (
SELECT
t.year_name
, p.department_name
, g.region_name
, SUM(sales) AS sales
FROM
time_dim t
, product_dim p
, geography_dim g
, sales_fact f
WHERE
t.month_name = f.month_id
AND p.category_id = f.category_id
AND g.state_province_id = f.state_province_id
GROUP BY
t.year_name
, p.department_name
, g.region_name
), year_dim AS (
SELECT DISTINCT
year_name
FROM
time_dim
)
SELECT
b.year_name
, a.department_name
, a.region_name
, a.sales
, LAG(a.sales)
OVER(PARTITION BY a.department_name, a.region_name
ORDER BY
b.year_name ASC
) AS sales_prior_period
FROM
sum_sales a
PARTITION BY ( a.department_name
, a.region_name ) RIGHT OUTER JOIN (
SELECT DISTINCT
b.year_name
FROM
year_dim b
) b ON ( a.year_name = b.year_name )
ORDER BY
year_name
, department_name
, region_name;
The analytic view automatically handles this sparse data case with time series calculations so the query selects from the analytic view is unchanged.
SELECT
year_name
, department_name
, region_name
, sales
, sales_pp
FROM ANALYTIC VIEW (
USING sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
ADD MEASURES (
sales_pp AS (LAG(sales) OVER (HIERARCHY time_hier OFFSET 1))
)
)
WHERE
time_hier.level_name = 'YEAR'
AND product_hier.level_name = 'DEPARTMENT'
AND geography_hier.level_name = 'REGION'
ORDER BY
year_name
, department_name
, region_name;
Change and Percent Change from Prior Periods are also common calculations.
These measures require a third pass in the query selecting from tables.
WITH sum_sales AS (
SELECT
t.year_name
, p.department_name
, g.region_name
, SUM(sales) AS sales
FROM
time_dim t
, product_dim p
, geography_dim g
, sales_fact f
WHERE
t.month_name = f.month_id
AND p.category_id = f.category_id
AND g.state_province_id = f.state_province_id
GROUP BY
t.year_name
, p.department_name
, g.region_name
), year_dim AS (
SELECT DISTINCT
year_name
FROM
time_dim
), sales_prior_period AS (
SELECT
b.year_name
, a.department_name
, a.region_name
, a.sales
, LAG(a.sales)
OVER(PARTITION BY a.department_name, a.region_name
ORDER BY
b.year_name ASC
) AS sales_prior_period
FROM
sum_sales a
PARTITION BY ( a.department_name
, a.region_name ) RIGHT OUTER JOIN (
SELECT DISTINCT
b.year_name
FROM
year_dim b
) b ON ( a.year_name = b.year_name )
)
SELECT
year_name
, department_name
, region_name
, sales
, sales_prior_period
, sales - sales_prior_period AS sales_change_prior_period
, ( sales - sales_prior_period ) / sales_prior_period AS sales_pct_change_prior_period
FROM
sales_prior_period
ORDER BY
year_name
, department_name
, region_name;
Two additional calculations can be added to the query selecting from the analytic view using the same query template.
SELECT
year_name
, department_name
, region_name
, sales
, sales_pp
, sales_change_prior_period
, sales_pct_change_period_period
FROM ANALYTIC VIEW (
USING sales_av HIERARCHIES (
time_hier
, product_hier
, geography_hier
)
ADD MEASURES (
sales_pp AS (LAG(sales) OVER (HIERARCHY time_hier OFFSET 1)),
sales_change_prior_period AS (LAG_DIFF(sales) OVER (HIERARCHY time_hier OFFSET 1)),
sales_pct_change_period_period AS (LAG_DIFF_PERCENT(sales) OVER (HIERARCHY time_hier OFFSET 1))
)
)
WHERE
time_hier.level_name = 'YEAR'
AND product_hier.level_name = 'DEPARTMENT'
AND geography_hier.level_name = 'REGION'
ORDER BY
year_name
, department_name
, region_name;
The comparisons between queries selecting from tables and analytic views show that table queries become increasingly complex as calculations are added, that analytic views can be queried with relatively simple, reusable templates, and that calculations are easily added to analytic view queries.
Analytic view expressions reference elements of the semantic model to make it easy to define calculations. Calculations can use both analytic view expressions and SQL single-row functions.
A few more calculation examples are provided in this section.
Shares calculate the ratio of the current member and the parent, ancestor, or the grand total of the hierarchy.
The following query includes:
SELECT
year_name
, region_name
, sales
, ROUND(sales_share_of_region,2) AS sales_share_of_region
, ROUND(sales_share_of_parent,2) as sales_share_of_parent
, ROUND(sales_share_of_total,2) as sales_share_of_total
FROM ANALYTIC VIEW (
USING sales_av HIERARCHIES (
time_hier
, geography_hier
)
ADD MEASURES (
sales_share_of_region AS (SHARE_OF(sales HIERARCHY geography_hier LEVEL region)),
sales_share_of_parent AS (SHARE_OF(sales HIERARCHY geography_hier PARENT)),
sales_share_of_total AS (SHARE_OF(sales HIERARCHY geography_hier MEMBER ALL))
)
)
WHERE
time_hier.level_name = 'YEAR'
AND geography_hier.level_name = 'STATE_PROVINCE'
ORDER BY
year_name
, region_name;
Rank expressions calculate the ranking of the current member within the parent, ancestor or grand total of the hierarchy.
SELECT
year_name
, region_name
, sales
, sales_rank_geog_region
, sales_rank_geog_parent
, sales_rank_geog_level
FROM ANALYTIC VIEW (
USING sales_av HIERARCHIES (
time_hier
, geography_hier
)
ADD MEASURES (
sales_rank_geog_region AS (RANK () OVER (HIERARCHY geography_hier ORDER BY sales DESC NULLS LAST WITHIN ANCESTOR AT LEVEL REGION)),
sales_rank_geog_parent AS (RANK () OVER (HIERARCHY geography_hier ORDER BY sales DESC NULLS LAST WITHIN PARENT)),
sales_rank_geog_level AS (RANK () OVER (HIERARCHY geography_hier ORDER BY sales DESC NULLS LAST WITHIN LEVEL))
)
)
WHERE
time_hier.level_name = 'YEAR'
AND geography_hier.level_name = 'STATE_PROVINCE'
ORDER BY
year_name
, region_name;
While this is not intended to be a calculation tutorial, it makes the point that this simple query template is easily extended with interesting calculations.