create table employees as
select * from hr.employees
where department_id in ( 90, 100, 60 );
You can use hierarchical queries to travel along parent-child relationships in your data. For example, family trees, computer directory structures, and company organization charts.
This tutorial shows methods to build org charts using the following sample company:
select * from employees
Connect by is an Oracle-specific way to create data trees using SQL. It has two key clauses, start with and connect by.
You state which rows are the roots here. These are the rows that appear at the "top" of the tree.
In a company org chart this is the CEO. Here that's employee_id 100, Steven King. So you can begin the chart with him using:
start with employee_id = 100
But if you do this, you need to change your query when a new CEO replaces him!
It's better to go with a more generic method. The CEO has no manager they report to. So their manager_id is null. So you could identify them with:
start with manager_id is null
You state the parent-child relationship here. This links the columns that store the parent and child values. You access values from the parent row using the keyword prior.
In a company each employee's "parent" is their manager. Thus you need to join the parent row's employee_id to the child's manager_id. So you connect the prior employee_id to the current manager_id, like so:
connect by prior employee_id = manager_id
Put this all together and you get the following query:
select * from employees
start with manager_id is null
connect by prior employee_id = manager_id;
Complete the following query to build a "reverse" org chart. Begin with employee 107 and go up the chain to Steven King. This switches the parent-child relationship so the employee is the parent row for their manager.
select employee_id, first_name, last_name, manager_id
from employees
start with employee_id /*TODO*/
connect by /*TODO*/
This should give the following output:
EMPLOYEE_ID FIRST_NAME LAST_NAME MANAGER_ID 107 Diana Lorentz 103 103 Alexander Hunold 102 102 Lex De Haan 100 100 Steven King <null>
Oracle Database 11.2 introduced another method for accessing trees: recursive subquery factoring. Aka recursive with.
This is the ANSI compliant way to build hierarchies in SQL. It's composed of two queries. A base query and a recursive one.
You use this to define the root rows in your tree. This is like the start with clause in connect by. So to begin the chart with the CEO, use:
select employee_id, first_name, last_name, manager_id from employees where manager_id is null
This maps to the connect by clause. Here you join the source table to the with clause on the columns storing the parent-child values.
For the company chart, you need to join each employee to their manager. This is the employee from the previous row. So you're linking the org_chart to the employees table.
This gives the following query:
select e.employee_id, e.first_name, e.last_name, e.manager_id from org_chart oc join employees e on e.manager_id = oc.employee_id
When you use recursive with, you must provide aliases for all the columns it returns. These go between the query name and "as":
with org_chart ( employee_id, first_name, last_name, manager_id ) as ( ...
To complete the query you need to union all the base and recursive queries together. Put this all together and you get:
with org_chart (
employee_id, first_name, last_name, manager_id
) as (
select employee_id, first_name, last_name, manager_id
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
)
select * from org_chart;
Complete the following query to build the reverse org chart from step three again. But this time using recursive with. It should start with employee_id 107 and go up the company to the CEO. As in module 3, the employee is the parent row, their manager the child.
with org_chart (
employee_id, first_name, last_name, manager_id
) as (
select employee_id, first_name, last_name, manager_id
from employees
where /*TODO*/
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id
from org_chart oc
join employees e
on /*TODO*/
)
select * from org_chart;
This should give the following output:
EMPLOYEE_ID FIRST_NAME LAST_NAME MANAGER_ID 107 Diana Lorentz 103 103 Alexander Hunold 102 102 Lex De Haan 100 100 Steven King <null>
With the queries so far it's hard to tell how senior someone is in the company. Adding the generation they belong to makes this easier. This allows you to see who's at the same level in the hierarchy. How you do this depends on whether you're using connect by or recursive with.
With connect by you can use the pseudo-column level. This returns the current depth in the tree, starting with 1 for the roots. Each new set of children increases the depth by one. So all a manager's reports are one level below them.
You can select this on its own to show the current row's depth in the tree:
select level, employee_id, first_name, last_name, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id;
This helps. But it's still tricky to tell how senior someone is in the company. Indenting their name based on their position makes this easier. Combining level with lpad makes this easy.
The syntax of lpad is:
lpad ( str1, N, str2 )
It adds the characters in str2 before those in str1 until the string is N characters long. So you can use level to place spaces before each employee's name. This makes it easy to see where people place in the organization:
select level, employee_id,
lpad ( ' ', level, ' ' ) || first_name || ' ' || last_name name, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id;
Recursive with doesn't have an in-built equivalent of level. You need to build your own. You can do this by selecting the value 1 in the base query. And incrementing it by one in the recursive part. For example:
with org_chart (
employee_id, first_name, last_name, manager_id, lvl
) as (
select employee_id, first_name, last_name, manager_id, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, oc.lvl + 1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
)
select * from org_chart;
As with connect by, you can use this pad out values in the final select.
When you build a hierarchical query, the database returns the rows in an order matching the tree structure.
Connect by returns rows in depth-first search order. If you use a regular order by you'll lose this sort.
But you can preserve the depth-first tree and sort rows with the same parent. You do this with the siblings clause of order by.
So to show a manager's reports after them, sorting employees with the same manager by hire date (first to last), you can do:
select level, employee_id, first_name, last_name, hire_date, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id
order siblings by hire_date;
Recursive with allows you to choose whether you want to traverse the tree using depth-first or breadth-first search. You define this in the search clause.
This starts at the root. Then picks one of its children. It then gets the child's child. And so on, down the tree accessing child nodes first. When it hits a leaf, it goes back up the tree until it finds an unvisited child.
So it always goes as far down the tree it can before accessing another row at the same level.
To use depth-first search, specify this in the search clause. The columns you sort by defines which order the database returns siblings. And the set clause defines a new column storing this sequence. It starts with 1 at for the first root. For each new row it increments by 1.
So to return employees in a depth first tree, sorting employees with the same manager by the date they were hired (first to last), use:
search depth first by hire_date set hire_seq
Which gives the following complete query:
with org_chart (
employee_id, first_name, last_name, hire_date, manager_id, lvl
) as (
select employee_id, first_name, last_name, hire_date, manager_id, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.hire_date, e.manager_id, oc.lvl + 1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) search depth first by hire_date set hire_seq
select * from org_chart
order by hire_seq;
Instead of travelling down the tree, you can go across it. This is breadth-first search.
Again, this starts with the root. But it accesses all the rows at the same level before going down to any children. So if you have many roots, it visits all these first. Then all the first generation children before going down to the second generation. And so on.
The sorting columns define which order you access nodes at the same depth.Unlike depth-first search, successive rows at the same level may alternate between parent rows.
So the following returns all the employees at the same rank next to each other. It sorts these by their hire date, first to last:
search breadth first by hire_date set hire_seq
Which completes the query like so:
with org_chart (
employee_id, first_name, last_name, hire_date, manager_id, lvl
) as (
select employee_id, first_name, last_name, hire_date, manager_id, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.hire_date, e.manager_id, oc.lvl + 1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) search breadth first by hire_date set hire_seq
select * from org_chart
order by hire_seq;
Complete the following query to return employees in depth-first order. You should sort employees with the same manager by first_name:
select level, employee_id, first_name, last_name, hire_date, manager_id
from employees
start with manager_id is null
connect by prior employee_id = manager_id
order /*TODO*/
This query should give the following output:
LEVEL EMPLOYEE_ID FIRST_NAME LAST_NAME HIRE_DATE MANAGER_ID 1 100 Steven King 17-JUN-2003 <null> 2 102 Lex De Haan 13-JAN-2001 100 3 103 Alexander Hunold 03-JAN-2006 102 4 104 Bruce Ernst 21-MAY-2007 103 4 105 David Austin 25-JUN-2005 103 4 107 Diana Lorentz 07-FEB-2007 103 4 106 Valli Pataballa 05-FEB-2006 103 2 101 Neena Kochhar 21-SEP-2005 100 3 108 Nancy Greenberg 17-AUG-2002 101 4 109 Daniel Faviet 16-AUG-2002 108 4 111 Ismael Sciarra 30-SEP-2005 108 4 110 John Chen 28-SEP-2005 108 4 112 Jose Manuel Urman 07-MAR-2006 108 4 113 Luis Popp 07-DEC-2007 108
Can you write this using recursive with?
It's possible to store loops in your hierarchy. Usually this is a data error. But some structures may contain loops by design.
For example, the following sets the CEO's manager to be a lowly programmer:
update employees
set manager_id = 107
where employee_id = 100;
This leads to a circle in your data. So you could get stuck in an infinite loop. Luckily Oracle Database has cycle detection to stop this.
If you try and build a hierarchy that contains a loop, connect by throws an ORA-1436 error:
select * from employees
start with employee_id = 100
connect by prior employee_id = manager_id;
You can avoid this using the nocycle keyword. This spots when the query returns to the same row. The database hides the repeated row and continues processing the tree.
To use it, place nocycle after connect by:
select * from employees
start with employee_id = 100
connect by nocycle prior employee_id = manager_id;
You control loop detection using the cycle clause of recursive with. Here you state which columns mark a loop. The database keeps track of the values it sees in these columns. If the current row's values for these appear in one of it's ancestors, you have a loop.
The syntax for this is:
cycle <columns> set <loop_column> to <loop_value> default <default_value>
The <loop_column> is a new one you define. When the database detects a cycle, it changes this from the default value to the <loop_value>
The org chart has a loop if you visit the same employee_id twice. So you want to check this column for cycles.
The snippet below does this and adds the column looped to your results. This starts with the value N. If the database finds a cycle, it sets it to Y:
cycle employee_id set looped to 'Y' default 'N'
You place this after your with clause, which gives the following complete query:
with org_chart (
employee_id, first_name, last_name, manager_id, department_id
) as (
select employee_id, first_name, last_name, manager_id, department_id
from employees
where employee_id = 100
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, e.department_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) cycle employee_id set looped to 'Y' default 'N'
select * from org_chart;
Unlike connect by, this includes the rows you visit twice. So the CEO, Steven King, appears twice in the results. If you want to exclude these, filter them out in your final where clause using the loop column you defined.
Using recursive with you can choose any columns in your query to mark a "loop". This allows you to stop processing before you get back to the same row.
For example, you can cycle on department_id. The CEO's direct reports, Neena Kochhar and Lex De Haan, are in the same department as him. So if you cycle on department_id, like so:
cycle department_id set looped to 'Y' default 'N'
The database considers these two employees to form a cycle! So the hierarchy only contains three rows:
with org_chart (
employee_id, first_name, last_name, manager_id, department_id
) as (
select employee_id, first_name, last_name, manager_id, department_id
from employees
where employee_id = 100
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, e.department_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) cycle department_id set looped to 'Y' default 'N'
select * from org_chart;
This can be useful to detect rows with two parents. Without cycle detection, you may access that row and all its children twice. The cycle clause allows you to avoid this. Connect by doesn't have an inbuilt way to do this.
Remove the loop before continuing the tutorial:
update employees
set manager_id = null
where employee_id = 100;
Complete the following query to cycle on job_id. Define a cycle column is_repeat which defaults to N. When accessing the same job_id, set it to Y
with org_chart (
employee_id, first_name, last_name, manager_id, job_id
) as (
select employee_id, first_name, last_name, manager_id , job_id
from employees
where employee_id = 102
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id, e.job_id
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) cycle /*TODO*/
select * from org_chart;
This should return the following rows:
EMPLOYEE_ID FIRST_NAME LAST_NAME MANAGER_ID JOB_ID IS_REPEAT 102 Lex De Haan 100 AD_VP N 103 Alexander Hunold 102 IT_PROG N 104 Bruce Ernst 103 IT_PROG Y 105 David Austin 103 IT_PROG Y 106 Valli Pataballa 103 IT_PROG Y 107 Diana Lorentz 103 IT_PROG Y
Using level makes it possible to see how deep the current row is in the tree. But it can still be tricky to see how the rows relate to each other. Connect by has many options to help with this.
The operator connect_by_root returns the value of a column from the root row.
It can be useful to see values from all the rows between the root and the current row. Sys_connect_by_path allows you to do this. It builds up a string, adding the value from the first argument for the current row to the end of the list. It separates these using the second argument.
You can identify the leaf rows using connect_by_isleaf. This returns 1 if the current row is a leaf. Otherwise it returns 0.
Combining these allows you to display:
The complete query for this is:
select employee_id, first_name, last_name, manager_id,
connect_by_root last_name,
sys_connect_by_path ( last_name, ', ') chart,
connect_by_isleaf is_leaf
from employees
start with manager_id is null
connect by prior employee_id = manager_id;
Recursive with doesn't have the built-in options like connect by. But you can emulate them.
To return a value from the parent row, select it in the base query. And in the recursive part, return this column from the with clause name.
Again, start by selecting the value you want in the base query. In the recursive part, append the values you want to add with an appropriate separator.
Displaying leaf rows is more complex with recursive with. To find these, add the depth (level) to the tree. Then sort it using depth-first search.
Depth-first search always goes as far down the tree it can. After hitting a leaf, it goes back up to the next unvisited child. So you know a row is a leaf if the next row is at the same or higher depth in the tree. So you need to check if the next row's level is less than or equal to the current.
You can test for this using lead. This gets a value from the next row in the results. Assuming seq follows a depth-first order, the following returns LEAF if the row is a leaf. And null otherwise:
case when lead ( lvl, 1, 1 ) over ( order by seq ) <= lvl then 'LEAF' end is_leaf
Putting these all together gives:
with org_chart (
employee_id, first_name, last_name, manager_id, root_emp, chart, lvl
) as (
select employee_id, first_name, last_name, manager_id,
last_name root_emp, last_name chart, 1 lvl
from employees
where manager_id is null
union all
select e.employee_id, e.first_name, e.last_name, e.manager_id,
oc.root_emp, oc.chart || ', ' || e.last_name, oc.lvl+1
from org_chart oc
join employees e
on e.manager_id = oc.employee_id
) search depth first by employee_id set seq
select oc.*,
case
when lead ( lvl, 1, 1 ) over ( order by seq ) <= lvl then 'LEAF'
end is_leaf
from org_chart oc;