Before diving into the details, here's an example of the use of a streaming table function:
INSERT INTO tickers SELECT * FROM TABLE (doubled (CURSOR (SELECT * FROM stocks))) /
What's going on here? Let's take it step by step "from the inside out":
Sometimes (often?) you will need to perform more than one transformation as part of the streaming process. No problem - you can certainly "string together" multiple invocations of table functions, as you see below (all the code to implement and demonstrate this statement follows in this tutorial).
INSERT INTO tickers SELECT * FROM TABLE (transform2 ( CURSOR (SELECT * FROM TABLE(transform1 ( CURSOR (SELECT * FROM stocks ))))))
In this tutorial, we will start with the stocks table, each row of which contains the open and close trading prices for each stock ticker symbol:
CREATE TABLE stocks ( ticker VARCHAR2 (20), trade_date DATE, open_price NUMBER, close_price NUMBER ) /
Let's load it up with some optimistic data!
BEGIN FOR indx IN 1 .. 1000 LOOP INSERT INTO stocks VALUES ('STK' || indx, SYSDATE, indx, indx + 15); END LOOP; COMMIT; END; /
My transformation is simple: for each row in the stock table, generate two rows for the tickers table (one row each for the open and close prices):
CREATE TABLE tickers ( ticker VARCHAR2 (20), pricedate DATE, pricetype VARCHAR2 (1), price NUMBER ) /
Before continuing, I feel obligated to point out that for this particular transformation (one row in stocks to two rows in tickers), you don't need a table function to get the job done. For example, you can use INSERT ALL to insert into tickers twice:
/* You do NOT need to run this for the tutorial! */ INSERT ALL INTO tickers (ticker, pricedate, pricetype, price) values (ticker, trade_date, 'O', open_price) INTO tickers (ticker, pricedate, pricetype, price) values (ticker, trade_date, 'C', close_price) SELECT * FROM stocks /
You could also use unpivot (thanks, Chris Saxon @chrisrsaxon, for making me aware of this technique!):
/* You do NOT need to run this for the tutorial! */ INSERT INTO tickers (ticker, pricedate, pricetype, price) SELECT * FROM stocks UNPIVOT (price FOR price_type IN (open_price AS 'O', close_price AS 'C')) /
And if you can avoid the use of a table function, implementing your requirement in "pure" SQL instead, then you should by all means do so!
Please assume for the purposes of this tutorial that the transformation is much more complex and requires the use of PL/SQL.
And if you did happen to run either of those INSERTs above, you might want to clear out the table before proceeding:
TRUNCATE TABLE tickers /
As discussed in Module 2 of the Getting Started with Table Functions class (and tutorial), when you need that function to return more than one piece of data in each collection element, you need to create an object type and a collection of those object types.
In this case, I want to move stock data to the tickers table, so I need an object type that "looks like" the tickers table.
CREATE TYPE ticker_ot AUTHID DEFINER IS OBJECT ( ticker VARCHAR2 (20), pricedate DATE, pricetype VARCHAR2 (1), price NUMBER ); / CREATE TYPE tickers_nt AS TABLE OF ticker_ot; /
Since I am going to use the table function in a streaming process, I will also need to define a strong REF CURSOR type that will be used as the datatype of the parameter accepting the dataset inside the SQL statement.
CREATE OR REPLACE PACKAGE stock_mgr AUTHID DEFINER IS TYPE stocks_rc IS REF CURSOR RETURN stocks%ROWTYPE; TYPE tickers_rc IS REF CURSOR RETURN tickers%ROWTYPE; END stock_mgr; /
The main distinction with streaming table functions is that at least one parameter to that function is a cursor variable. The table function could have more than one cursor variable input, and other parameters of other types, such as a string or date). In this tutorial, we will stick with the minimum: a single cursor variable parameter.
Generally, the flow within a streaming table function is:
Now let's see how this pattern unfolds in my doubled function (one row doubled to two). Line numbers may be seen in the worksheet window when you press the Insert into Editor button.
CREATE OR REPLACE FUNCTION doubled (rows_in stock_mgr.stocks_rc) RETURN tickers_nt AUTHID DEFINER IS TYPE stocks_aat IS TABLE OF stocks%ROWTYPE INDEX BY PLS_INTEGER; l_stocks stocks_aat; l_doubled tickers_nt := tickers_nt (); BEGIN LOOP FETCH rows_in BULK COLLECT INTO l_stocks LIMIT 100; EXIT WHEN l_stocks.COUNT = 0; FOR l_row IN 1 .. l_stocks.COUNT LOOP l_doubled.EXTEND; l_doubled (l_doubled.LAST) := ticker_ot (l_stocks (l_row).ticker, l_stocks (l_row).trade_date, 'O', l_stocks (l_row).open_price); l_doubled.EXTEND; l_doubled (l_doubled.LAST) := ticker_ot (l_stocks (l_row).ticker, l_stocks (l_row).trade_date, 'C', l_stocks (l_row).close_price); END LOOP; END LOOP; CLOSE rows_in; RETURN l_doubled; END; /
Regarding FETCH-BULK COLLECT-LIMIT
I used a value of 100 for the LIMIT clause; that's a decent default value - it's the number of rows retrieved by cursor FOR loops with each fetch. But if you are processing an extremely large number of rows and want to squeeze better performance out of your function, you might try a larger LIMIT value. Note, however, that this will consume more Process Global Area memory, and at some point your code will slow down due to excessive memory consumption.
You should also pass the LIMIT value as a parameter to give you the ability to modify the performance-memory profile without recompiling your function, as in:
CREATE OR REPLACE FUNCTION doubled ( rows_in stock_mgr.stocks_rc, limit_in IN INTEGER DEFAULT 100) ... BEGIN LOOP FETCH rows_in BULK COLLECT INTO l_stocks LIMIT limit_in;
In my two-step transformation, I will create a function that returns a nested table of elements that match the stocks table. So I will need an object type and nested table type.
CREATE OR REPLACE TYPE stock_ot AUTHID DEFINER IS OBJECT ( ticker VARCHAR2 (20), trade_date DATE, open_price NUMBER, close_price NUMBER ) / CREATE OR REPLACE TYPE stocks_nt AS TABLE OF stock_ot; /
Now I create a table function accepts a cursor variable each of whose rows contains ticker data, and returns a nested table, each of whose elements looks like a row in the stocks table. As it follows precisely the same pattern as the doubled function, I will not describe the individual lines.
CREATE OR REPLACE FUNCTION singled (tickers_in IN stock_mgr.tickers_rc) RETURN stocks_nt AUTHID DEFINER IS TYPE tickers_aat IS TABLE OF tickers%ROWTYPE INDEX BY PLS_INTEGER; l_tickers tickers_aat; l_singles stocks_nt := stocks_nt (); BEGIN LOOP FETCH tickers_in BULK COLLECT INTO l_tickers LIMIT 100; EXIT WHEN l_tickers.COUNT = 0; FOR indx IN 1 .. l_tickers.COUNT LOOP l_singles.EXTEND; l_singles (l_singles.LAST) := stock_ot (l_tickers (indx).ticker, l_tickers (indx).pricedate, l_tickers (indx).price, l_tickers (indx).price * .5); END LOOP; END LOOP; RETURN l_singles; END; /
Now let's do a two-step transformation: stocks -> tickers -> more stocks!
CREATE TABLE more_stocks AS SELECT * FROM TABLE ( singled ( CURSOR ( SELECT * FROM TABLE (doubled ( CURSOR (SELECT * FROM stocks)))))) / SELECT COUNT (*) FROM more_stocks /
Streaming table functions play a crucial role in data warehouse ETLs (extract-transform-load) operations. Oracle Database makes building such functions easy through its implementation of PL/SQL cursor variables and the CURSOR expression.
Remember that the collection constructed and returned by a streaming table function will consume PGA memory, so very large data sets passed in to the function via the cursor variable could result in errors.
What can you do about that? Make that streaming table function a pipelined streaming table function - which we cover in the next module of the class.