Why would we want to do that? Consider what you might want to return as the result of a text query:
This type of query is difficult or impossible from a single SQL query. Just getting a count of hits requires you to fetch the entire result set back, or do a separate "select count(*)" query as well.
Hence the Result Set Interface. You provide a Result Set Descriptor (RSD), which is a block of XML specifying what you want to get back (but not the search term - that's provided separately), and the RSI returns to you a Result Set, which is a block of XML describing the top-N hits and all the summary information.
This tutorial walks you through using the Result Set Interface on some short, sample data.
We're going to create a table which has some structured information (our "facets") and a text field. All but one row will contain the word 'record' which we'll later use as a full-text search term.
create table docs (
source varchar2(20),
recordtype varchar2(20),
region varchar2(20),
text clob );
And insert some data into that table:
insert into docs values (
'Internal', 'Customer Info', 'South', 'This is record 1');
insert into docs values (
'Internal', 'Customer Info', 'North', 'This record is record 2.');
insert into docs values (
'External', 'Customer Info', 'South-East', 'This is record 3');
insert into docs values (
'External', 'Customer Info', 'South-East', 'This is record 4');
insert into docs values (
'External', 'Office Doc', 'North-West', 'This is record 5');
insert into docs values (
'External', 'Office Doc', 'North-East', 'This is record 6');
insert into docs values (
'External', 'Office Doc', 'North-East', 'No r-word in this one');
Now the setup necessary for the index. Facets must be defined as SDATA (
-- only needed if re-running: exec ctx_ddl.drop_section_group ('mysg')
exec ctx_ddl.create_section_group('mysg', 'HTML_SECTION_GROUP')
exec ctx_ddl.add_sdata_column ('mysg', 'source', 'source')
exec ctx_ddl.add_sdata_column ('mysg', 'recordtype', 'recordtype')
exec ctx_ddl.add_sdata_column ('mysg', 'region', 'region')
Now we'll create the actual index. Notice that the SDATA columns specified above MUST be specified as FILTER BY columns for SDATA fields to be created from them. The section group created above is specified in the PARAMETERS clause.
create index docsindex on docs(text) indextype is ctxsys.context
filter by source, recordtype, region
parameters ('section group mysg');
We now have an index with SDATA sections defined for our three facets, and we're ready to move onto the next module - using the Result Set Interface to query the table and fetch back facets.What we commonly want as the result of a text query is:
These types of queries are almost impossible to do as a single query in SQL without fetching every single hit to get the count and build the summary info.
So instead we provide the XML-based Result Set Interface. The developer provides us with Result Set Descriptor (RSD) which specifies all the information they want returned. We process the RSD and a query string, and return a Result Set (RS). Both the RSD and the RS are written in XML.
The application can then process the XML in order to display the results to the end-user. They can do that using native XML capabilities of the application's language, or by using SQL-XML capabilities built into Oracle Database, which extract strings or relational tables (for repeated info) from the XML text.
For this demo, we're going to use a temporary table to store the RSD and RS. In most situations, you would use a suitable variables in your programming language to store them, but in LiveSQL it's easier to use a temporary table.
So let's create a suitable table:
create table temp_rs (rsd clob, rs_text xmltype);
Rather than using xmltype, we could use a CLOB for the RSD as well, but then we'd need to cast it to xmltype using xmltype(clobname) each time we invoked an XML-specific operation on it.
Now let's construct our Result Set Descriptor:
(Note: only do this once. If you need to change or rerun it, delete from temp_rs first)
begin
insert into temp_rs (rsd) values ('
<ctx_result_set_descriptor>
<count />
<hitlist start_hit_num="1" end_hit_num="2" order="score desc">
<score />
<rowid />
<sdata name="source" />
</hitlist>
<group sdata="source"> <count exact="true"/> </group>
<group sdata="recordtype"> <count exact="true"/> </group>
<group sdata="region"> <count exact="true"/> </group>
</ctx_result_set_descriptor>
');
end;
/
Some explanation is needed there.
For this tutorial, we're only considering single-valued facets. For example REGION might contain NORTH, SOUTH, EAST and WEST. In 19c and later it's possible to use range buckets for numeric or date values (for example price less than $10, price $10 - $20, etc) but we're not going to cover that here.
Before we go any further, let's list our RSD.
select rsd from temp_rs;
(In SQL*Plus you would want to SET LONG 50000 and do select xmltype(rsd)..." in order to format the XML nicely).
Make sure there's one and only one row returned. If you have more than one, future parts of the tutorial will fail. If that's the case, drop the table, and go back to the start of this section (section 2).
Now having specified our Result Set Descriptor, we can run a query using it. We do this using the PL/SQL procedure CTX_QUER.RESULT_SET. We need to specify the index to search, the search term or query string, the Resul Set Descritor as an input, and a variable for the Result Set output. Our query is for the word "RECORD" which should return six of the seven rows in the table.
So this is the actual query procedure:
declare
ResSetDesc clob;
ResultSet clob;
begin
-- fetch the RSD from our temporary table (assumes only we are using the temp table and it was cleared before running the module)
select rsd into ResSetDesc from temp_rs;
-- and run our query using the RSD
ctx_query.result_set(
index_name => 'docsindex',
query => 'record',
result_set_descriptor => ResSetDesc,
result_set => ResultSet );
-- ResSetOut now contains XML result set: write it to temporary table
update temp_rs set rs_text = xmltype(ResultSet);
end;
/
After running that, you'll only see "Statement processed". But the Result Set output has been written to our temporary table. Let's check it. We'll use getClobVal to make sure it's formatted nicely:
select xmltype.getclobval(rs_text) from temp_rs;
Take a few minutes to look through that, and compare it to what we requested in our Result Set Descriptor earlier. Remember we only asked for the top 2 results in our hitlist section.
In particular note the group counts towards the end. We can see that from our entire result set (not just the top 2 hits) there were 4 records with SOURCE=External and 2 records with SOURCE=Internal.
These are our facet counts.
This section shows how you can use the SQL-XML capabilities of Oracle Database to process that block of XML back into SQL-like constructs such as VARCHAR2 values and relational tables.
First of all, we'll use the XMLTable table function to get the hitlist rows.
Let's take a look at the hitlist section of the Result Set output:
<hitlist>
<hit>
<score>6</score>
<rowid>ABLKUlAAkAAAAQcAAB</rowid>
<sdata name="SOURCE">Internal</sdata>
</hit>
<hit>
<score>3</score>
<rowid>ABLKUlAAkAAAAQcAAA</rowid>
<sdata name="SOURCE">Internal</sdata>
</hit>
</hitlist>
We want to fetch that info back as a relational table with two rows in it. We can do that using the XMLTable table function and passing it our Result Set from the temporary table.
XMLTable requires us to specify the XML path to each data item, and how that should appear as column in the table function (its name and datatype). Then since we've got ROWIDs from the original table DOCS, we might as well join our table function with the DOCS table so we can fetch back the original text, or any metadata columns in the original table which are not returned by SDATA values:
select score, d.text, rso.source from
xmltable ( '/ctx_result_set/hitlist/hit'
PASSING ( select rs_text from temp_rs )
COLUMNS
xrowid varchar2(18) PATH 'rowid',
score number PATH 'score',
source varchar2(128) PATH 'sdata[@name="SOURCE"]' ) rso,
docs d
where d.rowid = rso.xrowid
order by score desc;
There you go - our XML hitlist has now become a SQL table.
Getting the total hit count
Getting the total hit count is easier - there's only a single value so we don't need XMLTable, we can use ExtractValue, passing it the path of the hitcount:
select extractvalue( (select rs_text from temp_rs ), '/ctx_result_set/count' ) as "Total Hits" from dual;
And finally we can get the facets and facet counts. We have three facet types, SOURCE, RECORDTYPE and REGION. Each of those have a section in the XML. For SOURCE, it looks like:
<groups sdata="SOURCE">
<group value="External">
<count>4</count>
</group>
<group value="Internal">
<count>2</count>
</group>
</groups>
It's another repeated field, so we need another XMLTable. And we need the same for each facet type:
select * from
xmltable( '/ctx_result_set/groups[@sdata="SOURCE"]/group'
PASSING ( ( select rs_text from temp_rs ) )
COLUMNS
source VARCHAR2(30) PATH '@value',
facet_count NUMBER PATH 'count/text()') as rso;
select * from
xmltable( '/ctx_result_set/groups[@sdata="RECORDTYPE"]/group'
PASSING ( ( select rs_text from temp_rs ) )
COLUMNS
recordtype VARCHAR2(30) PATH '@value',
facet_count NUMBER PATH 'count/text()') as rso;
select * from
xmltable( '/ctx_result_set/groups[@sdata="REGION"]/group'
PASSING ( ( select rs_text from temp_rs ) )
COLUMNS
region VARCHAR2(30) PATH '@value',
facet_count NUMBER PATH 'count/text()') as rso;
And that's it.
We've shown how we can use the XML-based Result Set Interface to fetch back a hitlist and summary information in a single operation, and how to manipulate that XML back into relational form.
The actual mechanism for displaying and allowing the selection of facets is beyond this tutorial, but let's assume there's some sort of checkbox system, and the user has indicated that they want to see only documents where the SOURCE facet has value "Internal".
We then want to re-run the query, but with SOURCE="Internal". We do that by adding an SDATA clause to the query parameter of ctx_query.result_set. We don't have to modify the Result Set Descriptor atat all. So our query procedure is now:
declare
ResSetDesc clob;
ResultSet clob;
begin
-- fetch the RSD from our temporary table (assumes only we are using the temp table and it was cleared before running the module)
select rsd into ResSetDesc from temp_rs;
-- and run our query using the RSD
ctx_query.result_set(
index_name => 'docsindex',
query => 'record AND SDATA(source="Internal")',
result_set_descriptor => ResSetDesc,
result_set => ResultSet );
-- ResSetOut now contains XML result set: write it to temporary table
update temp_rs set rs_text = xmltype(ResultSet);
end;
/
Notice that the 'query' parameter now includes an SDATA clause to restrict the result set. Also note that case is important in SDATA - it's an exact match, not the equivalent of a match in the CONTAINS operator.
Once again we can list the output from our query with:
select xmltype.getclobval(rs_text) from temp_rs;
We can see that both the records selected have "Internal" for the source, and that the facets for source now show only
<groups sdata="SOURCE">
<group value="Internal">
<count>2</count>
</group>
</groups>
What if we want to select multiple facets? Easy enough, we can use AND and OR with the SDATA operators (though be careful to use parentheses properly!)
declare
ResSetDesc clob;
ResultSet clob;
begin
-- fetch the RSD from our temporary table (assumes only we are using the temp table and it was cleared before running the module)
select rsd into ResSetDesc from temp_rs;
-- and run our query using the RSD
ctx_query.result_set(
index_name => 'docsindex',
query => 'record AND ( ( SDATA(source="Internal") OR SDATA(source="External") ) AND SDATA(region="South-East") )',
result_set_descriptor => ResSetDesc,
result_set => ResultSet );
-- ResSetOut now contains XML result set: write it to temporary table
update temp_rs set rs_text = xmltype(ResultSet);
end;
/
select xmltype.getclobval(rs_text) from temp_rs;
So that query looked for External or Internal sources, but only from South-East region.
That's all for now - please try these techniques out with your own data.
Feedback on, or corrections to these tutorials are welcomed. Hit the "Feedback" button at the top of the page, or contact the author, Roger Ford