Oracle XML capabilities in Oracle Database 11gR2, part 1

New Oracle XML capabilities in Oracle Database 11gR2 allow for easier reading and manipulation of XML documents, according to columnist Jeff McCormick. In part 1, McCormick introduces the reader to XML capabilities in Oracle Database and sets up a scenario where a business wants to pull XML documents into its database.

This is Part 1 of a two-part series on Oracle XML capabilities in Oracle Database 11gR2. In part 1, McCormick introduces the reader to XML capabilities in Oracle Database and sets up a scenario where a business wants to pull XML documents into its database. Part 2 details how that XML data can then be used and queried to gain business information. 

The use of XML data continues to increase across enterprises, and databases will play a role in XML data management. Yet many Oracle database professionals have been reluctant to adopt Oracle XML capabilities into their management practices.

That should change with Oracle Database 11gR2. Oracle XML features in the newest version of its database product make it easier to introduce XML into your data management practices with minimal disruption to SQL applications. In this data integration use case, we will demonstrate relational access of XML data while pointing out new Oracle Database 11gR2 improvements along the way.

Relational access of XML data from Oracle Database 11gR2

XML data integration, which has become increasingly more common in IT shops, is relatively easy to implement in the database. It is still normal to find XML documents shredded into relational tables for SQL application access. Shredding is expensive, time consuming and oftentimes unnecessary. A better approach may be to eliminate the shredding step by directly ingesting the XML data into the database and placing relational views over the XML structures. 

The following fictitious example demonstrates the use of native Oracle XML database capabilities supporting relational access of XML data.

The ABC Gum Company’s data integration hub receives source system purchase orders as XML documents, as shown below. The company is looking for a simple solution within Oracle Database to accept well-formed XML documents and present them relationally. Its SQL-based hub maintenance application can then perform data manipulation operations prior to the data being published to downstream XML applications, archived and eventually purged.

 '<?xml version="1.0"?>
 <CustomerName>Joe Smith</CustomerName>
 <SpecialInstructions>Air Mail</SpecialInstructions>
    <LineItem ItemNumber="1">
     <Description>Bubble Gum</Description>
     <Part Id="7155" UnitPrice="19.95" Quantity="2"/>
    <LineItem ItemNumber="2">
     <Description>Blow Pops</Description>
     <Part Id="3742" UnitPrice="15.95" Quantity="2"/>

Notice the XML document contains a LineItem collection (repeating elements) analogous to a master-detail model.

Create XML table, relational view and indexes

We’ll begin by creating a single schema-less XML table to store our complex purchase order structure.Notice the table is created as an XML Type data type. You can also create a table with an XML Type column. So far so good, as we’ve already simplified a multi-table relational data model to a single physical table.

 create table abc_purchase_order of xmltype
 (constraint abc_purchase_order_pk primary key (order_number))
 xmltype store as securefile binary xml
  (compress high)
 virtual columns
  (order_number as (xmlcast(xmlquery('/PurchaseOrder/OrderNumber' passing object_value returning content) as number (5)))
  ,order_date as (xmlcast(xmlquery('/PurchaseOrder/OrderDate' passing object_value returning content) as date))
 partition by range (order_date)
   partition yr_2010 values less than(to_date('01/01/2011','mm/dd/yyyy'))
   ,partition yr_2011  values less than (to_date('01/01/2012','mm/dd/yyyy'))

Notable Oracle XML 11gR2 enhancements include the use of SECUREFILE for large object (LOB) storage, binary XML Type storage and partitioning. SECUREFILE is preferred over BASICFILE for LOB storage for several reasons:  de-duplication, compression, encryption, caching, logging and piecewise updates.  Binary XML Type storage (persistence) is a post-parsed format providing flexibility, storage optimization and performance improvement. Partitioning, enabled by virtual columns, provides manageability, availability and support for information lifecycle management (ILM).

Next, we’ll expose our purchase order relationally for SQL access by creating a relational view over the XML data structure. Here is where views provide and hide the translation from XML data structures (i.e., elements) to relational constructs (i.e., columns) allowing SQL applications to read XML data.

 create view abc_po_master_detail_view as
 select m.order_number, m.order_date, m.customer_name, m.userid, m.special_instructions, d.*
  from abc_purchase_order po,
   xmltable('/PurchaseOrder' passing po.object_value
     order_number number(5) path 'OrderNumber',
     order_date date path 'OrderDate',
     customer_name varchar2(30) path 'CustomerName',
     userid varchar2(10) path 'User',
     special_instructions varchar2(80) path 'SpecialInstructions',
     lineitem xmltype path 'LineItems/LineItem') m,
   xmltable('LineItem' passing m.lineitem
     itemno  number(10) path '@ItemNumber',
      description varchar2(25) path 'Description',
      partno varchar2(14) path 'Part/@Id',
      quantity number(12,2) path 'Part/@Quantity',
      unitprice number(8,4) path 'Part/@UnitPrice') d;

Then we’ll add structured and unstructured indexes on commonly queried portions of our XML document to provide increased performance. New to Oracle 11gR2, structured XMLIndex components use relational content tables (relational index structures) to organize structured XML content. For non-partitioned tables, defining a XMLIndex index to include all fields in a relational view will ensure that a query result of a relational view query will come entirely from the underlying XMLIndex content table, theoretically achieving relational performance. Additionally, standard relational secondary indexes can be created on the columns of the structured index effectively indexing the index for even faster performance. 

 create index abc_purchase_order_ak01 on abc_purchase_order (object_value) indextype is xdb.xmlindex
   ('xmltable abc_po_idx ''/PurchaseOrder''
    order_number  number(5)  path ''OrderNumber''
    ,order_date date path ''OrderDate''
    ,customer_name varchar2(30) path ''CustomerName''
    ,userid varchar2(10) path ''User''
    ,special_instructions varchar2(80) path ''SpecialInstructions''
    ,lineitems xmltype path ''LineItems/LineItem'' virtual
    xmltable abc_po_lineitem_idx ''/LineItem'' passing lineitems
    itemno number(10) path ''@ItemNumber''
    ,description varchar2(25) path ''Description''
    ,partno varchar2(14) path ''Part/@Id''
    ,quantity number(12,2) path ''Part/@Quantity''
    ,unitprice number(8,4) path ''Part/@UnitPrice''
  local parallel;

-- Create secondary B-tree and Full Text indexes on the structured XML index

create index abc_purchase_order_ak02 on abc_po_idx (customer_name) local;
create index abc_purchase_order_ak03 ON abc_po_lineitem_idx (description) indextype is ctxsys.context parameters ('transactional');

Jeff McCormick is an architecture director at a major health service company and president of the Connecticut Oracle User Group. McCormick has worked in IT for more than 20 years as a data and infrastructure architect/administrator.He holds several Oracle, Microsoft and Sybase professional certifications.

Dig Deeper on Oracle XML