Problem solve Get help with specific problems with your technologies, process and projects.

Top 10 new data warehousing features in Oracle Database 10g

Plus Consultancy's Mark Rittman shares what he believes are the top 10 new data warehousing features in Oracle Database 10g.

Mark Rittman

Mark Rittman is a guest contributor. He is an OCP DBA and works as a consulting manager at Plus Consultancy in England, where he specializes in developing business intelligence and data warehousing applications using the Oracle Database, Application Server, Discoverer, Warehouse Builder and Oracle OLAP.

You can find more of Mark's work at

Plus Consultancy's Mark Rittman shares what he believes are the top 10 new data warehousing features in Oracle Database 10g. Use the following list to navigate to each feature.

   1. The SQL Model Clause
   2. The SQLAccess Adviser
   3. Improvements to the Multidimensional OLAP Engine
   4. The Tune MView Advisor and improvements to Query Rewrite
   5. Data Pump, The replacement for import and export
   6. Improvements to storage management
   7. Faster full table scans
   8. Automatic tuning and maintenance
   9. Asynchronous Change Data Capture
   10. Improvements to Oracle data mining


  1. The SQL Model clause
[ Return to Table of Contents ]

The new data warehousing feature in Oracle Database 10g that has probably received the most attention is the SQL Model clause. The SQL Model clause allows users to embed spreadsheet-like models in a SELECT statement, in a way that was previously the domain of dedicated multidimensional OLAP servers such as Oracle Express and Oracle9i OLAP. The SQL Model clause brings an entirely new dimension to Oracle analytical queries and addresses a number of traditional shortcomings with the way SQL normally works.

The SQL Model clause has been designed to address the sort of situation where, in the past, clients have taken data out of relational databases and imported it into a model held in a spreadsheet such as Microsoft Excel. Often, these models involve a series of macros that aggregate data over a number of business dimensions, over varying time periods, and following a set of complex business rules that would be difficult to express as normal SQL. I've worked on many a client engagement where the limitations of SQL meant that a number of standalone Excel spreadsheets had to be used, and while these gave the client the analytical capabilities they required, the usual issues of scalability and reliability of replicated data, and lack of overall control often became apparent after a while.

The aim of the SQL Model clause is to give normal SQL statements the ability to create a multidimensional array from the results of a normal SELECT statement, carry out any number of interdependent inter-row and inter-array calculations on this array, and then update the base tables with the results of the model. An example SQL statement using the MODEL clause would look like;

 SELECT SUBSTR(country,1,20) country, SUBSTR(prod,1,15) prod, year, sales FROM sales_view WHERE country IN ('Italy','Japan') MODEL RETURN UPDATED ROWS PARTITION BY (country) DIMENSION BY (prod, year) MEASURES (sale sales) RULES ( sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000], sales['Y Box', 2002] = sales['Y Box', 2001], sales['2_Products', 2002] = sales['Bounce', 2002] + sales['Y Box', 2002]) ORDER BY country, prod, year;

More details on the SQL Model clause can be found in "The new Oracle10g SQL Model Clause", the OracleWorld white paper "SQL Model clause of Oracle Database 10g", the OTN article "Announcing the new Model" by Jonathan Gennick and OTN's documentation.


  2. The SQLAccess Adviser
[ Return to Table of Contents ]

Part of the Oracle Database 10g Server Manageability feature, the SQLAccess Adviser recommends the best combination of indexes and materialized views for a given database workload. Available either at the command line (via the DBMS_ADVISOR package) or through the Advisor Central element of the new Web-based Oracle Enterprise Manager, the SQLAccess Adviser is based on the index and summary advisors previously bundled with Oracle9i and provides a one-stop-shop for tuning and summarizing your warehouse data.

More details on the SQLAccess Adviser can be found in this OracleWorld white paper, ""Turbocharge your database: Use the Oracle Database 10g SQLAccess Advisor" by Lillian Hobbs, the OTN documentation and this Oracle-by-example tutorial.


  3. Improvements to the multidimensional OLAP engine
[ Return to Table of Contents ]

With Oracle9i, the previously standalone Express multidimensional engine is now incorporated into the Oracle database, and with Oracle Database 10g, benefits of integration with the traditional relational Oracle engine are starting to become apparent.

First up is improvements to the way large analytic workspaces can be partitioned, introducing into the Oracle OLAP world some of the advanced partitioning options currently enjoyed by Oracle database users. Currently, analytic workspaces, stored as AW$ tables within an Oracle schema, can be partitioned across multiple rows in the AW$ table by specifying a maximum segment size, allowing you to split an individual analytic workspace into (say) 10 GB segments, one in each table row. This table could then be partitioned just like any other Oracle table, allowing you to put one row in one tablespace, another in another, and each of these tablespaces could of course be stored in datafiles on different physical disk units. Although this was of some benefit, splitting by segment size was the only way of partitioning the data, and you couldn't specify what objects within the analytic workspace went in to each partition. Oracle10g OLAP will now include an enhancement where you can specify exactly which objects within the analytic workspace go in to each partition, and you can further subdivide this by segment size if objects are particularly large.

In a similar fashion, variables within the analytic workspace can now be partitioned, either by range of dimension members, a list of dimension members, or by reference to a CONCAT dimension. The 10g multidimensional engine then stores each variable partition as a separate physical object, which can be directed to separate rows in the AW$ table (allowing you to partition these across different tablespaces and physical disk drives); the variable, however, appears as just one object to the application, simplifying the data model and allowing Oracle to do all the hard work in the background.

Another excellent new feature, and a real improvement over what was available with Express, is support for multi-user read-write access to individual analytic workspaces. In the past, one drawback with Express was that only one user could attach to an Express database in read-write mode, leading Express developers to develop a whole range of alternative solutions to allow ad-hoc write access to Express databases. In Oracle 10g OLAP, analytic workspaces can be attached in MULTI mode, whereafter applications then ACQUIRE individual variables in the analytic workspace for read-write access. Once an object has been acquired (and locked by the Oracle multidimensional engine), updates can then take place and the application can make whatever modifications are necessary. After all changes have been made, the UPDATE command is issued against the variable, followed by a COMMIT, and then a RELEASE command is issued against the variable to make it available for other applications to write to. It'll be interesting to see how the multidimensional engine handles multi-write access; in the past with Express databases could balloon in size when one user had read-write access to a database, and others were accessing it in read mode, as Express had to clone the database for each user to ensure that they had a consistent view of the data. I wouldn't be surprised if individual variables were copied out of a 10g analytic workspace into a temporary workspace while updates happened, with updates being propagated back (as with the old Express Excel Add-In) when the changes are finally COMMITted -- the key thing here is how database size is dealt with as the old Express way of doing it was less than optimal.

Aggregation has been improved with Oracle10g OLAP, with formulas now allowed as sources of data for the AGGREGATE command, eliminating the need to calculate and store data at the detail level. Aggregation, particularly dynamic aggregation, is another area where Oracle9i and now 10g OLAP are a distinct improvement over Express and it's well worth looking at this area in more detail if this is an issue with an existing Express system.

More details on Oracle10g OLAP's new features can be found in the OLAP Application Developer's Guide, the OLAP DML Reference and a white paper by Bud Endress, "OLAP Option to the Oracle10g Database."


  4. The Tune MView Advisor and improvements to Query Rewrite
[ Return to Table of Contents ]

Query Rewrite (the ability for Oracle to transparently redirect queries from detail level to summary tables) is one of the best data warehousing features in Oracle8i and 9i, but it's sometimes a bit temperamental and you can often find that queries don't actually get rewritten. Sometimes this is because you've broken one of the Query Rewrite restrictions, sometimes it's because your materialized view doesn't contain the correct columns and aggregates. Oracle 10g has a number of improvements to Query Rewrite and the materialized view tuning process that should make this process a bit more productive. With Oracle Database 10g, query rewrite is now possible when your SELECT statement contains analytic functions, full outer joins and set operations such as UNION, MINUS and INTERSECT. In addition, you can now use a hint, /*+ REWRITE_OR_ERROR */, which will stop the execution of a SQL statement if query rewrite cannot occur.

 SQL> SELECT /*+ REWRITE_OR_ERROR */ 2 s.prod_id, 3 sum(s.quantity_sold) 4 FROM sales s 5 GROUP BY s.prod_id; FROM sales s * ERROR at line 4: ORA-30393: a query block in the statement did not rewrite

Oracle9i came with two packages, DBMS_MVIEW.EXPLAIN_MVIEW and DBMS_MVIEW.EXPLAIN_REWRITE that could be used to diagnose why a materialized view wasn't being used for query rewrite. However, although these packages told you why rewrite hadn't happened, they left it down to you to work out how to alter your CREATE MATERIALIZED VIEW statement to ensure that rewrite happened correctly. Oracle Database 10g comes with a new advisor package, DBMS_ADVISOR.TUNE_MVIEW, that takes as its input a CREATE MATERIALIZED VIEW DML statement, and outputs a corrected version that supports query rewrite and features such as fast refresh.

More details on Query Rewrite improvements and the Tune MView Adviser can be found in the TUNE_MVIEW online documentation, the Query Rewrite online documents and this Oracle Database 10g Oracle-by-Example tutorial.


  5. Data Pump: The replacement for import and export
[ Return to Table of Contents ]

Data Pump is a replacement for the venerable IMP and EXP applications used for creating logical backups of Oracle tables, schemas or databases. Data Pump is a server application (as opposed to IMP and EXP, which were client applications), which in beta testing was twice as fast as the old EXP for exporting data, and 10 times as fast as the old IMP for importing data. Data Pump is callable either through the DBMS_DATAPUMP package, through the replacements for IMP and EXP, known as IMPDB and EXPDB, or through a wizard delivered as part of Oracle Enterprise Manager 10g.

Data Pump (and the new IMPDB and EXPDB applications) offers a number of improvements over the old IMPORT and EXPORT, including resumable/restartable jobs, automatic two-level parallelism, a network mode that uses DBLINKs/listener service names instead of pipes, fine-grained object selection (so you can select individual tables, view, packages, indexes and so on for import or export, not just tables or schemas as with IMPORT and EXPORT), and a fully callable API that allows Data Pump functionality to be embedded in third-party ETL packages.

More details on Data Pump can be found in this OracleWorld white paper by George Claborn, "Data Pump in Oracle Database 10g: Foundation for ultra-high-speed data movement utilities,", OTN's online documentation and this Oracle-by-example tutorial.


  6. Improvements to storage management
[ Return to Table of Contents ]

Automatic Storage Management (ASM) is one of the 'cool new features' in Oracle10g that is meant to reduce the workload for Oracle DBAs. ASM completely automates the process of creating logical volumes, file systems and filenames, with the DBA only specifying the location of raw disks and ASM doing the rest. Disk I/O is managed by evenly distributing the data across blocks within a disk group, with ASM in addition handling disk mirroring and the creation of mirror groups and failure groups.

ASM deals with the problems caused by rapidly expanding data warehouses, where administators can no longer deal with the sheer number of disk units, nodes and logical groupings, and is a key feature of the Oracle 10g Grid Architecture, which aims to 'virtualize' computing power and present database features like processing and storage as utilities that effectively manage themselves.

More details on Automatic Storage Management can be found in this OTN article by Lannes Morris-Murphy, this Oracle-by-example tutorial (which comes with a number of demonstration viewlets) and OTN's online documentation.


  7. Faster full table scans
[ Return to Table of Contents ]

Full table scans are common in data warehousing environments, and with this in mind, table scan performance has been improved in Oracle10g. Code optimization in Oracle Database 10g has decreased CPU consumption and this leads to faster table scan execution (when queries are CPU bound, rather than I/O bound), and gives the potential for greater query concurrency, offering up to 30-40% speed improvements when comparing CPU-bound queries.


  8. Automatic tuning and maintenance
[ Return to Table of Contents ]

Automatic maintenance and tuning has always been one of the key product differentiators for Microsoft SQL Server and with Oracle10g, features that meet and match those found in competitor products are being introduced to the server technology stack.

Surveys show that over 50% of a DBAs time is spend tuning and monitoring the database server, a task that while important is often complex and difficult to get exactly right. With Oracle Database 10g, Oracle has introduced a number of components that together make it possible for the database server to monitor itself, make intelligent changes to configuration, and alert DBAs when situations arise that need manual intervention.

The first component in this framework is the Automatic Workload Repository, which uses an enhanced version of Statspack to collect instance statistics every 30 minutes and stores these for a rolling seven day period. This enhanced version of Statspack now collects a broader range of statistics and has a number of optimizations to streamline the way high-cost SQL statements are captured, ensuring that only SQL activity that has significantly affected performance since the last snapshot are collected. The usage information stored in the Automatic Workload Repository is then used as the basis for all the self-management functionality in Oracle Database 10g.

Next up is the Automatic Maintenance Tasks feature, which acts on the statistics gathered by the Automatic Workload Repository, and carries out tasks such as index rebuilding, refreshing statistics and so on, where such tasks don't require any manual intervention by the DBA. A new scheduling feature known as Unified Scheduler runs these tasks during a predefined maintenance window, set by default to be between 10:00 pm and 6:00 am the next day, although these times can be customized to reduce impact on other tasks (such as batch loads) that might be taking place.

The third component of the self-managing framework is Server Generated Alerts, a method where the database server sends notifications via e-mail to the DBA -- including a recommendation as to how best to deal with the situation. Alerts will normally be raised where the database itself cannot deal with the situation that has arisen, such as when there is insufficient space on a disk unit to extend a datafile.

Lastly, and perhaps the most exiting of all the self-managing component frameworks, is the Automatic Database Diagnostic Monitor. This component analyzes the data captured in the Automatic Workload Repository and uses an artificial intelligence algorithm, similar to that found in Oracle Expert, to analyze areas such as lock contention, CPU bottlenecks, I/O usage and contention, issues with checkpointing and so on, in much the same way that a DBA would currently do by analyzing statspack reports.

More details on the self-tuning and maintenance features in Oracle 10g can be found in this OracleWorld white paper, "Oracle Database 10g: The self-managing database," by Sushil Kumar.


  9. Asynchronous Change Data Capture
[ Return to Table of Contents ]

Oracle Change Data Capture was introduced with Oracle9i, and provided the ability to track changes to tables and store them in a change table, for further consumption by an ETL process. Oracle9i Change Data Capture worked by creating triggers on the source tables, transferring data synchronously but creating a processing overhead and requiring access to the structure of the source tables. Because of the effect that the triggers had on the underlying tables, many warehouse projects did without change data capture and used other methods to capture changes.

Oracle10g introduces Asynchronous Change Data Capture, which instead of using triggers uses the database log files to capture changes and apply them to collection tables. Asynchronous Change Data Capture therefore doesn't require changes to the table structure and doesn't impact on database performance.

More details on Asynchronous Change Data Capture can be found in this OTN white paper, "On-time data warehousing with Oracle 10g -- Information at the speed of your business," which also details improvements to the transportable tablespaces feature in Oracle Database 10g (tablespaces no longer need to be sourced from the same O/S platform).


  10. Improvements to Oracle data mining
[ Return to Table of Contents ]

Alongside the inclusion of the Oracle Express multidimensional OLAP engine, Oracle9i also embedded data mining functionality in the database together, and this data mining functionality has been enhanced with Oracle Database 10g. Oracle Database 10g adds support for two new classification routines, Support Vector Machine (used for top-down rather than a bottom-up calculations, assuming the best possible fit and then working backwards to what can be achieved) and Non-Negative Matrix Factorisation, together with support for Frequent Itemsets, used for such functions as market basket analysis and propensity analysis.

More details on what's new with Oracle10g Data Mining can be found in the OTN online documentation.

About the author:
Mark Rittman works for Plus Consultancy, an Oracle Partner specializing in business intelligence and data warehousing solutions, based in Brighton, England. He manages their data warehousing practice and spends most time designing and building BI&W systems based on the Oracle9i platform. Away from Plus, he's also the chair of the UKOUG Business Intelligence & Reporting Tools SIG, working as a liason between the Oracle BI tools community and Oracle. Check out Mark's Web site at

This was last published in April 2004

Dig Deeper on Oracle Fusion middleware and integration

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.