Data modelers are often faced with the dilemma of how to model logical attributes in their data warehouse schema design. A new feature in Oracle 11g called virtual columns solves this dilemma by exposing logical attributes while reducing object management and storage consumption. Additionally, virtual columns can be used to improve availability, manageability and performance of the database. Virtual columns assist modelers in delivering simple, intuitive and fast performing designs.
Logical attributes
Logical attributes are commonly referred to as expression or function based results. These attributes fall into two categories: calculated and derived values. Calculated values represent the results of mathematical formulas. The following example adds two columns, resulting in a calculated value:
[TABLE]
Derived values, on the other hand, are logic based. The same two columns added together can be evaluated, resulting in a derived value:
[TABLE]
The dilemma
Prior to 11g, data modelers either had to expose logical attributes as physical table columns or virtualize the values in views. Both solutions were less than optimal.
Storing logical attributes as physical table columns allowed values to be exposed to users and were indexable. As physical columns, accurate statistics could be collected resulting in better decisions by the optimizer. However, storing the attributes as table columns meant consuming disk space. Conscientious of exponential data growth within the data warehouse, many designers opted to model these attributes in views.
Views achieved the design goal of exposing the logical attributes without consuming physical disk space. A view can be thought of as a stored query with results materialized at SQL execution. For this reason, optimizer statistics cannot be pre-collected on views. The optimizer is left with no other option but to guess at the cost and cardinality
To continue reading for free, register below or login
To read more you must become a member of SearchOracle.com
');
// -->

of the expression results. Although view columns cannot be indexed or constrained, the introduction of Function Based Indexes (FBIs) in Oracle 8i provided the optimizer with actual statistics for query plan optimization. Views and FBIs seemed a viable solution but meant managing additional database objects which led to data warehouse complexity.
The 11g answer
The introduction of virtual columns in Oracle 11g has provided data modelers with an optimal design for logical attributes. Exposed as table columns, virtual column values are not actually stored on disk but are computed on the fly during SQL execution. They actually share much of the same code path as FBIs but offer visibility. In addition to not being physically stored, virtual columns have other significant advantages over views. Optimizer statistics can be collected on virtual columns and these columns can be indexed, constrained and even used to partition tables. This enables the optimizer to make fact-based intelligent decisions around query access. Virtual columns achieve the design goal of exposing logical attributes without additional storage or object costs.
Show me
Let's demonstrate the use of virtual columns in a fictitious data warehouse design.
1. Create a de-normalized PRODUCT table that you might expect to see in a dimensional Star schema design.
[TABLE]
Notice the new syntax of the virtual columns in bold print. The calculated attribute "total_cost" has inline NULL and check constraints, and is used as the partitioning column. Each derived state flag attribute is based on the "states" column which reflects the aggregation of all states where the product is sold (see Figure 1). For example, a value of 7 in the "states" column reflects a product sold in ARIZONA, CONNECTICUT and RHODE ISLAND (1 + 2 + 4).
[TABLE]
2. Show virtual column expressions are actually stored as metadata in the data dictionary.
[TABLE]
3. Load the PRODUCT table.
[TABLE]
Storing the odd numbers 1 & 3 in the "states" column will ensure all products are sold in ARIZONA and no products are sold in RHODE ISLAND. We will come back to this in later steps.
4. Create local partitioned indexes on the virtual "state flag" columns.
[TABLE]
5. Gather Statistics.
[TABLE]
6. Verify the skewed distribution for ARIZONA and RHODE ISLAND.
[TABLE]
7. Compare ARIZONA and RHODE ISLAND Optimizer Query Plans.
Here, with statistics collection, the skewed results of the column functions are known by the optimizer. This causes the optimizer to treat the cost of each query differently, resulting in different query plans. Notice the number of estimated rows match the actual number of rows processed.
[TABLE]
Here, we see the optimizer has performed partition pruning on the "total_cost" virtual column.
[TABLE]
Virtual columns are a useful new feature in Oracle 11g. This functionality offers data modelers an optimal solution for exposing logical attributes. Understanding how to use features of the database is essential to delivering a data warehouse design that is both fast and simple.
About the author
Jeff McCormick is an Architecture Director at a major health care company and President of the Connecticut Oracle User Group. Jeff has worked in IT for 20 years as a system, storage and database architect/administrator and has over 15 years of experience with DB2, Sybase, SQL Server and Oracle relational database technology. He holds several certifications including Oracle Database 10g Administrator Certified Professional, Microsoft Certified Product (SQL Server) and Certified Sybase Professional Database Administrator. Jeff is a frequent speaker and has authored several papers on data and database architecture.