Problem solve Get help with specific problems with your technologies, process and projects.

Table design in OLAP versus OLTP

How is the table design (relationship between tables and the columns in the table) different when designing for...

an OLAP versus OLTP application? A small example or a site/book having an example (comparing design of OLAP versus OLTP) would be really appreciated.

The big difference when designing for OLAP versus OLTP is rooted in the basics of how the tables are going to be used. I'll discuss OLTP versus OLAP in context to the design of dimensional data warehouses. However, keep in mind there are more architectural components that make up a mature, best practices data warehouse than just the dimensional data warehouse. For more information on this I suggest reading:

  • Corporate Information Factory, 2nd Edition by W. H. Inmon, Claudia Imhoff, Ryan Sousa
  • Building the Data Warehouse, 2nd Edition by W. H. Inmon

With OLTP, the tables are designed to facilitate fast inserting, updating and deleting rows of information with each logical unit of work. The database design is highly normalized. Usually and at least to 3NF. Each logical unit of work in an online application will have a relatively small scope with regard to the number of tables that are referenced and/or updated. Also the online application itself handles the majority of the work for joining data to facilitate the screen functions. This means the user doesn't have to worry about traversing across large data relationship paths. A heavy dose of lookup/reference tables and much focus on referential integrity between foreign keys. The physical design of the database needs to take into considerations the need for inserting rows when deciding on physical space settings. A good book for getting a solid base understanding of modeling for OLTP is The Data Modeling Handbook: A Best-Practice Approach to Building Quality Data Models by Michael C. Reingruber, William W. Gregory.

Example: Let's say we have a purchase oder management system. We need to be able to take orders for our customers, and we need to be able to sell many items on each order. We need to capture the store that sold the item, the customer that bought the item (and where we need to ship things and where to bill) and we need to make sure that we pull from the valid store_items to get the correct item number, description and price. Our OLTP data model will contain a CUSTOMER_MASTER, A CUSTOMER_ADDRESS_MASTER, A STORE_MASTER, AN ITEM_MASTER, AN ITEM_PRICE_MASTER, A PURCHASE_ORDER_MASTER AND A PURCHASE_ORDER_LINE_ITEM table. Then we might have a series of M:M relationships for example. An ITEM might have a different price for specific time periods for specific stores.

With OLAP, the tables are designed to facilitate easy access to information. Today's OLAP tools make the job of developing a query very easy. However, you still want to minimize the extensiveness of the relational model in an OLAP application. Users don't have the wills and means to learn how to work through a complex maze of table relationships. So you'll design your tables with a high degree of denormalization. The most prevalent design scheme for OLAP is the Star-Schema, popularized by Ralph Kimball. The star schema has a FACT table that contains the elements of data that are used arithmatically (counting, summing, averaging, etc.) The FACT Table is surrounded by lookup tables called Dimensions. Each Dimension table provides a reference to those things that you want to analyze by. A good book to understand how to design OLAP solutions is The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses by Ralph Kimball.

Example: let's say we want to see some key measures about purchases. We want to know how many items and the sales amount that are purchased by what kind of customer across which stores. The FACT table will contain a column for Qty-purchased and Purchase Amount. The DIMENSION tables will include the ITEM_DESC (contains the item_id & Description), the CUSTOMER_TYPE, the STORE (Store_id & store name), and TIME (contains calendar information such as the date, the month_end_date, quarter_end_date, day_of_week, etc).

Hope this helps get you started. Best of Luck!


Dig Deeper on Oracle business intelligence and analytics