Q
Problem solve Get help with specific problems with your technologies, process and projects.

Using export/import process to clone to data warehouse

I work for a university and we have built a data warehouse and two data marts for archived data. Currently we are using an export/import process to clone our online system to our test data warehouse. We are trying to determine if we should do nightly incremental updates/inserts instead of cloning the production system. I know that incremental table updates are faster, but are there any cons to the process? Is there a possibility that the files could get out of sync?

I work for a university and we have built a data warehouse and two data marts for archived data. Currently we are...

using an export/import process to clone our online system to our test data warehouse.

Here is my system information: My database is Oracle 9i and operating system is Unix on Sun Solaris. We will go to an ODS in 2007.

We are trying to determine if we should do nightly incremental updates/inserts instead of cloning the production system. I know that incremental table updates are faster, but are there any cons to the process? Is there a possibility that the files could get out of sync?

  1. Do you use an export/import process to refresh your data warehouse?
  2. Do you do daily incremental updates?
  3. How long does it take to refresh your data warehouse/data marts?
  4. Do you create any summary data or materialized views?
  5. How long does it take to do backups?
  6. What is your enrollment?
  7. What about growth rates? How does that impact your system?
  8. What platform are you running on?

Thank you in advance for your time.

Wow, this is a long question that touches on a number of aspects of the types of architectural decisions that we normally consider when constructing or maintaining a data warehouse, as well as your basic question on whether to use export/import as a cloning method.

Let's start with import and export as a cloning method. First off I can tell you that it works; this is a solution that will work for what you are trying to do. However, not knowing how large your warehouse is, this may not be a viable option. Consider a warehouse that is 1 or 2 terabytes; the selection of export/import would not be viable. Options that we have used depend on the requirements. For example, for an enterprise data warehouse, we use a number of system-level options. We use either Oracle's transportable tablespaces or we use an EMC feature to quick-copy disks. Both of these are effective solutions, since they minimize the impact to the database, but these may be more appropriate to a larger warehouse. Another option is replication, but it has not been one that I generally would use for a test database. As you discuss you are concerned about posting updates; I can understand this as you will now need to build an ETL structure just to update your testing database. Sometimes I think we complicate our lives unnecessarily.

As far as the type of data warehouses I work on, the current one is an enterprise warehouse that integrates multiple lines of business. The warehouse is currently multiple terabytes and has seen significant growth in the past while it resides on an HP 12-way Itanium with an EMC array. We perform all of our loading via Informatica, so we utilize direct database access as well as bulk load methods, depending on the transactions that we are processing. All of our processing is delta processing. We capture changes in the sources and then post them to the warehouse; our load window is now at about five to eight hours depending on the daily volumes. As for backups, I do not have a time on this as I leave that to operations. As far as refreshing our development and test databases, we used export/import the first time and that took us three days. We have since moved to a simple Informatica solution that completes in about 12 hours.

So I hope that helps you. If you have any follow-up questions please let me know as I am happy to help. Good luck.

This was last published in January 2006

Dig Deeper on Oracle data warehousing

PRO+

Content

Find more PRO+ content and other member only offers, here.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide.com

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close