Problem solve Get help with specific problems with your technologies, process and projects.

Reducing risks in data warehouse maintenance

I am a data warehousing DBA. This happens to be my first data warehousing project, and it is in development stage. I am curious to know about potential risks involved in maintaining a data warehouse, and could you suggest ways to reduce them?

As a DBA supporting the data warehousing environment, you should be sure the ETL architects design their data aquisition jobs such that you get comprehensive statistics on the results of each load into the data warehouse. It is important that you have a good understanding of what data was loaded, what was rejected and why. You need to be assured they have a sound architecture in place to be able to re-run loads of data without compromising the referential integrity of the data warehouse content.

From a performance managment and tuning perspective, it is important you keep a close watch on the kinds of activity performed against the data warehouse. As the ETL jobs load, are the indexes properly updated to ensure top query performance. When queries are submitted, you need to be able to capture those queries and analyze how they are forcing the database to work to deliver the results. You want to continually monitor and tune the database and the queries to ensure peak performance. You also want to keep an eye on the "hot spots" and "cold spots" and make your data architects aware of these. "Hot Spots" are the areas of the database that get the most activity against them. As you see access trends forming, you should alert the data architect for the potential of designing new or improved data marts. "Cold Spots" are the areas of the database that seldom, if ever, get accessed. Again, you should alert the data architect that there is either a training problem (maybe nobody knows the data is there) or there is relatively meaningless data in the data warehouse that potentially can be removed and/or archived.

From a capacity management perspective, you want to continually monitor the growth of the data warehousing environment and plan for an archival strategy that allows you to maintain the older/seldomly used data on near-line and/or offline storage facilities and keep the fresher/most active data on highspeed online storage facilities. This will help you maintain control over the growth of your storage devices and will help you save lots of money in the long run.

Dig Deeper on Oracle data warehousing

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.