Q

Help building billion-row system

I need to build a new system with a table that will keep data for 60 days (approximately 1 billion rows). Once the data is there, more data will be loaded (about 10 million rows per day). The system needs to reject and report this new data whenever a duplicate value is found. What approach do you recommend?

I need to build a new system that will have a table that will keep data for 60 days (approximately 1 billion rows). Once the data is there, more data will be loaded (about 10 million rows per day loading around 40 files with 40,000 rows in each file). The system needs to reject and report this new data whenever a duplicate value is found (90% of the data is supposed to be valid; the other 10% is supposed to be duplicated). I also need to maintain this table with the latest 60 days of data.

What approach do you recommend? I was planning to have a partitioned table per day (is it possible to have 60 partitions?), use a PK on those unique values, use an external table (or SQL*Loader) to load the data and use the HINT APPEND to insert new data, then use another process to drop old partitions and create a new one for today's data. I'm not very familiar with partitioned tables, so any example and help is appreciated.

What do you think about this approach? Any other ideas? Thanks.

I would definitely use partitioning for this procedure. Partitioning can support 60 partitions without any problems. The nice thing is you can drop the oldest partition, create a new one and ingest the newest data. This can be repeated every day. This sounds like the proper approach to me.

You can use SQL*Loader to load the data into the partitioned table. With SQL*Loader, you can have a discard file which will contain the rows that were not ingested due to violating the PK constraint (or some other constraint). This can also be done with External Tables.

I recommend reading the following: the Oracle Concepts Guide (Chapter 18 talks about partitioned tables and indexes) and the Oracle Utilities Guide (Part II covers SQL*Loader and Part III covers External Tables).

This was first published in August 2006
This Content Component encountered an error

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close