I have a design situation here. We are working on a Oracle 8i table that has about 3 million records in the activity table. Multiple acitivity records are linked to a single account record (M:1)). We are now migrating to Oracle 9i and our client wants to merge the activity records with the same account record into a single record. This will mean that one column in multiple activity records will be merged into a long column, and other records will be discarded. (Out of 5 columns, there is only one column that has the usable data.)
The client thinks they can save on database size and improve performance by reducing the number of records. It is expected that records will be reduced by 90%. Do you think that performance will improve significantly by doing this? Will the size of the database be reduced?
By redesigning your database as you've indicated, performance can increase and decrease. It all depends on how that data is used. What you are essentially doing is precomputing a join of two tables and storing that precomputed join in another table. This is called denormalizing the data. Denormalization does have its problems in that you may suffer from update anomolies unless you are very careful. So I don't recommend denormalization unless there is a very good reason, and you have thoroughly thought through the entire process.
So as I've said, you are essentially storing a precomputed join. This is the same thing as placing both tables in a cluster. Performance will improve for those queries that would have joined the two tables, because you won't have to perform that time consuming step any more. But performance will decrease for accessing the data that used to be in one table. You said that one table has 3 million rows of data, but you didn't say how much the other table has. Let's assume it is a small amount of rows, say 1,000 rows of data in that table. Prior to combining the tables, if you wanted data from that second table, you had to search through, at worst, 1,000 rows of data. Now, you'll have to search through 3 million rows of data. To make matters worse, you signficantly change the cardinality of that data and may make an index on that data worthless.
And I really don't see how you'll save space. You will be duplicating data. Let's go back to my numbers of 3 million rows in one table and 1,000 rows in the other table. That means that on average, each row from the second table will find a match in 3,000 rows in the first table. That means that each row in the second table will be duplicated 3,000 times when combined with the first table. So take the size of that 1,000 row table and multiply it by a factor 3,000. Surely you can see that space would not be saved by doing this.
Dig Deeper on Oracle database design and architecture
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.