We have created an application that combines parts taken from several hundred sources daily and populates a table that reflects the parts and who the supplier is. We have problems with updating this table during the day because of the amount of activity on the table. The table contains about 7 million parts. Updates are handled via inserts into the table using a tool provided by DataJunction. Prior to the insert we do a SQL Delete to remove the existing records for the supplier. These Deletes can select up to 100,000 items. At the same time we are updating, we have users who are querying the database looking for parts. When they query during an update, the response time is unacceptable. Lock level escalutaion happens in some cases which we continue to monitor. The application is all Java using JDBC to query the table.
We would like to update/maintain a separate table and move to production every 15 or 30 minutes. Our goal would be not to have our query users impacted by the mass updates that occur.
What areas would you recommend we look at to provide better quality of service to our users?
My suggestion for a solution to this problem would be to use DataJunction to alter the update process slightly. Right now it sounds like DataJunction is trying to do massive DELETE and INSERT operations, involving many rows at once. This manipulates large amounts of data as a single transaction, which makes for huge locks and blocks.
My first suggestion is to change the update process itself. Instead of deleting all the rows, then inserting all the rows, take a simpler approach. Create a Java method to attempt an SQL UPDATE, and if that update fails (due to a missing row), have the process do an SQL INSERT. If it is possible for rows to be removed from the work table, you may have to include a separate SQL DELETE step for these rows in the master table. This change minimizes the amount of data manipulated to do the same update.
My next suggestion is a bit more of a technical challenge, but promises much lower impact on the table. This step would break the table update into its component rows, so instead of processing an UPDATE for up to 100,000 rows followed by an INSERT for whatever of the 100,000 rows were not updated, the whole process would now become 100,000 one row UPDATE/INSERT operations. This would make the whole process less efficient in terms of hardware utilization, but would make it much more efficient in terms of reduced locking/blocking. Even if this process works as expected, you'll need some way to delete any rows dropped from the work tables. These single-row operations could be done in parallel if your database and server will support that.
For More Information
- Dozens more answers to tough database design questions from Pat Phelan
- The Best Database Design Web Links: tips, tutorials, scripts, and more
- Have an Oracle or SQL tip to offer your fellow DBAs and developers? The best tips submitted will receive a cool prize. Submit your tip today!
- Ask your database design questions -- or help out your peers by answering them -- in our live discussion forums.
- Ask the Experts yourself: Our SQL, database design, Oracle, SQL Server, DB2, metadata, object-oriented and data warehousing gurus are waiting to answer your toughest questions.
Dig Deeper on Oracle support services
Related Q&A from Pat Phelan
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.