ORLANDO, Fla. -- Toting "DB2 on Linux" hats and green DB2 bumper stickers featuring the friendly Linux penguin logo, IBM is sending a message to DBAs that the database of choice for establishing clusters on the open source platform is DB2 -- and not products from its nemesis, Oracle Corp.
"You have, in some sense of the word, a data explosion going on, and to handle that data, companies are turning to database clusters," Rav Ahuja, IBM's product manager for DB2 on Linux, told an audience at the International DB2 Users Group (IDUG) conference.
"Establishing clusters on Linux is the most cost effective way to deal with that explosion of data," said Ahuja.
Only days after
Oracle has successfully leveraged its real application clusters (RAC) technology, calling it "grid" in the latest version of its database, Oracle 10g. It recently established the Enterprise Grid Alliance to design standards for grid computing, stealing the spotlight from IBM, which has long been building clustering databases for scientific and academic institutions.
While throwing out Linux hats to the audience, Ahuja told those in attendance that IBM would offer a discount to customers building data center clusters on Linux. Still, Ahuja found an audience less impressed with his message, and more concerned about whether it would be worthwhile to eliminate the millions of dollars worth of hefty mainframes that many attendees' firms currently maintain.
"It's hard to justify bringing in smaller machines when the bigger ones are currently doing the job and doing it well," said Mike Skaff, a senior DBA with Greenwood Village, Colo.-based First Data Corp. "There seems to be a generation gap now between us veterans and the young graduates who love Linux."
However, Ahuja said the latest Linux kernel will add enterprise functionality to Linux. For example, he said companies using the latest Linux kernel will have the ability to scale to a large number of CPUs in a system.
"If you want to do a deployment today, there is nothing wrong with running with the 2.4 kernel," Ahuja said. "Deploying with the robust 2.6 kernel will be worth the wait."
IBM is also launching a Design Advisor feature in Stinger, which Ahuja said will allow customers to automatically index, set portioning keys and establish multidimensional clustering.
While further automation is a bonus for time-crunched DBAs, it's still unclear whether an automated design adviser will be able to set the best possible configuration, said Miguel Celi, a senior DBA and enterprise resource planning applications specialist at Ft. Lauderdale, Fla.-based staffing and executive recruitment firm, Spherion Corp.
"It's all a little complicated for a human being, let alone for a simple program," Celi said. "There is still some convincing that needs to be done before we jump on board. The cost factor is still a bit hazy."
According to Ahuja, most companies establish only one or two CPUs per database partition, and six to 20 disks per partition. As companies add CPUs to a single system, Ahuja admitted that the cost rapidly increases.
Companies should also establish between 4 GB and 8 GB of RAM per 32-bit partition, or between 6 GB and 16 GB per 64-bit partition when clustering a database. Between 100 GB and 300 GB of space for a raw data partition should be established, Ahuja said. The largest table partition should be limited to 100 GB.
"There are some very well-established companies being successful with a clustering configuration," Ahuja said. "It's being done and companies are happy with the results."