This is the second of a two-part series on the Oracle big data architecture. The first part looked at various commodity-...
and appliance-based big data platforms. This installment examines the costs of big data when implementing it with an appliance versus a build-it-yourself approach.
When pondering how to address big data, many organizations turn to one of two options: building an in-house system with commodity hardware and open source software or else purchasing an appliance that already contains the necessary components. A commodity-based system offers flexibility and comes with an inviting price tag, at least on the surface, but an appliance makes implementation fast and easy, an attractive alternative if you don't have necessary in-house resources. Choosing one approach over the other for your Oracle big data architecture is no simple task.
Calculating costs for an Oracle big data architecture
A cost analysis is essential when planning a big data project. Yet coming up with an accurate picture is not always straightforward.
On the appliance side, you can start with the price tag. The full Oracle Big Data Appliance costs $450,000 for an 18-node rack, plus hefty annual fees if you want premier support. And even that doesn't represent all your expenses. You still need to host and maintain the appliance, even if you opt for premier support. It's also likely you'll be implementing additional software, developing custom code and linking to existing systems, such as data warehouses or relational database systems. The appliance might seem like a complete platform, but no system is an island, and you will still need some in-house resources.
Projecting the cost of building a commodity-based system with Oracle big data architecture is even trickier. Estimating the hardware is easy enough as long as you're sure to include the necessary network infrastructure, such as switches and bandwidth, as well as any other peripheral devices. But you also need to take into account any software or services you might want to purchase. For example, you'll likely need an application for managing your Hadoop clusters, such as one of the technologies offered by MapR, Cloudera or Hortonworks, as well as the resources necessary to perform ongoing management and maintenance.
If you want to incorporate other open source software into your platform, such as R Analytics or MongoDB NoSQL databases, you must also account for the time and resources necessary to implement and manage that software. Plus, you'll need resources to set up the hardware and connect Hadoop to other systems, and you must develop custom code to access data and support the various systems. Above all, you should be budgeting for data scientists and engineers who understand big data and how to manage it in a Hadoop environment. Without the proper expertise, you open yourself up to unexpected costs and risks -- and possibly to the project being abandoned altogether.
Estimating time to market
In all likelihood, you'll find that the time-to-market of an appliance-based system is shorter than a commodity-based one. With an appliance, you don't need to set up servers, wire them together, install the base software, link systems and try to make all the components work together. According to research the Enterprise Strategy Group Inc. conducted on behalf of Oracle, you'll see a 30% time-to-market saving with an appliance.
That said, you still have to wait for an appliance to be built and tested, but that waiting can be spent in pursuits other than building your own system. Plus, the faster you can get a system into place, the faster you can start analyzing data and realizing any benefits derived from big data.
Betting on the future
As with any large-scale enterprise effort, you must account for what lies ahead for your Oracle big data architecture. An appliance might seem perfect for your present needs, but what about two years from now? How about five?
For example, suppose you anticipate processing up to 200 TB of data in the foreseeable future. You determine that you can go with a single 18-node Oracle Big Data Appliance or a 100-node commodity-based system. Now suppose in a year from now you discover you need to accommodate another 10 TB of data. If you had gone the Oracle appliance route, you'll have to purchase a second 6-node rack for $160,000 (at today's prices). However, if you had built a commodity-based platform, you'll need only to add and configure five or so computers at a much cheaper cost. That said, if you need to expand your data capacity at a pace that fits the appliance mode, you'll likely find it easier to scale out your racks rather than add commodity hardware that will need to be configured and set up with complex software.
Commodity-based systems generally provide more flexibility going forward. And it's not just the risks of larger datasets where flexibility is important. Our understanding of big data and the way we address it is still a relatively young technology that is rapidly changing. Even Google is constantly on the lookout for new and better ways to handle its massive amounts of data. An appliance is built with a specific configuration in mind. What happens if that configuration is no longer the best strategy for handling big data?
The cloud will also likely play a larger role in the future of big data. The time might come when the cloud will serve an organization better than in-house technology, whether appliance-based or commodity-based. For some organizations, that time might already be here. If you purchase appliances today, will you be able to make effective use of them tomorrow if they no longer serve your big data needs? With commodity hardware, you might have more opportunities to put those systems to other uses. But even with them, you should be trying to project into the future before making a purchase.
Build vs. buy
There are many factors to take into account when deciding between an appliance-based system and a commodity-based one. If your organization is in a hurry to get going with as little effort as possible, chances are an appliance is the way to go. When you purchase an appliance, you're not buying just the hardware, but also the services and expertise you need to smooth the road to implementation and management. On the other hand, if you have the necessary in-house resources, including a high-level of expertise, you might benefit from the flexibility inherent in building it yourself, especially if you have a large number of commodity boxes sitting around.
But neither option should be taken lightly. They both come with a high price tag and their own risks, which are not always apparent on initial examination. Whatever you choose, try to get a realistic estimate of the total costs and time to market. And always be looking to the future.
About the author:
Robert Sheldon is a technical consultant and the author of numerous books, articles and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation.
Robert Sheldon asks:
What's the best hardware approach to big data: Appliance or commodity?
0 ResponsesJoin the Discussion