2003 closed out with an up market and an up alert level on the homeland defense meter. Taken together, the two indicators suggest that businesses are getting back to business and cranking out as much or more data as they have in the past, while terrorists, hackers and other malcontents are hatching new schemes to advance their various causes. And, in the process, data is increasingly at risk.
Given the mix of trends as we enter 2004, it seems we are all going to have to get a bit more savvy about data protection. Specifically, we need to find ways to make sense of the mixed-up messaging around data protection strategies and solutions. We also need to figure out how to build a capability that is right for our applications, environment and budget.
To start this process, we need to acknowledge some simple truths about the joy of data protection.
Simple truth #1: All data protection strategies come down to data replication
Despite the hemming and hawing of the tape backup and disk mirroring crowds, all are essentially methods for performing the same basic task: Make a copy of data at an alternate location to protect the original against loss.
Replication methods differ in terms of the speed and granularity of restore that is offered, support for heterogeneous restore targets, the delta between the original data and the copy, the media that is being used, and, of course, the price tag associated with each approach. By applying what you know
Simple truth #2: There is no "universal solution"
The word "solution" connotes a single, all-encompassing, method for data protection. However, it is rarely the case in that there is a one-size-fits-all approach that can address the recovery requirements of every business application. This is especially true in larger enterprises.
Just as many IT shops have begun to embrace the concept of multiple tiers of disk to satisfy different "data hosting" requirements. So, too, must disaster recovery and business continuity planners begin thinking of data protection as a capability that is potentially comprised of multiple products and services.
Different data has different recovery requirements, so it stands to reason that a single IT shop may have various tape backup subsystems, assorted disk-to-disk, asynchronous mirroring configurations, and -- if the current sign-ups reported by Sungard Recovery Services and LiveVault are any indication -- maybe even some remote data vaulting services to boot.
The important thing isn't how many processes comprise your data protection strategy, only how efficiently they can be managed by constrained staff resources to provide the expected value.
Simple truth #3: Continuous is better
Continuous data protection, the notion of mirroring (or journaling and forwarding I/O activity) to an alternate storage repository or server, is currently enjoying a considerable amount of attention within the industry. It connotes an on-going protection method in which copied data is pretty close to synchronized with production data so that recovery can occur by failing over one repository to another.
Ideally, such a solution would be handled as a function of business application software itself -- at "Layer 7" of the ISO model. However, most layer 7 strategies have bombed badly in recent years, including parallel databases in which the primary database logs in to the remote database as a "user" and proceeds to duplicate adds, modifies and deletes that have been logged in its own system. So far, such strategies have proven to introduce unacceptable latency into both the local and remote application.
Instead, most of the continuous data protection strategies in the market tend to work lower down in the ISO stack. Traditionally, array makers advised customers to allocate a considerable portion of their array disk capacity for creating and "breaking off" point-in-time mirror copies of data. These strategies worked, but increased the cost of the array substantially (special software, additional disk, and duplicate arrays from the same manufacturer were needed) and reduced the overall cost-efficiency of the array products themselves.
Enter the latest crop of continuous protection strategies from virtualization software vendors like Veritas, Datacore, FalconStor, StoreAge, and Fujitsu Softek, and specialty continuous data protection ISVs like Revivio. The former seek to usurp the on-array software functionality of the big iron vendors and to duplicate-and-forward I/O streams to heterogeneous targets.
Meanwhile, Revivio and others seek to journal block-level data changes to their own repository. If an interruption of the primary site occurs, this method enables data restoration to either any point in time (in the case of Revivio's time addressable storage) or to fixed and indexed "crash-coherency" points (in most everyone else's strategy).
Both strategies offset some of the drawbacks of vendor-specific array mirroring by enabling virtually any disk to be targeted for restore. However, all need to be scrupulously checked for compatibility with other software you may be using, including low-level operating system utilities.
Simple truth #4: Tape works
Tape, meanwhile, does not afford continuous protection, but may have advantages derived from its low price point and widespread familiarity after 20-plus years of service. New products, like those from Breece Hill, combine tape and disk into an appliance, facilitating low cost disk-to-disk-to-tape strategies. Meanwhile, many other vendors, including the redoubtable Quantum Corporation, have introduced disk appliances to front-end tape libraries in order to expedite backups and local restores while preserving the portability of tape.
Users need to understand that tape technology, when applied judiciously, can solve even the knottiest of data protection problems. Tape can still be used effectively, for example, in the case of multiterabyte databases, if the database has been designed with tape in mind. In the preponderance of cases, most data in a database does not change. So, pre-positioning this "core" data at a recovery center, then using tape only to restore changed data may effectively address the requirements of databases that seem to have outgrown tape from the perspective of either backup or restore.
Simple truth #5: You don't need a Fibre Channel (FC) fabric to have an effective data protection capability
This may come as a relief to many smaller and medium-sized businesses that are concerned about the solvency of their data protection strategies. The fact is that Gigabit Ethernet with Jumbo Frames provides a more efficient connection for file-based backups than does Fibre Channel any day of the week. The only advantage of a fabric is that it begins to enable back-end data movements, including replication processes. The downside is that a poorly managed fabric may increase, rather than decrease, the likelihood of a data disaster.
2004 will see the introduction of workable schemes for "networking" storage based on IP (iSCSI) and Serial-Attached SCSI (SAS) that may well eat into the Fibre Channel market. The new year will also see the development of new options for intra-storage connectivity and data protection.
Keep your eyes and ears open and be sure to let me know how these solutions are working out for you in the coming year.
About the Author: Jon William Toigo heads up an international storage consulting group, Toigo Partners International and has also authored hundreds of articles on storage and technology along with his monthly SearchStorage.com "Toigo's Take on Storage" expert column and backup/recovery feature. He is a frequent site contributor on the subjects of storage management, disaster recovery and enterprise storage. Toigo has authored a number of storage books, including Disaster recovery planning: Preparing for the unthinkable, 3/e. For detailed information on the nine parts of a full-fledged DR plan, see Jon's web site at www.drplanning.org/phases.html.
This was first published in December 2003