Grid computing is not a new concept. The use of multiple resources working together on an application is not new. In the late 1970's when network operating systems came into the picture, the evolution began. The latest iteration operates over IP. Grids have been able to expand to offer more services and more processing speed over a greater distance due to the increase in available bandwidth site to site and the ability to process more than one packet at a time that is of greater size than earlier counterparts. By being able to share data stores and applications across the globe, data that was not available before becomes accessible to scientific and private sectors alike.
The TeraGrid project is one such grid. "The TeraGrid is a multi-year effort to build and deploy the world's largest, most comprehensive, distributed infrastructure for open scientific research." There are several member sites and the information on this project is well worth checking out. They are able to store nearly a petabyte of data and have 20 teraflops of computing power. A complete image of the earth is stored once every 24 hours. These images are used to track global warming, fish migration patterns, environmental data, etc. With the increased ability provided through digital imaging, these images are extremely detailed, clear, and of course, large. The supercomputing centers in several states are linked together in this grid. Scientists can share the massive data stores regardless of their location.
A Data Grid has several tools to facilitate the movement of large amounts of data through a WAN. This is accomplished partly through replication and data set discovery, replica selection and replica management services. There are a variety of applications that run above the Data Grid. The Grid Services may include protocols, management, authentication, discovery, policies, etc. The Grid Fabric consists of the actual archives and all attached devices including computers, displays, networks, etc. All of this is needed as these massive amounts of data produce multiple terabytes per day. The CERN computer is reported to produce over 4TB of data per day. Much of this data is cached for faster access and indexed for faster lookups. Data that is tuned for WAN performance can be accessed faster than it can through a traditional single data store due to caching and the ability to feed multiple parts of a file at the same time.
As storage has become an insignificant portion of network costs, this model can be created for large companies with mass amounts of storage needs as well. Being able to tie multiple machines together with multiple data stores creates an inexpensive environment compared to a single super computer. What does this mean for your enterprise? These scientific grids are readily being adapted for commercial use at various scales. This single technology has the ability to eliminate one of the most costly factors for any company. That is network slowness.
Now I am not saying that grids are the only way to go. There are still some problems with compatibility, with both protocols and hardware. However Lawrence Berkley National Labs put together a grid for under $14,000 using 4 machines with over a terabyte of storage and a true throughput of 1Gb/s (250 Mb/s per computer). Grids still have to deal with QoS issues just like voice type applications, and more needs to be done to make them easy enough to administer for IT personnel, and of course, security is a concern. Each local policy must be adhered to for access. I will however say, that with 10GBASE-T coming soon, and 10G already shipping for fiber and CX4, increased throughput on the WAN side, and all of the work that has already been going on for 10 years in this area, grids will be the next best thing!
Grid-ready cabling has already been introduced. Server architectures that are scaleable, data stores that are scaleable and redundant, and network QoS issues are being addressed. Grids may be the best way to become fully fault tolerant over a geographically dispersed area very soon.
Carrie Higbie, Global Network Applications Market Manager, The Siemon Company
Carrie has been involved in the computing and networking industries for nearly 20 years. She has worked with manufacturing firms, medical institutions, casinos, healthcare providers, cable and wireless providers and a wide variety of other industries in both networking design/implementation, project management and software development for privately held consulting firms and most recently Network and Software Solutions.
Carrie currently works with The Siemon Company where her responsibilities include providing liaison services to electronic manufacturers to assure that there is harmony between the active electronics and existing and future cabling infrastructures. She participates with the IEEE, TIA and various consortiums for standards acceptance and works to further educate the end user community on the importance of a quality infrastructure. Carrie currently holds an RCDD/LAN Specialist from BICSI, MCNE from Novell and several other certifications.