ORLANDO, Fla. -- The problem of how best to ensure data quality has been on the radar of enterprises -- and the...
subject of much IT effort -- for more than a decade now.
So why are substantial improvements in the area of data quality as elusive today as they were back then?
It's a question Arkady Maydanchik, co-founder of Oak Brook, Ill.-based Data Quality Group LLC, sought to answer yesterday in his keynote speech before attendees of The Data Warehousing Institute conference here.
According to Maydanchik, growing demand for data quality tools, data quality professionals and training, and the rise of new data quality initiatives, all indicate that enterprises see data quality as an important issue. But despite the growing awareness of the data quality issue and some positive progress in the quality of some individual databases, overall corporate data quality continues to deteriorate, he said.
One of the primary reasons data quality continues to erode has to do with organizations' attitudes toward new technology, Maydanchik explained. Companies tend to believe that new business applications and other technologies will help to solve data quality issues, he said, but this is generally not the case.
"Technology is a magnifying glass. It magnifies the intrinsic value of data," he said. "Better technology combined with good data is a great asset, but better technology combined with bad data is an equally great liability."
Another problem that stands in the way of data quality is organizations' tendency to fill data quality teams with people from IT groups who may know a lot about databases but do not know a great deal about the data quality discipline itself.
"The truth is that data quality is a rather complex discipline," Maydanchik said. "A data quality team needs to be staffed with people that know a lot about this particular discipline."
He added that it's also important to staff data quality teams with representatives from the business side of the organization because, after all, they are the ones using data to drive the business.
"Without business users, you can never improve data quality," he said.
Finally, Maydanchik said, organizations continue to see data quality problems because they don't have a good handle on where their data quality issues reside within the organization. That's why a good data quality program begins with a comprehensive data quality assessment.
"[The data quality assessment] is a cornerstone, and the objective here is to systematically go through the data and identify all data problems and then measure their impact on various business processes," he said. "Then build a data quality scorecard so that you can tell what exactly the problems are and how fixing them is going to improve your data."
Does Web 2.0 mean more data quality problems?
The ever-increasing amount of data that organizations deal with and the emergence of Web 2.0 technologies are adding significantly to the data quality problem, according to one conference attendee who sat in on the keynote presentation.
The attendee, an enterprise architecture professional with a U.S. government agency who asked that his identity be kept private, said that the emergence of Web 2.0 systems known as mash-ups -- applications that combine data from different sources and are becoming increasingly popular in corporate environments -- is adding a great deal of complexity to the data quality issue.
"When you start doing mash-ups and Web servicing data in different places, your ability to control and structure the data interchanges is gone, and so you've got a whole new generation of data quality problems," he said. "Structured rules of warehousing are being blown out of the water these days."
The attendee said that he agreed with Maydanchik's contention that it's essential for organizations to form dedicated data quality teams in order to get some semblance of a handle on data quality issues. These teams, he said, are crucial to helping organizations plan for data quality before beginning any sort of data governance issue.
"It's a matter of lifecycle management [and] the perennial attempt to get planning ahead of building and buying," he said. "But that doesn't solve the mash-up problem, and it doesn't solve [the problem of] others using your data."