I am currently developing a customer-centric data warehouse for a pharmaceutical company. The company recently merged with another company, and now I have two lists of customer attributes with some customers common between the two companies. How do I go about merging the two lists to create a single dimension?
Of course the answer is "it depends".
If you will continue to receive customer data from these two systems, thus you have two systems of record for customer data, you need to develop a scheme for matching/merging customer that are common across the two sources. This could be done on a set of "fuzzy" logic that does a "like match" on attributes such as name and address. Exact matches can be associated to each other systematically. Close matches may need to be queued up for final determination by the Customer Data Steward.
Set up your target customer table on the data warehouse such that the data warehouse manages the unique keys, using a surrogate key. Maintain the source system unique id/data warehouse unique cross reference within the data warehouse. The ETL tool can populate and manage this. If you need help from the data steward to explicitly identify matches that could not be systematically matched, you may need to provide an end-user maintenance panel or GUI screen so the data steward can update the source/data warehouse unique id cross reference.
An alternative to maintaining the cross reference within the data warehouse would be to make some slight modifications to source systems such that the data steward can identify the logical equivalent customer in the other source system. However, this is messy.
For More Information
- Dozens more answers to tough data warehousing questions from Mike Lampa are available here.
- The Best Data Warehousing and Business Intelligence Web Links: tips, tutorials, scripts, and more.
- Have an DW tip to offer your fellow administrators and developers? The best tips submitted will receive a cool prize. Submit your tip today!
- Ask your technical data warehousing questions -- or help out your peers by answering them -- in our live discussion forums.
- Ask the Experts yourself: Our SQL, database design, Oracle, SQL Server, DB2, metadata, object-oriented and data warehousing gurus are waiting to answer your toughest questions.
Company leverages API integration to improve customer experience
Dig Deeper on Oracle data warehousing
Related Q&A from Mike Lampa
When trying to design a data warehouse, we often try to model the database on the operational data model. Are there any guidelines in trying to ... Continue Reading
What is a surrogate key in a table? Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.