Q
Problem solve Get help with specific problems with your technologies, process and projects.

Match two words in a column

I have two tables, and both have a column called full_name. I want to match at least two words in the full name column. For example, "John Alder Smith" and "John F Smith" is a match, while "Peter Duncan Doyle" and "Peter Parker" is not a match.

I have two tables, and both have a column called full_name. I want to match at least two words in the full name column. For example, "John Alder Smith" and "John F Smith" is a match, while "Peter Duncan Doyle" and "Peter Parker" is not a match. Thanks for your help.

The SQL part of the solution to this problem is to use a cross join, and then a WHERE clause to do the match:

select t1.full_name as t1_name
     , t2.full_name as t2_name
  from table_one as t1
cross
  join table_two as t2
 where t1.full_name resembles t2.full_name 

Why a cross join? Just to acknowledge that we are comparing every name in table one to every name in table two in order to find matches. At least, that's what the problem sounded like to me.

Now, about this mysterious resembles operator. Of course, there is no such thing, at least not tailor-made to match at least two words in two columns. Our challenge now is to find a way to do this with SQL.

The tough part is breaking up the name into words. For the general case, an unknown number of words, we might employ an auxiliary integers table, but let's assume that for full names, four words (names) is a good working maximum, as this simplifies the query a bit.

select t1.full_name as t1_fullname
     , t2.full_name as t2_fullname
  from table_one as t1
cross
  join table_two as t2
 where case when ' '||t1.full_name||' ' 
           like '% '||word(t2.full_name,1)||' %'
            then 1 else 0 end
     + case when ' '||t1.full_name||' ' 
           like '% '||word(t2.full_name,2)||' %'
            then 1 else 0 end
     + case when ' '||t1.full_name||' ' 
           like '% '||word(t2.full_name,3)||' %'
            then 1 else 0 end
     + case when ' '||t1.full_name||' ' 
           like '% '||word(t2.full_name,4)||' %'
            then 1 else 0 end
       >= 2

Each CASE expression compares a separate word from t2.full_name, to the complete t1.full_name. Each LIKE comparison consists of two terms, with a space appended to both the front and back of each term being compared, like this:

   ' Peter Duncan Doyle ' like '% Peter %'

We need spaces around the name Peter in the right term, '% Peter %', because we don't want to match Peterson. We therefore also need to append a space to both the front and back of the left term, the entire full name, in order to find a word at the beginning or end of the full name.

Thus we test the first four words of t2.full_name, and for every word found within t1.full_name, we add 1 to a total. And if this total is 2 or more, the full names are considered to match.

The only thing we haven't done yet is explain how to extract the separate words out of t2.full_name. As you probably guessed, there is no WORD function. Depending on your database system, you might use some combination of nested POSITION and SUBSTRING functions, to extract words based on how many spaces you detect in the full name going from left to right. Granted, by the time you get to the CASE expression for the fourth word, with POSITION and SUBSTRING functions nested four deep, it does get ugly. For this reason, look to see if your database system offers any other string handling functions to make this part easier. For example, MySQL has the SUBSTRING_INDEX function, which can make this task easier. But if your database system allows you to declare a user defined function, then you could write your own WORD function and use it exactly as shown above.

This was last published in October 2005

Dig Deeper on Oracle and SQL

PRO+

Content

Find more PRO+ content and other member only offers, here.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close