Q
Problem solve Get help with specific problems with your technologies, process and projects.

# Match two words in a column

## I have two tables, and both have a column called full_name. I want to match at least two words in the full name column. For example, "John Alder Smith" and "John F Smith" is a match, while "Peter Duncan Doyle" and "Peter Parker" is not a match.

I have two tables, and both have a column called full_name. I want to match at least two words in the full name column. For example, "John Alder Smith" and "John F Smith" is a match, while "Peter Duncan Doyle" and "Peter Parker" is not a match. Thanks for your help.

The SQL part of the solution to this problem is to use a cross join, and then a WHERE clause to do the match:

```select t1.full_name as t1_name
, t2.full_name as t2_name
from table_one as t1
cross
join table_two as t2
where t1.full_name resembles t2.full_name ```

Why a cross join? Just to acknowledge that we are comparing every name in table one to every name in table two in order to find matches. At least, that's what the problem sounded like to me.

Now, about this mysterious resembles operator. Of course, there is no such thing, at least not tailor-made to match at least two words in two columns. Our challenge now is to find a way to do this with SQL.

The tough part is breaking up the name into words. For the general case, an unknown number of words, we might employ an auxiliary integers table, but let's assume that for full names, four words (names) is a good working maximum, as this simplifies the query a bit.

```select t1.full_name as t1_fullname
, t2.full_name as t2_fullname
from table_one as t1
cross
join table_two as t2
where case when ' '||t1.full_name||' '
like '% '||word(t2.full_name,1)||' %'
then 1 else 0 end
+ case when ' '||t1.full_name||' '
like '% '||word(t2.full_name,2)||' %'
then 1 else 0 end
+ case when ' '||t1.full_name||' '
like '% '||word(t2.full_name,3)||' %'
then 1 else 0 end
+ case when ' '||t1.full_name||' '
like '% '||word(t2.full_name,4)||' %'
then 1 else 0 end
>= 2```

Each CASE expression compares a separate word from t2.full_name, to the complete t1.full_name. Each LIKE comparison consists of two terms, with a space appended to both the front and back of each term being compared, like this:

`   ' Peter Duncan Doyle ' like '% Peter %'`

We need spaces around the name Peter in the right term, '% Peter %', because we don't want to match Peterson. We therefore also need to append a space to both the front and back of the left term, the entire full name, in order to find a word at the beginning or end of the full name.

Thus we test the first four words of t2.full_name, and for every word found within t1.full_name, we add 1 to a total. And if this total is 2 or more, the full names are considered to match.

The only thing we haven't done yet is explain how to extract the separate words out of t2.full_name. As you probably guessed, there is no WORD function. Depending on your database system, you might use some combination of nested POSITION and SUBSTRING functions, to extract words based on how many spaces you detect in the full name going from left to right. Granted, by the time you get to the CASE expression for the fourth word, with POSITION and SUBSTRING functions nested four deep, it does get ugly. For this reason, look to see if your database system offers any other string handling functions to make this part easier. For example, MySQL has the SUBSTRING_INDEX function, which can make this task easier. But if your database system allows you to declare a user defined function, then you could write your own WORD function and use it exactly as shown above.

This was last published in October 2005

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### Connectedness is king, as Neo4j graph database ports to Spark

The Neo4j graph database emphasizes easy relationship mapping for diverse data points. Now, its related Cypher query language is ...

• ### NewSQL databases rise anew -- MemSQL, Spanner among contenders

The NewSQL database was almost hidden when Hadoop and NoSQL arose. Now, as more big data teams move toward production uses, ...

• ### Good data quality for analytics becomes an IT imperative

High-quality data is a must for analytics applications. That's driving more demand for data quality tools, but quality ...

• ### Tableau targets data preparation software with Project Maestro

As Tableau and other high-level applications look to automate more functionality, stand-alone data preparation tools have to ...

• ### AI components make tools more than the sum of their parts

AI applications, rather than being one monolithic tool, are built around a diverse collection of tools and techniques that ...

• ### At AT&T, CDO responsibilities to include all things AI

At most companies, the chief data officer role tends to focus on data governance and management issues, but at AT&T, AI is set to...

## SearchSAP

• ### Responsible sourcing can be good for business

A company's reputation and bottom line can be damaged if its suppliers engage in harmful practices. Responsible sourcing and risk...

• ### SAP TechEd 2017 focuses on next-gen development tools

At SAP TechEd 2017, SAP rolled out some new developer tools that are intended to extend the SAP platform and drive development of...

• ### SAP Analytics Cloud helps paper-maker tell a good business story

SAP Analytics Cloud software is helping paper manufacturer Pratt Industries tell the story of monthly forecasts more accurately, ...

## SearchSQLServer

• ### Microsoft technology refresh touches SQL Server, integration tooling

Microsoft is at work on a delicate technology refresh affecting database tuning and architecture, as well as data integration and...

• ### Microsoft boosts SQL Server machine learning services

Python and R are among the tools in the SQL Server machine learning toolkit. Native T-SQL scoring is also on the agenda, as ...

• ### Power BI updates drive Microsoft's latest hybrid cloud efforts

At PASS Summit 2017, Microsoft Azure's strides were measured in steps. These include Power BI updates that bring cloud reporting ...

## TheServerSide.com

• ### Can DevOps problems actually cause projects to fail?

DevOps isn't perfect. There are times when DevOps problems can overwhelm the potential benefits. So, why do some DevOps projects ...

• ### Owning the Java Platform is more of a burden or a blessing

Oracle became stewards of the Java platform as a by-product of their acquisition of Sun Microsystems. But looking back, it seems ...

• ### Migrations to Oracle's Java SE 9 platform may be delayed

Oracle did a great job getting Java SE 9 released earlier this year, but modularity and various smaller updates may not be enough...

## SearchDataCenter

• ### Data center GPU use on the rise thanks to AI, big data

GPU vendors have added new devices and cards for data center servers, as data demanding workloads infiltrated the data center and...

• ### SD-WAN benefits branch networks with simplicity, automation

Traditional branch networks haven't adapted well to new technologies. But a mature SD-WAN market can bring distributed networks ...

• ### Composable infrastructure creates new path to SDDC nirvana

Shiny new products like composable infrastructure and on-premises cloud platforms could offer a way to achieve software-defined ...

## SearchContentManagement

• ### Q&A: New CEO bets on open source future for Acquia CMS

The Acquia CMS took the Red Hat model to content management by commercializing open source Drupal. What's next? We ask co-founder...

• ### CMS analytics arms businesses with a strategic planning edge

Content analytics for CMSes mines business value from free text in data lakes, so it's time to go prospecting for gold with this ...

• ### Enterprise content management systems boost intelligence

Content analytics moves beyond the tried-and-true web analytics style of insights, adding natural language processing and images ...

## SearchFinancialApplications

• ### Finance IT case study: Reporting secrets of Derek Rose

CEO Sacha Rose says specialist reporting tools have saved the company thousands by avoiding unnecessary mistakes.

• ### WestJet turns to gamification to help its Oracle ERP users soar

WestJet's initial gamification project focuses on expense reporting.

• ### The Transformation of HR is Underway

HR is being transformed while we watch.

Close