Q

Fuzzy matching in SQL

Can you do fuzzy matching with SQL? I want to retrieve a set of results based on how closely they match a certain string.

Can you do fuzzy matching with SQL? I want to retrieve a set of results based upon how closely they match to a certain string. For example, if the input string is SMITH, I want to retrieve all similar results, such as SMYTH, AMITH, SMITH, SMYTHE, etc., ideally with a measure of match closeness, e.g., 98%.

The answer is no, not really—but you can come close.

Most databases offer text searching using the familiar LIKE operator. Matches are exact (at best, collation- and case-insensitive), so this is not really a fuzzy search.

Many databases offer fulltext searching. MySQL's is fairy robust (if not always 100% accurate; it ignores words present in more than 50% of all rows, or words shorter than 4 characters). With an ordinary text search, a search for "big dogs" will not return a result such as "dogs with big noses" even though both words that were in the search phrase are present. Fulltext searching overcomes the exact match problem, and may even offer options such as setting a parameter for how "near" one word must be to another.

Fulltext searches may also involve stemming. If you search for "data modelling", stemming would allow the search to return results including "data models" because both "modelling" and "models" have the same stem word.

Some databases offer other special functions, such as the SOUNDEX function in SQL Server. This function would consider SMITH, SMYTH, and SMYTHE to be similar, but not AMITH. Perhaps this is sufficient for your needs.

True fuzzy searching takes into consideration the actual meaning of words. Do a search for "family name" and you should see results with "surname" included. I'm not aware of any database that has this capability in its SQL.

This was first published in October 2006

Dig deeper on Oracle and SQL

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchDataManagement

SearchBusinessAnalytics

SearchSAP

SearchSQLServer

TheServerSide

SearchDataCenter

SearchContentManagement

SearchFinancialApplications

Close