Problem solve Get help with specific problems with your technologies, process and projects.

Extracting a part number from a text field

Our database contains a text field that users input information in that is not in a specific format. We need to extract data (specifically a part number) from this field for reporting purposes. What would be the best method to extract this information? The part numbers always begin with a 0, 1, 5 or 7 and are 7 characters in length.

If the text field is exactly 7 characters long, then it either contains a part number or it doesn't --

     select textfield
       from yourtable
      where substring(textfield from 1 for 1)
            in ('0','1','5','7')

If it's longer than 7 characters, you're in trouble. You want my honest answer? Write a program. In robust databases like Oracle and SQL/Server, the program can be a stored procedure. But do yourself a favour and don't try it in SQL.

To get an idea how tough this is, consider that the SQL has to extract a 7-character substring from somewhere within the text field, starting at the first position within the text field where a certain character is found. The problem is, you don't have too many tools -- expressions and functions -- at your disposal for this kind of processing.

Let's start by extracting part numbers that begin with a 5 --

     select substring( textfield
                  from position('5' in textfield)
                   for 7 )
       from yourtable
      where position('5' in textfield) > 0

The WHERE condition ensures that you actually do find a '5' somewhere in the text field. That's important, because the POSITION function returns a 0 if it doesn't find what it's looking for, and then the SUBSTRING function would end up taking the first six characters of the text field (since position 0 is outside of the string).

However, the above query will still produce incorrect results. Consider the following rows --

    row   textfield
     1    part number 5093705
     2    5234567 is the part number
     3    i wanted 5 widgets, fred, not 3
     4    6365937 is the part number

The query above will return

      5 widge
      5937 is

So perhaps you need to look for ' 5' instead, i.e. with a leading blank. But that wouldn't find the 2nd row above, so you'd have to test for that condition separately...

     select substring( textfield
                  from position(' 5' in textfield) + 1
                   for 7 )
       from yourtable
      where position(' 5' in textfield) > 0
  union all
     select substring( textfield
                  from 1
                   for 7 )
       from yourtable
      where substring( textfield 
                  from 1
                   for 1 ) = '5'

This would still pick up rows 3 and 4 (incorrectly). And that's only the beginning of our problems, because we've only done part numbers beginning with a 5.

What you really want is a stored procedure that can loop through the text field character by character. PL/SQL and Transact-SQL can do this -- contact your resident database programmer.

For More Information

  • The Best SQL Web Links: tips, tutorials, scripts, and more.
  • Have a SQL tip to offer your fellow DBA's and developers? The best tips submitted will receive a cool prize--submit your tip today!
  • Ask your technical SQL questions--or help out your peers by answering them--in our live discussion forums.
  • Ask the Experts yourself: Our SQL, database design, Oracle, SQL Server, DB2, metadata, and data warehousing gurus are waiting to answer your toughest questions.

Dig Deeper on Oracle and SQL

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.