Q
Problem solve Get help with specific problems with your technologies, process and projects.

# Normalize a column containing a list

I've just completed an assignment for my databases class in which we were given a database and told to put it in...

1NF. Several of the columns contained multiple values (e.g. 1;2;3;4), and I wanted a new row for each value in the column. I ended up projecting out the columns with multiple values and then importing them back in split up by semicolons, copying them into the original database and then using a bunch of select statements to get it down to what I wanted. I'm dying to know what a better way to do this is. If you don't understand what I'm talking about, just try to explain how to split a column that contains multiple values separated by a semicolon into separate columns.

The obvious way to do this is with a scripting language. Both Microsoft SQL Server and Oracle have comprehensive languages (Transact-SQL and PL/SQL, respectively) in which this would be both easy and efficient.

The scripting approach involves a loop, in which the target column is searched for an occurrence of the list separator (a semicolon in your case, more often a comma). In pseudocode:

```variable tempstring varying character
set tempstring = columnvalue
while position(';' in tempstring) > 0
begin
insert
into normalizedtable
( newcolumn )
values
( left( tempstring
, position(';' in tempstring)-1) )
set tempstring =
substring(tempstring
from position(';' in tempstring)+1
for length(tempstring)
- position(';' in tempstring) )
end
insert
into normalizedtable
( newcolumn )
values
( tempstring )```

The POSITION function finds the separator, or returns zero if one is not found. If a separator is found, then the substring, up to the position before the separator, is extracted with the LEFT function, and used in the insert statement to insert a new row. Then the substring just used is chopped out of the string, by setting the string equal to the remainder of the string starting one position beyond the separator, and looping continues. After looping has finished, the last part of the string is used to generate the last row.

In a database that has a built-in scripting language like SQL Server or Oracle, this script would be saved as a stored procedure and invoked with one call to the database. With an external scripting language, like PHP with MySQL for example, control would be bouncing back and forth between the scripting engine and the database engine for each insert, so an external script, while perhaps just as easy to write, is substantially less efficient, although still better than doing it manually.

But is there a straight SQL solution, one that does not involve scripting? If there is, it would have to involve a cross join of some sort, since it must be able to generate multiple rows for each row that contains at least one separator. It will more than likely involve a cross join with an integers table, so that if, for example, there are four separators in a given column value, then five rows would be generated, probably by the cross join with integers 0 through 4.

That said, I have seen an SQL solution, one written by Joe Celko. It was truly complex, involving CAST, MAX, COUNT, SUBSTRING, and DATALENGTH functions, a double cross join with two copies of the integers table, and a GROUP BY clause. Not for the faint of heart. My advice is to use a script, or even do it by hand. After all, you don't normalize on the fly all the time, right?

This was last published in November 2002

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### NewSQL databases rise anew -- MemSQL, Spanner among contenders

The NewSQL database was almost hidden when Hadoop and NoSQL arose. Now, as more big data teams move toward production uses, ...

• ### Good data quality for analytics becomes an IT imperative

High-quality data is a must for analytics applications. That's driving more demand for data quality tools, but quality ...

• ### Data quality for big data should include a focus on usability

To help make big data analytics applications more effective, IT teams need to augment conventional data quality processes with ...

• ### Tableau data governance policies key to enterprise use

Data governance policies are key to effectively implementing Tableau and avoiding common pitfalls that can often affect ...

• ### Better Tableau implementation gives BI dashboards a boost

Building good Tableau dashboards is about more than just pretty visualizations. Users say the process should incorporate strong ...

• ### Tableau data visualization tool gets enterprise upgrade

Tableau is adding low-level capabilities to raise its software's standing above that of a data visualization tool for business ...

## SearchSAP

• ### SAP boosts data integration with SAP Data Hub and Vora

SAP Data Hub and Vora are both data integration tools, but Data Hub has a broad mission to manage data from different sources, ...

• ### User experience analytics tool helps Graybar improve support

Electric supply firm Graybar improved help desk operations and app performance for its SAP Suite on HANA system when it got the ...

• ### SAP promotes blockchain services, suggests IoT use cases

Blockchain use cases for business are still limited, but SAP believes the new SAP Leonardo Blockchain Co-innovation program will ...

## SearchSQLServer

• ### Dissect the SQL Server on Linux high availability features

SQL Server 2017 on Linux gives IT shops greater flexibility, but there are some limitations and changes to the way high ...

• ### Microsoft SQL Server 2017 for Linux hits GA, IT pros encouraged

Microsoft SQL Server 2017 becomes available in October. In addition to a version that runs on Linux, new features include support...

• ### SQL Server graph database tools map out data relationships

Get equipped to take advantage of the addition of graph database features in SQL Server 2017 to use graph structures to represent...

## TheServerSide.com

• ### Migrations to Oracle's Java SE 9 platform may be delayed

Oracle did a great job getting Java SE 9 released earlier this year, but modularity and various smaller updates may not be enough...

• ### Java 18.3 marks the future of Java at JavaOne 2017

At JavaOne 2017, Oracle identified four projects that will have a significant impact on the future of Java: Project Valhalla, ...

• ### How blockchain security is driving digital transformations

Whether it is a secure cloud, a secure mobile device or a secure IOT interaction, organizations are making blockchain security a ...

## SearchDataCenter

• ### Words to go: HCI platforms

Implementing HCI systems in the data center is a big undertaking for IT. Learn the basics of this emerging technology and its ...

• ### Software-defined memory trends yield speed, high performance

A new class of memory technologies is coming to the data center landscape. Educate yourself on the emerging tech, including tools...

• ### Explore uses for virtual data center architecture with VMC on AWS

The popularity of a virtual data center has risen because of the VMware Cloud on AWS announcement at VMworld 2017. But which ...

## SearchContentManagement

• ### Slack vs. Teams vs. Spark: Which is the best collaboration tool?

Ever wonder how the leading cloud collaboration tools stack up against each other? Our comprehensive chart pits platforms from ...

• ### Experts: Updating customer digital experience is a tall task

Gone are the days of a quick website launch. According to speakers at the Acquia Engage conference, redesigning a website is now ...

• ### Content personalization fuses marketing automation, content management

As the standards get higher for digital experiences, content personalization engines could be the answer for faster and better ...

## SearchFinancialApplications

• ### Finance IT case study: Reporting secrets of Derek Rose

CEO Sacha Rose says specialist reporting tools have saved the company thousands by avoiding unnecessary mistakes.

• ### WestJet turns to gamification to help its Oracle ERP users soar

WestJet's initial gamification project focuses on expense reporting.

• ### The Transformation of HR is Underway

HR is being transformed while we watch.

Close