Q
Problem solve Get help with specific problems with your technologies, process and projects.

# Table design: go across or go down, continued

In reference to Pat Phelan's answer to the question of whether to "go across or go down": If you understand normalization, then you know that Mr. Phelan is completely wrong.

First normal form simply means ensuring that each attribute represents an atomic, or non-decomposable value. Both the "go across" and "go down" versions of his monthly expenses example meet this requirement.

2nd normal form means satisfying 1NF and ensuring that each attribute depends entirely on the key. Given that the key for "go down" would be the month and the expense type, and the key for the "go across" would be the month, then both versions meet this requirement.

3rd normal form means satisfying 2NF and ensuring that no attribute represents a fact about another attribute. This is where the "go down" version fails. With one column representing the expense type and another column representing the expense, the expense type column represents a fact about the expense column. In other words, they are not mutual independent.

Aside from this discussion of normalization, the "go down" version is simply often a worse option, for several reasons:

1) To store a set of expenses for a month, you have to do an insert for every type of expense. This is a comparitively expensive set of operations (storing the expenses for a month with the "go across" version only requires one insert).

2) Querying is incredibly more expensive with the "go down" version. Here's the most clarifying example. Assume the following tables:

```name:go_down
month expense_type amount
----- ----------- -------
jan   salary          500
jan   rent            200
feb   salary          300
feb   rent            200

name:go_across
month salary rent
----- ------ ----
jan      500  200
feb      300  200
```
Now, as a manager, I ask: in what months was salary greater than 400 and rent greater than 180? Here are the SQL select statement for the two tables:
```go_down:
select month from go_down
where expense_type = "salary"
and amount > 400
intersect select month from go_down
where expense_type = "rent"
and amount > 180

go_across:
select month from go_across
where salary > 400 and rent > 180
```
As you can see, the query for go_across is significantly more efficient than the query for go_down.

Mr. Phelan says an advantage of the go_down version is how easy it is to construct a query to get the total expenses for the month. He's right - it is easy to construct the query. Here it is:

```select sum(amount) from go_down where month = "jan"
```
whereas the query for the go_across example is:
```select salary + rent from go_across where month = "jan"
```
It is a bit easier to construct the query for the go_down version. BUT it is more efficient to run the query using the go_across version! With the go_down version, the RDBMS has to do more disk reads to get the answer.

Lastly, note that I put a caveat on my statement above in saying that the "go down" version is USUALLY worse than the "go across" version. Specifically, the "go down" version is worse than the "go across" version when you have a limited number of "types" that can be specified beforehand. Mr. Phelan mentions MS Money and Quicken, saying that they are normalized, and they use the "go down" method. Yes, MS Money and Quicken use the "go down" method, but we now see that they are denormalized. But in this case the "go down" method is the best option. And it's for the simple fact that the developers of MS Money and Quicken do not know what expense types users will define and want to use when entering transactions. Every database designer should be cognizant of these pluses and minuses when they design their databases.

I made this mistake in a budget application that I designed the database structure for in roughly 1986, and I cursed the design until we retired the application in 1996! This is a subtle kind of error that often sneaks up on the inexperienced data modeler, or one that has never taken a significant application from "cradle to grave" through the life cycle. Silly me, I assumed that years always had twelve months!

While it seems like you can make a choice to normalize a table in many different ways, and still reach a manageable form, that is rarely if ever the case. The "across" form of the database violates the first normal form as described at http://databases.about.com/library/weekly/aa081901a.htm?terms=normalization because it has more than one kind of expense going across the rows. Naming one column "salary" and one "rent" doesn't make them any different than "expense1" and "expense2" from the standpoint of normalization.

The fundamental issue from my perspective isn't even the normalization. Hardware is cheap. Wetware (people) are expensive! You can afford enormous amounts of disk I/O in order to support a more flexible design; because disk latency is measured in milliseconds and schema changes are measure in hours (for VERY small changes) to months (for normal size changes). The "go down" schema will support an arbitrary number of expense types, which can be changed at runtime by defining a new type. The "go across" schema can support whatever existed when the application was written, meaning that any new or deleted expense type requires a person to find and modify every piece of affected code!

If you are using an older database engine that is expensive in terms of primitive DML operations (like INSERT, UPDATE, and DELETE), you sometimes had to denormalize to accommodate the deficiencies of the engine. Older versions of DB2 and Oracle were notorious for these kinds of problems, but current versions don't have much of a performance penalty.

A few engines also had problems with indexing, such that you had to keep row sizes small to improve performance. MS-SQL had this problem before the release of version 6.5, but for the most part this is a thing of the past too.

While there might be a small performance penalty in the execution of the SELECT, the cost in terms of application rewrites to support an additional expense type is staggering. For the folks that prefer to load their database into memory and write linked list managers to improve performance, the difference might be significant. To a business manager that wants to write code once then use it for months or years without a rewrite; the "go down" design is preferred hands down.

• The Best Database Design Web Links: tips, tutorials, scripts, and more.
• Have an Oracle or SQL tip to offer your fellow DBAs and developers? The best tips submitted will receive a cool prize. Submit your tip today!
• Ask the Experts yourself: Our SQL, database design, Oracle, SQL Server, DB2, metadata, object-oriented and data warehousing gurus are waiting to answer your toughest questions.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDataManagement

• ### Data integration vs. ETL in the age of big data

Data integration provides a consistent view of business performance across multiple data sources, though it needs to keep up with...

• ### 5 steps to an improved data quality assurance plan

Consultant David Loshin offers tips on developing a data quality strategy that can help identify data errors before they cause ...

• ### DOD CDO shares 7 data management best practices

The DOD's first chief data officer, Michael Conlin, shares tips on how organizations can modernize their data management ...

• ### Promethium tool taps natural language processing for analytics

The software, called Data Navigation System, was designed to enable non-technical users to make complex SQL requests using plain ...

• ### Intelligence as a service is key to nonprofit's urban planning

Cobalt Community Research, a nonprofit organization based in Michigan, uses data collected and packaged by TruFactor for urban ...

• ### GoodData BI customers now able to use data on Amazon Redshift

GoodData partners with Amazon Web Services to allow the vendor's users to store and manage their data in Amazon Redshift, a cloud...

## SearchSAP

• ### CEO of SAP Bill McDermott abruptly resigns

Bill McDermott, CEO of longtime ERP, CRM and BI software vendor SAP, resigned Thursday triggering a succession plan that promotes...

• ### What is an SAP Intelligent Enterprise? Don't look for a clear answer

What does Intelligent Enterprise actually mean? If the SAP term confuses you, you're not alone. Learn what the experts say and ...

• ### Pros and cons of SAP migration to the cloud

Hosting SAP on cloud services can bring added security and scalability. But organizations must also watch out for issues such as ...

## SearchSQLServer

• ### SQL Server database design best practices and tips for DBAs

Good database design is a must to meet processing needs in SQL Server systems. In a webinar, consultant Koen Verbeeck offered ...

• ### SQL Server in Azure database choices and what they offer users

SQL Server databases can be moved to the Azure cloud in several different ways. Here's what you'll get from each of the options ...

• ### Using a LEFT OUTER JOIN vs. RIGHT OUTER JOIN in SQL

In this book excerpt, you'll learn LEFT OUTER JOIN vs. RIGHT OUTER JOIN techniques and find various examples for creating SQL ...

## TheServerSide.com

• ### Pivotal, Microsoft team up to deliver Azure Spring Cloud

Azure Spring Cloud, jointly developed by Microsoft and Pivotal, lets Spring developers bring apps to the cloud without concern ...

• ### GitHub tempts enterprises with Semmle, security enhancements

With the Semmle semantic code analysis engine freshly added to its quiver, GitHub gives corporate development teams one way to ...

• ### You need more than web app security to stop API attacks

API and web application vulnerabilities may share some common traits, but it's where they differ that hackers will target.

## SearchDataCenter

• ### IBM quantum computers' usefulness in sight -- using binoculars

IBM's Bob Sutor discusses Big Blue's new quantum systems and computation center, the realities of quantum computing today and how...

• ### Addendum sets ASHRAE 90.4 as energy-efficiency standard

The publication of ASHRAE 90.4 in 2016 brought a new set of energy guidelines, but the industry still used Standard 90.1. A new ...

• ### How to realize the benefits of software-defined infrastructure

Software-based infrastructure and services can streamline management and application data. Before you add them to your data ...

## SearchContentManagement

• ### Digital asset management benefits the future of work

Digital asset management promises to transform digital work, making it easier to tag documents, analyze video and provide image ...

• ### Box cloud content management adds PDF tools, boosts security

Box partners with Adobe in releasing a new set of PDF tools that can be used within Box's content management cloud to enable ...

• ### AI optimizes content management workflows, personalizes CX

AI is poised to revolutionize content management workflows, bringing order to enterprise documents, greater insights into ...

## SearchHRSoftware

• ### Get smarter about HR data security

HR deals with mountains of sensitive employee data and must do a better job of protecting that information. Here's a look at how ...

• ### Walmart, FedEx make case for VR training

The payback from virtual training may be improved retention and shorter training times. An hour-long training period can be ...

• ### Do you know how to improve employee experience? Test yourself

Think you know what it takes to create a positive employee experience for your workers? Take this EX quiz to test your savvy and ...

Close