Hope you can help here. I have an application that I am designing that will list the results of a series of golf tournaments. The database schema is as follows:
dates (table) record_id (PK; auto) date date_writeup_text players (table) record_id (PK; auto) player_name player_gender player_age divisions (table) record_id (PK; auto) division_name division_abbr scores (table) record_id (PK; auto) date_record_id player_record_id division_record_id score
First off, is the normalization correct here? Too much or too little? Suggestions?
Secondly, I want to create a points system that will calculate points based on each date. You get one point for participation, four points for winning your division and two points for each player you finish ahead of. Then I need to display the results by division. I've got it to the point you can add/edit/delete any of the categories, and it displays the basic data - scores by date, but I can't get my mind around how to display scores for a given date by division (where the division is listed only once like a heading). I'm not asking anyone to write it for me, just give me some pointers since I am a struggling mid-level hacker who wants to learn a bit more.
The normalization is fine. The SCORES table does not need its own surrogate (autonumber) key; the primary key should be the composite of date_record_id, player_record_id, division_record_id, and of course each of those is a foreign key back to its respective table. Do not use the reserved word DATE to name the date field; choose something like tournament_date.
The query to assign points based on your system is:
select p.player_name , ( select count(*) from scores where player_record_id = p.record_id ) * 1 -- tournaments + ( select count(*) from scores s where player_record_id = p.record_id and score = ( select min(score) from scores where division_record_id = s.division_record_id ) ) * 4 -- divisionwins + ( select count(*) from scores s inner join scores s2 on s.division_record_id = s2.division_record_id and s.player_record_id <> s2.player_record_id and s.score < s2.score where s.player_record_id = p.record_id ) * 2 -- finishaheadof as totalpoints from players p order by 2 desc
You mentioned you wanted no one to write this query for you, but sometimes seeing the result is more helpful than reading a long explanation.
The query uses scalar subselects. Each subselect selects rows to look at based on matching the value of the player_id from the players table in the main query. In the main query, the players table has as alias, p, which you can see being used inside each subselect, in what looks like a join condition. They are called correlated subselects because of this matching. It almost looks like each subselect is "executed" over and over, once for each row of the main players table. Perhaps there is some appeal in imagining correlated subselects to work that way, but it is a poor analogy, because the database optimizer almost always works more efficiently than that.
The important point, though, is that each subselect returns a single value (hence the name scalar) for each row of the main query.
As for your question about listing data by division, and only showing the division once, like a heading, my advice is to avoid completely even thinking about attempting to achieve this with SQL. Do report formatting in your application program.
For More Information
- Dozens more answers to tough SQL questions from Rudy Limeback.
- The Best SQL Web Links: tips, tutorials, scripts, and more.
- Have an SQL tip to offer your fellow DBAs and developers? The best tips submitted will receive a cool prize. Submit your tip today!
- Ask your technical SQL questions -- or help out your peers by answering them -- in our live discussion forums.
- Ask the Experts yourself: Our SQL, database design, SQL Server, DB2, object-oriented and data warehousing gurus are waiting to answer your toughest questions.
This was first published in February 2004