My book table has 4 instances of co-authored books, but I cannot retrieve them. So in other words, I'm trying to list the book (id, title, and cost) and author (id, name) and publisher (id, name) info for all co-authored books in the collection. This is what I have so far to join the tables together:
SELECT book.id, book.title, book.cost, author.id, author.name, publisher.id, publisher.name FROM book, author, publisher, writtenby WHERE book.id = writtenby.bookid AND writtenby.authorid = author.id AND book.publid = publisher.id;
I'm not sure where to go from here to get the four rows.
Whenever a query involves a number of tables, I find it helpful always to start writing the join with the most restrictive table first.
For example, if you are writing a query to retrieve all the books from a specific publisher, the restriction in this query would be the specific publisher. Think of it as the difference between:
Retrieve all books, then retrieve the publisher for each book, and then throw away both the book and the publisher if the publisher isn't the right one.
Retrieve the specific publisher, then retrieve only the books for that publisher.
Does that make sense? See the difference?
Okay, so my rules for writing the FROM clause are:
Start with the most restrictive table first.
If the first table has any many-to-one relationships, join those tables next. Specifically, for each row of the first table, there is at most one row of the joined table(s).
If any of the tables thus far joined has any one-to-many relationships, you can join only one of these. Otherwise, if you join more than one one-to-many table, you get cross join effects.
Continue by joining many-to-one tables as necessary.
So, in your query, the table you want to start with is books. Why is this most restrictive? Because you want only co-authored books, and there are many fewer co-authored books than those written by a single author.
Next, join to publishers, because a given book has just one publisher. (If a book has two publishers, it isn't the same book. It might have the same name, but it'll have a different ISBN.)
Now you can join to writtenby, which is your books-to-authors relationship table. Specifically, for a given book, it is a one-to-many relationship.
Finally, you can join to authors, because from writtenby to authors, it's a many-to-one relationship.
This should all make a lot more sense if you were to abandon the old style FROM clause comma-delimited list of tables with WHERE conditions, and started using JOIN syntax:
SELECT book.id , book.title , book.cost , author.id , author.name , publisher.id , publisher.name FROM book INNER JOIN publisher ON publisher.id = book.publid INNER JOIN writtenby ON writtenby.bookid = book.id INNER JOIN author ON author.id = writtenby.authorid
Doesn't that look better? The intent is certainly clearer, I think.
Now the only remaining task is to restrict the first table, the books table, to only co-authored books. That requires a subquery:
WHERE book.id IN ( SELECT bookid FROM writtenby GROUP BY bookid HAVING COUNT(*) > 1 )
The subquery is uncorrelated, which means it can be executed before the main query, to produce a list of book ids for only co-authored books. Thus the main query is most efficient because it retrieves the publisher and author data only for those books.
Dig Deeper on Oracle development languages
Related Q&A from Rudy Limeback
Read an example of an SQL case expression from our SQL expert Rudy Limeback. Continue Reading
Read SQL expert Rudy Limeback's advice for counting combinations in a table with SQL's GROUP BY clause Continue Reading
Read about the Mimer Validator, a tool used to verify your SQL code, in this tip from SQL expert Rudy Limeback. Continue Reading