PASS Book Review by: Tracy McKibben
Rating: 



(5 out of 5)
Overview
If you are a developer or software architect responsible for writing SQL queries, or a manager overseeing a group of such people, this book is an absolute must-read. There is both a science and an art to writing SQL code. As the author states, nearly any developer today can write functionally correct (the “science”) SQL code, but few can write efficient (the “art”) SQL code. With the ever-increasing volumes of data, and the reliance on that data for making business decisions, queries must not only be correct, but they must be fast. Regardless of the platform, it is becoming imperative that developers know and understand how to write the fastest, most efficient SQL queries possible. They truly need to turn SQL coding into an art form.
Who Will Benefit:
The book is targeted at developers with one year (or more) of database experience, their managers, and the software architects responsible for designing the database components. I would broaden the target audience to include ANYONE who writes SQL code for a living, including experienced DBAs and seasoned developers. I’ve been writing SQL for almost ten years, and I found the book to be an excellent refresher, and even found new insight in a couple of topics. It is important to note that this is NOT a book of problems and solutions, it is a book intended to help you understand SQL techniques, hopefully avoiding problems before they occur.
Chapter Breakdown:
Chapter 1: Laying Plans – Designing Databases for Performance
Speaking to the developer who is new to database design, this first chapter discusses the familiar topics of normalization, atomicity, keys, and attributes. It also discusses briefly the need to consider the accumulation of historical data and how that data will affect performance. As the chapter title indicates, the overall discussion focuses on designing a database with performance in mind, now and in the future.
Chapter 2: Waging War – Accessing Databases Efficiently
This chapter stresses heavily the need to allow the SQL query engine to do its job. One of the hardest things for a developer new to SQL is to make the shift from procedural programming to set-based operations. The authors try to drive home the point that the query engine is extremely powerful, and very good at figuring out the best way to get the data you’re asking for. The focus is on set-based operations, minimizing trips to the database, and telling the query engine what you want, letting it decide the best method of retrieving it.
Chapter 3: Tactical Dispositions – Indexing
This is a concise but informative chapter, illustrating the effect that indexes have on data access and modifications. The focus is on how proper indexes can greatly improve access to data, but that excessive indexes can negatively effect updates of that same data.
Chapter 4: Maneuvering – Thinking SQL Statements
This is probably the most important chapter in the book. It begins by defining the boundaries between reporting requirements, relational theory, and implementation, and the SQL language and the query optimizer relate to these. The reader is encouraged to understand the various factors that affect query performance, the need to filter the result set as quickly as possible, and is introduced to the issue of concurrency. Subqueries are also presented as an alternative method of filtering data. You can view this chapter on the O’Reilly web site at http://www.oreilly.com/catalog/artofsql.
Chapter 5: Terrain – Understanding Physical Implementation
Indexes are again discussed in this chapter, this time the reader is given an explanation of how indexes can serve as alternative physical storage of data, and how clustered indexes affect data storage and retrieval. Data partitioning is also covered, both pros and cons of various partitioning schemes.
Chapter 6: The Nine Situations – Recognizing Classic SQL Patterns
This chapter attempts to coach the reader to recognize the nine basic “patterns” of data access, such as the need to scan or filter a large data set in order to obtain a single row. Also discussed are various methods of joining tables, including self-joins, encouraging the reader to look for less-than-obvious methods of joining related tables for filtering and aggregation.
Chapter 7: Variations in Tactics – Dealing with Hierarchical Data
The discussion here is geared towards representing hierarchical tree structures in a SQL database. The three accepted methods of tree representation are presented (adjacency model, materialized path model, and the nested set model), pros and cons of each are discussed, as are excellent examples of when to use each method, and the performance aspects of each.
Chapter 8: Weaknesses and Strengths – Recognizing and Handling Difficult Cases
Using the proper tool is essential to completing any task, and working with databases is no exception. This chapter demonstrates some cases where using pure SQL is not always the best approach. The use of other programming languages, such as PHP, to build SQL queries is covered, as is the use of “dynamic” queries.
Chapter 9: Multiple Fronts – Tackling Concurrency
Any multi-user database application will have concurrency issues, this chapter explains various techniques for handling those issues, as well as methods to reduce or prevent them. Locking and transactions are covered in detail. Additional coverage of indexes is also provided, along with discussions about hot-spotting, primary keys, and the effects of architectural design on concurrency.
Chapter 10: Assembly of Forces – Coping with Large Volumes of Data
In the real world, databases grow larger and larger, usually leading to one performance problem after another, due to poor design and poor planning in the initial phases of development. The focus of this chapter is on dealing with these performance problems, and also on doing the proper planning to prevent them from occurring in the first place. The reader is introduced to data warehousing and pre-processing techniques, fact tables and dimensions, and ETL processes.
Chapter 11: Stratagems – Trying to Salvage Response Times
As I stated earlier, this is NOT a book full of problems and solutions. This particular chapter deviates from that a bit, by providing examples of some common problem queries, and alternate ways of achieving the same results. Pivot tables, attribute tables, concatenated fields, and all of their related headaches are offered up, with explanations of why each is to be avoided, and how to avoid them. Optimizer hints are also discussed.
Chapter 12: Employment of Spies – Monitoring Performance
The final chapter of the book describes methods of detecting problems proactively, by using various monitoring tools to establish a performance baseline, determining what is “normal” for your environment. Analysis of execution plans is discussed. The reader should finish this chapter understanding exactly what to do when the “database is slow” complaints start coming in.
Pass Book Review by:
Tracy McKibben
tracy@realsqlguy.com
http://www.realsqlguy.com
From the Publisher: O’Reilly Media, Inc.
ISBN: 0596008945
Published: March 2006
Pages: 349
Publisher’s Price: $44.99