August 28, 2008

There are 2 ways to participate in the PASS SIG Book Review program. 

1.  PASS members can click on the "Submit your name here to review this book" link on the books available for review to send an email requesting that book.  

2.  All registered users will be able to submit a book review online for a book that they already possess.  Once the review has been posted, you will be sent another book from our Book Review inventory (subject to availability).    Not a Registered User?   No problem, you can become a registered member of the PASS Special Interest Groups at no cost!  Register now!

For more information on how to participate, please contact the PASS SIG Book Review Committee.

Current Articles | Categories | Search | Syndication

Wednesday, November 29, 2006
Inside Microsoft SQL Server 2005: T-SQL Querying by Itzik Ben-Gan
By Grant Fritchey @ 11:44 AM :: 3914 Views ::

PASS Book Review by: Grant Fritchey
Rating:  star.gifstar.gifstar.gifstar.gifstar.gif (5 out of 5)

Overview

 

The "Inside SQL Server X" series has a long history. It started with, as far as I can determine, SQL Server 6.5. I started reading the series with "Inside SQL Server 7" and continued on to the 2000 version. Now that single volume has split and I'll be reading several books. To my knowledge, there are four planned with three released. This is the first. As with all the previous books in the series, this will be a standard book on most DBA's desktops. Itzik Ben-Gan of Solid Quality Learning has taken the Advanced T-SQL course that he teaches and turned it into a fantastic book.

 

The first two chapters cover the logic of query processing and what physically occurs when you run a query. Very early on Itzik Ben-Gan establishes a practice of defining a problem or a question about T-SQL and then spends time working through the solution. This approach carries through the entire book. This not only allows the user to see the problem solving approach in action, but enables the user in pursuing the solutions themselves. Ben-Gan takes the time to lay out a very detailed methodology for identifying and tuning problem queries from the server, through the database, down to the query that's causing the problems. From these foundation chapters he explores topics such as Subqueries and Table Expressions, Joins and Set Operations, Aggregating and Pivoting Data, Top and Apply, Data Modification and finally Graphs, Hierarchies and Recursive Queries.

 

The greatest thing about the book is that Ben-Gan really explores his solutions. He walks the user into a mind set that accepts only set based operations as the norm for data access.  He shows multiple solutions to many of the problems and outlines where and how those solutions differ. This book acts as a good mechanism for actually learning the details of how to use T-SQL withing SQL Server 2005. He also points out where there are differences from 2000 and supplies 2000 only solutions to a lot of the problems, which can help those that need to maintain two code bases. He regularly uses and addresses 2005 functionality so there is zero indications of this being a 2000 book warmed over for re-publication.

 

Make no mistakes here though, this is a high-end book. If you're just getting started in T-SQL, you may want to start out somewhere else. If, after getting down the concepts of SELECT.. FROM… WHERE… GROUP BY… ORDER BY, etc., then come back here and learn what happens and the right reasons to do the right things with the scripts that you write.

 

Target Audience

 

I always like to read the Forward and Preface of a book to see if they outline who the target audience should be. In this case, it’s not really covered. Instead, we launch immediately into the power and scope and depth of the writing and the scripts contained. That pretty much answers the question for me. This book is for the more advanced user. You are no longer writing simple “SELECT * FROM TABLE x” queries to help people generate a report. You have moved on from a single INNER JOIN to multi-table JOINS using keys as well as constraints. In other words, you are not a beginner, but you’re an intermediate developer or database administrator who needs to increase their understanding of the medium they’re working with. This book is what you need.

 

Strengths

 

All of it. No, I’m serious. There’s too much here to single individual things out. What I will point out as the principle strength of the book is the approach taken. Ben-Gan has designed the book to work like his class. He shows you a common problem and then shows you the common solution. He then shows what’s wrong with the common solution and offers a much better one. After that, he starts pointing out the weaknesses with the solution he just offered and lays out an even better approach. Sometimes he keeps going. Through it all, he’s expanding your understanding by applying the solutions to a problem in such a way that you learn not only the syntax of what he’s teaching, but when and how to apply it. One other strength is that some of the work was done with others, including Lubor Kollar, one of the developers for the SQL Engine.

 

Weaknesses

 

There’s a lot of information here. It will take a long time to go through it all. What’s more, if you’re not terribly bright, like me, you may have some problems seeing how to apply some of these solutions to your own problems if your problems are different than the ones within the book. However, this can largely be overcome by working your way through the problems in the book. Take your time.

 

Chapter Breakdown

1 Logical Query Processing

This chapter can be summed up here:

(8) SELECT (9) DISTINCT (11) TOP

(1) FROM

(3) JOIN

(2) ON

(4) WHERE

(5) GROUP BY

(6) WITH

(7) HAVING

(10) ORDER BY

These are the order and steps that the query optimizer uses within SQL Server 2005. Ben-Gan goes into a lot more detail as to why the order is this way and what it means to data as it goes through this order. He takes a sample query and walks the data through all the logical steps, showing what happens to the data as it is joined (INNER & OUTER, not surprisingly, behave very differently) and the WHERE clause is applied, etc. He introduces new logical processing steps from 2005; APPLY, PIVOT, UNPIVOT, OVER, EXCEPT and INSERSECT.

2 Physical Query Processing

This is the chapter written by Lubor Kollar, one of the developers of the relational engine for 2005. I think the summary is best from the book, “While the previous chapter described what outcome a query execution result should produce, this one will explain how Microsoft SQL Server 2005 attains that outcome.” Mr. Kollar follows Ben-Gan’s approach and uses a query right from the start to show how the engine handles the processing of the data. Surprising for some may be the idea that the optimizer can, and does, ignore the logical steps if it can find faster ways to do things. Kollar takes us through compilation and the algebrizer, a new part of the 2005 engine replacing the normalizer in 2000, and into optimization. This is where the real magic occurs. We’re introduced to the concepts of trivial plans, situations where there is only a single possible plan, so the optimizer basically doesn’t fire. This is where the book introduces the show plan options including the new options in the graphical show plan and the all new XML show plan, which introduces a lot more detailed information in the plans. While written primarily by someone else, this chapter fits seamlessly with the rest of the book and maintains the exact same high quality.

3 Query Tuning

An odd place to put a tuning chapter, right at the beginning, but it’s explained by Ben-Gan. He lays out a method of tuning based on waits on the server. It may not be the exact approach most of us take, but it makes a great deal of sense. He walks through using Performance Monitor and Profiler to gather data to determine a course of action. One of the best parts of the chapter is when explains how to take the output from Profiler and clean the queries such that you can perform aggregates on the data. As he does throughout the book, Ben-Gan offers multiple solutions to this problem using sets, not cursors. He details the tools you’ll use to tune queries and goes over the graphical show plan in more detail than I’ve ever seen. A large part of the chapter covers how indexes work, how they store and access data. The chapter concludes with a detailed examination of set-based vs. iterative processing.

4 Subqueries, Table Expressions, and Ranking Functions

Subqueries are old hat in SQL Server now, but ranking functions and Common Table Expressions, CTE’s, are not. This chapter goes over the old and the new and really delves into the new. CTE’s are derived tables that get defined prior to a query. They’ve got a number of performance advantages over regular derived tables and Ben-Gan goes through all these advantages. They also make query code much cleaner. CTE’s are recursive, which changes how you do hierchies and trees in SQL Server 2005, but that topic has a whole chapter reserved to it later. The analytical ranking functions are somewhat interesting except for the new ROW_NUMBER function which is hugely powerful. It’s so powerful that Ben-Gan uses it in solution after solution throughout the book. It creates row numbers to sets of data with a defined order and can be done across partitions. This makes jobs like eliminate duplicate rows, finding islands and gaps, and various ordering tricks extremely simple, elegant and fast. You’ll be referring to this chapter fairly often as you read the rest of the book and find implementations in your own code.

5 Joins and Set Operations

This is a fairly standard subject that is given a great deal of attention since most of us are writing queries that access multiple tables. Traditional CROSS, INNER and OUTER queries are explored. UNION is covered. The new concepts of EXCEPT, which shows rows that appear in the first set, but not the second, and INTERSECT, which shows rows that appear in both sets. Ben-Gan covers the algorithms for joins, hash, merge and loop, at a low level so you can understand what’s used, when and why. This allows you tell, based on your own data and queries whether you need to worry about that hash join or not, a very useful tool when tuning queries.

6 Aggregating and Pivoting Data

Few of the queries that I write on a daily basis involve data aggregation, so, in terms of my immediate usage, this chapter has less utility. However, if you are writing those types of queries then the coverage of the new OVER clause, which allows you to “request window-based calculations-that is, the calculation is performed over a whole window of values.” Anyway, it allows for calculations on grouped data as a set and it screams in performance compared to any other method I’ve seen for doing the same thing. Running Aggregations, pivoting, custom aggregations and histograms are covered. All these use new approaches such as ROW_NUMBER and OVER as well as the old standards of GROUP BY to get the job done.

7 TOP and APPLY

You wouldn’t think that a topic as simple in concept as TOP would get half a chapter devoted to it, but when you see what can be done with the TOP operator, especially when you include the new capability of using expressions in the TOP statement (which also now requires parentheses in the syntax: TOP (‘some expression’)). The new concept APPLY, “a table operator that applies the right-hand table expression to every row of the left-hand table expression”, is introduced and explored in depth. I’ve been able to put this to immediate use in our queries to realize a three-fold reduction in resources, twice as fast compile and execution times. Ben-Gan goes way past the simple examples show in the Books Online of a table function getting called.

8 Data Modification

This chapter probably is probably the least ground shaking of all the chapters in the book. However, even here, as we explore inserts, deletes and updates at a very low level, new and better methods using SQL Server 2005 functionality are introduced which will affect how you write your TSQL code going forward. New concepts such as the OUTPUT key word are explored. This is a pretty simple concept with a lot of implications for usage. Basically you get access to the inserted and deleted tables that show the new data coming in and the old data going out. The first example Ben-Gan goes over is getting all the identity values from a multi-row insert, something not possible previously without extra queries. He also spends a bit of time going over performance considerations around getting data into your system.

9 Graphs, Trees, Hierarchies, and Recursive Queries

With the introduction of CTE’s in 2005, the methods for accessing data to create an Org Chart or a Bill of Materials have radically changed. Ben-Gan explores these examples and more, in depth, bringing a large array of the new functionality to bear to solve these problems in ways that will radically change your applications going forward. The new functionality allows for set-based access to data that previously required cursor logic, either in the database or in the application code. I found this chapter to be one of the densest and most difficult to read in the book. Many concepts, not just T-SQL, around hierarchies, sets, trees, graphs, etc., are introduced and explored through very complex T-SQL. The time spent on this chapter will bear fruit for anyone that needs to support the types of applications that access data in this manner. Recursive queries using CTE’s are a common sight throughout the chapter.

A Logic Puzzles

Personally, I hate logic puzzles. However, there are amazing T-SQL solutions to problems that should never be solved in T-SQL. The appendix is taken from Itzik Ben-Gan’s column in SQL Server Magazine. It’s worth a look.

Perspective of Reviewer

 

I work primarily as a database developer, doing database design from logical models and business requirements and then developing stored procedures, etc., for accessing data from those databases. This kind of information is absolutely vital to doing the best job I can. I’ve actually attended a week long class with Itzik, twice. It’s a constant education and that, like his book, is an important part of the quality of my work. I think I was predisposed to like this work, but I’ve actually been picking it up during the day to refer back to points that were unclear to me or that I want to better communicate to others, especially from the first two chapters.

Comments
Currently, there are no comments. Be the first to post one!
You must be logged in to post a comment. You can login here
Untitled-1
Featured Publishers
 
 
 
 
Copyright 2005 Professional Association for SQL Server (PASS)