August 28, 2008

There are 2 ways to participate in the PASS SIG Book Review program. 

1.  PASS members can click on the "Submit your name here to review this book" link on the books available for review to send an email requesting that book.  

2.  All registered users will be able to submit a book review online for a book that they already possess.  Once the review has been posted, you will be sent another book from our Book Review inventory (subject to availability).    Not a Registered User?   No problem, you can become a registered member of the PASS Special Interest Groups at no cost!  Register now!

For more information on how to participate, please contact the PASS SIG Book Review Committee.

Current Articles | Categories | Search | Syndication

Thursday, May 18, 2006
Professional SQL Server 2005 Integration Services - by Brian Knight, Allan Mitchell, Darren Green, Douglas Hinson, Kathi Kellenberger, Andy Leonard, Erik Veerman, Jason Gerard, Haidong Ji, Mike Murphy
By sig admin @ 7:21 PM :: 6405 Views ::

PASS Book Review by: Grant Fritchey
Rating:  star.gifstar.gifstar.gifstar.gifstar.gif (5 out of 5)

Overview

 

As of this review, this is absolutely the best book available on SQL Server Integration Services (SSIS). Of course, as of this writing, this is the only book based on the released version of SQL Server 2005 available. The good news is, this is a great book. SSIS was, until late in the release cycle, referred to as DTS and is still referred to by most people as the DTS replacement. When you consider just how fundamentally different this application is from DTS, that comparison breaks down quickly. These differences are so far reaching that, if anything, deep knowledge of DTS could act as a detriment for the user. This book walks the user completely through the creation of SSIS packages from the dropping the first flow control object to writing your own transforms including a GUI to a concluding chapter that builds an entire migration process from a set of business requirements.

The basics are covered quite well. Advanced topics, tips & tricks, performance bottlenecks and benefits are all covered. Some best practices for development and design of your own SSIS packages are also introduced, especially in the final chapter. I’ve got a first printing and there were a few errors, but nothing egregious. The Wrox website has a discussion group for the book and a list of errata that’s worth a visit. The approach taken by the authors, reflective of experience and knowledge of enterprise development and deployment, makes the book extremely useful. The entire book treats the reader like a peer who simply needs information, rather than a simpleton who needs his hand held through the most basic mouse operations. It truly makes the book a joy to read. When this gets added to the amount of information contained within, this book becomes a must for the book shelf of anyone supporting SSIS or developing in SSIS.

 

Who Will Benefit

 

Basically anyone who needs to move data. If you’re a DTS expert, you still need to read this book, because it’s a whole new world in SSIS. Developers and DBA’s who have ETL needs can and will benefit. It is a Wrox book, so some fundamental knowledge is assumed such as writing SQL queries, building a table, logging on to the server, etc..

 

Chapter Breakdown

 

1: Welcome to SQL Server Integration Services

 

A quick run through the basics covering topics like the Import/Export Wizard, Business Intelligence Development Studio (aka Visual Studio), and SSIS Architecture. The new architecture consists of a Package which contains tasks in a flow control linked by precedence constraints. One of the most import flow control tasks being Data Flow task. Data now moves from one or more data sources through a series of transformations to one or more destinations. Sources & Destinations are simply the places where the data is coming from or going to, databases, spreadsheets, XML files, flat files, etc. Transforms are where the data gets manipulated in a variety of fashions including Lookup, Merge, Fuzzy Grouping, Pivot, and all the rest. Error Handling and Logging have been radically expanded & refined from the old DTS equivalents. This chapter touches on all of it, quick & light.

 

2: The SSIS Tools

 

This chapter starts off with how & when to use the Import/Export Wizard. Enough said. The rest of the chapter introduces you to the BI Development Studio and building your first Package. Using the first package as mechanism, most of the windows and tool bars of the BI Development Studio are introduced and explained. Basics such as grouping tasks and creating Connection Managers are covered along with more advanced topics such as Variables, which have come a long way since the old days of DTS (a phrase that could be overused when comparing SSIS & DTS, so I’ll try not to mention it any more). Other, more advanced topics, such as Event Handlers & the Package Explorer are also brought out.

 

3: SSIS Tasks

 

Tasks are the units of work in the Control Flow space in SSIS. The authors take the time to explain the fundamentals of the Control Flow at the same time, they introduce some of the most commonly used tasks such as “Execute SQL,” “Bulk Insert,” “File System,” and “Script.” Most of the other tasks are covered with examples, not simply BOL descriptions.

 

4: Containers and Data Flow

 

Containers are introduced with a nice little package that uses the “For Each Loop” container to work with some files, stepping through each file in a directory, a very common task in most systems. The Data Flow task, where the majority of data manipulation will take place, is dug into for the first time, with most of chapter working on Data Sources and Destinations. The transforms are covered in a, frankly, monotonous listing of each and what they do. The data flow example at the end of the chapter redeems the boring part and details one of the cool new toys called Data viewers, which allows you to view the data within a datapipe on the fly as part of debugging.

 

5: Creating an End-to-End Package

 

Having completed the introductions, the book jumps write into the meat, walking the user through a realist, if simple, data load process. This package is followed by a second that steps up the complexity by FTP’ing files down with some simple data scrubbing, which allows the more drill down on data transforms such as “Conditional Split,” “Lookup” and “Union All.” Yet another cool new toy called “error handling” gets its first run in this chapter. All kidding aside, errors in DTS were a horror show, but SSIS now allows errors to be dealt with and error data can be captured or cleaned.

 

6: Advanced Tasks and Transforms

 

Some of the best advice in the book is right at the start of this chapter: “In a lot of ways you are going to have to restrain yourself in your approach to SSIS package development. It is not necessary to start every new package by opening up an ActiveX script task and using brute force to get things done. In fact, if at first you find yourself in the guts of a script task, just stop.” This, and a bunch of the information around it, is at the start of the chapter, in a section of the book, that, if I weren’t reviewing it, I would have skipped. Don’t. Its little details like this, scattered all over the book, that really make this a must have, not just all the big picture items & code samples covered. The chapter drills down on variables and their use with expressions to make package execution dynamic and configurable. One of the samples uses SQL output parameters to change runtime settings. The next sample drills down on import column including its use with files iterated from a directory. Other more advanced topics are covered in greater detail such as OLE DB Command, Term Lookup, Export Column, Fuzzy Lookups & Grouping, Pivots and Slowly Changing Dimensions. Each topic goes over a few pitfalls & some best practices for using the various objects. This is the longest chapter in the book except for the final chapter which is a complete development project walkthrough reflecting the book’s overall emphasis on important details.

 

7: Scripting in SSIS

 

The chapter on scripting does a fine job of walking through the basics on creating scripts, when, how and where to use them, and best of all, when not to. Accessing variables, logging and debugging all receive some attention. The ability to expand your scripts by using other .Net assemblies, one of the major enhancements to scripting in SSIS, gets addressed. Error handling, also enhanced in SSIS, is explained. Examples include using a script as part of the data pipe, which can be a transform, a source or a destination.

 

8: Accessing Heterogeneous Data

 

I found this chapter a bit dull, although it definitely keeps the same high standards found throughout the book. Getting data out of Access or from Web Services just isn’t the type of thing I do. However, in the Oracle section, it listed a KB article on the Microsoft web site (244661 for the curious) that directly addressed an issue we had run into. Again, the details available in the book are what really makes it good, not just it’s general approach to the broader topics.

 

9: Reliability and Scalability

 

Having a chapter devoted to these topics rather than simply addressing the ability to restart packages once more shows the importance that the authors placed on using this tool in the real world instead of simply a rehash of the Books Online. Checkpoints, yet another SSIS enhancement, are introduced, their use explained, and their shortcomings explored. The examples, this time very simple and not business oriented, work well to explore the variations of behavior that checkpoints will take on in your SSIS project. Package Transactions using DTC get a thorough treatment. The section on Scaling was a very thorough exploration of the options of adding servers or staging data with the pluses and minuses of each well documented.

 

10: Understanding the Integration Services Engine

 

More details to the underpinnings of SSIS, which can help you design & distribute better packages in general. Most of the time was spent in the Data Flow explaining the mechanisms of the data pipeline. The chapter did a great job of explaining the buffering mechanisms of the pipeline and going over which transformations will be blocking, semi-blocking and non-blocking. It explained in detail the synchronous versus asynchronous transformation outputs and broke down how to determine the number of execution trees. All of this will be useful data once you get anywhere beyond running the Import/Export wizard.

 

11: Applying the Integration Services Engine

 

This chapter takes the concepts explained in the last chapter and applies them to designing higher functioning SSIS packages. They take the time to explain how DTS would have handled an ETL process and compare that to how SSIS handles it, including performance statistics (174 seconds to 81 seconds for the curious). Again, the emphasis on real world requirements leads the authors to take the time to explain that, in all circumstances, the pipeline method of data manipulation, while fast & agile, isn’t necessarily the only answer. Situations that require staging data are explained in detail. This chapter also goes through some basic performance tuning best practices for SSIS developers and finish up with some discussion on the new Performance Monitor counters added to the SSIS service.

 

12: DTS 2000 Migration and Metadata Management

 

The book details how the migrate DTS packages to SSIS, but here’s one area where, in an effort to describe how things work, the book deemphasizes reality. For example, I’ve tried migrating six different DTS packages from what I thought of as very simple to overly complex, and didn’t get a single one of them to actually migrate. The authors do detail how DTS packages that can’t be converted will be run in SSIS, which is as DTS packages running through a task called “Execute DTS 2000 Package.” What the chapter describes as metadata management I would have described as deployment, but it does go into the two methods for deployment (I mean metadata management), msdb and file system, and how to control the packages from with the SQL Server Management Studio.

 

13: Error and Event Handling

 

This chapter is a must read. It does a great job taking you through precedence constraints and all the new wrinkles there. It moves on to trapping errors, logging, and event handlers. All the examples were clear and easy to follow and there are a lot of best practices built into the chapter that you can use in your own development and deployments.

 

14: Programming and Extending SSIS

 

While I read through this and the following chapter, I wasn’t able to compile the examples because I still don’t have Visual Studio 5 installed. The examples and explanations seem very clear and I’ll be going back to these chapters soon.

 

 

15: Adding a User Interface to Your Component

 

Nothing else to add beyond what I said about the last chapter.

 

 

16: External Management and WMI Task Implementation

 

First half of this chapter is how to control SSIS from .Net and I’m still facing the same problem, no VS 5.0. The example is clear and easy to follow and I’ll be compiling it, with reference to the book in order to get a good handle on how we might add SSIS to our other applications. More of this chapter goes into the details on package logging and the methods & options associated. These are also programmable externally. The Package Configurations are reviewed in this chapter, but unfortunately, for such an important topic, they are somewhat glossed over, to the point where I went away from this chapter with an idea of how they worked that was 100% incorrect. Rushabh Mehta of Solid Quality Learning set me right. The last part of the chapter details WMI interactions with SSIS including a thorough example.

 

17: Using SSIS with External Applications

 

This is a short little chapter, but it hits a few good points on integrating your SSIS packages with other applications such as Reporting Services, MS Office applications and ASP.NET apps. The flexibility built into SSIS is well displayed by the examples in the chapter.

 

18: SSIS Software Development Life Cycle

 

The chapter starts off with a description of software development life cycles including waterfalls & iterative styles. Then it moves into the meat, integration with source code control. Yes, you can check your .dtsx files into VSS or Team Foundation Services where these XML file are treated just like the other source code instead of big, blind binaries. The book details how VSS interfaces with the BIDS IDE and how to check packages and solutions and their attendant objects into & out of VSS. It also details Team Foundation Server and Team System, the new source code control & process management tool from MS. I couldn’t go through these examples at the time because I didn’t have that tool installed, but we’re migrating to it soon so I’ll be revisiting this part of the book to ensure I don’t miss any of the details the authors are kindly providing. The author of this chapter also spends some time with MSF Agile. I’d love to see even more time spent on agile methodologies and approaches specifically geared to SSIS, but there were very few on the display in the book which concentrated mainly on the BIDS interaction with TFS. For example, how exactly could you incorporate test driven development into SSIS development? I doubt it’s possible, but I’d be curious what some of the experts think about it. The chapter finishes up going over the deployment wizard.

 

19: Case Study: A Programmatic Example

 

This is biggest chapter in the book and I could spend a lot of time writing about it. However, a summary will suffice because any information I try to give you beyond “work through the examples in this chapter if you do nothing else with this book” will really be wasted space. Basically they define a company and a set of business needs and then go through the tedious process of building packages to answer the business needs. It’s very thorough. This chapter is worth the cost of the book.

PASS Book Review by: Grant Fritchey 
Rating:
  star.gifstar.gifstar.gifstar.gifstar.gif (out of 5)
From the Publisher: Wrox 
ISBN: 0-7645-8435-9
Published: January 2006
Pages: 692
Publisher's Price: $49.99

Comments
Currently, there are no comments. Be the first to post one!
You must be logged in to post a comment. You can login here
Untitled-1
Featured Publishers
 
 
 
 
Copyright 2005 Professional Association for SQL Server (PASS)