August 28, 2008
Untitled-1
ALL PASS Bloggers
  • Bob Beauchemin - Bob Beauchemin's Blog - RSS
  • Glenn Berry - Glenn Berry's SQL Server Performance - RSS
  • Mark Caldwell - Ajarn's SQL Corner - RSS
  • Louis Davidson  - The SQL Doctor is In (Real In) - RSS
  • Tara Duggan - Ramblings of a DBA - RSS
  • Randy Dyess - Transact-SQL Blog - RSS
  • Mike Epprecht - To SQL or not to SQL - RSS
  • Rob Farley - Blog - RSS
  • Euan Garden - Euan Garden's Blog - RSS
  • Bill Graziano - The Lazy DBA - RSS
  • Ken Henderson -  Ken Henderson's WebLog - RSS
  • Haidong Ji - The Ji Village News - RSS
  • Don Kiely - Don Kiely's Technical Blatherings - RSS
  • Kevin Kline - In a Nutshell - RSS
  • Thomas LaRock - Crazy DBA - RSS
  • Greg Linwood - Transaction Blog - RSS 
  • Aaron Lowe - AML's Space - RSS
  • Greg Low - The Bit Bucket - RSS
  • Adam Machanic - data manipulation for fun and profit - RSS 
  • Denise McInerney - select * from denisemc.views - RSS
  • Chris Miller - syscomments - RSS
  • Michael Rys - Musings on XML, XQuery and more... - RSS
  • Dejan Sarka - Solid Quality Learning Blogs - RSS
  • Kent Tegels - Enjoy Every Sandwich - RSS
  • Jamie Thomson - Jamie Thomson - Life, the universe and SSIS! - RSS
  • Rob Volk - TRUNCATE TABLE master..sysdatabases - RSS
  • Christian Wade - Christian Wade's Blog - RSS
  • Joe Webb - Musing and observations about SQL Server, other technogies, and sometimes just life in general - RSS
  • Pat Wright - A SQL Blog - RSS
  • Microsoft SQL Server Development Customer Advisory Team - RSS
  • Microsoft VS Data Team Weblog - RSS 

  • Untitled-1
     

    Welcome, this site is our a collecting ground for blogs about the PASS Community Summit experience and associated technologies.  We have links to any board of speakers, volunteers, attendees, and any members of PASS who want to be listed, and there will be an aggregation of several of the best posts every day.  If you would like to be a part of this site please email drsql@hotmail.com. 

    Untitled-1
    Blogs

    Current Articles | Categories | Search | Syndication

    Tuesday, December 19, 2006
    Adam Machanic - Medians, ROW_NUMBERs, and performance
    By drsql@hotmail.com @ 10:48 PM :: 747 Views :: 0 Comments :: Summit Speaker 2005
     
    A couple of days ago, Aaron Bertrand posted about a method for calculating medians in SQL Server 2005 using the ROW_NUMBER function in conjunction with the COUNT aggregate. This method (credited to Itzik Ben-Gan) is interesting, but I discovered an even better way to attack the problem in Joe Celko's Analytics and OLAP in SQL.

    Rather than using a COUNT aggregate in conjunction with the ROW_NUMBER function, Celko's method uses ROW_NUMBER twice: Once with an ascending sort, and again with a descending sort. The output rows can then be matched based on the ascending row number being within +/- 1 of the descending row number.  This becomes clearer with a couple of small examples:

    A

    1

    4

    B

    2

    3

    C

    3

    2

    D

    4

    1

     

    A

    1

    5

    B

    2

    4

    C

    3

    3

    D

    4

    2

    E

    5

    1


    In the first table (even number of rows), the median rows are B and C. These can be matched based on [Ascending Column] IN ([Descending Column] + 1, [Descending Column] - 1). In the second table (odd number of rows), the median row is C, which is matched where [Ascending Column] = [Descending Column]. Note that in the second table, the match criteria for the first table does not apply -- so the generic expression to match either case is the combination of the two:  [Ascending Column] IN ([Descending Column], [Descending Column] + 1, [Descending Column] - 1).

    We can apply this logic within the AdventureWorks database to find the median of the "TotalDue" amount in the Sales.SalesOrderHeader table, for each customer:

    Comments
    Currently, there are no comments. Be the first to post one!
    Click here to post a comment
     
    Copyright 2005 Professional Association for SQL Server (PASS)