Quickie: Use of APPLY Operator

There are many great use-cases for the T-SQL APPLY operator, and if you haven’t tried it yet, I encourage you to check out Kevin Feasel’s excellent presentation on it here.  Today, I want to demonstrate a particularly interesting use-case (at least, to me!) that I found for it while tackling a real business problem — for reporting, specifically, which is what I spend a lot of my time dealing with lately.

The Setup

We have a “history of estimated market values” table, which stores, for some arbitrary dates based largely on user input, values (prices) of our proverbial Widgets.  In other words, we can’t guarantee we have every date in there for a given time period.  Over on side B, we have a “sales” table, which tells us when a given Widget was sold and for how much.  The business question, simply enough, is “When a given Widget was sold, how accurate was its ‘estimated market value’ at the time?”  Or, put another way, “How closely did the sale price of that Widget match our estimated market value?”

we don't make widgets
Yeah, yeah, we’re all tired of the Widget business. If only we could come up with a better term for a generic product to use in discussions of business problem samples.

The Tools

I used two interesting bits of TSQL to accomplish this, the main one being our lovely APPLY operator.  OUTER APPLY, in this case, because I still wanted the rows from “sales” even if we couldn’t find a closely-matching “market value” record.

If you haven’t used this operator yet, think of it conceptually like a JOIN to a sub-query, but with the added value that you can reference the outer tables’ columns within it.  CROSS APPLY is like INNER JOIN, and OUTER APPLY is like LEFT JOIN.

Sorry, RIGHT JOIN fans, no love for you here.  You’re just too weird.

My other little trick involved using TOP(1) in the apply’d sub-query, with a non-standard ORDER BY clause.  As I mentioned, within the APPLY‘s body, you can reference the outer table’s columns.  So I’m easily able to compare the SaleDate (outer table) with the EstimateDate (inner query).  I want “the closest EstimateDate to the SaleDate​”, which means I want the row where the difference between those two dates is the smallest.  Which means making use of our friend DATEDIFF.  So let’s try:

ORDER BY DATEDIFF(day, Sales.SaleDate, MktValueHist.EstimateDate)

Do you see the problem yet?  If I get a negative value, i.e. my SaleDate is way before my EstimateDate, say -100 days, that’s the top 1 row.  I don’t want that!  Let’s try our old friend from many many math classes ago, Mr. Absolute Value.

ORDER BY ABS(DATEDIFF(day, Sales.SaleDate, MktValueHist.EstimateDate)

That’s better!  Now I have the top 1 “closest to my sale date” row from MktValueHist.

All Together Now

I’ll put up a Gist with repro/demo code soon.  Here’s the meat of it, as a sample select, for convenience:

SELECT Sales.WidgetID, Sales.WidgetName, Sales.Price, Sales.SaleDate
, mvh.MarketValue, mvh.EstimateDate
, Accuracy = some_made_up_mathy_thing_here
FROM Sales
OUTER APPLY (
SELECT TOP(1) mValHist.MarketValue, mValHist.EstimateDate
FROM MarketValueHistory mValHist
WHERE mValHist.WidgetID = Sales.WidgetID
ORDER BY ABS(DATEDIFF(day, Sales.SaleDate, mValHist.EstimateDate)
) mvh
WHERE Sales.SaleDate >= '20180101'

There, my completely fabricated yet totally-based-in-reality example of how to get 2018’s Widget Sale Prices with corresponding “closest to the sale-date” Market Value Estimate from our history table.  You could even throw in some fancy math expression for “accuracy”, if you felt like it.  Like maybe “relative difference“.  =)

Note: For simplicity, we’re dealing with “whole dates”, i.e. just date.  Not datetime or datetime2.  If your date values do include times, you’d want to change the datediff interval to something more appropriate, like second or millisecond.

cross apply explained vs inner join and correlated-subquery
Forgive the graininess, it was from a Youtube screen-cap.  In an Oracle presentation, of all things. And I feel like they meant to say “Data exposition“, not “explosion”, given the latter can have negative connotation.  But.. hey, it’s better than a poorly drawn stick-figure or an inappropriate meme, right?  RIGHT?

And that’s how APPLY and ABS() saved my bacon today.  Hooray!

Have an interesting use-case for APPLY?  Or perhaps an alternative approach to a similar problem?  I’d love to hear from you!  Drop me a comment, a tweet, or a LinkedIn msg.  Happy summer!


PS: I’d also recommend checking out Steve Stedman’s useful “Join types poster“, if for no other reason than to have something concrete to use when explaining those concepts to your fellow cube-dwellers.  It has cross/outer apply on the 2nd page in their more commonly used pattern, the TVF (table valued function) relationship.

PPS: Fine, have your meme…

apply yourself, from breaking bad
I’ve never seen the show, so I don’t get it, but it was one of the first Google Image results for “apply yourself meme”. Enjoy that.

Advertisements

Author: natethedba

I'm a SQL Server DBA, family man, and all-around computer geek.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s