Quickie: SSRS Multi-Value Parameter Defaults

a quick tip for passing default values to an SSRS multi-value parameter during report drill-thru.

Advertisements

Inspired by some StackOverflow-ing and this particular answer.

Aka: “How do I pass/assign a default set of values to a multi-value parameter in SSRS?”

We often have an SSRS report that needs to “drill thru” to another report, usually to go from a “high level view” to a “lower level” or more detailed view, or sometimes just to lead the user down the path we’ve chosen as the head analyst / BI architect.  And part of that report navigation involves initializing (set to defaults) the value(s) of the sub-report’s parameters, including multi-value parameters (which I’ll now refer to as mvp, even though, I know, it’s a ridiculously overused acronym).  These are basically arrays, but are often represented as simply comma-delimited strings, so it’s easy to forget their true nature.

beware the array in string's clothing
I see you there!

Let’s fabricate an example.  In our Sales Summary report, we have an element (a textbox, image, placeholder, or whatnot) that, when clicked, should drill-thru to Sales by Person.  Said next report requires a multi-select parameter (another term for the mvp) to have 1 or more values selected, otherwise it doesn’t render.  We’ll call this parameter SelectedNames, with value-label pairings 1=Bob, 2=Alice, 3=Mary.  When we drill-thru to this by-Person report, we want it to initially show (have selected by default) all available people.

So how do we do this?  In the properties of the “clickable” element on Sales Summary, say it’s a text-box named GoToDetails, we go to the Action tab.  We choose the “Go to report” action, select/specify the target report, Sales by Person, and then add the parameters we want to pass to it.  For comparison, I’m going to pass a “regular” (single value) parameter called @ReportDate, as well as the mvp SelectedNames.  Here’s what that all looks like, in picture form.

text box properties > actions > go to report > specify report > add parameters > enter expression
step by step

The single parameter pass-thru is, as you’d expect, very simple.  But for our mvp, we need to use the expression-builder, that little fx button stylized to look like your old high school math class days.  Here’s what that function looks like:

=Split("1,2,3", ",")

And presto!, we have converted a comma-delimited list into an array to pass into our Sales by Person report’s SelectedNames multi-value parameter.  Now it will initially render with all 3 selected people as desired.

So there you have it, a quick tip for passing default values to an SSRS multi-value parameter during report drill-thru.  But what if you wanted to pass the selected values of one mvp down to another?  Or do some clever on-the-fly mapping (conversion) from one to the next?  Well, stay tuned!  I’ll write about that next time.  =)

Thanks for reading!  For a lot more on SSRS and multi-value parameters, check out these articles: @sqlchick, @mssqltips, and @msdn.

you the real mvp meme
Well, not really, see, because MVP is now completely overused and diluted to the point that it’s nearly lost all meaning, so… but yeah, you’re cool. And stuff.

Quickie: TempDB on local SSD

What could possibly go wrong? As it turns out, plenty.

Faithful reader(s), it’s been a while!  I’ve been busy preparing for some big transitions.  I’m also getting better at MDX queries, tweaking SSAS-based reports to more accurately reflect the business rules.  But enough about that, on with the post!

In which we doubt the SAN

A storage area network (SAN) is a management & administration solution, not a performance solution.

-someone wiser than me

SANs are wonderful technology.  They inspire all kinds of geekery and are purported to solve all your storage woes.  But there’s a catch: they’re expensive.  Not just as a capital expense, but in maintenance and licensing costs.  And if you ever want to upgrade it, like add some more drives to a particular tier/pool — fuhgeddaboudit.

So what do we do with SQL on a SAN?  Well, it has tiers, right?  Slower storage with huge capacity, faster storage with less, etc.  We put the data files (heavy random read workload, typically) on the pool optimized for that kind of I/O pattern.  We put the TLog files (heavy sequential write workload) on the pool best suited for that.  And what about good ol’ TempDB?  Its access pattern is fairly unique — random writes and reads, and frequent overwrites, which means it could potentially wear out your typical prosumer SSD relatively quickly.  But we’re not complete cheapskates, we’ll buy enterprise class SSDs, no?

So we go read some stuff and figure, hey, sounds like a great idea, right?  Put TempDB on a local SSD, or better yet, a pair of SSDs in RAID-0 for pure performance (because this is a cluster, we’ve got HA already).  We’ll reduce the load on the SAN I/O channels and make our overworked TempDB happier with lower latency and better throughput.  Right?

what could possibly go wrong
ooh, sparkly!

 

In which we discover what could possibly go wrong.

Once the new drive(s) is(are) installed and “presented” to Windows (that’s my SysAdmin’s term), it’s fairly trivial to do the SQL configuration change — it does of course require a SQL service restart (or cluster failover).  Code example, assuming your new drive is ‘T’:

use master;
alter database tempdb
modify file (name=tempdev, filename='T:\tempdb.mdf')
alter database tempdb
modify file (name=tempdb2, filename='T:\tempdb2.ndf')
--etc...

You do of course have multiple TempDB data files, yes?  Good.

Side-debate

Should we put templog (TempDB’s transaction log) on the same drive as the TempDB data files, or put it on the same storage pool as the regular DBs’ TLogs?  As usual, “it depends” — ask your favorite SQL gurus and do some testing.

Back on topic

We’ve made the change, we’ve done the cluster failover.  TempDB is now running on our spankin’ new SSD.  So we start monitoring performance metrics.  Things like file I/O stats (from SQL DMV sys.dm_io_virtual_file_stats), latency and waits (from our monitoring tools), and good ol’ PerfMon.

But wait, what’s this?  I/O stalls are higher?  Write latency is higher?!?  Perfmon agrees?

nooooooooooooo-vader
why god why?!?

Write latency on the TempDB files was over 10x higher than it was when they were on the SAN (the performance tier, to be clear).  The file_stats DMV showed large increases in I/O stalls.  Sad-trombone.

 

In which we have several theories

Then ensued various conversations and brainstorms among my colleagues.

Someone check the firmware/drivers!

It’s up-to-date.

Maybe it’s got the wrong block-size.

Nope, 64k.

Well, it’s only 6Gbps SAS… maybe we should’ve sprung for the 12Gbps.

The write latencies went up by a factor of 10.  I don’t think an improvement by a factor of 2 is going to win you any trophies.

Why didn’t we get an NVMe or M.2 one?

Because the damn blades don’t have those slots, goober.

Another interesting observation, and potentially the silver lining.  Overall instance waits (wait stats), according to our monitoring tool, went down.  That’s good news, right?  Maybe.  Does application performance & user experience corroborate it?  Possibly!  We’ll be observing the patient for another week or so.

Let’s turn to the community again to see what others have experience.

In which we eat some crow

And by “we” I mean “me”.  Being the DBA and the primary proponent of the SSD addition, because I knew our workloads were very TempDB-heavy, I had to hang-tail and admit that the SAN gods won this round.

Maybe.

But wait, what about the fact that our wait stats are down?  What about app/user experience?  Valid arguments, I agree.  That’s why we’re still observing.  But I’m not optimistic, given the follow-up links above.  We may utilize local SSDs for something else (index filegroups?) — but if those write latencies don’t improve, I’m concerned that it won’t help anybody.

keep calm because only time will tell
And beer. Lots of beer.

In which I ask for your help

Yes, you!  If you have ideas on what we did wrong, what we’re missing, or any other advice about getting the most “bang for the buck” out of a direct attached SSD on a converged-infrastructure Cisco UCS blade server platform with a VNX SAN, by all means, drop me a line.  I’m all ears.

Quickie: Timing a HUGE Data Copy Operation

We think the total operation will take several hours, possibly days.  So we want to be able to check up on it.

While I try to compose my resources and samples for the previous two teasers, I found this to be a particularly interesting problem and thought I’d share my solution.

Scenario

trump it's gonna be yuge
yuuuuge

We have a very large table, half a billion rows.  It needs to be copied (well actually, indexed, but that’s another part of the discussion).  We know that standard TSQL would suck for this, even if we wrote a batch-loop proc for it (sure, it might be less blocking, but it’ll take forever).  Plus, we might be going between two different servers (well, not in my case, but you very well could be, and in fact that’s probably the more likely case, given the implications of such a large operation).  SSIS to the rescue!

Now, again, we’re lazy, so we use the Import/Export Data wizard (a component of SSIS) to build the .dtsx package.  Then we can schedule it via a SQL Agent Job.

Sure, we could build it with Visual Studio and/or BIML and be super-awesome, but where’s the fun lazy in that?

Based on some preliminary estimates of the throughput using SSIS for a small sub-set of the data, we think the total operation will take several hours, possibly days.  So we want to be able to check up on it — to see its status & elapsed time, and get an estimate of time remaining.  Sound good?

Assumptions

assumptions this way
over thar!

Two assumptions going into this, to keep the example simple.

  1. We know the Agent Job’s scheduled start time, and it does start on-schedule.
  2. We’re only dealing with 2 tables — 1 source, 1 destination.  Furthermore, they’re both on the same server instance; or if not, we can query a linked-server connection once to get the “source” table size-stats, because they won’t change.

(Okay that was slightly more than 2, but again, simple.  One can imagine expanding this to multiple tables & multiple instances, but then at some point you’re going to need a “witness” that can talk to all the disparate sources of data and conglomerate those bits & pieces together, and then you have to ask yourself “is it really worth it, or can I just give a SWAG & move on with my day?”)

Before I move on: helpful SO answer that reminded me how surprisingly-not-that-difficult (and-still-pretty-efficient) it is to convert a time interval (seconds, in this case) in a “human friendly format” like Days.hh:mm:ss.  Yay.

I’m sure I’ve done it before, and subsequently forgotten about it.  Because, again, it’s one of those things that should be done in the presentation layer.

The Outline

One key component of any kind of “check-up” or general monitoring solution is, you need it to be light weight, i.e. have low overhead.  We don’t want our method of monitoring the process to add some non-trivial extra load to that process (or the servers doing said processing).  So let’s avoid COUNT(*) shall we?  Unfortunately, the top Google results don’t readily point to this, but MS actually gave us a handy-dandy built-in way of measuring this, sys.sp_spaceused.  Yay again.

Granted, it doesn’t get you “up to the millisecond” accuracy, but it’s truly close enough for what we’re doing; and frankly, it’s your only sane option when asking the question “how many rows?” of a TOUS.

So we’re going to use the output of that system proc, specifically the rows column, to measure how much data has been bulk-copied into our destination table.  We’ll also measure the source table, once, as stated, because (we’re assuming!) that won’t change.

Finally, we’ll use those row-counts, with some simple tricks & nonsense — basic math, dateadd/datediff — to calculate a percent-complete, elapsed time, estimated total time, and ETA (estimated time of arrival completion).

it's all a lot of simple tricks and nonsense
The Force? Hah!

The Code

See Gist.

In the next post, I’ll zoom out a bit and explain why I needed to do this, and in what situations it should come in handy.  Stay tuned!