Quickie: Timing a HUGE Data Copy Operation

We think the total operation will take several hours, possibly days.  So we want to be able to check up on it.

While I try to compose my resources and samples for the previous two teasers, I found this to be a particularly interesting problem and thought I’d share my solution.


trump it's gonna be yuge

We have a very large table, half a billion rows.  It needs to be copied (well actually, indexed, but that’s another part of the discussion).  We know that standard TSQL would suck for this, even if we wrote a batch-loop proc for it (sure, it might be less blocking, but it’ll take forever).  Plus, we might be going between two different servers (well, not in my case, but you very well could be, and in fact that’s probably the more likely case, given the implications of such a large operation).  SSIS to the rescue!

Now, again, we’re lazy, so we use the Import/Export Data wizard (a component of SSIS) to build the .dtsx package.  Then we can schedule it via a SQL Agent Job.

Sure, we could build it with Visual Studio and/or BIML and be super-awesome, but where’s the fun lazy in that?

Based on some preliminary estimates of the throughput using SSIS for a small sub-set of the data, we think the total operation will take several hours, possibly days.  So we want to be able to check up on it — to see its status & elapsed time, and get an estimate of time remaining.  Sound good?


assumptions this way
over thar!

Two assumptions going into this, to keep the example simple.

  1. We know the Agent Job’s scheduled start time, and it does start on-schedule.
  2. We’re only dealing with 2 tables — 1 source, 1 destination.  Furthermore, they’re both on the same server instance; or if not, we can query a linked-server connection once to get the “source” table size-stats, because they won’t change.

(Okay that was slightly more than 2, but again, simple.  One can imagine expanding this to multiple tables & multiple instances, but then at some point you’re going to need a “witness” that can talk to all the disparate sources of data and conglomerate those bits & pieces together, and then you have to ask yourself “is it really worth it, or can I just give a SWAG & move on with my day?”)

Before I move on: helpful SO answer that reminded me how surprisingly-not-that-difficult (and-still-pretty-efficient) it is to convert a time interval (seconds, in this case) in a “human friendly format” like Days.hh:mm:ss.  Yay.

I’m sure I’ve done it before, and subsequently forgotten about it.  Because, again, it’s one of those things that should be done in the presentation layer.

The Outline

One key component of any kind of “check-up” or general monitoring solution is, you need it to be light weight, i.e. have low overhead.  We don’t want our method of monitoring the process to add some non-trivial extra load to that process (or the servers doing said processing).  So let’s avoid COUNT(*) shall we?  Unfortunately, the top Google results don’t readily point to this, but MS actually gave us a handy-dandy built-in way of measuring this, sys.sp_spaceused.  Yay again.

Granted, it doesn’t get you “up to the millisecond” accuracy, but it’s truly close enough for what we’re doing; and frankly, it’s your only sane option when asking the question “how many rows?” of a TOUS.

So we’re going to use the output of that system proc, specifically the rows column, to measure how much data has been bulk-copied into our destination table.  We’ll also measure the source table, once, as stated, because (we’re assuming!) that won’t change.

Finally, we’ll use those row-counts, with some simple tricks & nonsense — basic math, dateadd/datediff — to calculate a percent-complete, elapsed time, estimated total time, and ETA (estimated time of arrival completion).

it's all a lot of simple tricks and nonsense
The Force? Hah!

The Code

See Gist.

In the next post, I’ll zoom out a bit and explain why I needed to do this, and in what situations it should come in handy.  Stay tuned!