T-SQL Tuesday #104: Code You’d Hate to Live Without

And that’s where we could use a little encouragement, I think — another DBA saying “Yay, it’s not just me!” makes it all worthwhile.

Advertisements

It’s that time of the month again!  Bert‘s fantastic invitation complete with YouTube vid is tempting indeed.  There are so many wonderful community scripts and tools for SQL Server DBAs.  But, in an interesting twist, he specifically asks us about something self-written or self-created.  And I’ll admit, I sometimes have trouble being exceptionally ‘proud’ of my home-brewed stuff.  Sure, I’ve blogged and GitHub‘d, and hopefully those have helped someone along the way.  Most of the time, though, those are specifically suited to a given use-case.

The more difficult code to share, I think, is the stuff that we keep in our “daily-grind” “Get-Stuff-Done” folders.  Scripts that we’ve tweaked and commented sporadically throughout the years, finding this edge-case or that tweak for a given scenario, most of which ends up being pretty environment-specific.  And that’s where we could use a little encouragement, I think — another DBA saying “Hey, it’s not just me!”, or “I knew there was another guy/gal out there who always has to do this kind of thing!”.  Thus, here I will attempt to show off something along those lines and see what you kind folks think.

Where Does it Hurt?

No, I don’t mean the famous wait-stats queries by Paul Randal (those are awesome).  I mean, what are the top few areas of your SQL environment where you’re always questioning something, or being asked to find some answer that’s not immediately obvious?  For me, there are 3.

Replication

Transactional replication is, in a word, ‘brittle’.  It works well when it’s working, and usually you don’t have to think about it.  But when you do need to check on it, or change anything about it (god forbid), or troubleshoot it (more like shoot it, amirite?), it feels a bit like trying to mess with a half-played Jenga stack.  Meaning, you might be just fine, but you might send the whole thing crashing down and have to rebuild it all.

peanut-brittle-from-the-joy-of-baking
As brittle as this stuff. But not nearly as delicious.

I won’t go into the whole troubleshooting aspect here, that’s too much scope.  But there’s a question that can often come up, especially from Devs or Biz-Analysts, and that is: “Hey, is TableX being replicated?”  And rather than subject my poor eyeballs to the replication properties GUI, I just run a little query, which reads from the distribution database and the actual database that’s being published (the ‘publisher’ db), and tells me a list of tables & columns that are being replicated.

Here it is.  Be sure to replace [dbName] with your actual published DB name.  Like StackOverflow or WideWorldImporters, or AdventureWorks <shudder>.

Report Subscriptions (SSRS)

Another question I’m often asked is, “Hey, did ReportX run?”  What they really mean is, “Did ReportX‘s email subscription get sent to BigWigUserY?”  We have an unholy amount of SSRS reports with email subscriptions.  And because I don’t care to bloat my inbox by BCC’ing myself with every single one, I rely on the users to speak up when they don’t receive something they’re expecting.

“This is a terrible idea.”, you say.  “Never trust the users!”

Yes, well, such is life.  If you have a better lazier solution I’m all ears.

So here’s a little script I wrote to query the ReportServer database (which MS will tell you is “officially unsupported”, snore).  If I know some keyword from the report title, or the supposed recipient’s email address, this will tell me if it ran successfully or not — LastRun, LastStatus.  A couple other useful bits: where it lives, ReportPath, and what its SQL Agent Job’s Name is, JobName.

That last bit is peculiar.  The JobName looks like a GUID (because it is, because SSRS just loves GUIDs), but it’s also the actual name of the agent job, which you can use to re-run said job — via exec msdb.dbo.sp_start_job — if the failure wasn’t systemic.  As I often do.

Disk Space

Last but not least, everybody’s favorite topic to try and forget about or pawn-off to the SysAdmins.  “How much space are those databases / data files / log files eating up?”  Well, mister suddenly-cares-about-disk-space-but-is-OK-with-storing-all-domain-users’-iTunes-music-libraries-on-the-central-fileshare-along-with-their-documents-because-that’s-what-we’ve-always-done.  {True story.}  Let me tell you.

keep calm and release the bitter
It’s healthy, I swear!

This script has a lot of comments and commented-out lines because I will tweak it depending on what I need to see.  Sometimes it’s “show me the DBs where the logical filename doesn’t follow my preferred pattern” (the DB name with ‘_data’ or ‘_log’ after it); or perhaps “Only show me files over 5GB with a lot of free space” when I’m itching to shrink something.

“Never shrink your files!  (In production!)”

Yes, thank you, knee-jerk reactionary.  I’m dealing with servers in lower environments, usually, with this one.  😉

What do you think?

I’d love to hear your feedback in the comments!  Happy Tuesday!!  =)

Credit where credit is due.

I did not magically come up with these all by myself.  They’re pieced together from many a StackOverflow answer and/or other community blog post or contribution that I’ve since forgotten.  I usually store a link to the source in these kind of things, when it’s true copy-pasta, but in these cases, I added enough of my own tweaks & style that I no longer tracked the original linkage.  If you see anything that looks familiar, please do tell me where to give kudos and shout-outs!  😀

The Horror of Conflicting Database Collations

All these queries with all these JOINs on columns with mis-matched collation, lead to very sad pandas: RBAR operations (row-by-agonizing-row, unable to use index-seeks), and high CPU.

It’s not even close to Halloween, but I promised I would get to this someday, and that someday is now. Strap in, grab the popcorn, and turn up the volume.

Oh wait, this is just a textual medium. Forget the volume bit.

If you’re not sure what this collation thing is, you should go back and read my teaser post, and as always there’s the docs. The one-liner recap on why it matters and how it became a target of my ire is this: legacy DBs use the old default SQL latin1 collation, but an upgraded ERP system’s DB now uses the new default Windows latin1 collation; these DBs all talk to each other, including linked-server JOINs, and performance of those queries has gone to shit.

Pardon the French. “Gone to hell in a hand-basket.”

where exactly are we going in this handbasket?
It can’t be any hotter than a California summer!

So why did this happen? Let’s try to find out. But first; let’s get specific about how all this wailing and gnashing of teeth actually manifests in real technical terms.

The Problem

Essentially what happens here is an implicit conversion problem. There are several blog posts from our distinguished community leaders on this topic and its woes. It causes heavy CPU load as the SQL engine tries desperately to match values of different data types. Even though both columns may be, say, nvarchar(20), the fact that one uses collation SQL_Latin1_General_CP1_CI_AS and the other uses Latin1_General_100_CI_AS, makes them somewhat strangers — they might as well be an INT and a varchar!

Now again, this is my example. We have dozens of critical application queries using their own little sandbox-y databases, joining to the central ERP system DB to fetch Customer or Sales data. This is already a recipe for sadness.

“Use a middle tier or service layer, dammit!”, you’d say.

“You’re preaching to the choir,” I’d reply.

Hell, you’re preaching to the preacher, at that point. But it’s not that simple, as I’ll touch on later.

There’s a subtle difference here, vs. those many community blog posts, which I’ll repeat.  The columns are of the same type.  Just different collations.

And when the collation on the join predicates is different, bad things happen. Let’s take CustomerNumber for example. On the ERP side, it’s a nvarchar(20) collate Latin1_General_100_CI_AS. On the internal & web apps side, it’s a varchar(20) collateSQL_Latin1_General_CP1_CI_AS. As you might imagine, this is a prime field for joining because it’s the main customer identified throughout all the systems.

Let’s be clear here. This is a numeric value only. Did it need to support Unicode? Absolutely not. Should it have been an int or bigint? Probably. But did The ERP designers choose to make it Unicode string anyway? Yep.

Premature optimization may be a root of evil in software development, but inappropriate data typing is at least as evil in database development.

Anyway. The point of this post is not to rant and whine about the design of that particular system. I’ll save that for another day. That won’t stop me from complaining about the improper ways it’s used.

this is why we can't have nice things
Exactly.

As stated above, all these queries with all these JOINs on columns with mis-matched collation, lead to very sad pandas: RBAR operations (row-by-agonizing-row, unable to use index-seeks), and high CPU.  Under duress, my 32 core servers ground to a halt with blocked sessions, ASYNC_NETWORK_IO waits, and 99% CPU utilization metrics.  Needless to say, these were not good days.

Side-note: I really enjoyed Bert’s post because it helped put things into very easy-to-understand terms, and is targeted at a wider audience than just the DBA.  Read it, and watch the video too!

Attempt to Treat the Symptoms

Because the ERP system has been upgraded a couple times in the last decade, the team came up with a [very thin] abstraction layer manifested as a series of “integration views” that UNION similar core entities from the older and newer incarnations of the database. Like Sales records. These have permeated many many areas of the ecosystem, to the point that almost all apps use them instead of the “raw” source tables. Which sounds great, right? Riiiight.

Until you throw in that monkey wrench of conflicting collations. Remember, Devs are lazy (in a good way). Their apps and queries need to not care about such a low level detail as database collation. So to support that “not caring”, we set up these integration views to apply the older default collation (the one that matches everything else in the environment) to the output columns. That way, no extra work is required to consume them in the same way they’ve been consumed for the last 10+ years.

Basically, we can add the keywords COLLATE DATABASE_DEFAULT after each column declaration of the view, like so (in the form “alias = sourceColumn”): CustomerNo = erp.CustomerNo COLLATE DATABASE_DEFAULT.

This was a terrible idea.

It made sense at the time. But as the months passed and the complaints of performance degradation mounted, the signs continued to point back at these views which now used this collation-conversion mechanism (prior to the latest upgrade, they did not).

The typical work-arounds involved temp tables and/or going straight to the “raw” source. Neither of these are ideal — the latter breaks that abstraction (however thin it was), while the former risks over-burdening tempdb (sometimes referred to as the “communal toilet” of SQL server). Generally this was acceptable, and often resulted in orders of magnitude improvement to performance. But it continued to rack up that technical debt.

One thing I tried was to remove those collation conversions from all columns in the view except the join predicates, because the consumers fed those values straight into the object or ORM layer, at which point they all became C# strings anyway, so it didn’t matter what collation they’d used or whether they were ANSI or Unicode at that point. But alas, because the core pitfall of these queries was still very present — that implicit conversion — performance still suffered.

Treating the Root Cause

Here I re-link the two scary articles that warn of the dangers and gotchas of changing your database and server (instance-level) default collations: StackOverflow answer, and blog post.  Given all that, it’s a daunting task.  But if we’re going to get our performance back up to snuff, it’ll probably have to happen at some point.  As the great Mike Rowe says…

it's a dirty job but somebody's gotta do it

And unlike the previous blogger, I will do my very best to follow up here and post about my journey as we undertake this harrowing trek.

The way I see it, there are three major attack vectors.

  1. We can try converting the ERP database to the old SQL collation.

Pros: smaller effort than #2, less integration/regression testing overall.
Cons: unsupported by ERP vendor, downtime for ERP system, high amount of risk.

  1. We can try converting all other DBs in the environment (across all SQL instances) to the new Windows collation to match that of the ERP DB.

Pros: supported by ERP vendor, future-proof, less tech-debt.
Cons: largest effort, heaviest testing burden, high risk.

  1. We could utilize some kind of data-replication to maintain copies of the required data on the other SQL instances in their default (matching) collation.

Pros: support not an issue, lowest effort & testing burden, lowest risk.
Cons: replication maintenance burden & overhead, loss of “real-time” (added data latency), and some tech-debt.

As the lone DBA, most if not all effort falls to me, so I’m quite inclined toward #3. And we were somewhat already going in this direction with the temp-table workarounds, i.e. pulling in the ERP data (usually via a linked-server) to the target app DB & using that temp-table for joins — essentially, that’s “lightweight replication”.

The technical debt we’re incurring here is that we’re leaving all of our legacy DBs (and servers) in the older SQL collation.  At some point, likely the far-future, these will be unsupported, or at least obsolete, in the sense that all new applications & DBs will prefer the newer Windows collation.  Perhaps during the next big “hardware refresh” cycle, i.e. when we have to plan and execute a big SQL server upgrade/migration, we can consider integrating the collation-change into that project.

But wait, you argue, what about a 4th option?

Oh sure, you mean say, “Screw it, not my problem!”…

  1. Force all ERP DB data access up the stack to the application layers, i.e. the apps pull the data into memory and join/merge it there (or relate to & interact with it however the developers want to, in that layer).

But this has several downsides that the business and the development teams would [justifiably] push back on: dev time & effort, the drastic-ness & unprecedented-ness of the change, the fear of not catching every single place & usage of the ERP in the mysterious myriad of apps that it seems nobody can ever quite get a unified visibility handle on (some of which they can’t even build or deploy anymore without diving down some seriously arcane rabbit-holes of ancient tech).  The risk is just too high.

More than that, and as much as I would love to say “ain’t my problem” and pass it off to a larger group of smart people, the fact is that over 50% of dependencies on the ERP DB are from my own BI/reporting queries.  There’s no shortage of reports that need to examine and relate legacy LOB app data with ERP customer & sales data.  And it’s not just to build a paginated SSRS report — where I could, arguably, do what I said above in faux-option 4: pull the data from 2 separate data-sets, then merge it in the report layer.  It’s heavily customized, painstakingly crafted mini-data-warehouses and data-marts that massage and tailor the data for an assortment of different purposes and reporting needs.  Thus, even with this, most of the burden still falls to me.

some guy with glasses posing under a globe like he's carrying the weight of the world
To be clear, this is NOT me; it’s the 4th pic I found on Google image search for “weight of the world” filtered for “noncommercial reuse” images. 😉  Kinda looks like Landry from Friday Night Lights, no?

 

So, to borrow a closing line from one of my favorite podcasts

May your server lights blink, your database collations be identical, and your cables be cleanly managed.

Favorite TSQL Tuesday #101 Posts

 

Since I didn’t even come close to making it in time for this month’s T-SQL Tuesday, I figured I’d highlight my 5 favorite posts from the community, and then share a few of my own tips/tools.

I use a Central Management Server too, and although I don’t often use it to run a query against multiple instances, it’s definitely a handy built-in feature to take advantage of.  A minor downside is that it only supports Windows Authentication (not SQL logins), so I can’t use it for my AWS RDS instances, nor for the CMS server itself — those I have to keep stored in my local “Registered Servers” section.  Another tool for running queries against multiple instances, with a good deal more flexibility, is Red Gate’s MultiScript, though it’s not free.  ;o)

Ethervane Echo, a clipboard manager and history-remember-er, is similar to something I use called Clipboard Fusion — in fact, it might even be better.  And who doesn’t love dbatools and dbachecks ?  If you’re not using them yet, don’t wait; start getting into PowerShell today by at least trying out some of the ‘get’ cmdlets from dbatools.

Telegraf looks absolutely stunning, as a monitoring system.  It does take some setup work and some maintenance, but it’d be a great branch-out learning opportunity to touch a few different technologies that a traditional SQL DBA might not normally think of.  Hats off to the people behind that, and may it continue to grow.

Leave it to Bert to go “outside the box” a bit, with these tools that help you be a better presenter and collaborator.  I use BeyondCompare, which is similar to WinMerge (tho, again, not free); I’ve fallen in love with its features that go beyond file diff/merge, but it’s nice to have a free option whenever I’m not on my main machine.

This is a broad sweeping post but it captures a LOT of what the community is and should be.  We’re inclusive, we want people to participate, grow & learn from each other, and ultimately advance their careers.  Tons of useful gems in here, from the Slack workspace to the event links to the networking advice.  Excellent stuff; go read it.

Honorable mention:

The SQL DB Modeler beta looks really interesting as an alternative to traditional big-$$$ tools like ER/Studio & Erwin.  If only I wasn’t stuck in brown-field legacy data models 95% of the time… =D

And finally, although they’ve probably been mentioned a few times already, pastetheplan and statisticsparser are two amazingly simple tools from the Brent Ozar folks that make sharing and comparing query performance so much easier.  My M.O. is to use PasteThePlan links in a dba.stackexchange post so that others can easily see the graphical execution-plan to offer feedback; while I use StatisticsParser to compare between A/B-testing runs of a stored-proc I’m trying to refactor & improve.

TSQL Tuesday #100 – Predictions for 2026

Yeah so I missed the boat by a few days week.  That’s pretty much my M.O.  This month’s T-SQL Tuesday #100 is hosted by the author of sp_WhoIsActive and the creator of T-SQL Tuesday himself, the legendary, the incomparable, Adam Machanic.

applause-please
You ain’t never had a friend like the SQL blogger community ;D

The Year is 2026

Despite IT’s best efforts to kill the relational database, it’s still alive and kicking.  Sure, it’s mostly in the cloud, and we’ve largely solved the problems of scalability, availability, and “traditional” maintenance, but the DBA still plays a critical role in the IT organization.  He/she is more of an architect and an automator, someone who understands the business and development needs as they relate to data — its storage, availability, security, and performance — and can leverage cohesive data platform technologies to provide those services and meet those needs.  But the fundamental issue of data quality still haunts even the best environments, because at the end of the day, when you rely on a human to enter text into a field, you’re gonna get garbage inconsistency.  Thus, we’re still fighting that fight, if only to appease our “data scientists” and machine-learning models so that they stop whining about it.

SQL Server itself has evolved.  After realizing that it was pretty silly to bolt-on a hacky “graph db” component to what is, at its core, a relational engine, MS broke that off into its own product, “Microsoft GraphDB Server”.  But the good news is, SQL & GraphDB talk to each other seamlessly; in fact all of the data-platform products integrate and inter-operate much more smoothly than 10 years ago.

We finally have a single unified CE (Cardinality Estimator), which is intelligent enough to know which paths/plans to use for a given query, so we don’t need to mess with those awful trace-flags anymore.  Indexes and Statistics are all but self-maintaining; the DBA rarely has to step in and mess with them.  Part of the reason for this is that SQL Server yells at you if you try to make a GUID the clustering-key, or other such nonsense.  =D

Columnstore is everywhere; traditional row-store (b-tree) indexes barely exist.  JSON storage & indexing inside SQL Server is much better, but it’s still preferable to use a document-store DB if you can.  Hierarchical structures (not to be confused with graphs) are easily implemented and supported, without having to resort to old hacky models.  And user-defined functions (all types) perform nearly on-par with stored procedures.

They’ve replaced sp_who and sp_who2 with the code from sp_WhoIsActive, and made SSMS Activity Monitor suck less & actually be semi-useful as a basic first-response monitor.  Profiler was officially killed off, and XEvents has come into general widespread usage — largely because MS finally dedicated some hard-core dev time to improving its GUI & making it much easier to use.  Native Intellisense finally works, and works well, for all but the most obscure/weird things, and is much less chatty in terms of network traffic to/from the server.

And finally.  FINALLY.  Each database has its own TempDB.

and there was much rejoicing.. yay
We’d only been asking for it for.. 10 years?

T-SQL Tuesday #98: Orphaned Users Redux

It’s that time again!  The first #Tsql2sday of 2018.  Thanks to the Blobeater for this month’s invite: “your technical challenges conquered”.

Because I’m already ridiculously late, I have a short one.  This is about orphaned users — you know, when you restore a database and its users aren’t mapped to the server logins that they should be or used to be.

orphan-movie-poster
Not that kind of orphan… pretty decent movie tho!

The typical solution is sp_change_users_login with the auto_fix or update_one option.  But guess what?  Yep, that’s deprecated.  By the way, did you know that it also has a report option?  Apparently that’s got some bugs…ish?  Try it sometime and see — compare the output of sys.sp_helpuser where the ‘LoginName’ column is null, with sp_change_users_login 'report'.  Preferably on a DB you’ve restored from another server.  😉

So what’s the correct solution?  ALTER USER [theUser] WITH LOGIN = [theLogin].  Simple, no?  Let’s get more general.  Could we come up with a half-decent way do apply this kind of fix dynamically?  Well sure, the nice folks at DBATools have already solved that problem.  And that’s great, really.  But just in case that doesn’t work… ^_^

One of the many things I love about SQL Prompt is the right-click option to “Script as INSERT” (from the results grid).  This is a quick & easy way to built a temp-table for the results of an exec statement so you can do the ol’ insert #tmp exec sys.sp_blah !  Then we can query the list of DB users for the null LoginNames and write a little set of queries to fix them!  Sound good?

UPDATE: Behold the code!

Happy Tuesday!

PS: Coincidentally, today’s (Thursday) SQL Server Central newsletter featured a very similar post by a gentleman over at Madeira Data.  Go check it out, it’s another great solution to this problem!  And while you’re at it, get the SQL Server Radio podcast (created by a couple guys from the same company) – it’s a terrific addition to your iTunes library.

Adventures in Database Renaming

Renaming a Database looks easy, but it’s really not. Unless nobody’s touching it. Let’s find out why!

Databases sometimes need to get a new name.  Or you need to swap names.  It can be a royal PITA if the DB is a mission-critical always-being-used everybody-and-their-mother-touches-it-all-the-time thing.

Errors like “The database must be in a state in which a CHECKPOINT can happen” or “Cannot obtain exclusive access to the database” or “This transaction was chosen as the deadlock victim” abound.  Sometimes it’s better to fall back on our old friends, detach & attach.

Let’s paint a picture!

lets paint a happy little tree
And if you tell ANYONE, that that tree is there, I will come to your house, and I will CUT YOU…

Our ERP database has been chosen by the IT gods to get moved to the shiny new flash storage array, off the old spinning-rust SAN.  This is fantastic news for the business users.  But lo, the executives warn us, “You must do this with no downtime!” (said in my best Brent Ozar PHB-imitation voice).  Of course when we tell them that’s impossible, they say, “OK, you must do this with minimal downtime.”  That’s mo’ betta’.

So what are our typical options for doing a database migration?  Or, more specifically, a data file migration.  See, we’re not moving to a new server, and we’re not moving a bunch of databases together; we’re just moving this one ERP database.  And we’re keeping it on the same SQL instance, we’re just swapping the storage underneath.  Now yes, in a perfect world, both storage arrays (both SANs) would inter-operate and they’d be able to take a snapshot, pass it over, spin it up, swap it out, and our SQL instance would be none-the-wiser.  But alas.

Other options?  Log-shipping, sure; it’s been around forever, it’s very mature, simple, and relatively easy to operate, even if it lacks a bit in terms of automation capability.  But let’s face it, the DBA is going to be sitting in front of his/her screen babysitting this operation no matter what, so that’s no biggie.  Mirroring?  Meh.  Expensive Edition’s still got all the hot action there, and we’re not that fancy.  Availability Groups?  Sure, if we already had them set up & running.  But we don’t.  Sad-trombone.

Back to Basics

So we circle back to the classics, “backup & restore”, “detach-move-attach” routines.  But there’s a hidden gem of a hybrid here: We can backup, restore as a new name, then switch the names!  The catch is, we need to put the original DB in read_only mode while we do this, to prevent new/updated data from coming in.  Now, when we ask the Dev-managers if their overlaying apps/websites can handle the DB being in read-only mode for an hour, they’re much more inclined to say OK to that, as opposed to the DB being actually ‘down down’.

clinton-depends-on-definition-of-down
are we talking about “down” time or “downtime”?

Now, there’s a way to do this in T-SQL, and there’s a way to do this in PowerShell.  Both are fine!  I’m not arguing for one or the other — pick a flavor and go with it.  This happened to be a day where I was feeling SQL-ish, not PoSh, hence the flavor I chose.

The Walkthru

In my example we have our “slow” storage drives ‘D’ and ‘L’ (for Data and Logs, respectively).  We’re moving our blessed database to “fast” flash storage drives ‘E’ and ‘M’ (you know, because they come after ‘D’ and ‘L’).  The team managing the dependent apps have given their blessing to have a ‘read-only mode’ maintenance window for the database, as long as the actual ‘downtime’ is mere seconds.

Once the DB is in READ_ONLY, we do a backup,preferably to the flash storage so that it’s faster (and the restore, from flash to flash, will be super-fast!).  We then restore that backup to a new DB on the flash storage drives, and, as expected, it’s speedy-gonzales.

he makes speedy gonzales look like regular gonzales
Flash arrays are awesome, btw. =D

Once our two copies of the database are up, we’re ready to take that minuscule down-time.  We kick the users out of it (with SINGLE_USER mode), detach both DBs, and re-attach them (create database .. on (file), (file) .. for attach) with different names!  Presto-change-oh, the ‘original’ DB is now resident on flash storage, and the old slow files are now a DB called “whatever the heck you want it to be, because it doesn’t matter anymore!”.

Disclaimer:  I in no way support nor condone the naming of databases with spaces or special characters.  If you actually name a database something like that, no matter how temporarily, I will find you, and throw sporks at your keyboard.

Assuming you didn’t run into any crazy locking problems or sticky-sessions that denied your attempt to force SINGLE_USER mode or detach the DBs, you should be good to go!  I even considered building all this into an Agent Job, but… I’m worried it would get tripped up and leave the DBs in a half-arsed state that wouldn’t make our Dev-managers very happy.  But hey, that’s what we have test environments for, right?!?

RIGHT.

Here’s the code!  Enjoy.  =)

Finding Trigger Dependencies

Preamble

In SQL, TRIGGERs are often a subject of contention.  Some data professionals have strong opinions about them, especially on the “they’re bad!” side.  But like anything, they’re a specific kind of tool that has a place in a specific problem.  One of the common scenarios for their usage is what we call an “Audit trail” — a running log of any & all changes made to a table or data-set (or even a database schema, as RedGate DLM Dashboard employs).

Yes, there are newer product features built around this very requirement, such as Change Data Capture (CDC) and Temporal Tables, but they require particular versions and editions of the product, and not every environment is going to be there.  So we need something native and compatible to the lowest common/supportable denominator.

Like most areas of the database environment, triggers are often poorly documented.  So what happens when we need to review all the triggers in a database and show what tables they’re on & what columns they’re “tracking” (sticking with the audit-trail example)?  Or, more generally, you could say, we want to find all table & column dependencies of our triggers.  Well sure, we could use the SSMS Object Explorer GUI — find the table, the triggers, right-click and “Show Dependencies”.. but that’s not good enough, for two reasons: A) it doesn’t get down to the column level, and B) it’s not script-able.

System catalogs and DMVs to the rescue!

dmv-line-of-people
No, not THAT DMV…

Disclaimer: this is heavy on MS Docs links.  There was a StackOverflow Q/A that helped point me in that direction, but sadly I lost the tab and didn’t save the link.

The Goals

In this script, I mostly focus on the UPDATE triggers, because they’re the most heavy-handed in terms of auditing.  Whereas with a typical INSERT or DELETE, the triggered audit action is basically “Hey, a row was inserted/deleted, here’s its primary key” — and sure you can get fancier and log what all the values were in the entire row — those are fairly straightforward.  It’s basically “all or nothing”.  In contrast, with an UPDATE audit, we want to know just what was updated, and from what (old value) to what (new value).

So the goal of documenting the dependencies is to see which columns we care about — because, inevitably, there are some columns that we either “don’t care about” for auditing, or some that just plain don’t change (by their nature — they’re an identity or a primary key, or controlled via other means).  In turn, this helps us understand how to reproduce these audit mechanisms when a better technology comes along — such as CDC or temporal tables, as alluded to earlier.

cdc-centers-for-disease-control
No, not THAT CDC… yeesh.

The Walkthru

We start with the system catalog views sys.tables and sys.triggers.  This establishes our primary relationship — a trigger ‘lives’ on (is attached to) a single table.  Then we immediately step into murky water.  A trigger, like other programmable objects, has a definition script, which you can view in a myriad of ways — Object Explorer, sys.sp_helptext, sys.sql_modules, sys.syscomments (deprecated), or OBJECT_DEFINITION(OBJECT_ID).  So if we really wanted to go spelunking, we could just read the trigger’s definition and see what columns it deals with.  But after about 2 or 3 of those, our eyes are going to glaze over and our fingers will start to twitch.  We need something better, something that can handle hundreds of triggers and output a nice unified result!

Oh wait, what’s this?  Object Explorer has a “View Dependencies” function.. aw, bummer, it doesn’t show columns.  Just tables.  As we know, most of the GUI functionality is driven by existing system meta-data/catalogs/dmv’s, so there’s definitely something that keeps track of these dependencies.  Let’s hazard a guess… sys.sp_depends or sys.sql_expression_dependencies?  The former is deprecated, the latter doesn’t give us a way to get to columns either.  But thankfully, on the docs, the deprecated note points us toward the correct path: sys.dm_sql_referenced_entities and sys.dm_sql_referencing_entities.

Now we’re cooking with gas!  So how do I use these suckers?  Well, there’s a couple intermediary joins I need to go thru, and in fact these are functions, not views, so we need to know the proper arguments.  But fear not, I have done the hard work for you!  Check out the code and let me know what you think.

But wait, there’s more!  There’s a slight monkey-wrench here.  You see, in the column dependencies, we’re getting both the “audited” columns and the “key” columns.  But we’d like those to be distinguished from each other.  When we document and/or replicate the auditing using newer technology, the primary keys aren’t what we need to track (they’re in a sense “given”, known) — we need to know which columns are “audit-able” and which aren’t.  Hence the two queries with a UNION ALL.

union-vs-confederate
That is ALSO inappropriate. Good lord, who’s choosing these pictures today?

Conclusion

So there you have it, documenting some audit-trail triggers.  Or triggers in general.  Complete with the tables & columns on which they depend.  Enjoy!  ‘Til next time folks.