Follow-up: Cribbage “15’s Counter”

The actual method involves joining 5 copies of the table together, by each right-side table only including cards with higher ID values than the table to its left.

Advertisements

To be honest, my T-SQL Tuesday puzzle was a bit of a last-minute idea, which is why I didn’t have a solution ready-made. But, dear reader, you’re in luck! I have one now.

The code is over here in Gist. You can read thru it, but since the final query — the actual “answer” — is kinda ugly, let me explain my thought process.

Modeling is Important

Even when I’m putting together a silly little demo script like this, I feel that good habits and fundamentals are important. You never know what future developer might read it, copy-paste it, and say to themselves “Cool, I’m gonna follow this example when I do this other thing over here!” So you’ll see my formatting preferences, naming convention (though I must admit, I argued with myself over whether to pluralize the table names or not!), and correctly allocated Primary Keys. And since we’re modeling a card deck, even though I didn’t need to store the ‘NumValue’ (which is what you’d use for a straight/run, where the Jack is 11, Queen is 12, etc.), I did anyway.

Now, when we set up our “Hands”, we’re going to use two ‘PlayerNum’s, just so we can test two different hands at the same time. Cribbage can be played with 3 or 4 players, but we’re keeping this simple. Also, I could have built the hands more aesthetically, i.e. by selecting from Cards using PtValue and Suit, but again, I was trying to script quickly, so I just used the IDs that I knew from the previous query (the “full deck”). And again, there’s a “little extra” tidbit, the ‘IsCut’ indicator — we won’t be using that right now. If you’re still not sure what that means, go read the rules.

The Method

At the end of the original post, I mentioned loops and cursors as possible routes to a solution. That may still be true, but I decided to challenge myself to avoid them. Not because they’re “always bad”, as popular media would have you believe; they’re just often an indicator that a developer isn’t thinking in set-theory when they probably should be.

Let’s start with some basic principles. You have 5 cards in your hand. It takes a minimum of two cards to make 15 (examples include Jack+5, 6+9, etc.), and up to a maximum of.. you guessed it, five cards. So we need to check all combinations of any two, three, four, or five cards. We cannot re-use a card within the same combination; and putting the same three cards in a different order, for example, does NOT count as a separate combo (another ’15’).

So as you start to think about these rules, and if you’ve been around data for a while, especially data with identity values, you might have a little light-bulb. “Aha! I know how to do that. We can simply order the combos by the ID value, and that way we won’t allow duplicates!” And that’s kinda what I did, by enforcing the JOIN predicates that every subsequent derived-table have a ‘CardID’ greater than the prior one. But I’m getting ahead of myself.

The actual method here involves JOINing 5 copies of the table together, mainly just on PlayerNum, but also, as I said, by each right-side table only including cards with higher ID values than the left-side. In this way, we ensure that we’re not allowing the same cards to be “joined” to each other, i.e. we’re removing them from the right-side tables.

And finally, we have four OR‘d conditions: simply “do any of those combinations add up to 15, by the Card’s PtValue?” These are echo’d in the CASE-expression in the SELECT line, where we want to essentially “show the combo”, i.e. tell you what cards make up the ’15’. (Again, for style’s sake, we have an ELSE, but we don’t really need it because it’ll never actually happen.)

Now, it does look kinda ugly. It’s not very extensible — meaning, if you wanted to scale it up to find the ’15’s in a 6- or 7-card hand, or you wanted to look for other kinds of combos (like ’18’s or ’27’s), you’d end up re-writing a good portion of it, or at least copy-pasting a lot. Fortunately for us, Cribbage is fairly simple in this regard — your hand is always the same size, and you only ever care about ’15’s.

(Well, and pairs, 3- and 4-of-a-kinds, straights, flushes, knobs, etc., but again, read the rules if you’re curious. We kept this very simple by limiting ourselves to just one small fraction of the game mechanics.)

The cool thing about this sample, though, at least to me, is that you’re already set up to build on it if you want to try out other Cribbage mechanics. Or even other card games, if you just use the base Suits & Cards.

What Did We Learn?

What’s the point of a puzzle like this? Well, besides introducing you to a fantastic card game, if you didn’t already know about it. The point is to make your brain think in a different way than usual. Are any of us programming card games using a SQL back-end? Probably not. (Although an in-memory equivalent like SQLite or something might be viable!) But the next time you have a “combinations problem” with some real-world data, you might wonder if a method like this could come in handy. Or at least, if it could work out better than a double-nested-loop. =)

PS: I believe, instead of the LEFT JOIN​s, we could have used OUTER APPLYs. We’d move the conditions from the JOINs into the inner WHERE clause of each derived table, i.e. “this ID > previous ID” and “PlayerNums are equal”. If you’re curious, try it out!

T-SQL Tuesday #114: A Puzzle

One of the main things a new cribbage player needs to learn is how to easily spot the combos that make ‘a 15’ (the ways to combine cards to add up to a numeric value of 15). Let’s do that with SQL!

It’s that time again! The 2nd Tuesday of the month, T-SQL Tuesday. This month’s invitation is on the lighter side, which is nice, and it comes from Matthew McGiffen (b | t). The theme is “Puzzle Party!” And I’m going to cheat, since it’s getting horribly late already and I’m lacking in inspiration.

So, I propose a puzzle! Which you must solve using SQL. Then I’ll post my own solution in a day or two. Bwahahaha.

I actually really wanted to do a Sudoku solver, but @SQLRnnr beat me to it. By a few years. =P   I might still work on that when I’m bored, just to have a standby for another blog post. Maybe we’ll compare notes.

But for now…

Do You Even Cribbage, Bro?

If you’ve never heard of the card game cribbage, it might sound weird. When you read the rules, it sounds even weirder. Legend has it that it was invented by drunk Englishmen in a pub. Reality is actually not that far off. It’s also heavily played by Navy submariners, and that’s how it was passed down in my family.

There are already many great mobile & web versions of the game, and it will quickly become obvious to anyone who’s tried to program a card game before, that a query language like T-SQL is NOT suited (omg see what I did there?) to the task. However, we can probably come up with a small sub-task of the game that’s acceptable for our purposes.

Enter: the hand scorer. There’s a nice example of a finished product here. The input would be a set of 5 ‘cards’ — the ‘hand’ has 4, and the ‘cut’ adds 1 more, used as part of each player’s hand in scoring (like community property). A ‘card’ is simply an alphanumeric value — 1-10 plus JQK (which are ‘worth’ 10 for arithmetic, but can be used like normal for ‘straights’ aka ‘runs’) — and a ‘suit’ (heart, spade, diamond, club). Think for a moment on how you’d store that as a data structure.

The output, then, is a single numeric value, the ‘score’. But how do you score? You look for the following: combinations of any numeric values that add up to 15; pairs, 3-of-a-kinds, or 4-of-a-kinds; straights (suit does not matter); a flush, if all 4 ‘hand’ cards are the same suit (and a bonus point if the ‘cut’ card matches as well). And then there’s a funky thing where you get an extra point if you have a Jack that matches the suite of the ‘cut’ card. o_@

Dude… What?

Wow, that sounds complicated, no? Let’s make it simpler. One of the main things a new cribbage player needs to learn is how to easily spot the combos that make ‘a 15′ (the ways to combine cards to add up to a numeric value of 15). For each ’15’ you make, you score 2 points. That sounds pretty feasible in SQL, right?

For starters, we don’t really care about suit anymore. But we do need some way to distinguish the cards from each other. This is a single-deck game, so you’re never going to have more than 4 of the same number; never more than one of the same card (like the Ace of Spaces). And when you’re counting combinations (or is it permutations?), you can’t use the same card twice. So let’s still use the suits for card distinction; I’ll just suffix the number with an ‘h’, ‘s’, ‘d’, or ‘c’.

We also don’t care about differentiating a 10 or J/Q/K, since they’re all just worth 10, numerically. So your ‘input’ can just consist of five numbers between 1 and 10. Cool? Just find the ’15’s!

Example:

  • Your hand is 3h, 6s, 6d, 9c, and the ‘cut’ is 3c.
  • Combos for ’15’: 6s+9c, 6d+9c, 3h+3c+9c, 3h+6s+6d, 3c+6s+6d.

That’s five unique combos, for a total of 10 points! Good job, that’s a bit better than average hand. In cribbage lingo, you’d say it like so: “fifteen two, fifteen four, fifteen six, fifteen eight, and fifteen ten.” Or if you’re playing with more experience, you’d abbreviate to simply “two four six eight ten”.

In “normal” programming land, we’d probably use a loop and some branching logic. What will we do in SQL? A loop, a cursor, or something more (or less!) elegant? You decide!

I’ll come up with something solution-y soon. Enjoy! ❤

cribbage board close-up of winning peg and partial hand
Red won by 2 points! Close game.

T-SQL Tuesday #113: Personal-Use Databases

So when I dived down the rabbit-hole of the Nested Set Model, of course I created a sample database to write & test the code against.

tsql2sday150x150It’s that time again! This month, Todd Kleinhans (b/t) asks us how we use databases in our day to day life, i.e. personal use or “outside of our work / day-job”. Actually, the question is kinda vague — because if you think about it, we all use TONS of databases in our daily lives. Your phone’s contact list, your calendar, online shopping, banking.. the list goes on. As I’ve said before, Data is everything.

But what I think he meant, and the way most of the community has interpreted it, is “How do you manage/administrate/build/work-with/develop databases in your day-to-day life outside of work?”. So we’ll go with that.

Now this may out me as “not a real DBA” or some such nonsense, but honestly.. I don’t spend much of my time creating silly playground databases. Not that anybody else’s are ‘silly’ — just look at some of the fantastic posts for this month! Such neat ideas brought to life.

Special shout-out to Kenneth Fisher, who, if you look closely at his screenshot (and it’s not even related to this post), committed the abhorrent sin of creating a database name of pure emojis — FOR SHAME sir! But also you’re awesome. ❤

Me, I’m more of a quick-n-dirty spreadsheet guy. If I need, say, an inventory of my computer parts & gadgets so I know what I can & can’t repair, what materials I have to work with as I tinker, etc.. well, I create a Google Sheet. And it serves my needs well enough. (Spoiler alert: yes, you can view that one; I shared it. But it’s fairly outdated since I moved in March and haven’t had time to re-do inventory since.. last autumn.)

But for blogging in the tech field, you gotta get your hands dirty. So when I dived down the rabbit-hole of the Nested Set Modelof course I created a sample database to write & test the code against. And there have been some additional bits & pieces for blog demos and GitHub samples.

Most of the time, I’m creating databases / entities on SQL 2016 Developer Edition. Of course by now, that’s 2 major versions ‘behind’, but since I don’t run Linux personally (yet?), and I’m not a conference speaker (yet??), I don’t feel a burning need to upgrade. It’s FAR superior to Express Edition, though, so please for the love of all that is holy, if you find yourself using Express for personal/playground use, save yourself the headache and go grab Developer.

Containers/Docker? Meh. If you want to start playing with those, definitely look at 2017 or higher. It sounds appealing in theory — “just spin it up when you need it, spin it down when you don’t!” — and I’m sure that’s great if you’re starved for resources on whatever laptop you’re working with, but if you’ve done your due diligence and set your local SQL instance to appropriate resource limitations (hello, ‘max server memory’ and file-growths!), I’ve found that its impact is quite tolerable.

But come now. Surely this isn’t just a “shameless self-promotion” post or a grumpy-old-DBA “get off my lawn” post. Right?? Right!

To you folks out there creating your own nifty little databases for personal projects, learning/development, or even hopes & dreams of building a killer app on top of it one day — you’re amazing! Keep doing what you do, and write about it, because I love reading about it. Heck, go try to create the same model in PostgreSQL or MariaDB and see how it goes. We could all use a little cross-stack exposure once in a while.

That’s all I have for this month; short & sweet. I need to finalize plans for virtualizing our main SQL instances (which is really just a migration off bare-metal & onto VMs) within the coming weeks. Yes, we’re that far behind the curve. Now get off my lawn!

=P

clint eastwood frowning angrily
I’m old and racist! But I’m still adorable for some reason!

T-SQL Tuesday #112: Cookies!!

..this analogy of “dipping into the cookie jar”. What events or accomplishments can I take sustenance from, draw strength from, during these times?

tsql2sday150x150

Hi folks. It’s been a minute. Frankly it’s been a rough 5 months. From losing my wife, to dealing with the holidays and her birthday, to moving houses again. On the career front, I’m faced with the challenge of virtualizing our core business-critical SQL instances with minimal downtime. And obviously, because of all that personal/life stuff, it’s been difficult to stay focused and productive.

So this #tsql2sday‘s topic is poignant, I suppose — this analogy of “dipping into the cookie jar”. What events or accomplishments can I take sustenance from, draw strength from, during these times?

As usual, we must thank our host, Shane O’Neill (b|t), and remind readers to check out tsqltuesday.com.

Encouragement

Back when I first took this current job, I was worried sick about doing the commute every day. One and a half to two hours in traffic each way. Even if I used toll-roads, it might save 10-15 minutes, but it was still super stressful. But my wife never stopped encouraging me, telling me it would pay off. She put up with the crazy hours, she checked on me periodically, she stayed on the phone to keep me awake sometimes. She reminded me often, when the time was right, to have the telecommute/remote-work conversation with management.

And of course, to nobody’s surprise, she was right. I now work from home 4 days a week, take a vanpool the 5th day, and am much happier and more productive (in general!), much less stressed, and a markedly better driver. More importantly, because of her unwavering support, I can still look back and draw renewed energy from those memories and from her still-present spirit that stays with me in my heart.

the wife is always right
GOTO 1

Accomplishment

One of the first big projects on my plate was upgrading & migrating the SQL 2005 instances to new hardware and SQL 2016. We didn’t use anything super fancy for the actual migration, just backup shipping, essentially. DbaTools came in very handy for copying over the logins/users without having to hash/unhash passwords. The main hiccups were around Agent Jobs and Replication. Jobs were turned off at the old server before they were enabled on the new, but due to lack of documentation and understanding of some, it became difficult to see which ones were critically needed “now” vs. which could wait a while, and which may have dependencies on other SQL instances (via linked-servers) that may or may not have moved.

And replication is just a PITA in general. Fortunately, I took this as a ‘growth opportunity’ to more fully document and understand the replication environment we were dealing with. So at the end of the project, we had a full list of replication pub-subs with their articles, a sense of how long they each took to re-initialize, and a decision-process to consult when adding-to or updating articles within them.

Continuous Learning

A similar upgrade-migration project came to fruition in 2017: the ERP system upgrade. This was a delicious combo meal of both the database instance upgrade (SQL 2008R2 to 2016) and the application & related services/middleware (Dynamics NAV 2009 to 2017). And as I blogged about, it came with a super sticky horrible side-effect that we’re still dealing with over a year later: a different collation than the rest of our DB environment.

Which reminds me, I need to do some follow-up posts to that. Briefly, the “best” band-aids so far have been thus:

  1. If only dealing with the ERP entities, without joins to other DBs, just leave collations alone. The presentation layer won’t care (whether that SSRS or a .NET app).
  2. If relating (joining) ERP entities to other DB entities, I’ll load the ERP data into temp-tables that use the other DB’s collation, then join to those temp-tables. The “conversion” is essentially ‘free on write’, meaning when we load the weird-collation data into a temp-table that’s using the normal-collation, there’s no penalty for it.

As I said, I’ll try to dive into this more in a future post. It’s been a life-saver for performance problems that cropped up as a result of the upgrade & the different collations.

But the point here is that, even though this project didn’t end up as wildly successful as we’d hoped, it’s still a success, because we learned some key lessons and were able to pivot effectively to address the problems in a reasonable way. And frankly, there was no going back anyway; it’s not like the business would have said “Oh, never mind, we’ll just stick with the old versions of everything” (even though some reticent managers would probably have enjoyed that!). So even when things seem bleak, there’s always a way around.

when the going gets tough, the tough take a coffee break
Wait a minute…

Conclusion

I’m still trying to figure out what this new chapter of my life looks like, without her. I’m still trying to be the very best DBA & BI-Dev I can, supporting the dozens of requests & projects that the business throws at us. Fortunately, with the incredible SQL Community, a wonderfully supportive family, and a fantastic team of colleagues, I remember how far I’ve come, on whose shoulders I stand, and how much promise the future holds.

Even though the one person I was meant to share it all with is gone; she still smiles down and encourages me in subtle ways.

…with love & light ❤

T-SQL Tuesday 106: 5 Things Not to Do With Triggers

Here are a handful of anti-patterns that I’ve seen with triggers in my time…

This month’s invite comes once again from our benevolent overlord* community pillar Steve Jones, head of SQLServerCentral.com! If you haven’t been there, stop reading immediately and go check out the helpful forums and ‘Stairways’ articles. Some truly excellent content to be had.

No, don’t stop reading immediately… save it to your favorites/reading-list and look at it in 5 minutes when you’re done here.  =)

*Though I’ve not had the pleasure of meeting him in person, I’ve heard Steve is a phenomenally humble and down-to-earth guy, so my silly comment about him is even sillier in that light. ❤

Triggers – Love ’em or Hate ’em

Borrowing a bit of a mini-game from the CodingBlocks guys, the first “Google-feud” (auto-complete result) for “sql server triggers are ” is sql server are triggers bad.  Well, they can be, if used improperly / to excess.  Like most bits of tech, they have their place and their use-cases.

I’ve blogged about a few such use-cases here (a “who did what” audit-trail type of trigger) and here (as part of the Nested Set Model implementation), which you should definitely read. I didn’t condone click-baity titles back then. Alas…

click this right now omg
All we need now is a “blink” tag…

Here are handful of anti-patterns that I’ve seen with triggers in my time.

Thing 1: Using them as Queues

Repeat after me:

A trigger is not a queue.

Triggers are executed within the same transaction as the query that fires them. Meaning, they’re subject to all that ACID-y goodness of a transactional system. This is a double-edged sword. If all is successful, it guarantees that trigger will do its job when the calling query runs, which is great for audit-ability. On the other hand, if the trigger has a problem, anything and everything that triggers it will also fail.

The fundamental distinction between this and a queue, is that the success of the queued action is not immediately critical to the continued operation of the thing that called (queued) it.

So if your requirement matches the latter behavior, and not the former, do us all a favor and use a real queue. Heck, find one of the few people who know Service Broker. (Hint: one’s a Warewolf, another’s a Poolboy.)

Thing 2: Making them WAY TOO BIG

Mostly because of the transactional thing, the rule of thumb with triggers is K.I.S.S. “Keep it Small and Simple.” Even the audit-trail example is a bit much if the table being audited is under significant write load. Typically, if the business requirements are both high-throughput and high audit-ability, you’ll be implementing a much more complicated tech-stack than just plain ol’ SQL Server with triggers.

Some of the offenders I’ve seen include: A trigger that wanted to write to several other tables with IF conditions that branched based on what column was being updated. And a trigger that required near-SA level permissions to do some back-end maintenance-y stuff. Those are both recipes for problem pie.

pie that looks like a kraken attacking a boat
We’re gonna need a bigger… fork?

Thing 3: Doing Nothing Useful

Somewhat opposite of above, there’s no point in introducing the management and performance overhead of triggers if they’re not performing a transaction-critical operation. For instance, something better left to a queue.

Thing 4: Housing Business Logic

There’s always been a holy war about whether “business logic” belongs in the database or in the application code. And, since geeks love sub-classifying things to the Nth degree, there’s a sub-holy-war about what “business logic” actually means. But I won’t go into that here.

If you fall on the 1st side of the fence, and you feel the desperate desire to embed some logic in the data layer, it belongs in stored procedures, not in triggers. Reasons include maintaintability, discoverability, performance, documentability. Not to mention source-control on procs is a helluva lot easier than on triggers.

Thing 5: Too Many of Them

there's too many of them
Stay on target..

While multiple triggers can be defined for the same action on the same table, that’s not an invitation. You enforce trigger execution order to an extent (first and last), but any more than that and you’re asking for confusion. Falling back on the KISS principle, if you need more than one trigger on a table & action (insert, update, or delete), you probably need to rethink the underlying design.

Bonus

Using INSTEAD OF vs. AFTER: it’s fairly self-explanatory, but just be aware of what you’re doing, especially with the former. You’re literally replacing the desired action of, say, an update query with the contents of your instead of update trigger. If this is not obvious to all users of this table, you’re in for some really surprised faces and angry messages.

And that’s all I have for today folks! Enjoy triggering stuff. In moderation, as all things. =)

Just Another Brick (in the) Wall

Over-engineering may be foolish, but under-engineering is just as perilous.

This month’s T-SQL Tuesday topic, thanks to Wayne Sheffield, is a doozy.  It’s difficult to admit failures, even if we eventually overcame them.  Mostly, it’s tough to fess up to those hours upon hours of what feels like wasted time preceding the hopefully inevitable breakthrough.

Let me tell you a tale of square pegs and round holes.

Get your mind out of the gutter.

And strap in.  This puppy went over 2k words.  I didn’t intend it that way, but, c’est la vie.

A Short History Lesson

I used to work for a software company that made K-12 educational assessment & reporting products.  A large part of K12ED is dealing with learning standards, these historically fraught named concepts that students are supposed to have “mastered” and learned during their time in the US public school system.  You ever hear a reference in pop-culture or media referring to “teaching to the test” or “state standardized testing” or similar?  That’s what they’re talking about.

In the late 90’s, a bunch of states, including California in 1997, finalized and centralized a “list” of sorts, almost a database (but not formally, because let’s face it, this was still the early days of what we know as modern IT), of said Standards, which came to be unofficially known as the “CA ’97 Standards”.  For over a decade, these learning goals dictated what was taught when in our schools.  Government funding incentives, based on state standardized test scores, drove instruction to the point that teachers had little if any time to dynamically mold their students’ minds into critically thinking, multi-faceted, creative individuals.

But this article isn’t about my bias.  This is about the broad, sweeping shift in the K12ED landscape called “Common Core”.  As the technology sector grew more mature and started permeating more “traditional” industries, a vocal minority of thought-leaders had what they deemed an epiphany.

Hey, what if we took the old-guard educational bureaucracy, and all those disparate state standards, and turned it into a unified, technology-driven learning ecosystem?

Sounds great in theory, right?  I mean, surely their intentions were good?  Well, you know what they say about the road to Hell and how it’s paved…

Gosh, there goes my bias again.  Sorry, let me just tuck that back in its pocket.

Anyway.  These new Core Standards introduced some new ways of thinking.  For example, that some learning concepts are inter-related and “cross-cutting” (a fancy way of saying that sometimes a Math concept requires a fundamental Reading-Literacy knowledge-point to fully grasp).  This had some very interesting impacts on the underlying technology systems which relied on, and housed, said Standards.  System which, I might add, had existed for over a decade at this point, in many cases.

Bringing it Back to Data

Our company’s system was one such.  Our key partner’s system, from which we quite literally inherited the traditional, relational database structure to house the CA ’97 Standards, was another.  You see, back before RESTful APIs ran the world, software systems relied heavily on what we call “local data stores”.  In order to show the teachers and administrators, who primarily used our system, the Standards, in which their students were performing well (or poorly), we had to relate those Standards to the test questions that said students were tested on month after month, year after year.  And, like so many other small businesses of the late 90’s / early 00’s, we had a trusty ol’ SQL Server lying around, just waiting to be loaded with all our precious data.

This was fine, for the most part.  The legacy Standards conformed reasonably well to a relational data model, even though we had to throw in a bit of hierarchy (using the good ol’ adjacency list scheme).  There wasn’t a complicated set of relationships from Standards in different subjects (Math, Science, History) to each other, and people didn’t care to do much in-depth analysis beyond “are my students getting well-prepared for the state tests?”.

Enter Common Core

You parents thought these things were complicated and convoluted — just Google “common core math problem” and you’ll find no shortage of critical satire.  Well, the underlying data structures required to store these new Standards were going to be significantly more complex as well.

One of my main jobs, for about a year or two, was to oversee the importation of said Core Standards into our SQL database system.  On the surface, it seemed reasonable — we had a hierarchy concept already, and we had a roll-up & drill-down mechanism for the handful of different “levels” of said hierarchy.  But it was all very static.  What that means, for us tech-heads, is that it was not easy to change or extend; not adaptable to new and expanded requirements.  The older Standards adhered to a fairly strict hierarchy, and each level of roll-up had a distinct name.  With Common Core, they broke out of that old mold, while simultaneously keeping some of the names only to change their meaning depending on context.

Think of it this way.  A group of cattle is called a herd.  A group of sheep is also called a herd.  And a group of seagulls is called a flock.

And I ra-a-an.. I ran so far a-wa-a-ay…

Sorry, where was I?  Right, group names.  So what if the government suddenly decided for us that a group of sheep will from now on be called a ‘gaggle’.  But only if they’re all female.  If the group contains any male sheep, it’s called a ‘herd’ still.  And groups of cattle will be still be called herds, unless it’s purely a group of bulls being raised for beef, in which case we call it a ‘meatsock’.

Have I lost you yet?  Of course I have!  This is pure nonsense, right?  Language does not work this way.  Moreover, hierarchies of groups of things do not work this way.

But I digress.  There was, despite my jest, a method to the madness of the new Common Core hierarchy groupings.  And I did learn it and understand it, for the most part.  The problem was that it threw our existing legacy data model to the wind.

Enter Academic Benchmarks

As usual with a legacy software system in a small company, the resources and buy-in for a data-layer overhaul were nil.  So it fell to us intrepid DB-Devs to shove that snowflake-shaped peg into the very square hole of the old relational model.  We sketched out plenty of ERDs, brainstormed how to tack-on to our existing structures, but nothing short of a miracle would make this new data conform to these old static norms.

Thankfully, the “geniuses” (and yes, that is used sarcastically) over at Academic Benchmarks, or AB for short (at least for the purposes of this post), had already done this.  And we paid them, thousands of dollars per year, for the convenience of using their GUIDs to identify Standards across different systems and vendors.  Never mind that they were just perpetuating the bad model of yesteryear; never mind that they provided zero support for data quality feedback loops.  We could happily take their Excel files or CSVs and load them nearly straight into our database.

Enter, and Exit, ASN

While I was searching for the words to express how insufficient our data model was, I came across this little gem from the Gates Foundation: Achievement Standards Network, or ASN.  (ASN stands for many other things, so as with all acronyms, it’s all about context; just fair warning for the Google-happy.)  The architects here had understood that learning standards needed a better and more flexible model, not only in terms of storage, but also in terms of data interchange format.  This new kid on the block called JSON had been making waves for a while, and was subsequently widely adopted by the tech industry in general, so it stood to reason that this would be the preferred format for publishing and serving the Standards from ASN.

Bonus: it was FREE.  Yes, really.  What a wonderful thought, I said to my team, to imagine never having to pay those crooks at AB another red cent!  Eventually.  After years of transition.  But alas, it was not to be.  See, AB had been around the industry for a long time.  They had their hooks in almost every learning content publisher and assessment vendor.  So as tempting as this shiny new source of academic truth may have been, sadly, it was not practical.

Enter the Contractor

Somewhere around the same time, we took on a promising new developer who, not only had a very strong background in Math and CS fundamentals, but who had also proven his worth with real world applications with actual “in the wild” deployments and users.  He was a bit arrogant, actually.  One could argue that he’d earned it, perhaps, but we didn’t appreciate always being told everything we were doing wrong, constantly, to the point that it was hard to hear the more important bits of wisdom and insight into how we could improve.

Despite that ego, one of his biggest contributions to the team was a fresh impetus to learn new things and branch out to new technologies.  To start looking at new Javascript frameworks.  To revisit OO fundamentals like Dependency Injection, and stop writing such procedural code.  To consider NoSQL data stores.  In particular, graph data stores.

Sadly, that last part came in too little, too late.

Side-note, on the crest of the micro-services wave, he diagrammed and proposed an entire system re-write for our core software product, using micro-services architecture and the concept of small purpose-dedicated data stores.  It looked real purty, but was ultimately and completely impractical for a company of our size.  If we were a “true startup” with millions in VC funding coming out the ears, and could afford to throw a brand new, young & hungry “2 pizza team” at the product, then sure.  But that was not reality.

The Brick Wall

No two bones about it: we had to support these new Standards.  So we soldiered on with our relational database tables.  And we tacked-on additional entities and relationships, logic and code, to make it “almost” do what the Common Core Standards wanted to do.  Effectively, what we were trying to do, was to shove that pretty round sparkly peg of graph data, into the dingy old square hole of a 15-year-old SQL table structure.

Somewhere along the way, AB caught up with the times and started offering other, less “flat” formats, for their Standards data.  So even though ASN was off the table, not all of our work toward JSON ingestion/conversion went to waste.  Consuming the data exports from the vendors wasn’t a problem — we’d already been doing this.  That was not the issue.

The issue, the brick wall against which we continually banged our heads, was the fact that we just plain couldn’t support the advanced & complex relationships and groupings (categorizations) of the new Standards.  Which turned out, in retrospect, not to be the end of the world, because honestly it would take years, if not decades, for the educational system’s old-guard mentality to even comprehend such relationships and categorizations, let alone how they could help shape classroom instruction toward better outcomes for their students.

Good lord, that sounded like a bunch of jargon.

What I mean, in plainer English, is that we spent a lot of time worrying and arguing about something that did not matter as much as we thought it did.  The consumers of our system, the teachers and principals and such, honestly didn’t care about all that.  They just wanted to know if their kids were on-track to be “Proficient” on the state tests so their funding would remain steady.  (I don’t give them enough credit, I know; teachers themselves also needed to know, on a regular basis, which kids needed special attention in what areas of learning, but that’s beyond the scope of most generalized reporting tools.)

Hindsight is Always 20/20

So what was the lesson here?  I don’t want to overlook the fact that we were still using the wrong data storage technology for the problem, fundamentally.  Or at least, the wrong data model.  But, we live in a real world where software needs to be delivered on-time, in-budget, with less than perfect teams who have less experience and expertise than they feel they should.  So instead of butting our heads against that brick wall, let’s try to remember to be pragmatic.  To be realistic about what’s feasible for who & when; and to adapt the parts and segments of our application infrastructure, appropriately and efficiently, to the business problems and use-cases at-hand.  Over-engineering may be foolish, but under-engineering is just as perilous.

T-SQL Tuesday #104: Code You’d Hate to Live Without

And that’s where we could use a little encouragement, I think — another DBA saying “Yay, it’s not just me!” makes it all worthwhile.

It’s that time of the month again!  Bert‘s fantastic invitation complete with YouTube vid is tempting indeed.  There are so many wonderful community scripts and tools for SQL Server DBAs.  But, in an interesting twist, he specifically asks us about something self-written or self-created.  And I’ll admit, I sometimes have trouble being exceptionally ‘proud’ of my home-brewed stuff.  Sure, I’ve blogged and GitHub‘d, and hopefully those have helped someone along the way.  Most of the time, though, those are specifically suited to a given use-case.

The more difficult code to share, I think, is the stuff that we keep in our “daily-grind” “Get-Stuff-Done” folders.  Scripts that we’ve tweaked and commented sporadically throughout the years, finding this edge-case or that tweak for a given scenario, most of which ends up being pretty environment-specific.  And that’s where we could use a little encouragement, I think — another DBA saying “Hey, it’s not just me!”, or “I knew there was another guy/gal out there who always has to do this kind of thing!”.  Thus, here I will attempt to show off something along those lines and see what you kind folks think.

Where Does it Hurt?

No, I don’t mean the famous wait-stats queries by Paul Randal (those are awesome).  I mean, what are the top few areas of your SQL environment where you’re always questioning something, or being asked to find some answer that’s not immediately obvious?  For me, there are 3.

Replication

Transactional replication is, in a word, ‘brittle’.  It works well when it’s working, and usually you don’t have to think about it.  But when you do need to check on it, or change anything about it (god forbid), or troubleshoot it (more like shoot it, amirite?), it feels a bit like trying to mess with a half-played Jenga stack.  Meaning, you might be just fine, but you might send the whole thing crashing down and have to rebuild it all.

peanut-brittle-from-the-joy-of-baking
As brittle as this stuff. But not nearly as delicious.

I won’t go into the whole troubleshooting aspect here, that’s too much scope.  But there’s a question that can often come up, especially from Devs or Biz-Analysts, and that is: “Hey, is TableX being replicated?”  And rather than subject my poor eyeballs to the replication properties GUI, I just run a little query, which reads from the distribution database and the actual database that’s being published (the ‘publisher’ db), and tells me a list of tables & columns that are being replicated.

Here it is.  Be sure to replace [dbName] with your actual published DB name.  Like StackOverflow or WideWorldImporters, or AdventureWorks <shudder>.

Report Subscriptions (SSRS)

Another question I’m often asked is, “Hey, did ReportX run?”  What they really mean is, “Did ReportX‘s email subscription get sent to BigWigUserY?”  We have an unholy amount of SSRS reports with email subscriptions.  And because I don’t care to bloat my inbox by BCC’ing myself with every single one, I rely on the users to speak up when they don’t receive something they’re expecting.

“This is a terrible idea.”, you say.  “Never trust the users!”

Yes, well, such is life.  If you have a better lazier solution I’m all ears.

So here’s a little script I wrote to query the ReportServer database (which MS will tell you is “officially unsupported”, snore).  If I know some keyword from the report title, or the supposed recipient’s email address, this will tell me if it ran successfully or not — LastRun, LastStatus.  A couple other useful bits: where it lives, ReportPath, and what its SQL Agent Job’s Name is, JobName.

That last bit is peculiar.  The JobName looks like a GUID (because it is, because SSRS just loves GUIDs), but it’s also the actual name of the agent job, which you can use to re-run said job — via exec msdb.dbo.sp_start_job — if the failure wasn’t systemic.  As I often do.

Disk Space

Last but not least, everybody’s favorite topic to try and forget about or pawn-off to the SysAdmins.  “How much space are those databases / data files / log files eating up?”  Well, mister suddenly-cares-about-disk-space-but-is-OK-with-storing-all-domain-users’-iTunes-music-libraries-on-the-central-fileshare-along-with-their-documents-because-that’s-what-we’ve-always-done.  {True story.}  Let me tell you.

keep calm and release the bitter
It’s healthy, I swear!

This script has a lot of comments and commented-out lines because I will tweak it depending on what I need to see.  Sometimes it’s “show me the DBs where the logical filename doesn’t follow my preferred pattern” (the DB name with ‘_data’ or ‘_log’ after it); or perhaps “Only show me files over 5GB with a lot of free space” when I’m itching to shrink something.

“Never shrink your files!  (In production!)”

Yes, thank you, knee-jerk reactionary.  I’m dealing with servers in lower environments, usually, with this one.  😉

What do you think?

I’d love to hear your feedback in the comments!  Happy Tuesday!!  =)

Credit where credit is due.

I did not magically come up with these all by myself.  They’re pieced together from many a StackOverflow answer and/or other community blog post or contribution that I’ve since forgotten.  I usually store a link to the source in these kind of things, when it’s true copy-pasta, but in these cases, I added enough of my own tweaks & style that I no longer tracked the original linkage.  If you see anything that looks familiar, please do tell me where to give kudos and shout-outs!  😀