nate_the_dba

The Nested Set Model

The #1 rule of the Nested Set Model is: FAST READs. The #2 rule of the Nested Set Model is: see #1

There are ~~probably~~ definitely several articles out there which cover the SQL implementation of the Nested Set Model, aka “modified preorder tree traversal” (which is more the name of the algorithm by which you traverse the tree, rather than the structure itself). But I found it interesting enough, and more importantly, applicable enough to my job experience, that I feel it deserves some treatment. Not the basic “how to”, but more an example of a particular operation and a specific pitfall to avoid. (Jump straight to the example diagrams.)

Now, we’re not going to debate about whether this model is “the best” representation of hierarchical data in an RDBMS (some argue that Closure Tables, aka “Ancestor Tables“, or some kind of hybrid approach is better, and I’d probably agree). The fact is, sometimes (read: almost always) as a DBA/DBDev, you’re “stuck with” an existing database in a legacy application environment that you pretty much can’t change — or if you can, changes need to be small, incremental, and non-disruptive.

Okay, with that disclaimer out of the way, let’s dive in. First things first:

The #1 rule of implementing the Nested Set Model is: FAST READs.

I can’t stress that enough. Fast SELECTs. Everything else pales in comparison. In other words, we don’t care how long and painful and slow write operations are against this table (updates, inserts, deletes), as long as our SELECTs remain super speedy. If that is not your use-case, consider a different model.

The #2 rule of the Nested Set Model is: see #1

Moving on…

The #3 rule is: encapsulate tree operations to maintain its integrity & structure.

Put another way, the #3 rule is that you should always operate on the tree (CrUD ops) using stored-procedures and/or triggers that encapsulate all the nitty-gritty details of maintaining the correct position values during said insert/update/delete operations. Of course, somebody is responsible for writing those stored-procs. Any volunteers? Easy now, don’t raise your hands all at once! Generally, this responsibility falls to the DBA(s) or DBDev(s).

The problem at-hand, in my current situation, was that of “moving a sub-tree”, i.e. taking a node and all its descendants, and moving it to place it under another “parent” node. In some models, and/or in some languages, this is a simple recursive operation. However, SQL is not spectacular at recursion — after all, we’re working in a relational engine — so let’s try to play to its strengths:

namely, SET-BASED operations!

A previous DBDev had written a stored-proc for just such an operation. However, as (somewhat) expected, it was horribly slow, to the tune of hours of run-time. This is not acceptable, even given the #1 rule stated above.

Well it turns out that most of it was pretty efficient, but the last step, in which they attempted to “fix” the left/right values in the entire table “just to make sure we didn’t leave any gaps“, was, frankly, quite silly. Because the only “gaps” you create are created by the previous steps in the proc, and you know exactly how big that gap is (the width of the subtree you’re moving), and where it is, so you should be able to target that specific area of the tree and close the gap more intelligently, using some simple math. (addition and subtraction — the simplest math there is!)

Doing that improved the performance of the whole proc by a factor of 10. That’s huge. Or, “yuuuuge“.

So let’s get specific. As you’ll see from my diagrams, the model actually is a hybrid, combining an Adjacency List (each record knows its “parent”) with a Nested Set (each record has a “left” & “right” position value). We do this for two big reasons. First, having the parent relationship along with the position values makes all that nasty book-keeping (rule #3) a bit easier to manage (and to check our work). And second, because, conveniently, we can store the data from both models in one table.

On to the examples!

First, we have our tree of Cats.

cat-tree-1 — Or, as a coincidentally cute table alias, `CatTree`

Now, we want to move Jack & his children to become descendants of Mittens (Jack being the child, Smush & Smash being grandchildren). So we start by “making a gap” of the subtree’s “width” (6, the distance between Jack’s PLeft and PRight inclusive of end-points). We add that amount to all PRight values >= Mittens’ original PRight, and add it to all PLeft values > Mittens’ PRight — see the blue #s in diagram below, and code here:

UPDATE Cats
SET PLeft = (CASE WHEN PLeft > @NewParentRight
             THEN PLeft + @SubtreeSize
             ELSE PLeft END)
  , PRight = (CASE WHEN PRight >= @NewParentRight
             THEN PRight + @SubtreeSize
             ELSE PRight END)
WHERE PRight >= @NewParentRight

The red values haven’t changed (yet) but are now wrong, so we’ll have to fix them next. And of course the green values are the moved subtree’s new positions based on the new parent’s (Mittens) PLeft.

cat-tree-2 — Jack is now Mittens’ child.

Finally, now that we’ve moved Jack & his children under Mittens, we need to “close the gaps” that we created at first, to make sure that the tree’s position values remain contiguous. This isn’t as difficult as it sounds: if we’ve stored Jack’s original PRight value (10), we can use that as a cutoff to subtract the subtree width from higher position values and intelligently (and quickly) close the gaps we created before. Again, code & diagram:

--Notice this looks very similar to the previous
--code snippet! (We're basically doing the reverse)
UPDATE Cats
SET PLeft = (CASE WHEN PLeft > @SubtreeOldRight
             THEN PLeft - @SubtreeSize
             ELSE PLeft END)
  , PRight = (CASE WHEN PRight >= @SubtreeOldRight
             THEN PRight - @SubtreeSize
             ELSE PRight END)
WHERE PRight >= @SubtreeOldRight

cat-tree-3 — Red values indicate “closing the gap” that was created by removing the subtree of Jack. Blue values indicate the incidental gap closures for the rest of the tree (above and right). Green values, you’ll notice, are “reverted” (i.e. same as they were originally).

SQL-wise, this should translate pretty well. I’ve posted the setup and stored-proc scripts to GitHub, so the distinguishing reader can review and offer feedback. In theory, there’s probably a way to exclude the green reverted values from the first pass operation (gap-making) so that we don’t have to revert them (at gap-closing), but again, since we’re doing SQL set-based operations, it seems hardly worth the effort — i.e. the potential speed gain would be outweighed by the logical/maintenance complexity.

So what’s the lesson here? Well hopefully, if you’re “stuck with” a SQL DB with a Nested Set Model table containing a hierarchical tree of data, you don’t have to completely re-invent the wheel and write your CrUD ops from scratch. But if your predecessors didn’t plan for certain kinds of operations, and this “move a subtree to a new parent” happens to be one of those, this should help you (re)implement it efficiently.

I’d love to get some feedback on this. Let me know if I’ve missed anything conceptually, if there are better ways or methods to doing any of this, or any other tips & tricks that folks might have for dealing with such data. Leave me a comment!

[footnote 1]
The root of the problem, in this case, was simply taking the code from a slideshare presentation and copy-pasting it into the routine without analyzing its effectiveness and efficiency. It proposed re-calculating the position values after a move, across the entire tree, by using a triple-cartesian-product (or cross-join) to “get the count of nodes to the left/right of each node” for every node, which should sound dirty even as you say it silently in your head, let alone attempt to write it in query form!

[footnote 2]
There’s a 3rd model that we could consider storing in the same table, called “Enumerated Path” or “Materialized Path” or “Breadcrumbs”, which may look good on paper and to your human eyeballs, but breaks down spectacularly when you start talking performance and scale — but to be fair, so do most of these models, eventually, in one way or another, which is why we’ve invented fantastic alternative technologies to address these problems… and frankly, if you’re using all 3 models at once, you’re #DoingItWrong, creating a veritable maintenance nightmare for yourself and everyone around you. Note that the elusive 4th model, the Ancestor Table, requires (as the name would imply) another table — not an argument for or against anything, just an observation.

PS: Happy 2017!

Dates, Times, and Datetimes, Oh My!

There’s a tool for every job. Just stop abusing the tool!

This MSDN page, CAST and CONVERT (T-SQL), specifically the section on DATETIME conversion formats, is easily one of my most frequently visited links.

It really shouldn’t be.

SQL Server is very good at storing and manipulating Date/Time values. There are dedicated data-types for all flavors — DATETIME, DATE, TIME, the newer DATETIME2, and the less common SMALLDATETIME and DATETIMEOFFSET. Then there are the functions & operators that let you do all sorts of fun stuff with them — DATEDIFF, DATEADD, DATEPART, GETDATE, ISDATE, and even some newer ones like EOMONTH and DATEFROMPARTS. These are really powerful tools in the hands of a DB-Developer or DBA.

i just want to use dates — Is that so much to ask? Courtesy of this guy’s blog, which sounds like a great place to learn iOS programming if one was interested in such things…

But you know what SQL Server is not so great at? Reading your mind. Oh, wait, that goes for most applications & systems. Let me rephrase. SQL is not the best platform for knowing how end-users will want their Dates/Times displayed in a contextually/culturally sensitive manner, and executing said preferences.

That’s what we have UX/UI layers for!

While it’s true that the underlying data store (SQL, in this case) needs to be aware of localization & globalization requirements, it shouldn’t be asked to serve-up, say, a Sales-Order-Date in 5 different flavors just because Report X wants it in typical USA fashion (mm/dd/yyyy), User B wants it in “long-form” because they’re reading it like prose (“Jan 13 2016 08:32pm”) in an email, and SSIS Package FooBar needs it in “ISO” format (yyyymmdd) because it’s using the date in a filename! Actually, of those 3 examples, the latter is the most “legit” — or at least, the most justifiable use-case. The other two should have been handled by the overlaying application or middleware — SSRS in the first case, or whatever automation app produced User B’s email in the second.

i can has string to store dates — Because once wasn’t enough…

But surely there’s a good reason that the T-SQL gods included the CAST/CONVERT functionality with all those special date-format arguments, right? Obviously. There are always valid use-cases; or, more proverbially, “There’s a tool for every job.” Just stop abusing the tool!

broken-hammer-pulling-nail — Abused tools can fail you…

A typical DBA or DB-Dev is often asked to write ad-hoc queries or build one-off reports to meet some business request, and he/she often doesn’t have the time or the resources to offload the nitty-gritty formatting details to the appropriate layer of abstraction. So yes, that’s why these conversion options exist (among other reasons). And there’s nothing wrong with that, in and of itself. But like anything, if it becomes a bad habit and a hindrance to overall productivity, it’s time to take a step back and re-examine the situation. Ask the hard questions, like “Why am I being asked to create these one-off reports all the time, which sound so similar to each other, yet inevitably are always a bit different?”, or “Have my business users developed unrealistic expectations about what can/should be done by me vs. by other teams/contributors in the organization?”

This isn’t about passing the buck — I’ve already established that’s not my style. It’s about working smarter, bringing more value to the organization by leveraging better technologies and techniques than obsolete habits and old-guard mentality would otherwise allow.

calvin-and-susie-arguing — “Why are you making me write these horrible queries!?” .. “Because you’re the DBA!” .. “Fine, but give me the resources to automate this for the future.”

So, dear reader, take the time to learn about SQL’s Date/Time types & functions, including the myriad formatting options of CONVERT. But do yourself a favor and consider, when you find yourself using & abusing them, whether the task at-hand is truly best suited for the database layer, or if it really belongs somewhere else.

Thanks for reading!

Views & Mismatched Datatypes

If you’ve ever wondered “who would win in a fight” amongst SQL datatypes…

Allow me a short revisit to the previous topic, because this example came from “real life” in my actual job in a production environment, and I wanted to share. And remember, take this with a grain of salt — there are very few absolutes, a DBA’s favorite answer is “it depends”, and even when we spout rules or decrees, there are usually some exceptions.

Now, let’s start at the top. T-SQL is a strongly typed language. SQL Server, like most RDBMSs, counts datatypes as a foundational element in its ecosystem. Things like bit, int, datetime, varchar (aka string), etc. We also have this concept of NULL. Any element of any datatype can be NULL, which, coincidentally, means a bit field can actually be 3-valued instead of just binary; BUT, that’s not the topic of this post. However, it does involve a bit field. And, as the title says, a VIEW.

First, the larger problem, or “what these little oversights / mis-steps can lead to”, as blogged by someone much smarter than me: Jonathan Kehayias – Implicit Conversions.

Now the diagram:

category, categorygroup, view-category-by-group — In English, the view’s `DisplayInMenu` field is defined as “If Category’s IsRoot bit is false, always use False; otherwise, use the Group’s DisplayInMenu bit.” In even plainer English, it means we only want to “Display this Category in the Menu if it’s a ‘Root’ Category and it’s a member of a Group that gets displayed in the Menu.”

Do you see the problem? It’s not so obvious if you’re not thinking about datatypes, but look closely. Yes, our view’s DisplayInMenu field is a different datatype than its base field (origin). That’s because the result of the CASE expression that defines it is “promoted” to the int type, instead of remaining a bit. The same would be true of a COALESCE or ISNULL expression. This is an example of datatype precedence. If you’ve ever wondered “who would win in a fight?” amongst the SQL datatypes, Microsoft has the answer right there in black & white.

This isn’t necessarily a problem, all by its lonesome. So why did it bite us in the proverbial behind? Two reasons. First, query & usage patterns: vwCategoryByGroup.DisplayInMenu happens to be an important field in our queries, and usually it’s compared against a bit parameter or variable that’s passed down from the app’s lower tier. Sometimes it’s even JOINed to another column, say, GroupMenuProperty.DisplayInMenu — which is, of course, a bit. But because it’s an int in the view, SQL is doing extra work every time to implicitly convert those bits to ints so that each side of the comparison operation is of the same type. And again, not always, but sometimes, this causes performance problems.

ms-sql-stop-making-me-work-so-hard — I’m tired! Can’t you see I’ve been implicitly converting these datatypes all day?

The second reason ~~is, admittedly, a bit beyond my understanding, so I’ll just explain the symptoms and leave the technical details to the more inquisitive minds~~. Basically, during a performance crisis, one of the measures we took was to “fix” the view by turning DisplayInMenu back into a bit. However, we found that it literally broke the dependent .NET application sitting on top, which started throwing exceptions of the “invalid cast” flavor. I believe it’s using Entity Framework. Thanks to a helpful tip from the Brent Ozar Office Hours webcast, I found out that it’s because the EF entity mapped to this view had that field (property?) defined as an int, and in order to make it “see” the datatype change, the code itself would need to be changed, i.e. that property would need to be defined as a bit. Yay learning!

the-more-you-know — Also, did you know that EF really isn’t all that bad? 😉

So, dear reader, be conscious of your decisions with datatypes, especially when it comes to views with superficial computed columns. But more to the point, beware of implicit conversions and mis-matched datatypes. They can be, at best, a source of technical debt; at worst, a silent killer.

Til next time!

Views – the Good, the Bad, & the Lazy

If your VIEW has a dependency tree more than 2 levels deep, you’re doing it wrong.

Let’s talk for a minute about views. No, not like scenery. Not like my family (they’re good peeps). I mean SQL views – i.e. CREATE VIEW vwFooBar which combines commonly used fields (columns) from tables Foo and Bar into a single entity, which datavelopers (term borrowed from Richie Rump, who has an awesome name btw) can then use & abuse without having to know the details of and relationships between those two tables.

So far, this sounds like a great idea, right? Encapsulation & reusability are great principals in IT/Dev land. And I agree! Views have their place, for sure. But some reason, I find myself constantly deconstructing them– refactoring complex queries & stored-procedures that use them — into their base tables (in the above example, that’s Foo and Bar). And I find myself asking the proverbial WHY? Why is this such a common misstep of devs & query writers, that the DBA has to spend such time “fixing” the issues this can cause? Naturally, that train of thought spawned a blog post.

Let’s dive in!

Organization, Customer, Order: vwCustomerOrder — A very simple example of a 3-table view.

So, somebody created a nice high-level view for us that summarizes Customer, Organization, and Order data into a single entity that should prove pretty handy for things like reports, ad-hoc analysis, and maybe even a display-grid on a page somewhere. And the happy developers don’t need to care about the foreign keys between the tables or how to JOIN them properly.

But! Do we see a problem yet? Yes, our vwCustomerOrder doesn’t include our Customer’s Email. Well what if we need that in our code? Do we go through change-management procedures and get the column added to the view? Do we JOIN the view to the Customer table again just to get the Email? Neither of those options are ideal. The latter means that now we’re referencing the Customer table twice; the former involves refactoring, and is even more difficult if the view is an indexed view.

Okay, well let’s say we’ve added Email to the view, and it’s working fine. Now let’s add another layer of complexity. Say Customer is actually a VIEW, which consists of base-tables CustomerHeader and CustomerContact, where the latter stores a collection of contact entries for each Customer. Now, vwCustomerOrder is thus a 2-level nested view. These aren’t really super fun, but they’re not the most offensive thing in the database. But again, what happens when we need something from CustomerContact that’s not already in our “master” top-level view vwCustomerOrder? Maybe the primary Phone1, for example. So to build that query, we now have to combine our 2-level nested view with a redundant table (i.e. a table that’s already in the view)! But because it’s so terribly important to some reporting modules or some code bits, it becomes a permanent fixture of the schema. And on it goes.

Expanded example from earlier:

organization-customer-contact-order-views — Because `vwCustomer` doesn’t include `Phone2`, if we built `vwCustomerOrder` on top of it, we’d have to `JOIN` again to `CustomerContact`, when we know it’s **already** included in `vwCustomer`. Thus, it’s smarter to build `vwCustomerOrder` from the base-tables and avoid that double-join (duplicate reference).

This is the problem with VIEWs that are grown organically, re-actively, without careful attention to the underlying schema and dependencies. And look, I get it. Mature software system systems DO evolve organically; iteration is the name of the game, and that goes for the database too, not just the app code. But let’s agree that, like the boyscout principle espoused earlier in my ode to Clean Code, we can try to leave things a little better than how we found them. DB refactoring is … not necessarily harder, but definitely different (than app code refactoring). You probably have to jump thru more hoops to get the changes pushed to production. Regardless, it’s worthwhile. Why? Glad you asked!

Nested views are icky for a couple reasons. First, they’re more difficult to troubleshoot – not because they annoy the snot out of your DBA, but because they legitimately make index tuning more tedious. Second, laziness begets more laziness – when you’ve got these views, you’re automatically tempted to simply use, re-use, & abuse them, instead of taking the time to analyze their effectiveness and refactor, replace, or reevaluate the problem. Rampant ill-constructed views can lead to some serious technical debt. And…

There’s another nugget of wisdom in software development that’s appropriate here: Less is More. Your views should address a specific need or use-case, do it well, and call it a day. Feature-creep is not a goal; views that try to be many things to many people ultimately fail at it, because you end up committing sins against set-theory & the relational model in favor of “oh just one more thing!” While the purpose of views is ultimately abstraction, we should not forget the underlying parts — tables, indexes, statistics, schema relationships. In fact, a good view can help us construct the right indexes and visualize the relationships. Whereas a BAD view will obfuscate indexing needs and confuse (or even outright lie about) the base relationships.

Venn-diagram time. We’ll use the circle areas to represent that “scope” of each view, meaning the base-tables that comprise it. In a healthy schema, there will be mostly independent views with a few overlaps in small portions. These views “do one thing well”, with little redundancy and hardly any nesting. Conversely, in a haphazard ill-managed schema, there is lots of overlap and redundancy, multi-level nesting, and ugly colors (who the heck likes brown, anyway?).

views-family-good — Views in happy harmony & balanced dependence vs. independence.

views-family-bad — Views in sad land with ugly brown and black overlapping areas. (I was going to try for puce or burnt-umber but Paint is sadly not up to Crayola 64-pack standards.)

So, dear reader, what’s the point? (I feel like I rhetorically ask that fairly often.) Build your SQL views with targeted use-cases and clear purpose. Avoid the laziness trap and the temptation to tack-on “one more thing”. A view should be a concrete, concise, abstracted representation of the underlying tables & relationships. Evaluate existing views for how well they meet the developer needs, and don’t be afraid to deprecate, drop, or rewrite bad code. And above all, stop the insane nesting.

If your view has a dependency tree more than 2 levels deep, you’re doing it wrong.

Or, more visually…

ridiculous-swiss-army-knife — This is not a useful tool, despite the sheer volume of functionality.

That’s all for this week! Thanks for reading, and apologies for the slightly longer delay between posts.

SQL Server for the Developer

And now for a brief interlude while I work on my next “real” post… continuing with the same theme… SQL Server for the developer/programmer! Warning: potentially excessive snark ahead.

SQL Server is a mystical fairy-tale land where you can store all your application data, and never have to worry about it again! Sure, it “should” be relational, but you also “should” drive the speed limit.. PSHH! Let’s throw blobs in there too — HTML, XML, hell, even JSON now that 2016’s got nice happy JSON-centric system functions. And files? Heck yes, what else is FILESTREAM for? Sure, performance may be abysmal, but that’s somebody else’s problem, so why should we care?

It’s the only database platform your company invested in, so it must be good for everything. Document storage? Sure, not like there’s anything else better suited for the job, especially for ‘Free’! Ooh, how about key-value data, and volatile temporal data like .NET Session-State? Obviously! Nothing else exists, and Microsoft makes it so simple to set it up in SQL.

sarcasm-sign-everybody-needs-one — Here’s your sign…

It’s that wonderful database system where we send all our Entity Framework queries and data just magically appears! We don’t need to worry about how nasty and convoluted those queries turn out when they finally get parsed down to actual TSQL code. It was only 3 nice simple lines of LINQ, why should it be any more complex than that? Oh, and we can write all sorts of abhorrent ad-hoc queries against it too, nobody will even notice. Let that code build all sorts of IN SELECTs & sub-queries, chains of unreadable JOINs, and Inception-level nested VIEWs, it’s perfectly fine! Performance, shmerformance! That’s what we have DBAs for. Not that they’ll be able to help you when you refuse to refactor or rewrite your code that’s responsible for these abominable queries.

what has been seen cannot be unseen — Your terrible queries are scaring the dog…

SQL Server is so easy! Microsoft gives me a free Developer Edition, I can code for all these awesome features that I know we’ll have in production, right? Oh, we only run Standard Edition? Woops. Ooh, I know! Let’s put everything in Azure. It can do everything and make us breakfast! Then we won’t need to worry about feature differences, backups, RTO/RPOs, index maintenance, performance tuning, scaling, or even performance in general! Wait, no? Oh well, that’s what we have DBAs for! Besides, it’s so easy to move everything over, we just take our SQL backups and upload them to blob storage and then restore them on Azure SQL Database, right? No?

this is why we can't have nice things — Exactly.

Hmm. Maybe we should hire a DBA. Or four.

That’s all, see ya next time! XD

SQL Server for the IT Helpdesk, part 2

Part 2, as promised. We were talking about the components of SQL Server. So far we hit “the big ones”: the database engine sqlservr.exe, the data files (mdf, ldf), and the agent sqlagent.exe.

Thirdly, less of a “component” and more of a “tool-set”, the venerated SSMS – SQL Server Management Studio. (Formerly known as SQL Server Enterprise Manager, if you’ve been around for a long time.) SSMS is the “stock” application interface for SQL Server, the DBA’s basic swiss army knife – all of the essential functionality, a couple odd-ball modules here & there, and no “fluff”. It’s not the prettiest, nor the sexiest, but it gets the job done. It even has some basic monitoring capability; as an infrastructure person, you would most likely use it to check up on DB statuses, Agent Job runs, and perhaps the occasional “everything is slow!” panic moment by peeking at the Activity Monitor (right-click on a server instance).

SSMS is powerful, and with great power comes great responsibility. Your DBA should be practicing good security principles and granting limited access to all but the most trusted, experienced team members, because that top tier access, sysadmin (or sa for short), is literally capable of destroying the server (maybe not physically, but logically!). SSMS is a client-side application, primarily – meaning that yes, while you do “get it for free” when you install SQL Server on your server (at least, until SQL Server 2016, when they finally broke it out to a standalone installer, praise the gods), you should really install it on your own workstation and use it from there.

Key takeaway: ask your DBA to install SSMS on your workstation, IF it’s appropriate (and trust me, they will tell you if it’s not); often, there are better monitoring tools and infrastructure-focused apps that can “talk to” the SQL servers without needing you to dive down deep into SSMS-land. But, in a pinch, it helps to know a few basic things about what it can do.

Finally, and less commonly, we have a collection of supplemental services that work within and around those two main components to provide additional functionality and support for things like bulk data interchange, reporting, data warehousing, and scale-out options. Again, the curious reader can easily find more detailed info at his/her favorite reference sites, so I’ll define these briefly and move on. They all start with “SS”, which of course stands for SQL Server. So, here we go. SSIS – Integration Services – bulk data import/export, ETL (extract/transform/load), and similar tasks. SSRS – Reporting Services – lets us build & manage reports, and provides a basic no-frills web interface from which any business user can view said reports to which they’re granted access. SSAS – Analysis Services – data warehousing, cubes, “analytics” – basically, we’re filling up disk-space with pre-defined roll-ups and drill-downs of our data for faster reporting queries & data analysis. So as you can imagine, often those last two go hand-in-hand, in a mature environment that’s cognizant of its “business intelligence” (BI) needs. But that’s another topic way outside the scope of this post!

Key takeaway: if any of those SS*S components are a part of your environment, monitor them at the service/process level, and, just like with the Agent, have a procedure in place to address incidents.

So what have we learned? A SQL server machine runs one (or more, but hopefully/usually just one!) instances of SQL Server. We should primarily be aware of the engine & agent service processes, the locations of the mdf & ldf files, and the agent job schedules; and secondarily, if applicable, the other components (SS-whatever-S).

Oh, but wait! What about performance concerns?!? Glad you asked.

SQL Server EATS disk for breakfast, memory for lunch, and CPU for dinner. In that order.

om-nom-nom-nom-wait — I may have made a slight miscalculation..

So your dedicated SQL server machines better have FAST disks, LOTS of RAM, and a decently fast, multi-socket multi-core CPU. The disk speed helps keep transaction log throughput & data file read/write activity up-to-snuff, so faster == better. Memory is for holding the “hottest” data for fastest access so it doesn’t always have to go to disk for everything, so more == better. And CPU cores help it serve 100’s (or even 1000’s) of concurrent requests in a sane, efficient manner.

zoidberg-wants-moar — Because moar… and Zoidberg.

In reality, SQL Server performance is a huge topic; there are entire consulting companies dedicated to it, and tons of excellent blogs about it. You want a good DBA on your team who knows how to troubleshoot and address performance issues with your SQL servers. But when you’re tasked with helping spec-out a new physical box, or sizing a new VM, it doesn’t hurt to know what kind of demands you’ll need to meet.

Which brings us to our last key point…

Key takeaway: SQL Server is primarily a scale-UP technology, not scale-out. If you’re not OK with that, you might want to look into other DBMS’s.

scale-out-and-scale-up-illustrated — I have to give full credit to this guy because his slide really illustrates my point well.

That’s not to say you can’t scale-out with it; obviously, large corporations with mature ecosystems often have 100’s of instances in some resource-groups, but as I discussed earlier in What’s in a Name?, the applications which depend on those SQL servers are also architected with that scaling in mind, and it’s not like an “easy switch” you can make to an existing monolithic app.

Anyway, I hope this helps IT Helpdesk and SysAdmins/Engineers get a little more visibility into the SQL DBA’s world, and bit more understanding of how things work behind-the-scenes.

Thanks for reading, see you next time!

SQL Server for the IT Helpdesk

Know which servers are running SQL Server, monitor the services, and have an escalation procedure for failures.

Today’s post will be much less philosophical and (hopefully) much more practical. This is a topic that I’ll actually be presenting at work to my fellow IT team members, who are mostly all Helpdesk and Infrastructure people, with no Dev or database background. My goal is to explain the basics of SQL Server from an IT infrastructure perspective, and I hope that others find it useful as well as my team! So let’s get to it. Fair warning: lots of abbreviations ahead! But fear not, I will define them – I don’t want to assume any prior knowledge, other than basic IT skills & Windows familiarity.

admin-disabled-contact-admin — *This might be a bad sign…*

I also decided to break this up into 2 posts, because it got pretty lengthy. So, here goes part 1!

SQL Server is, obviously, Microsoft’s relational database management system, or RDBMS. “Relational” means it’s based on the concept of relationships between entities (or objects, if you will; though mostly we mean tables). In layman’s terms, the relations can be expressed as “is a..”, “has a..”, or “has many..”; for example, a Customer object “has a” Contact-Info object, which probably “has many” Phone#s, Emails, etc. SQL Server is a transactional, fully ACID compliant database system – Atomicity, Consistency, Isolation, Durability. I won’t bore you with the computer science theory stuff; you can read all about it on Wikipedia or your favorite reference site, so let’s move on.

There are a handful of components that make up SQL Server, and I’ll list them in what I think is the “most to least important” order, or at least most to least commonly used. Infrastructure-wise, at its core, SQL Server is basically an “application” and persistent data storage mechanism. It’s a set of processes (services) that run on your server and operate on specific sets/types of files, which we’ll cover in a bit.

First and foremost of said core components is the database engine. This is the DBA’s bread & butter – it’s the service process, sqlservr.exe, which handles all the queries and scripts, manipulates the data files & transaction logs, ensures referential integrity and all that other ACID-y goodness. Like any good service, it is best run under a dedicated service account, especially in a Windows domain environment. Otherwise, it will run as a built in system account, like NT Service\MSSQLSERVER. As a service, it’s set to “Automatic” run-mode and it accepts a number of startup parameters which can help govern and tweak its behavior. Most of those details are outside the scope of this post, so go ask your friendly (or surly, depending on the day) DBA if you’re curious.

msdn-sql-core-components — *It’s an older infographic, sir, but it checks out.*

A bit of key terminology: each running database engine is what we call a SQL Server instance. Typically you dedicate one physical (or virtual) server to each instance; however, in rare cases (usually in the lower tier environments like dev/test), you will see multiple instances on one server – this is called instance stacking, and it’s not usually recommended by DBAs due to resource governance complications. You typically reference a SQL Server instance simply by the server name that it’s running on, because the “default instance”, while technically named MSSQLSERVER, does not require any special referencing. So if my SQL server box was named FooBar, I would connect to the SQL instance with the name FooBar. If you get into stacked instances, you have to start naming the other instances uniquely, and your connections now have to include them with a backslash, e.g. FooBar\SQL2, FooBar\SQL3, etc.

As I said, the engine manages the data files. We’re talking about a permanent data storage system, so that data has to “live” somewhere – namely, mdf and ldf files (and sometimes ndf too). The mdf (and ndf) files hold the “actual data“, i.e. the tables, rows, indexes, & other objects in each database. The ldf files represent the transaction logs, or “T-logs” for short, which, again, are a fundamental part of the RDBMS. They allow for and enforce those fancy ACID properties. All of these files are exclusively locked by the sqlservr.exe process, meaning, nothing else is allowed to touch or modify those files but SQL Server itself, while SQL Server is running – not your antivirus, not your backup software (SQL has its own backup methodology), not even Windows itself. The only way to interact with and manipulate those files, or rather, the data within those files, is through the engine.

cant-touch-this-permission-denied — You can look, but you can’t touch!

Key takeaway: know which servers are running SQL Server, monitor the sqlservr.exe service(s), and work with your DBA to ensure that your AV software excludes the locations of those data files, and that your backup strategy includes & works in conjunction with SQL-based backups.

In a close second, we have the SQL Server Agent, sometimes just “agent” for short. Think of this like Windows Task Scheduler on steroids, but specifically focused within the realm of SQL Server. Like the engine, it runs as a service process – sqlagent.exe– again, typically set to automatic startup and running under a dedicated service account. (There’s a small debate on whether it’s best to use the same service account as the engine, or a separate one; in the end it doesn’t really matter much, so it’s mostly at the discretion of your DBA team and/or the Active Directory admin.)

sql-agent-man — Shamelessly borrowed from the Database Whisperer

SQL Server Agent is basically free automation – we can use it to schedule all kinds of tedious operations and database maintenance tasks, anything from “run this query every day at 12pm because Mary in accounting runs her big report after lunch and it needs to have data from X, Y, & Z”, to running the hallowed “good care & feeding of indexes & stats” that any DBA will tell you is not only essential to decent database performance, but essential to life itself (or at least, your SQL server’s life). These sets of work are called Agent Jobs (or just “jobs”), which consist of one or more Job Steps. A job has one or more Schedules, which dictate when it runs – such as, again, daily 12pm, or “every Saturday & Sunday at 5pm”, or “every 15 minutes between 8am and 9pm”. Ideally, jobs should be configured to alert the DBA (or, in some cases, the broader IT team) when they fail, so that appropriate action can be taken.

Key takeaway: monitor the service sqlagent.exe and have an alert & escalation procedure in place for handling failed Agent Jobs.

database-automation — Automation is good, mmkay?

A quick sidebar about transaction logs. SQL Server uses what’s called a “write-ahead log” methodology, which basically means that operations are literally written to the T-logs before they’re written to the actual data files (MDFs/NDFs). As stated so eloquently in this blog:

[When] neglected, it can easily become a bottleneck to our SQL Server environment.

It’s the DBA’s job to ensure the T-logs are being properly backed up, maintained, and experiencing good throughput. And I’ll have another blurb about performance at the end of part 2. However, as an infrastructure person, it’s worth knowing two things: A) these aren’t your “traditional” logs, like error or event logs; these are fundamental core part of SQL Server. And B) —

Key takeaway: The disks which house the transaction logs should be FAST. Like, stupid-fast. Srsly. At least at sequential writes.

samsung-960pro-m2-ssd — *Yes please, I’ll take several… just charge it to the CIO’s platinum card.*

Edge Cases

“Oh, that’ll never happen!”

I have a confession. I enjoy edge cases. More than I probably should. I take an unhealthy amount of pleasure in finding those unexpected oddities and poking holes in the assumptions and “soft rules” that business users and even developers make about their processes, applications, and most importantly, their data.

But let’s back up a minute. What exactly is an edge case? Well, I’ll quote Wikipedia’s definition at you, then I’ll put my own spin on it.

An edge case is a problem or situation that occurs only at an extreme (maximum or minimum) operating parameter.

edge-cases-99-to-1 — *it’s in a pie chart, so it must be true…*

In my mind, there are really two types of edge cases. Although the results and treatment are largely similar, so it’s not a terribly important distinction, but for conversation’s sake, here they are.

Known: while we can (and usually do) identify these cases, they are typically thought of as ultra-rare and/or “too difficult to reproduce”; thus, the plan for handling them involves tedious one-off or ad-hoc procedures, which are often completely undocumented.
Unknown: the saying goes, “You don’t know what you don’t know” – these cases don’t even cross our minds as a possibility, either due to ignorance or because they are [supposedly] contrary to our business rules or our [assumed/remembered] application logic.

Again, the end result is the same: panic, discord, technical debt, and wasted hours of remediation. So why do we do this to ourselves? Well, one common justification tends to be “Oh, that’ll never happen!”, and we sweep it under the rug. Then there’s the laziness, busy-ness / lack of time, pressure to deliver, gap in abilities or tool-sets, passing the buck, etc. We’re all guilty of at least two of these in any given week.

So let’s move on to the important part: What can we do about it? Largely, we can simply take the excuses & reasons and simply turn them on their heads. Take ownership, commit to learning, communicate with management, make time for planning and documentation, and once a path is laid out for remediation, actually do the work. It’s often not easy or pretty, but a little pain now beats a lot of pain later.

I know, easier said than done, right? :o)

Example time.

Let’s say our sales offices are closed on Sunday – this is our “operating assumption”. Therefore, no orders will be processed on Sundays – this is our “business rule”. So because of this, we’ve decided to do some ETL processing and produce a revenue report for the previous week. Now, we’re using some antiquated tooling, so our first batch of ETL, which takes the orders from the sales system and loads them into the bookkeeping system, runs from, say, 2am to about 5am. Then we have a second batch, which moves that bookkeeping data into the report-staging area, from 6am to about 7am. We need those hours of “buffer zones” because the ETL times are unpredictable. And finally, our reporting engine churns & burns at 8am. But along comes Overachieving Oliver, on a Sunday at 5:30am, and he’s processed a couple orders from the other day (so perhaps he’s really Underachieving Oliver, but he’s trying to make up for it, because he enjoys alliteration almost as much as I do).

Woah nelly! What happened? Oliver’s sales didn’t make it into the report! Not only that, but they don’t even exist according to bookkeeping. But come Monday, if he tries to re-process those orders, he’s probably going to get an error because they’re already in the sales system. So now he’s gotta get IT involved – probably an analyst, a developer, and maybe even a DBA. Damn, that’s a lot of resources expended because an assumption and a rule were broken!

Here’s another one. An order consists of 1 or more order-lines, which each contain 1 or more of a given item (product). Let’s say we store these in our database in a table called OrderLines, and each line has a LineNumber. Now, we have an assumption that those LineNumbers are always sequential. It’s even a rule in our applications – maybe not all parts or all modules, but at least some, and enough to cause a fuss if there’s a gap in that sequence or if a line is moved around somehow without proper data dependency updates (which, by the way, are application-controlled, not database-controlled). Plus there some manager who depends on this old reporting metric that also breaks when those line numbers are out-of-whack. But this should never happen, right?

The operative word there being “should”. But apparently there was a bug in an “update order” routine that left a gap in the sequence. Or maybe the DBA was asked to delete something from an order post-mortem, and there’s no way within the app’s ecosystem to do it, so he had to write some queries to make it work. And guess what? Because the Dev team is super busy working on the hot new feature that everybody wants, it will be 2 weeks before they can circle back around to address that update-bug or add that utility function for line-deletion. So now the DBA’s writing a stored-proc to wrap in a scheduled job to “fix” those order-line sequences every night, to prevent that one app module from breaking and keep those management reports accurate. And, to quote my very first post ever, “the DBA waxe[s] wroth”.

picard-why-we-cant-have-nice-things — *Picard knows best..*

So, prevention? Well, I’d probably start by locking down the order entry system during that off-limits window. We should also wire-up those big 3 processes so that there’s no need for indeterminate buffer-zones and inconsistent timing. And yeah, it could be a big change needing lots of team buy-in and approvals, but it’s worth the investment. Right? Right!

Hope you enjoyed reading! Until next time…

Drafted with StackEdit, finished with WordPress

What’s in a Name?

a thing by any other name… is a different thing.

Let’s just start with a universal truth:

Names matter.

What do I mean by that? In technology, specifically. Well, let me explain.

The name you choose for a thing – whether it’s a server, a database, an application, a service, a class, a file, a method, a function, a procedure, ad nauseum – that name is immediately and irrevocably ~~in love with Edward~~ associated with that thing. Even if you miraculously get to change that name somewhere down the road, it will still be (at least for a while) “formerly known as” the original name. Legacy teams will continue to use that old name until they’ve convinced everybody the new name is worth the trouble of switching all dependency references & documentation. Managers will continue to use the old name even longer because they’re not close enough to the tech to be aware of, let alone understand, the full implications of said name change.

you-keep-using-that-name — Enigo Montoya prefers easy and descriptive names, like “Six-Fingered-Man”.

So that’s why names matter. They’re a unique and semi-permanent identifier of something; they encapsulate the essence of that thing in a single word or phrase. And they become embedded in the business jargon and technology team lore. Such lofty expectations of such a simple (hopefully), short (usually), and often under-appreciated element of our field.

It’s unbelievably easy to find bad examples of IT naming schemes, too. Just look:

I don’t know about you, but I like my names to be pronounceable. I don’t want to need dozens of syllables and two extra breaths just to say a couple server names in conversation. It’s not hat hard to create meaningful multi-part names that are still phonetic. We speak these things just as much as we type them, let’s make them speak-able!

Thus, once again,

Names matter. REALLY.

And so, dear reader, choose them wisely, with care and consideration. With lots and lots of context. Maybe even some conventions, or at least conventional wisdom. And consistency. Plan for the future, while being aware of the past. Don’t let legacy remnants stand in the way, but don’t let delusions of grandeur force you into a naming scheme that breeds confusion or resentment.

Allow me to illustrate. We have a handful of SQL servers as part of our core IT infrastructure. Let’s call them FooDB, BarDB, and CharDB. Now, they each serve a fairly distinct and semi-isolated set of apps/websites. In other words, they each fill a role. (Of course, there are cross-dependencies and integrations, but nothing extreme.) Now, we want to migrate them to new hardware and plan for scale, assuming the business keeps growing. So what do we name the new servers? We know that SQL Server doesn’t really scale horizontally, at least not very easily & not without major application rewrites. But we don’t want to draw ourselves into a corner by assuming that we’ll “never” scale out. Ok, how about this: DC-FooDB-P1, DC-BarDB-P1, etc. The P stands for production, because, along with the fancy new hardware, we’ve got some additional infrastructure room to build a proper Dev & QA environment.

dev_test_prod_white — Seriously. You need tiered environments. That’s a whole ‘nother blog post.

So what have we planned for? Tiered environments, as well as a bit of scale-out room – 9 servers in each “role”, so to speak. What if we grew to the point where we needed 10 servers? Well, as a DBA, I’d argue that we should consider some scale-up at that point; but also, to seriously start looking at a broad infrastructure revamp, because that kind of growth should indicate some serious cash influx and a much higher demand on IT-Dev resources.

will-this-scale — *Don’t ask me what happened to rules #1-5. I don’t know, I didn’t read the article.*

Why should the breaking point be 10 and not 100? or 1000? Well look, it definitely depends on the technology we’re dealing with. SQL server is not Mongo, or Hadoop, or IIS, or Apache. Certain tech was made to scale-out; other tech is more suited for scale-up. And there’s nothing better or worse about either approach – both have their place, and both are important. Scale-out is arguably more important, but for traditional IT shops, it’s also more difficult.

This is where that big C-word comes in. Context. If we’re talking about a tech that scales-out, sure, leave room for 100 or 1000 of those cookie-cutter instances that you can spin-up and shuffle around like cattle. But if it’s a scale-up tech, don’t kid yourself by thinking you’re going to get to the triple-digits.

star-wars-episode-vi-han-solo — **Delusions of grandeur… we all have them!**

The point is, if you’re starting at 1, it’s not too difficult to scale-out to a few more “nodes” with simple tricks and tweaks; i.e. without drastically changing your IT-Dev practices and environment. But when you start talking multi-digits, of redundant nodes fulfilling the same role, you need to seriously PLAN for that kind of redundancy and scale. Your apps have to be aware of the node topology and be concurrency-safe, meaning “no assumptions allowed” about where they’re running or how many other instances are running simultaneously. And since we’re talking about database servers, that means replication, multi-node identity management, maybe sharding, data warehousing, availability groups, and other enterprise-level features. We’re talkin’ big $$$. Again, it stands to reason that if the business is seeing the kind of growth that leads in this direction, it can afford to invest in the IT-Dev improvements required to support it.

I’d love to know your thoughts! Leave me a comment or two. Until next time!

Header image: Keira loved her little pet penguin… and then later she destroyed it and literally ate its face off.

Drafted with StackEdit, finished with WordPress.

Is Coding Style still a thing?

Make it readable, minimize scrolling.. provide context, clarify complexity.

I can hear it now.. the moans and groans, the wailing and gnashing of teeth.. “Another post about coding style / formatting? REALLY!? Why can’t we all just ~~get along~~ read/write/paste code in our own environment and let our IDE/toolset worry about the format?” I hear you, really I do. But you’re missing the point. The single most important trait of good code is readability. Readability should be platform-independent, i.e. it should not vary (too greatly) depending on your viewing medium or environment.

The ratio of time spent reading (code) versus writing is well over 10 to 1 … (therefore) making it easy to read makes it easier to write.

-Robert C. “Uncle Bob” Martin, Clean Code

And yet, all around the web, I continue to find terribly formatted, nigh-unreadable code samples. Specifically, SQL scripts. That’s me; that’s my focus, so that’s what I pay the most attention to. But I have no doubt that the same applies to most other languages that Devs & IT folks find themselves scouring the tubes for on a daily basis. There is just an excessive amount of bad, inconsistent, language-inappropriate formatting out there. And honestly, it’s not that surprising — after all, there are millions of people writing said code — but it’s still a source of annoyance.

Clean code always looks like it was written by someone who cares.

-Michael Feathers, Clean Code

Don’t you care about your fellow programmers?

Maybe it stems from my early days working in a small shop, where we all had to read each other’s code all the time, and we didn’t have these fancy automated toolsets that take the human eyeballs out of so much of the workflow/SDLC. So we agreed upon a simple set of coding style/layout rules that we could all follow, mostly in-line with Visual Studio’s default settings. And even though we occasionally had little tiffs about whether it was smarter to use tabs or spaces for indentation (protip: it’s tabs!), we all agreed it was helpful for doing code reviews and ~~indoctrinating~~ onboarding new team members.

My plea to you, dear reader, is simple. Be conscious of what you write and what you put out there for public consumption. Be aware of the fact that not everybody will have the exact same environment and tooling that you use, and that your code sample needs to be clean & structured enough to be viewed easily in various editors of various sizes. Not only that, but be mindful of your personal quirks & preferences showing through your code.

That last part is really difficult for me. I’m a bit of ~~an anal-retentive bastard~~ a stickler for SQL formatting, particularly when it doesn’t match “my way”. So that’s not the point of this post – I don’t mean to, nor should I ever, try to impose my own personal style nuances on a wide audience. (Sure, on my direct reports, but that’s why we hire underlings, no? To force our doctrine upon them? Oh, my bad…)

All I’m saying is, try to match the most widely accepted & prevalent style guidelines for your language, especially if it’s largely enforced by the IDE/environment-of-choice or platform-of-choice for said language. And I’m not talking about syntax-highlighting, color-coded keywords, and that stuff – those are things given to us by the wonderful IDEs we use. Mostly, I mean the layout stuff – your whitespace, your line breaks, your width vs. height ratio, etc.

For example, Visual Studio. We can all agree that this is the de-facto standard for C# coding projects, yes? Probably for other MS-stack-centric languages as well, but this is just an example. And in its default settings, certain style guides are set: curly braces on their own lines, cascading smart-indents, 4-space tab width, etc. Could we not publish C# code samples to StackOverflow that have 7 or 8 spaced tabs, curly braces all over the place (some on the same line as the header code or the closing statement of the block; some 3 lines down and spaced 17 tabs over because “woops, I forgot a closing paren. in a line above, I fixed it but I was too lazy to go down and fix the indentation on the subsequent lines!”). GRRR!

Always code as if the [person] who ends up maintaining your code will be a violent psychopath who knows where you live.

John Woods

Now, on to SQL. Specifically, TSQL (MS SQL). Okay, I will concede that the de-facto environment, SSMS, does NOT have what we’d call “great” default style/layout settings. It doesn’t even force you to use what little guidelines there are, because it’s not an IDE, so it doesn’t do smart auto-formatting (or what I’ve come to call “auto-format-fixing, in VS) for you. And sure, there are add-ins like RedGate SQL Prompt, SSMSBoost, Poor Man’s T-SQL Formatter, etc. But even they can’t agree on a nice happy compromised set of defaults. So what do we do? Maybe just apply some good old fashioned common sense. (Also, if you’re using such an add-in, tweak it to match your standards – play with those settings, don’t be shy!)

I can’t tell you how frustrating it is to have to horizontally-scroll to read the tail-end of a long SQL CASE expression, only to encounter a list of 10+ literal values in an IN () expression in the WHERE clause, that are broken-out to 1-per-line! Or even worse, a BETWEEN expression where the AND <end-value> is on a brand new line – REALLY?? You couldn’t break-up the convoluted CASE expression, but you had to line-break in the middle of a BETWEEN? Come on.

youre-killin-me-smalls — You’re killin’ me, Smalls!

Or how about the classic JOIN debate? Do we put the right-hand table on the same line as the left-hand table? Do we also put the ON predicates in the same line? Do we sub-indent everything off the first FROM table? Use liberal tabs/spaces so that all the join-predicates line up way over on the right side of the page? Again, common sense.

Make it readable, minimize scrolling, call-out (emphasize) important elements/lines (tables, predicates, functional blocks, control-flow elements, variables). It’s not revolutionary; just stop putting so much crap on one line that in a non-word-wrapped editor, you’re forced to H-scroll for miles. (FYI, I try to turn word-wrap on in most of my editors, then at least if I have some crazy-wide code, I can still read it all.)

SWYMMWYS. Say what you mean, mean what you say. (“Swim wiz”? “Swimmies”? I don’t know, I usually like to phoneticize my acronyms, but this one may prove troublesome.)

Modern languages are expressive enough that you can generally make your structure match your intended purpose and workflow. Make it obvious. And if you can’t, comment. Every language worth its salt supports comments of some kind – comment, meaning, a piece of text that is not compiled/run, but rather just displayed to the editor/viewer of the code. Comments need to provide context, convey intent, and clarify complexity.

So what’s my point? Basically, let’s all try to make our code samples a bit easier to read and digest, by using industry-standard/accepted best-practices in terms of layout & style.

Drafted with StackEdit, finished with WordPress.

PS: Just to eat my own words, here’s an example of my preference to how a somewhat complex FROM clause would look, if I was king of the auto-formatter world:

SQL-sample-dark-code — Hipster dark theme FTW!

Send comments, thumb-ups, and hate-mail to my Contact page. 😛

Header image: our husky Keira at 2.5 months, being a sleepy-head!