Nested Set++ Wrap-Up

So we’ve built our Cat Tree. But how do we know it’s all correct?

Advertisements
nsm-cat-swap-node

One more time, with feeling!  Not that this dead horse needs another beating, but I did promise…

So we’ve built our Cat Tree.  We’ve written our CrUD ops, our “move” op, and even some readers.  But how do we know it’s all correct?  We can select from our Cats view, of course.  But we want to be really sure.  Plus there’s that pesky SwapCatNode method.

Easy one first.  SwapCatNode can mean swapping sibling order, or switching a parent with a child or grandchild, or toggling nodes that are in completely different places in the tree & not related at all!  This is the least logical operation, if you think about a proper hierarchy, but it turns out to be necessary sometimes.  We’re just swapping the nodes’ position values & ParentIDs with each other, and updating ParentIDs on their children to each others’ IDs.

I really don’t even need to draw this one… but because I needed a header image, I did.  Anyway, just get the rows with the given target IDs, swap the PLeft, PRight, Depth, and ParentID values, and call it a day.

Now the complex.  To validate that our tree is properly structured, the following statements need to be true:

  1. Each node’s Right value is greater than its Left.
  2. More to the point, each node’s Right value is greater than all of its ancestors’ Left values.
  3. Similarly, each node’s Left value is less than all of its descendants’ Left values (and Right values, obviously!)
  4. Leaf nodes have no gaps between Left & Right: Right = Left + 1
  5. Depth is easy to verify because we already wrote the rCTE to calculate it!
  6. And of course, no orphans – all ParentIDs lead to an actual parent node, except of course if they’re NULL (root nodes).

We can either go thru lots of logical checks in different queries, or we can try building a mock tree out of the base adjacency-list structure (ParentIDs) and compare values.  The latter will only help us with #1-5; the orphans problem is a different animal, but it’s also not part of the model per-se, so it’s actually good to separate that check from the rest.  (And it’s really simple – use a not exists query on ParentIDs and presto, orphans checked!)

Building a mock-tree, or a “position re-builder”, will come in handy for another reason:  Let’s say we need to completely revamp a subtree, i.e. insert & update a bunch of nodes at once because somebody royally screwed up that branch.  And we’ve got our shiny fixed data, 100’s of rows, ready to go, if only those damn triggers weren’t there, preventing us from doing bulk operations!  What we’d really like to do is, knowing the starting ParentID, just insert all our new nodes with PLeft values in sequence to each other (and not care about the rest of the tree); and/or, update a few sets or families of nodes to massively re-order them, without having to call the Swap routine one-at-a-time ad-nauseum.  We also don’t want to care about figuring out correct PRight & Depth values.  After that’s all done, our new subtree will have “bad” position values, so we need to rely on some other routine to fix them for us, so that the tree can again be well-formed and things can go back to normal.

nsm-cat-bulk-reorder-insert-rebuild
Simon & Tigger got re-ordered. Then we bulk-added 3 new Cats under Mittens and didn’t know what their position values would be, so we let the rebuilder take care of it.

In our RebuildCatTree routine, we actually need to re-number all nodes to the right & above our “bulk-inserted” subtree, just in case we’ve caused things to move.  And since we’ve re-ordered some siblings elsewhere, it turns out to be easiest, in practice, to re-number the whole tree.  This is where our fair-weather friend recursion comes in — and not just another rCTE, but real stored-procedure recursion.  This can get dicey; SQL only supports a certain # of recursion levels, and it can really eat up those CPU cycles & RAM buffers.  So this should be done rarely, and preferably during a time where the tree is not under heavy usage.

The code samples are now available on my GitHub page.  Comments abound!

I hope you’ve enjoyed this little mini-series.  And now, I promise to move on to new topics & rantings of various nature!  Thanks for reading.

~Fin~

Update:

I’d like to point future readers at two very informative articles for those interested in deep-diving down the hierarchical rabbit-hole: Aaron Bertrand, and Jeff Moden.  There are many more tweaks and enhancements that can be made to the “classical” Nested Set model, which those lucky Devs/DBAs who are in a position to actually [re]implement their hierarchies will want to read about and take advantage of.

Author: natethedba

I'm a SQL Server DBA, family man, and all-around computer geek.

2 thoughts on “Nested Set++ Wrap-Up”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s