One more time, with feeling! Not that this dead horse needs another beating, but I did promise…
So we’ve built our Cat Tree. We’ve written our CrUD ops, our “move” op, and even some readers. But how do we know it’s all correct? We can
select from our
Cats view, of course. But we want to be really sure. Plus there’s that pesky
Easy one first.
SwapCatNode can mean swapping sibling order, or switching a parent with a child or grandchild, or toggling nodes that are in completely different places in the tree & not related at all! This is the least logical operation, if you think about a proper hierarchy, but it turns out to be necessary sometimes. We’re just swapping the nodes’ position values & ParentIDs with each other, and updating ParentIDs on their children to each others’ IDs.
I really don’t even need to draw this one… but because I needed a header image, I did. Anyway, just get the rows with the given target IDs, swap the
ParentID values, and call it a day.
Now the complex. To validate that our tree is properly structured, the following statements need to be true:
- Each node’s Right value is greater than its Left.
- More to the point, each node’s Right value is greater than all of its ancestors’ Left values.
- Similarly, each node’s Left value is less than all of its descendants’ Left values (and Right values, obviously!)
- Leaf nodes have no gaps between Left & Right:
Right = Left + 1
Depthis easy to verify because we already wrote the rCTE to calculate it!
- And of course, no orphans – all ParentIDs lead to an actual parent node, except of course if they’re
We can either go thru lots of logical checks in different queries, or we can try building a mock tree out of the base adjacency-list structure (ParentIDs) and compare values. The latter will only help us with #1-5; the orphans problem is a different animal, but it’s also not part of the model per-se, so it’s actually good to separate that check from the rest. (And it’s really simple – use a
not exists query on
ParentIDs and presto, orphans checked!)
Building a mock-tree, or a “position re-builder”, will come in handy for another reason: Let’s say we need to completely revamp a subtree, i.e.
update a bunch of nodes at once because somebody royally screwed up that branch. And we’ve got our shiny fixed data, 100’s of rows, ready to go, if only those damn triggers weren’t there, preventing us from doing bulk operations! What we’d really like to do is, knowing the starting
insert all our new nodes with
PLeft values in sequence to each other (and not care about the rest of the tree); and/or,
update a few sets or families of nodes to massively re-order them, without having to call the
Swap routine one-at-a-time ad-nauseum. We also don’t want to care about figuring out correct
Depth values. After that’s all done, our new subtree will have “bad” position values, so we need to rely on some other routine to fix them for us, so that the tree can again be well-formed and things can go back to normal.
RebuildCatTree routine, we actually need to re-number all nodes to the right & above our “bulk-inserted” subtree, just in case we’ve caused things to move. And since we’ve re-ordered some siblings elsewhere, it turns out to be easiest, in practice, to re-number the whole tree. This is where our fair-weather friend
recursion comes in — and not just another
rCTE, but real stored-procedure recursion. This can get dicey; SQL only supports a certain # of recursion levels, and it can really eat up those CPU cycles & RAM buffers. So this should be done rarely, and preferably during a time where the tree is not under heavy usage.
The code samples are now available on my GitHub page. Comments abound!
I hope you’ve enjoyed this little mini-series. And now, I promise to move on to new topics & rantings of various nature! Thanks for reading.
I’d like to point future readers at two very informative articles for those interested in deep-diving down the hierarchical rabbit-hole: Aaron Bertrand, and Jeff Moden. There are many more tweaks and enhancements that can be made to the “classical” Nested Set model, which those lucky Devs/DBAs who are in a position to actually [re]implement their hierarchies will want to read about and take advantage of.