The EAVil Cycle, Part 2

In which we discuss more about the EAV model and some of its merits and pitfalls.

Advertisements

continued from last week…

The Ugly (aka the “Wow really?!?”)

You’ll see this ‘creep’ even in product-catalog systems as mature as Amazon’s. If you search for (geeky as we are) graphics cards, and use the product attribute filters in the left pane to narrow it down, you’ll find that some correctly have their memory type (GDDR5, etc.) listed, while others may not. If you’re really unfortunate, there will be two semi-redundant attribute-sets that you’ll have to “juggle” between to really get at what you want. TVs, for example, may see both an “HDR support” (yes/no) and an “HDR type” (standard, ultra, etc.) — I’m kinda pulling those out of my arse for example’s sake, but you get the point.

Why does this happen? Because at some level, humans are still responsible for ‘tagging’ those products (for lack of better word). And as much encouragement and guidance as the ‘admin software’ may give them, they can (and do) still decide at times to side-step that guidance and say “Nope, I know better; make me this new thing!”

But isn’t that a problem with nearly all data-driven systems? Sure, of course it is. Yet with a model as flexible as EAV, the problem is intensely magnified by the fact that it’s made so easy to do — to ‘extend’.

so preoccupied with whether or not you could, you didn't stop to think if you should
It’s probably not the exact quote, for you pedants, but it’s close enough for government blog work.

And unfortunately, the biggest contributor to this problem is the lack of development-time and forethought given to the administration, or management, of the data. You see, this problem could be largely assuaged if the admin-toolset were the FIRST thought and priority in the roadmap. But so often, that thought comes LAST, if at all. So sure, your product feature tagging system looks great, it’s flexible and your customers love it. But you’re throwing tickets over the wall to your data team every time a requirement or use-case changes, or when you need to fix a data-quality problem caused by the users not knowing there was already a “Widget Type” before creating their new “Widget Kind” tag, or misspelling “Is Wierd” because English is weird and has more exceptions to the “I before E” rule than not.

Does this problem go away with a dedicated search-index or NoSQL technology like Elasticsearch or (shudder) MongoDB? Of course not! If anything, it may be worse. Maybe. But wait, those systems make it easier to de-dupe and manage redundancy & data quality, don’t they? Probably. I can’t speak from experience myself, but I’ve heard good things. Once again, it all comes down to the effort you’re willing to invest in the system. If you make data quality a priority, you’ll be happier with the experience. If you don’t, well you’re just another amateur data scientist complaining about dirty non-standardized/non-validated address fields, aren’t ya?  =P

I joke with the data scientists, of course. What they do is awesome. They just tend to often re-invent the wheel of data-cleansing/data-wrangling that we DBAs have been doing for a few decades, because they didn’t know the right questions to ask or the right place to look. We need to get better at working together WITH them, not ‘for’ or ‘against’ them.

ninja cat riding a unicorn with laser-eyes
How the data scientists see themselves…

The Why or When (aka “Is it a decent model for this?”)

The long-story-short version is, consider your business and your data. Try to plan for the future, and anticipate potential changes and growth. It’s not easy, and we never “get it right the first time”. But we can try.

When your attributes are fairly static, and you know that you can tightly control them, you might consider a more rigid model. Something with a handful of lookup tables referenced by the main product entity. This is advantageous for performance and management, at the expense of scalability and extensibility.

When you literally need to support on-the-fly extension, and you’re absolutely married to SQL (i.e. not ready to venture out into NoSQL land just yet), the EAV model may fit the bill. Aaron’s article, and the comments therein, present some fairly valid and reasonable implementation suggestions to make it a little more palatable. Just beware the date — that was written back in 2009. Before we had such things as Elasticsearch and its ilk. I’d heavily encourage the consideration of purpose-built data-stores for this sort of thing, if you have any hope of scaling-out.

Other tools in your toolbox can help with this, too. For example, consider an in-memory data-grid for super-fast reads. The vast majority of data-access to these attributes & values is going to be reading, using it to filter & slice & dice a data-set. You can pay the small performance cost (e.g. write to the underlying SQL database) on the rare occasion when a write/change needs to occur.

In Conclusion

Proving the age-old rule of “Just because you CAN, doesn’t mean you SHOULD”, the EAV model is sometimes okay and sometimes not. You need to understand your business and your data to make that call. And you need to consider the magnitude of effort that may be involved in pivoting from one model to another. Unfortunately, in many cases, that part overshadows the rest, and the show business must go on.

queen the show must go on
You’re welcome again, ears.

Still, I encourage you to at least think about it, and be ready with that knowledge of pros/cons when the time is right to discuss it with stakeholders.

The EAVil Cycle

In which we discuss the EAV model and some of its merits and pitfalls.

EAV, or Entity-Attribute-Value, is an data model that’s been around the block. It’s typically injected into a relational database at some point during the overall application/architecture life-cycle, somewhere between when the team realizes that they’ve got way too many lookup tables for a “main business entity” thing, and when they finally make the shift into polyglot data stores.

Wake me up when that actually happens, successfully.

I’m not going to rehash aging internet arguments here, nor bore you with replicated diagrams that you could just as easily look up on Google Images. No, instead, I’m going to tell you why this model is not great, why it’s not bad, and how you should go about deciding if it’s not wrong for you.

Yes, I did negate all those adjectives on purpose. Mostly because nobody really enjoys working with these structures, regardless of how they’re implemented; customers and business stakeholders are ALWAYS CHANGING the requirements and the attributes in question. But, we press on.

Refs:

PS: Postgres is looking really freakin’ cool these days.

the good the bad and the ugly
Another Clint Eastwood pic… apologies if you’re not a fan. =P

The Good (aka “Not Bad”)

Proponents tell us that this model is easily searchable and easy to administer. The “searchable” bit is true; especially when you have the attribute-values pre-defined and don’t rely on end-user text-entry. But that’s true of basically any data model. The key here is that all attribute-values are effectively in one “search index”. But wait, don’t we have purpose-built search indexes nowadays? (Hint: see Elasticsearch.) This will come up again later.

Administerable? Administrable? Administratable? Damn you English! Anyway. Yes, again, if you’re fairly confident in your business users’ ability to effectively track and de-dupe (de-duplicate) incoming requirements/requests using their own brains/eyeballs and the admin tool-set that you build for them.

Oh, right, did I mention that? You have to build the admin app. Because you do NOT want to be writing ad-hoc SQL queries every time a new attribute requirement comes in. (Still, it’s better than making schema changes for new req’s, as I’ll discuss in a bit.)

Mainly, though, the biggest ‘pro’ of this model is that your business requirements, i.e. your attributes and the values they’re allowed to contain, can be flexible. The model allows a theoretically infinite amount of customization to suit your needs; though in practice, as Allen writes in the CodingBlocks article, you do run up against some pretty hard scalability hurdles right-quick. So in practice, you might want to consider more horizontally-scalable data stores, or (God help you) try scaling-out your SQL databases. (Spoiler-alert: big money big money!)

shut up and take my money
Millions of query-bucks…

The Bad (aka the “Not Great”)

Which brings me to the first ‘con’. Performance. If you’re modeling this in a relational DB, and you expect it to scale well, you’re probably overly optimistic. Or very small. (If the latter, great! But you don’t always want to be small, right? Right!)

Don’t get me wrong; you can make it work half-decent with good indexing and sufficient layers of abstraction (i.e. don’t write a “kitchen-sink view” that’s responsible for pivoting/piecing-together all the attributes for a product straight outta SQL). But again, is it really the right tool for the job?

Momentary digression. SQL Server, or more generally, the relational database, has long been touted as the “Swiss army knife” of IT; we’ve thrown it at so many problems of different size and shape, that we’ve almost lost track of what it’s actually very GOOD at. Hint: it’s about relationships and normalization.

Another argument against it seems to be data integrity and enforcement. I can understand that, but again, with some clever software overlay and user-guidance, this can become almost a non-issue. But remember, your developers are the ones building said software. So that effort needs to be considered.

The Ugly (to be continued…)

The biggest problem, and quite a legit one, is ‘creep’ — both scope and feature. See, the inherent flexibility in the model will almost encourage data managers to be less careful and considerate when deciding when to add an attribute or value-set, and how to govern the data-set as a whole.

creep wish i was special
No, not THAT creep. But, you’re welcome ears.

Stay tuned for more…

Nested Set++ Wrap-Up

So we’ve built our Cat Tree. But how do we know it’s all correct?

One more time, with feeling!  Not that this dead horse needs another beating, but I did promise…

So we’ve built our Cat Tree.  We’ve written our CrUD ops, our “move” op, and even some readers.  But how do we know it’s all correct?  We can select from our Cats view, of course.  But we want to be really sure.  Plus there’s that pesky SwapCatNode method.

Easy one first.  SwapCatNode can mean swapping sibling order, or switching a parent with a child or grandchild, or toggling nodes that are in completely different places in the tree & not related at all!  This is the least logical operation, if you think about a proper hierarchy, but it turns out to be necessary sometimes.  We’re just swapping the nodes’ position values & ParentIDs with each other, and updating ParentIDs on their children to each others’ IDs.

I really don’t even need to draw this one… but because I needed a header image, I did.  Anyway, just get the rows with the given target IDs, swap the PLeft, PRight, Depth, and ParentID values, and call it a day.

Now the complex.  To validate that our tree is properly structured, the following statements need to be true:

  1. Each node’s Right value is greater than its Left.
  2. More to the point, each node’s Right value is greater than all of its ancestors’ Left values.
  3. Similarly, each node’s Left value is less than all of its descendants’ Left values (and Right values, obviously!)
  4. Leaf nodes have no gaps between Left & Right: Right = Left + 1
  5. Depth is easy to verify because we already wrote the rCTE to calculate it!
  6. And of course, no orphans – all ParentIDs lead to an actual parent node, except of course if they’re NULL (root nodes).

We can either go thru lots of logical checks in different queries, or we can try building a mock tree out of the base adjacency-list structure (ParentIDs) and compare values.  The latter will only help us with #1-5; the orphans problem is a different animal, but it’s also not part of the model per-se, so it’s actually good to separate that check from the rest.  (And it’s really simple – use a not exists query on ParentIDs and presto, orphans checked!)

Building a mock-tree, or a “position re-builder”, will come in handy for another reason:  Let’s say we need to completely revamp a subtree, i.e. insert & update a bunch of nodes at once because somebody royally screwed up that branch.  And we’ve got our shiny fixed data, 100’s of rows, ready to go, if only those damn triggers weren’t there, preventing us from doing bulk operations!  What we’d really like to do is, knowing the starting ParentID, just insert all our new nodes with PLeft values in sequence to each other (and not care about the rest of the tree); and/or, update a few sets or families of nodes to massively re-order them, without having to call the Swap routine one-at-a-time ad-nauseum.  We also don’t want to care about figuring out correct PRight & Depth values.  After that’s all done, our new subtree will have “bad” position values, so we need to rely on some other routine to fix them for us, so that the tree can again be well-formed and things can go back to normal.

nsm-cat-bulk-reorder-insert-rebuild
Simon & Tigger got re-ordered. Then we bulk-added 3 new Cats under Mittens and didn’t know what their position values would be, so we let the rebuilder take care of it.

In our RebuildCatTree routine, we actually need to re-number all nodes to the right & above our “bulk-inserted” subtree, just in case we’ve caused things to move.  And since we’ve re-ordered some siblings elsewhere, it turns out to be easiest, in practice, to re-number the whole tree.  This is where our fair-weather friend recursion comes in — and not just another rCTE, but real stored-procedure recursion.  This can get dicey; SQL only supports a certain # of recursion levels, and it can really eat up those CPU cycles & RAM buffers.  So this should be done rarely, and preferably during a time where the tree is not under heavy usage.

The code samples are now available on my GitHub page.  Comments abound!

I hope you’ve enjoyed this little mini-series.  And now, I promise to move on to new topics & rantings of various nature!  Thanks for reading.

~Fin~

Update:

I’d like to point future readers at two very informative articles for those interested in deep-diving down the hierarchical rabbit-hole: Aaron Bertrand, and Jeff Moden.  There are many more tweaks and enhancements that can be made to the “classical” Nested Set model, which those lucky Devs/DBAs who are in a position to actually [re]implement their hierarchies will want to read about and take advantage of.

The Nested Set Model++

This time we talk about adding a Depth field, and good ol’ CrUD ops – Create, Update, Delete.

Since my first post on this topic got a lot of attention and traction, I felt it appropriate to expand on the topic a bit, even if it’s been largely covered by other bloggers in the past.  I’ve also found it very useful to have a “depth” field, which isn’t canonically part of the model (hence the “++” in the title!), but is quite handy not only for display purposes (while you’re querying & testing the thing), and also for making certain “get” ops easier.  Sure, it adds a wee bit more to structural maintenance, but since that’s already the most complicated part of the model anyway, it’s hardly worth a second thought.  So let’s dive in!

The big topic last time was this operation of “move a subtree” — of course, sometimes you’re just moving one node, but only if it’s a leaf; otherwise you’re moving a node and all its descendants, so I’ve kept the procedure name MoveCatSubtree intact.  This time we’ll talk about good ol’ CrUD ops – Create, Update, Delete.  In my implementation, I chose to handle these with table triggers.  Some would argue in favor of stored-procs, and while that would seem “more consistent” with the precedent set, I’d counter with 2 points:

  1. To be really fool-proof, you’ll need to prevent ungoverned inserts/updates/deletes anyway; you could either do this with GRANT/DENY permissions, or triggers.  Permissions would be more complex because you’d still need your users to be able to exec the CrUD procs, so you’d end up using some convoluted security mechanisms that can be tricky to maintain over time.
  2. With triggers, we can allow the consumers of the data (apps, users) to continue to use “plain-ol’-TSQL” to access and manipulate the data, instead of having to remember stored-proc names and hunt for documentation on them.  (The exception being, of course, MoveCatSubtree, which, honestly, could be integrated into the insert trigger, but I’ll leave that as an exercise to the reader!)

Again, yes, we could easily do the same implementations in stored-proc form, and you’re welcome to fork my GitHub repo if you feel like exploring that.

Let’s outline the steps and draw some pictures.

1. InsertMake a hole!

When we INSERT a node, we want to specify its parent and a name, and let the triggers do the rest!  We place it at the right of its siblings-to-be, and update the position values of all nodes to the right so that everything stays kosher.  This should sound familiar — it’s essentially that “make a gap” part of the subtree-move op.  In terms of depth, we just +1 to the parent’s.

nsm-cats-insert-gadget-under-stripes
Stripes is a breeder; Gadget comes in and makes Fluffy & children move over.

Also, for some reason, our cats reproduce asexually…

 

2. Delete: Think of the children!

Similarly, to DELETE a node, we want to “close the gap” left by said deleted node.  But what of the children?  We don’t want to leave any orphans behind!  So we “promote” the children of our deleted node to the level (depth) of their parent, sandwiching them in between the deleted node’s siblings (aka their former aunts/uncles!).  This is easier than it sounds.

nsm-cast-delete-fluffy
He killed Fluffy!

Fluffy is survived by his children, who are now for some reason his siblings, and are very confused by their sudden increase in age & status.

3. UpdateRename; everything else is encapsulated.

Finally, we only allow UPDATEs on the Name, because everything else (position values, depth, parent) is structural, and encapsulated by our tree maintenance logic.  Moving a node or subtree?  MoveCatSubtree.  Swapping positions with another node?  SwapCatNode (TBD!).

4. Depth: Set it once, & encapsulate it!

Depth is pretty simple to add if you’ve already got a tree full of data.  We can use a recursive common table expression, or “rCTE“.  While normally these are frown-worthy (remember, recursion is not SQL’s strong suite), we’re only using it one time to populate an existing data-set, so we can keep on smiling.

;WITH CatTree AS
(
    SELECT CatID, ParentID, Name, PLeft, PRight, Depth = 0
    FROM nsm.Cat
    WHERE ParentID IS NULL
  UNION ALL
    SELECT cat.CatID, cat.ParentID, cat.Name
        , cat.PLeft, cat.PRight, Depth = tree.Depth + 1
    FROM CatTree tree
    JOIN nsm.Cat cat
        ON cat.ParentID = tree.CatID
)
UPDATE cat SET cat.Depth = CatTree.Depth
FROM CatTree
JOIN nsm.Cat cat
    ON cat.CatID = CatTree.CatID

The last order of business (for now) is to add Depth support to our MoveCatSubtree method.  As illustrated below, we have to move the subtree “up” or “down” in Depth depending on its new parent’s position relative to its old position.  The details are, of course, in the GitHub repo, but here’s a quick snippet of what that looks like: NodeNewDepth = /*NodeCurrent*/Depth + (@NewParentDepth - @SubtreeOldDepth) + 1  (where @SubtreeOldDepth is the depth of the top node of the moving subtree.)

nsm-cast-move-jack-to-mittens
Move Jack to under Mittens; I won’t repeat the Left/Right logic, just note the Depth logic.

 

In a future little addendum, I’ll briefly go over the “get” queries and that TDB SwapCatNode method.  For now,  enjoy the cats (again)!  Thanks or sticking around, I know it’s been a few more weeks than normal.

PS: A big thank-you to the dudes in the CodingBlocks #blogging Slack channel for their encouragement and motivation to get this done!  You guys rock.  Check out their blogs for some terrific content: http://dotnetcore.gaprogman.com/ , http://www.codeshare.co.uk/ , http://thereactionary.net/ .

Update:

I’d like to point future readers at two very informative articles for those interested in deep-diving down the hierarchical rabbit-hole: Aaron Bertrand, and Jeff Moden.  There are many more tweaks and enhancements that can be made to the “classical” Nested Set model, which those lucky Devs/DBAs who are in a position to actually [re]implement their hierarchies will want to read about and take advantage of.

The Nested Set Model

The #1 rule of the Nested Set Model is: FAST READs. The #2 rule of the Nested Set Model is: see #1

There are probably definitely several articles out there which cover the SQL implementation of the Nested Set Model, aka “modified preorder tree traversal” (which is more the name of the algorithm by which you traverse the tree, rather than the structure itself).  But I found it interesting enough, and more importantly, applicable enough to my job experience, that I feel it deserves some treatment.  Not the basic “how to”, but more an example of a particular operation and a specific pitfall to avoid. (Jump straight to the example diagrams.)

Now, we’re not going to debate about whether this model is “the best” representation of hierarchical data in an RDBMS (some argue that Closure Tables, aka “Ancestor Tables“, or some kind of hybrid approach is better, and I’d probably agree).  The fact is, sometimes (read: almost always) as a DBA/DBDev, you’re “stuck with” an existing database in a legacy application environment that you pretty much can’t change — or if you can, changes need to be small, incremental, and non-disruptive.

Okay, with that disclaimer out of the way, let’s dive in.  First things first:

The #1 rule of implementing the Nested Set Model is: FAST READs.

I can’t stress that enough.  Fast SELECTs.  Everything else pales in comparison.  In other words, we don’t care how long and painful and slow write operations are against this table (updates, inserts, deletes), as long as our SELECTs remain super speedy.  If that is not your use-case, consider a different model.

The #2 rule of the Nested Set Model is: see #1

Moving on…

The #3 rule is: encapsulate tree operations to maintain its integrity & structure.

Put another way, the #3 rule is that you should always operate on the tree (CrUD ops) using stored-procedures and/or triggers that encapsulate all the nitty-gritty details of maintaining the correct position values during said insert/update/delete operations.  Of course, somebody is responsible for writing those stored-procs.  Any volunteers?  Easy now, don’t raise your hands all at once!  Generally, this responsibility falls to the DBA(s) or DBDev(s).

The problem at-hand, in my current situation, was that of “moving a sub-tree”, i.e. taking a node and all its descendants, and moving it to place it under another “parent” node.  In some models, and/or in some languages, this is a simple recursive operation.  However, SQL is not spectacular at recursion — after all, we’re working in a relational engine — so let’s try to play to its strengths:

namely, SET-BASED operations!

A previous DBDev had written a stored-proc for just such an operation.  However, as (somewhat) expected, it was horribly slow, to the tune of hours of run-time.  This is not acceptable, even given the #1 rule stated above.

Well it turns out that most of it was pretty efficient, but the last step, in which they attempted to “fix” the left/right values in the entire table “just to make sure we didn’t leave any gaps“, was, frankly, quite silly.  Because the only “gaps” you create are created by the previous steps in the proc, and you know exactly how big that gap is (the width of the subtree you’re moving), and where it is, so you should be able to target that specific area of the tree and close the gap more intelligently, using some simple math. (addition and subtraction — the simplest math there is!)

Doing that improved the performance of the whole proc by a factor of 10.  That’s huge.  Or, “yuuuuge“.

So let’s get specific.  As you’ll see from my diagrams, the model actually is a hybrid, combining an Adjacency List (each record knows its “parent”) with a Nested Set (each record has a “left” & “right” position value).  We do this for two big reasons.  First, having the parent relationship along with the position values makes all that nasty book-keeping (rule #3) a bit easier to manage (and to check our work).  And second, because, conveniently, we can store the data from both models in one table.

On to the examples!

First, we have our tree of Cats.

cat-tree-1
Or, as a coincidentally cute table alias, CatTree

Now, we want to move Jack & his children to become descendants of Mittens (Jack being the child, Smush & Smash being grandchildren).  So we start by “making a gap” of the subtree’s “width” (6, the distance between Jack’s PLeft and PRight inclusive of end-points).  We add that amount to all PRight values >= Mittens’ original PRight,  and add it to all PLeft values > Mittens’ PRight — see the blue #s in diagram below, and code here:

UPDATE Cats
SET PLeft = (CASE WHEN PLeft > @NewParentRight
             THEN PLeft + @SubtreeSize
             ELSE PLeft END)
  , PRight = (CASE WHEN PRight >= @NewParentRight
             THEN PRight + @SubtreeSize
             ELSE PRight END)
WHERE PRight >= @NewParentRight

The red values haven’t changed (yet) but are now wrong, so we’ll have to fix them next.  And of course the green values are the moved subtree’s new positions based on the new parent’s (Mittens) PLeft.

cat-tree-2
Jack is now Mittens’ child.

Finally, now that we’ve moved Jack & his children under Mittens, we need to “close the gaps” that we created at first, to make sure that the tree’s position values remain contiguous.  This isn’t as difficult as it sounds: if we’ve stored Jack’s original PRight value (10), we can use that as a cutoff to subtract the subtree width from higher position values and intelligently (and quickly) close the gaps we created before.  Again, code & diagram:

--Notice this looks very similar to the previous
--code snippet! (We're basically doing the reverse)
UPDATE Cats
SET PLeft = (CASE WHEN PLeft > @SubtreeOldRight
             THEN PLeft - @SubtreeSize
             ELSE PLeft END)
  , PRight = (CASE WHEN PRight >= @SubtreeOldRight
             THEN PRight - @SubtreeSize
             ELSE PRight END)
WHERE PRight >= @SubtreeOldRight
cat-tree-3
Red values indicate “closing the gap” that was created by removing the subtree of Jack. Blue values indicate the incidental gap closures for the rest of the tree (above and right). Green values, you’ll notice, are “reverted” (i.e. same as they were originally).

SQL-wise, this should translate pretty well.  I’ve posted the setup and stored-proc scripts to GitHub, so the distinguishing reader can review and offer feedback.  In theory, there’s probably a way to exclude the green reverted values from the first pass operation (gap-making) so that we don’t have to revert them (at gap-closing), but again, since we’re doing SQL set-based operations, it seems hardly worth the effort — i.e. the potential speed gain would be outweighed by the logical/maintenance complexity.

 

So what’s the lesson here?  Well hopefully, if you’re “stuck with” a SQL DB with a Nested Set Model table containing a hierarchical tree of data, you don’t have to completely re-invent the wheel and write your CrUD ops from scratch.  But if your predecessors didn’t plan for certain kinds of operations, and this “move a subtree to a new parent” happens to be one of those, this should help you (re)implement it efficiently.

I’d love to get some feedback on this.  Let me know if I’ve missed anything conceptually, if there are better ways or methods to doing any of this, or any other tips & tricks that folks might have for dealing with such data.  Leave me a comment!

[footnote 1]
The root of the problem, in this case, was simply taking the code from a slideshare presentation and copy-pasting it into the routine without analyzing its effectiveness and efficiency.  It proposed re-calculating the position values after a move, across the entire tree, by using a triple-cartesian-product (or cross-join) to “get the count of nodes to the left/right of each node” for every node, which should sound dirty even as you say it silently in your head, let alone attempt to write it in query form!

[footnote 2]
There’s a 3rd model that we could consider storing in the same table, called “Enumerated Path” or “Materialized Path” or “Breadcrumbs”, which may look good on paper and to your human eyeballs, but breaks down spectacularly when you start talking performance and scale — but to be fair, so do most of these models, eventually, in one way or another, which is why we’ve invented fantastic alternative technologies to address these problems… and frankly, if you’re using all 3 models at once, you’re #DoingItWrong, creating a veritable maintenance nightmare for yourself and everyone around you.  Note that the elusive 4th model, the Ancestor Table, requires (as the name would imply) another table — not an argument for or against anything, just an observation.

PS: Happy 2017!