Let’s talk for a minute about views. No, not like scenery. Not like my family (they’re good peeps). I mean SQL views – i.e.
CREATE VIEW vwFooBar which combines commonly used fields (columns) from tables
Bar into a single entity, which datavelopers (term borrowed from Richie Rump, who has an awesome name btw) can then use & abuse without having to know the details of and relationships between those two tables.
So far, this sounds like a great idea, right? Encapsulation & reusability are great principals in IT/Dev land. And I agree! Views have their place, for sure. But some reason, I find myself constantly deconstructing them– refactoring complex queries & stored-procedures that use them — into their base tables (in the above example, that’s
Bar). And I find myself asking the proverbial WHY? Why is this such a common misstep of devs & query writers, that the DBA has to spend such time “fixing” the issues this can cause? Naturally, that train of thought spawned a blog post.
Let’s dive in!
So, somebody created a nice high-level view for us that summarizes
Order data into a single entity that should prove pretty handy for things like reports, ad-hoc analysis, and maybe even a display-grid on a page somewhere. And the happy developers don’t need to care about the foreign keys between the tables or how to
JOIN them properly.
But! Do we see a problem yet? Yes, our
vwCustomerOrder doesn’t include our Customer’s
JOIN the view to the
Customer table again just to get the Email? Neither of those options are ideal. The latter means that now we’re referencing the
Customer table twice; the former involves refactoring, and is even more difficult if the view is an indexed view.
Okay, well let’s say we’ve added Email to the view, and it’s working fine. Now let’s add another layer of complexity. Say
Customer is actually a
VIEW, which consists of base-tables
CustomerContact, where the latter stores a collection of contact entries for each Customer. Now,
vwCustomerOrder is thus a 2-level nested view. These aren’t really super fun, but they’re not the most offensive thing in the database. But again, what happens when we need something from
CustomerContact that’s not already in our “master” top-level view
vwCustomerOrder? Maybe the primary
Phone1, for example. So to build that query, we now have to combine our 2-level nested view with a redundant table (i.e. a table that’s already in the view)! But because it’s so terribly important to some reporting modules or some code bits, it becomes a permanent fixture of the schema. And on it goes.
Expanded example from earlier:
This is the problem with
VIEWs that are grown organically, re-actively, without careful attention to the underlying schema and dependencies. And look, I get it. Mature software system systems DO evolve organically; iteration is the name of the game, and that goes for the database too, not just the app code. But let’s agree that, like the boyscout principle espoused earlier in my ode to Clean Code, we can try to leave things a little better than how we found them. DB refactoring is … not necessarily harder, but definitely different (than app code refactoring). You probably have to jump thru more hoops to get the changes pushed to production. Regardless, it’s worthwhile. Why? Glad you asked!
Nested views are icky for a couple reasons. First, they’re more difficult to troubleshoot – not because they annoy the snot out of your DBA, but because they legitimately make index tuning more tedious. Second, laziness begets more laziness – when you’ve got these views, you’re automatically tempted to simply use, re-use, & abuse them, instead of taking the time to analyze their effectiveness and refactor, replace, or reevaluate the problem. Rampant ill-constructed views can lead to some serious technical debt. And…
There’s another nugget of wisdom in software development that’s appropriate here: Less is More. Your
views should address a specific need or use-case, do it well, and call it a day. Feature-creep is not a goal; views that try to be many things to many people ultimately fail at it, because you end up committing sins against set-theory & the relational model in favor of “oh just one more thing!” While the purpose of views is ultimately abstraction, we should not forget the underlying parts — tables, indexes, statistics, schema relationships. In fact, a good
view can help us construct the right indexes and visualize the relationships. Whereas a BAD
view will obfuscate indexing needs and confuse (or even outright lie about) the base relationships.
Venn-diagram time. We’ll use the circle areas to represent that “scope” of each
view, meaning the base-tables that comprise it. In a healthy schema, there will be mostly independent views with a few overlaps in small portions. These views “do one thing well”, with little redundancy and hardly any nesting. Conversely, in a haphazard ill-managed schema, there is lots of overlap and redundancy, multi-level nesting, and ugly colors (who the heck likes brown, anyway?).
So, dear reader, what’s the point? (I feel like I rhetorically ask that fairly often.) Build your SQL
views with targeted use-cases and clear purpose. Avoid the laziness trap and the temptation to tack-on “one more thing”. A
view should be a concrete, concise, abstracted representation of the underlying tables & relationships. Evaluate existing
views for how well they meet the developer needs, and don’t be afraid to deprecate, drop, or rewrite bad code. And above all, stop the insane nesting.
viewhas a dependency tree more than 2 levels deep, you’re doing it wrong.
Or, more visually…
That’s all for this week! Thanks for reading, and apologies for the slightly longer delay between posts.