T-SQL Tuesday #122: Impostors!

Be humble, but be confident in your own expertise. Realize that you will sometimes be the teacher, and sometimes the student.

Howdy folks! It’s been a while. My last post was in October of last decade (har har)! I’ve been busy with other projects, work, and life in general. ūüôā

Now, this month’s invitation is hosted by John Shaulis. I’m always excited to see someone in the #SQLcommunity that I’ve never heard of before. Looks like he’s familiar with the same WordPress templates as I am. Heh. I like it.

Side-note, I’ve been planning to buy my own domain and remove the ads here for a better reading experience (and more professional feel), so hang in there… it’s happening soon!

And now without further ado.

Who, Me?

Everybody reading this knows what impostor syndrome is. I won’t bore you with that. Fact is, if you’re in tech, you either have it or you know someone who does. If neither, well you’re probably a narcissist surrounded by other narcissists and you might wanna look for another job. Heh heh.

oh, get a job? just get a job? why don't I strap on my job helmet, get into a job cannon, and fire off into job land, where jobs grow on jobbies!
I think this is from “It’s Always Sunny in Philadelphia” but I’ve never seen the show.

Yes, You!

I’ll share a personal and ongoing example of where I’m constantly feeling “not good enough at my job”. It’s about server migrations. You know, where you take a running production SQL Server instance, and you have to move it to new hardware/infrastructure. Maybe it’s a physical (bare metal) cluster to a virtualized environment. Maybe it’s just one VM to another VM on a newer platform. It typically involves taking a maintenance window and convincing business stakeholders that the “downtime” is worthwhile.

Now surely we’ve all heard of DBATools by now, right? (If not, go there right now and check ’em out!) They were BUILT to do migrations. Yet I’ve not been able to successfully use them as the whole and ONLY toolset for that purpose. Let me clarify: I’ve used pieces of the framework (such as the copy logins / users command, and others) to HELP me in the migration project, but never been able to simply fire off Start-DbaMigration and go get coffee and watch as things magically successfully fall into place.

dog in a room on fire, casually drinking coffee, saying "this is fine."
It’s working EXACTLY AS IT SHOULD!

Couldn’t Be!

Even when I’ve built a beautiful check-list, lined up all my scripts, scheduled things via Agent Jobs, double-checked names and permissions and networks and drive letters and shares. Still, something inevitably goes wrong. Maybe it’s just that a database refuses to go into READ_ONLY mode when I ask it to. Maybe it’s that a whole series of jobs get lost because they were in the wrong category. Maybe user permissions don’t come over on this database because I didn’t sacrifice a chicken on the night of the blood-moon while the wind blew north-east.

Surely, by this point in my 10+ years as a database professional, 4 years at this particular company and role, I would be able to do this with my eyes closed. With no hiccups or misfires. Right? RIGHT??

murphy's law: if anything can go wrong, it will go wrong.
Full plaque samples, which are quite entertaining, can be seen here & here, to name a few.

Then Who?

Wrong. Murphy’s Law is a real thing. More than that, things CHANGE. The environment is always evolving to fit the business needs. The technology is always just a little ahead of my own knowledge-base. And every SQL instance in this environment is, unfortunately, still, a ‘pet‘. Not a ‘cattle‘ (cow?). They’re each carefully and fearfully constructed to suit a specific set of application and business needs, each with their own nuances and barely-documented dependencies.

Plus, it’s not like we’re doing these migrations very often. Anything you don’t do on a constant basis, at least a few times per week, tends to sink back toward the bottom of the mental stack, and you forget the details and gotchas that were involved because your brain needs to keep more important and relevant things near the top, for what you’re working on “right now”. This is normal, at least in my mind.

DevOps stole the cookie from the cookie jar!

Just hire a DevOps Engineer to solve all your problems. Then everything will be all sunshine and rainbows and unicorns. All your servers will be cattle, replaceable little containers that are allowed to fail because there’s always another one waiting to spin-up and replace it. Not like your DATA is important or anything. It’s only the lifeblood of the entire company.

Not really sure where I was going with this. I love DevOps, don’t get me wrong, and it does have a part to play in making the traditional RDBMS platform more agile, but the devilish details and difficulties therein are ALWAYS overlooked/brushed-off in popular discourse, and it’s got me a little jaded.

I mostly wanted to finish the song I’d started with the title-blocks. =P

N.

But srsly. Instead of feeling guilty about your impostor syndrome, just own it. Be humble, but be confident in your own expertise. Realize that you will sometimes be the teacher, and sometimes the student. And hopefully more often the latter, because there’s exponentially more to learn every year we evolve on this crazy spinning rock we call Earth.

live as if you were to die tomorrow. learn as if you were to live forever. -Mahatma Gandhi
Good advice in general.

Dealing with Time Zones in SQL

In my sample script, I have a “transaction table” where I insert a few events for each office, with the ‘original date’ in PST, and I show how to easily convert that to each office’s local time, or visa-versa.

I’ve ranted about times, datetimes, and the like before. As most programmers & IT pros do, I loathe time zones with a mild passion. As one of my favorite #SQLCommunity members quips:

daylight savings time was actually created by a government works project to ensure that programmers could forever write tedious conditional logic in their date-based queries [and code].

Bert Wagner

Thankfully, in SQL 2016, MS has somewhat heard our outcries and given us an easier way to convert things between time-zones in a somewhat sensible manner.

Resources:

  • The at time zone keyphrase
  • The system metadata table that drives it
  • A good blog post that explains essentially what I’m about to (because it’s great to have different perspectives and unique voices!)
  • Two StackOverflow answers (one, two) on the topic
  • Bert’s succinct and fun post about the same topic, with video!
give me utc or give me death
This is Patrick Henry. He famously said “Give me liberty or give me death!”. Now you know.

A Use-Case

In fact, I’m not even going to repeat what 2 (and many more, I’m sure) people have already said. Go check out their posts above! But since you’re here, this is how I use it for my reporting environment.

To start with, the transactional data is in PST/PDT — i.e. Pacific Time with DST fluctuation. Yes, it’s horrible. No, I don’t know what happens to events or jobs at 2am on the “Fall Back” date, or between 2am and 3am on the “Spring Forward” date. No, I can’t change it right now. Stop whining.

Now, I have offices in Paris France, Hong Kong, and Beijing China. These are 3 different “time zones”, but only 2 different offsets — China and Hong Kong are in the same bucket, namely, UTC +08:00. More on that later.

So I have my OfficeLocation lookup table:

Office  | TimeZone
--------|-------------
Paris | 'Central Europe Standard Time'
Beijing | 'China Standard Time'
HK | 'China Standard Time'

(Again, see below for why we can’t call HK’s zone “Hong Kong Time” like most websites/APIs would assume.)

Now, the cool thing about this is, we can pull those strings into a variable, or use them straight from the table, to convert our PST/PDT times to the appropriate zone.

Here’s a variable example:

DECLARE @TimeZoneStr sysname; --"sysname" is just nvarchar(128)
SELECT @TimeZoneStr = TimeZone
FROM OfficeLocation
WHERE Office = 'Paris'

DECLARE @MyTimeNow datetime = GETDATE();
DECLARE @TimeInParis datetime;
SELECT @TimeInParis = @MyTimeNow
AT TIME ZONE 'Pacific Standard Time' --Converts to datetimeoffset
AT TIME ZONE @TimeZoneStr --Shifts it to Paris time

PRINT ('The time now in Paris is ' + CONVERT(varchar(30), @TimeInParis, 121);

And here’s an example using the field straight from the table.

SELECT Office
    , [Time in my location] = GETDATE()
    , [Time in remote office location] = CONVERT(datetime, GETDATE()
        AT TIME ZONE 'Pacific Standard Time' 
        AT TIME ZONE ol.TimeZone)
FROM OfficeLocation ol

See the Gist for a full-fledged sample script. In it, I have a “transaction table” where I insert a few events for each office, with the ‘original date’ in PST, and I show how to easily convert that to each office’s local time, or visa-versa.

working with time is easy -jon skeet
Well YEAH, if you’re frickin’ JON SKEET! =P
But srsly, go watch his presentation and read his stuff.

Gotchas

Here’s the major catch. The information available to YOUR instance of SQL Server is pulled from that server’s Windows Registry hive. No, I’m not making this up. So if that box doesn’t know about, say, ‘Hong Kong Standard Time’, and you try to use that in your SQL statement.. you’re hosed.

And yes, that is a real example from my own experience.

This article shows the “Windows standard format” time zone list. As you can see, they merged some zones with others because.. they felt like it? But apparently Central Europe Standard Time, Central European Standard Time, and Romance Standard Time (all UTC +01:00) were completely necessary to keep separate. Go figure.

In my use-case above, then, I couldn’t actually store the string Hong Kong Time, because my SQL instances (hence my Windows Servers) don’t know what that is. Thankfully, at least for this decade, it doesn’t look like Hong Kong and China will diverge in terms of their geopolitical directions, and we’re safe to assume that HKT = CST (China Standard Time, not to be confused with US Central Time!).

In another example, the typical go-to site for timezone questions says Japan observes “Japan Standard Time”. Obviously enough. But Microsoft, in their infinite wisdom, decided to call that “Tokyo Standard Time”. Go figure again.

It also kinda makes you wonder.. how does this work on SQL on Linux? No such thing as “the registry” there. I’m sure there’s an internal OS data-store that houses time-zone info, of course. Heck, they might even be better at it than Windows. But it makes you think.

world time zone map
Disclaimer: this is probably already out-of-date. And it’s from Wikipedia.

So What?

If you’re not already running SQL 2016 or upward, this should give you yet another compelling reason to upgrade. Seriously.

And don’t do what I did and attempt to store a “business locations with time-zone offsets” table, that you have to remind yourself every 6 months to go update (manually), and will inevitably fail to do so, and will not support any sensible manner of long-term historical reporting.

More to the point, don’t try to implement dynamic time-zone logic and calendaring yourself, in general. Because trust me, you’re not gonna get it right. Use the built-in tools, use the community resources, and be smart.

That’s all for today! ‚̧

Facepalms Per Hour

My current velocity is sometimes measured in FPH – facepalms per hour.

This is a rant. Fair warning.

I guess the new ‘Millenial’ colloquialism for “grumpy” or “sarcastic” is “salty“. So I’m feeling extra salty this week. For several reasons. One, it’s audit season. Two, I had to churn out about a dozen new reports in the span of 4 days because the manager who was supposed to be tracking that project dropped the ball and forgot they were due by the end of this month until… yeah, last Friday. Wheeeeee!

Thus, I decided, my current ‘velocity’ (a SCRUM/DevOps term for “how much work are you getting done”) shall be measured in FPH – Facepalms Per Hour. Currently I’m at 3. Earlier this week I was approaching the double-digits, when the lovely report consumers kept thinking of “just one more little thing” they forgot about until after I’d delivered the ‘final’ product.

‘Final’ actually being a meaningful adjective in this context approximately NEVER.

facepalm original picard
The original gangsta.

Change Logs

How best to describe this scenario while still maintaining separation of “real job” from “blog land”… Hrm. So let’s say we have a CRM, like most companies. This stores customers, among other things, in a database. And since it also stores sales transactions and financials, it’s heavily audited — it has a lot of change-tracking mechanisms.

Now, auditors come along and want a report of some specific type of change over time. I happily oblige. Then… PANIC! And not at the disco. “What are all these changes to these customers by these users who don’t have permission to make said changes?!?”

K, calm down sparky. Try not to sound the alarm; auditors are a sensitive bunch.

Turns out, those changes are, in a word, “fake”. You see, there’s this background “customer sync” process that keeps them up to date with another part of the CRM where the actual changes were made. But, because it’s written poorly, it thinks that ANY field change, even just the Name or Address (which a lot more CSR’s, customer service reps, have the permission to change, because, you know, that’s their job), constitutes a change to the ENTIRE customer record on the other end. So the change tracker logs a change to every single field on the receiving end of that sync process, even though nothing really changed on the source side except maybe one or two fields.

With me so far? Great. So now the question is, “Well, can we get a report that doesn’t show those ‘fake’ changes?” But wait, it has to be “system generated” and you’re not allowed to “filter” or “add special exceptions” to it, because it still needs to be audit-able.

So what you’re saying is, give me a report that shows me what I care about, but you’re not allowed to change the logic behind said report.

Riiiiiiight.

So I give them a new report. I don’t explain how the sausage is made, I just make it and serve it up. “But why is this different from the original report?”

double facepalm

Well, do you want the audit-able answer, or the real answer? The audit-able answer is, “We made a system change that allowed us to prevent the ‘fake’ changes from being logged incorrectly.”

The real answer is, “B*tch, I AM the system!” — meaning yes, I excluded those with some hacky logic, and you need to stop asking questions about it.

Anyway. Change Logs are super fun.

Reports

Speaking of reporting. I could really go on for pages about how terrible and broken this whole system of “request-based report development” is. But it’s frankly all we have right now. Until there’s sufficient business buy-in to the concept of agile data warehousing and collaborative cross-functional data modeling, shit just comes in one funnel and goes out another with a little sparkle spackled to it. And we call it a report.

Example, you say? Sure! Let’s say we run a special sale on certain types of widgets every quarter. We want to track how these ‘specials’ perform — do they increase our sales of those widgets? By what factor, compared to the other not-on-sale widgets? Can we trend this over several quarters?

Oh but wait. The data structures that govern widget pricing and time-span-based sale pricing, and the logic that relates customer orders to what pricing structure they used at the time of ordering, is awful, terrible, and changes every time there’s a new quarterly promotional sale.

So you’re saying you want a report that trends sales of widgets based on arbitrarily changing promotional pricing as compared to other widgets that may or may not be subject to ‘normal’ pricing during that same time period, all without a simple definitive data-point that says “This is a Quarterly Promo sale, and That is Not.”

elrond's facepalm
Mister Anderson… I mean, wait.

Let’s try to get at the root of the problem, shall we? The business doesn’t seem to understand that the way they implement promo-sales is detrimental to long-term/comparative reporting. The data model makes this harder, not easier. Can we perhaps put some heads together and come up with a compromise that both A) makes more business sense, and B) improves the data model to be a bit more intuitive?

In Closing…

What’s your FPH? What causes you to facepalm on a regular basis? Let me know in the comments!  :o)

Quickie: SSRS Sort Descending

Let’s make it real simple for the end-user to get the ‘best’ behavior by default. Let’s sort by our ‘Revenue’ column in DEscending order first.

You ever have a paginated report, i.e. a standard old-fashioned report with a big table (tablix, matrix), where you want the users to be able to sort by any column they want? And let’s say one such column is ‘Revenue’ — some money amount where the “biggest is the most important”. Like your top customer, for example.

Now, you know about ‘Interactive Sorting’ right? No?!? Oh it’s awesome! Let me remind you. It’s super easy to configure: just get to the Text Box Properties of the header box of whatever column you want to use, & go to the ‘Interactive Sorting’ tab. Like so:

ssrs interactive sorting property
Simple example — we just have ‘Details’ rows, no other row-groups or hierarchies.

Make sure you pick the appropriate Field in the Dataset to “Sort by”. Hint: it’s usually the one you’re displaying in that column!

But you know what’s kinda annoying? You can’t dictate a “first sort direction” — it just assumes that the first time you click the sort-arrows, you want ‘Ascending’ (lowest first). Then you can switch to ‘DEscending’ (highest first). This makes perfect sense for alpha values (strings), but not always for numeric values — at least not when you’re dealing with money, when generally the highest dollar amount is the most important!

So let’s make it real simple for the end-user to get the ‘best’ behavior by default. Let’s try to make it sort by our ‘Revenue’ column in DEscending order first. Ready?

ssrs interactive sort expression negative revenue
Set the ‘Sort by’ expression to the NEGATIVE of the field value! Simple as that.

It’s so small in that picture you might miss it. I’m doing this, but I’ll add whitespace below for easier reading:

= - Fields!Revenue.Value
..^look, there's the minus sign!
(assuming your viewing device hasn't mangled the fixed-width text)

You could also get more pedantic/readable and spell-it-out like so:

=(-1)*(Fields!Revenue.Value)

Right? Beautiful. So what does it look like on the report?

ssrs interactive sort by revenue descending
I clicked it! And it sorted highest-first just like I wanted!

Notice my silly hyphenated column names, like ‘Custom-er No’ and ‘Rev-enue’. That’s because those sort-icons take up space within the textbox, and you can’t control it, so the text wraps in a non-obvious manner. Thus, I purposefully added a hyphen and a line-break to the text, so I could control how they looked, instead of leaving it to the wiles of the SSRS render-er.

That’s all there is to it! Happy reporting. =)

 

The EAVil Cycle, Part 2

In which we discuss more about the EAV model and some of its merits and pitfalls.

continued from last week…

The Ugly (aka the “Wow really?!?”)

You’ll see this ‘creep’ even in product-catalog systems as mature as Amazon’s. If you search for (geeky as we are) graphics cards, and use the product attribute filters in the left pane to narrow it down, you’ll find that some correctly have their memory type (GDDR5, etc.) listed, while others may not. If you’re really unfortunate, there will be two semi-redundant attribute-sets that you’ll have to “juggle” between to really get at what you want. TVs, for example, may see both an “HDR support” (yes/no) and an “HDR type” (standard, ultra, etc.) — I’m kinda pulling those out of my arse for example’s sake, but you get the point.

Why does this happen? Because at some level, humans are still responsible for ‘tagging’ those products (for lack of better word). And as much encouragement and guidance as the ‘admin software’ may give them, they can (and do) still decide at times to side-step that guidance and say “Nope, I know better; make me this new thing!”

But isn’t that a problem with nearly all data-driven systems? Sure, of course it is. Yet with a model as flexible as EAV, the problem is intensely magnified by the fact that it’s made so easy to do — to ‘extend’.

so preoccupied with whether or not you could, you didn't stop to think if you should
It’s probably not the exact quote, for you pedants, but it’s close enough for government¬†blog work.

And unfortunately, the biggest contributor to this problem is the lack of development-time and forethought given to the administration, or management, of the data. You see, this problem could be largely assuaged if the admin-toolset were the FIRST thought and priority in the roadmap. But so often, that thought comes LAST, if at all. So sure, your product feature tagging system looks great, it’s flexible and your customers love it. But you’re throwing tickets over the wall to your data team every time a requirement or use-case changes, or when you need to fix a data-quality problem caused by the users not knowing there was already a “Widget Type” before creating their new “Widget Kind” tag, or misspelling “Is Wierd” because English is weird and has more exceptions to the “I before E” rule than not.

Does this problem go away with a dedicated search-index or NoSQL technology like Elasticsearch or (shudder) MongoDB? Of course not! If anything, it may be worse. Maybe. But wait, those systems make it easier to de-dupe and manage redundancy & data quality, don’t they? Probably. I can’t speak from experience myself, but I’ve heard good things. Once again, it all comes down to the effort you’re willing to invest in the system. If you make data quality a priority, you’ll be happier with the experience. If you don’t, well you’re just another amateur data scientist complaining about dirty non-standardized/non-validated address fields, aren’t ya?¬† =P

I joke with the data scientists, of course. What they do is awesome. They just tend to often re-invent the wheel of data-cleansing/data-wrangling that we DBAs have been doing for a few decades, because they didn’t know the right questions to ask or the right place to look. We need to get better at working together WITH them, not ‘for’ or ‘against’ them.

ninja cat riding a unicorn with laser-eyes
How the data scientists see themselves…

The Why or When (aka “Is it a decent model for this?”)

The long-story-short version is, consider your business and your data. Try to plan for the future, and anticipate potential changes and growth. It’s not easy, and we never “get it right the first time”. But we can try.

When your attributes are fairly static, and you know that you can tightly control them, you might consider a more rigid model. Something with a handful of lookup tables referenced by the main product entity. This is advantageous for performance and management, at the expense of scalability and extensibility.

When you literally need to support on-the-fly extension, and you’re absolutely married to SQL (i.e. not ready to venture out into NoSQL land just yet), the EAV model may fit the bill. Aaron’s article, and the comments therein, present some fairly valid and reasonable implementation suggestions to make it a little more palatable. Just beware the date — that was written back in 2009. Before we had such things as Elasticsearch and its ilk. I’d heavily encourage the consideration of purpose-built data-stores for this sort of thing, if you have any hope of scaling-out.

Other tools in your toolbox can help with this, too. For example, consider an in-memory data-grid for super-fast reads. The vast majority of data-access to these attributes & values is going to be reading, using it to filter & slice & dice a data-set. You can pay the small performance cost (e.g. write to the underlying SQL database) on the rare occasion when a write/change needs to occur.

In Conclusion

Proving the age-old rule of “Just because you CAN, doesn’t mean you SHOULD”, the EAV model is sometimes okay and sometimes not. You need to understand your business and your data¬†to make that call. And you need to consider the magnitude of effort that may be involved in pivoting from one model to another. Unfortunately, in many cases, that part overshadows the rest, and the show business must go on.

queen the show must go on
You’re welcome again, ears.

Still, I encourage you to at least think about it, and be ready with that knowledge of pros/cons when the time is right to discuss it with stakeholders.

The EAVil Cycle

In which we discuss the EAV model and some of its merits and pitfalls.

EAV, or Entity-Attribute-Value, is an data model that’s been around the block. It’s typically injected into a relational database at some point during the overall application/architecture life-cycle, somewhere between when the team realizes that they’ve got way too many lookup tables for a “main business entity” thing, and when they finally make the shift into polyglot data stores.

Wake me up when that actually happens, successfully.

I’m not going to rehash aging internet arguments here, nor bore you with replicated diagrams that you could just as easily look up on Google Images. No, instead, I’m going to tell you why this model is not great, why it’s not bad, and how you should go about deciding if it’s not wrong for you.

Yes, I did negate all those adjectives on purpose. Mostly because nobody really enjoys working with these structures, regardless of how they’re implemented; customers and business stakeholders are ALWAYS CHANGING the requirements and the attributes in question. But, we press on.

Refs:

PS: Postgres is looking really freakin’ cool these days.

the good the bad and the ugly
Another Clint Eastwood pic… apologies if you’re not a fan. =P

The Good (aka “Not Bad”)

Proponents tell us that this model is easily searchable and easy to administer. The “searchable” bit is true; especially when you have the attribute-values pre-defined and don’t rely on end-user text-entry. But that’s true of basically any data model. The key here is that all attribute-values are effectively in one “search index”. But wait, don’t we have purpose-built search indexes nowadays? (Hint: see Elasticsearch.) This will come up again later.

Administerable? Administrable? Administratable? Damn you English! Anyway. Yes, again, if you’re fairly confident in your business users’ ability to effectively track and de-dupe (de-duplicate) incoming requirements/requests using their own brains/eyeballs and the admin tool-set that you build for them.

Oh, right, did I mention that? You have to build the admin app. Because you do NOT want to be writing ad-hoc SQL queries every time a new attribute requirement comes in. (Still, it’s better than making schema changes for new req’s, as I’ll discuss in a bit.)

Mainly, though, the biggest ‘pro’ of this model is that your business requirements, i.e. your attributes and the values they’re allowed to contain, can be flexible. The model allows a¬†theoretically infinite amount of customization to suit your needs; though in practice, as Allen writes in the CodingBlocks article, you do run up against some pretty hard scalability hurdles right-quick. So in practice, you might want to consider more horizontally-scalable data stores, or (God help you) try scaling-out your SQL databases. (Spoiler-alert: big money big money!)

shut up and take my money
Millions of query-bucks…

The Bad (aka the “Not Great”)

Which brings me to the first ‘con’. Performance. If you’re modeling this in a relational DB, and you expect it to scale well, you’re probably overly optimistic. Or very small. (If the latter, great! But you don’t always want to be small, right? Right!)

Don’t get me wrong; you can make it work half-decent with good indexing and sufficient layers of abstraction (i.e. don’t write a “kitchen-sink view” that’s responsible for pivoting/piecing-together all the attributes for a product straight outta SQL). But again, is it really the right tool for the job?

Momentary digression. SQL Server, or more generally, the relational database, has long been touted as the “Swiss army knife” of IT; we’ve thrown it at so many problems of different size and shape, that we’ve almost lost track of what it’s actually very GOOD at. Hint: it’s about relationships and normalization.

Another argument against it seems to be data integrity and enforcement. I can understand that, but again, with some clever software overlay and user-guidance, this can become almost a non-issue. But remember, your developers are the ones building said software. So that effort needs to be considered.

The Ugly (to be continued…)

The biggest problem, and quite a legit one, is ‘creep’ — both scope and feature. See, the inherent flexibility in the model will almost encourage¬†data managers to be less careful and considerate when deciding when to add an attribute or value-set, and how to govern the data-set as a whole.

creep wish i was special
No, not THAT creep. But, you’re welcome ears.

Stay tuned for more…

Career Day: SQL DBA

Build a solid CS/IT foundation, build your soft-skills, and then learn the latest & greatest tech stack. Or the mature enterprise stuff. There’s plenty of room for both!

Our benevolent overlord¬†prompted us (months ago; I know, I’m always late to the party) to write about what we do in our day-to-day jobs as SQL Server professionals, aka #SQLCareer. The actual idea is to do 4 posts in a row, for 4 working days (either adjacently or the same day each week for 4 weeks).

Sticklers, us? Nope, never.

Without any kind of future guarantees, here is my first.

Morning Mash-up

Emails, tickets, and alerts, oh my! Today was actually pretty quiet on the monitoring front — no major performance issues, no interruptions, and no job failures. That one that occasionally takes 8+ hours did okay last night. More often than not, I have to kill it if I find it’s been running into the work-day. That should be automated. /backlogged

Re-wrote some analyst’s data change script to be more “friendly” and set-based, instead relying on identity values — which is important when they can differ between Dev & Production environments. Deployed a few change requests, including that one.

virtualize all the things
Even SQL? Yes! Especially SQL. Just know the caveats.

On the side of less tedium, I fleshed-out more of the plan to virtualize our remaining bare-metal SQL servers. The usual ‘gotchas’ have to be discussed with the Server Admins — reserving compute resources, preventing the ‘noisy neighbor’ problem, and having enough storage to do the migration.

Yes, that means about 2x the storage as they’re currently using. Got a problem with that? No? Good. ūüėČ

Finally, I worked on some code for querying the SSRS ReportServer¬†database to obtain report metadata (including stored-proc usage, parameters, and user activity). The core concepts came from Ted Stathakis‘s SQLSaturday¬†session on SSRS 2016. Hopefully, some of that code will be coming to GitHub soon.

Afternoon Delight

“Delight” being sarcastic. No really, I love my job, but sometimes there are just some things that make you go ‘grr arg’.

First up, developing and finalizing some “data cleanup effort” reports. These are things that show the business users “what’s wrong with the data” and give them some guidance on how to “fix it”. Now, because there is no easy “management GUI” for this data, and because actual changes need to go through change-control, it’s not going to be a cake-walk. But given the right inputs and some domain knowledge, they can make the decisions as to what those changes should be, and then hand them off to the BA’s (business analysts). Usually in Excel form.

Excel spreadsheet with woman screaming in front of it
What? It’s under 1,048,576 rows! Barely…

Next we have the ‘grr’ part. This is where I spent about 40 minutes trying to explain to the 3rd-party software vendor exactly where and how to integrate with our core data systems. Most of the info should not be ‘new’ to them, but since I never know which support tech will be assigned to the issue, it’s best to err on the side of verbosity. To make things more confusing, the folks on my end, who have been interfacing with the vendor thus far, aren’t intimately familiar with the underlying technology bits, so the conversation gets a little dicey before we all arrive on the same page.

Fortunately, since I had their attention, I was able to sneak in a suggestion to upgrade the back-end database for the software (which we self-host, but is essentially managed by them & their software) to something from this decade. MySQL 5.0 is old, guys, mmkay?

Wrapping Up

A developer needs a new table & some initial data-load to production; it’s passed QA. Great! They wrote the query a little weird, but nothing ol’ SQLPrompt can’t fix.

Commit outstanding stored-proc code & RDL files (SSRS reports) to source control. Even if I’m the only one working on them, I’d rather they be safe & track-able.

Ruh-roh.¬†A developer set off some proc in a Dev db, it’s been running for over 30 minutes… and then they cancelled it! But wait, it’s stuck in rollback. Yes, kids, rolling-back a transaction can take even longer than the transaction itself, due to rollbacks being single-threaded.

Now, since the user’s hitting of the ‘Cancel’ button (aka “Stop” in SSMS) does not wholly and entirely cause a kill SPID¬†command, I do that for them (since only sysadmin¬†or processadmin can kill sessions). Then I run ye olde kill 64 with statusonly¬†(64 happened to be the SPID) to check on its rollback status. Good news, everyone! It’s at… 0%.

bad news nobody
You heard me.

And it stays at 0%. For fifteen minutes.

I move on to other tasks of course, since there’s nothing more I can do (short of pulling the plug on the server, which is never¬†a good idea (unless you’re testing your Disaster Recovery Hope plan!). We also create a ticket to fix this stored-proc so that it doesn’t “take forever”, even if it has to churn through millions of records.

Finally, the statusonly¬†check starts moving. It’s about 45% rolled back, with about “300 seconds remaining”. This, as anybody who’s been a DBA will tell you, is in “Microsoft time”. You know those progress-bars you typically get in Windows while something is happening? They’ll start off fast, make it to around 90-some percent, tell you there’s about 1 minute remaining, then take at least five minutes to “finish that last little bit”? Yeah, that’s a “Microsoft minute”. Patent-pending.

But fortunately for us, it does finish in just about the time it promised, maybe a bit longer. Just in time for the Dev to test what they wanted to test before going home for the day. And just in time for me to have my evening not ruined by a runaway transaction.

PS: Anybody who’s ever worked with SQL, including yours truly, has done this (set off a long transaction and/or tried to rollback a long transaction) at least once in their career. If you ask a seasoned DBA or DB-Dev, and they say “No, I’ve never done that!”, they’re a dirty rotten liar.¬† =D

Are You Not Entertained?

If the life of a DBA sounds like fun to you, get in touch with your local IT recruiter and find out what the job market is like! You’ll hear all sorts of doom & gloom about how “the Cloud is taking over everything” and “the DBA is dead; long live DevSecDataOpsUnicorns!”.

No, I joke. Yes, some of the fundamental responsibilities of the DBA are shifting to the cloud vendors, to IaaS/PaaS services, but at the end of the day, being in IT is about learning tech and working with people. Build a solid CS/IT foundation, build your soft-skills, and then learn the latest & greatest tech stack. Or the mature enterprise stuff. There’s plenty of room for both!

tee-shirt that says "tough enough to be a dba, crazy enough to love it"
We end on a happy note.