Go check out your AWS RDS SQL Error Logs! See what tickles your curiosity.
Today’s post is brought to you by an unplanned AWS RDS outage, the desire to search its SQL Error Logs for events during the target time-frame, and the horrendously inefficient interface which AWS gives you in the GUI console for reading said logs.
Even the dedicated ‘admin’ user that you create for your instance, doesn’t have permission to read the error logs via the SSMS GUI nor with xp_readerrorlog. You can, however, use sp_readerrorlog. That’s with an ‘S‘.
The parameters here are quite arcane, namely@1, @2, @3, @4. Unfortunately, none of them allow you to filter on a time-span (those are the 5th and 6th parameters of the underlying xp, which we don’t have access to, as per #1).
My solution involves a #temptable, a loop of insert / exec commands, and then whatever queries you desire to search/filter/analyze the results. I also throw in a conversion to local time (from UTC, which is what the AWS servers use).
Details and The Why
You can check out the script; it’s short enough to embed, IMHO, so I’ll just leave it right here.
Line 25-26 is particularly interesting to me, and only works with SQL 2016 and up. I got the idea from this StackOverflow answer. You can chain two AT TIME ZONE commands together to convert a given datetime value from one zone to another. The reason you still need the CONVERT is because the output of the AT TIME ZONE command is always a datetimeoffset type, which, while quite useful in its own right, has its quirks, and doesn’t serve our purposes for ease-of-readability.
If you’re not running 2016, at least in RDS, you’ve got nearly no excuse. The upgrade process is vastly simpler with RDS than with traditional on-prem servers. Although, I did run into my share of snags with it recently, which I’ll blog about later.
You should plug in whatever values suit your need & environment — the @NumLogFiles and @StartDate & @EndDate. I used 2-2:30am, because… well, that’s always when those damn outages seem to happen, ain’t it?
As I mentioned, “the Why” is basically because AWS RDS limits your permissions (even as an admin) in some key ways, and one of those limitations prevents you from reading the error logs in the more “normal” ways — SSMS GUI, xp_readerrorlog, etc. And the interface given to read the logs in the AWS console GUI is quite a sad-panda. They offer a wrapper proc rdsadmin.dbo.rds_read_error_log @index = 0, @type = 1 , which really only serves the purpose of naming the parameters more nicely; under the hood it’s doing the exact same thing.
Of course we can’t prove that, because it’s encrypted, but the results are identical.
So there you have it. Go check out your AWS RDS SQL Error Logs! See what tickles your curiosity. =)
Triggers aren’t bad, if used for the right reasons.. Here we look at an “audit-trail” use-case.
Inspired by a brief conversation in the #CodingBlockscommunity Slack: A short discussion and example of a “who-dunnit” (“who done it”, a colloquialism for a murder-mystery type thing) trigger, to find how what user is doing deletions against a certain table.
The Background Check
Let’s name our hypothetical database CSI. In it, we have a table, dbo.Victims, where it seems like data is being randomly deleted at random times. As we all know, this is impossible — computers never do anything truly randomly, much less RDBMSes.
Insert witty counter-example here. You know you have one.
So we want to find out who’s doing these deletions. One DBA says, “Hey I got an idea… Let’s put an after deletetrigger on the table!” Another DBA says “I abhor triggers; let’s log sp_WhoIsActive every 5 seconds to try to catch the suspect ‘in-the-act’.”
Both approaches have their merits, and neither is that uncommon. However, the latter is much more regularlybloggedabout, so I’m going to present the former, because it kinda helped remind me of a few things that I hadn’t used in a while. I’d also argue that the latter is much less of a “guaranteed capture”, since you’re gambling pretty liberally on the fact that the delete transaction will even last that long; it’s statistically more likely that you’ll miss it.
Here’s a SQL snippet that shows a basic after delete trigger created on our dbo.Victims table. Notice the use of the special Deleted table reference — this is a “temporary, memory-resident” table according to the Docs, and it holds all the records that were/are-about-to-be deleted from the target table.
I feel like it used be called a “temporal table”, but that now refers to a new feature in 2016, where SQL keeps a hidden history-tracking copy of your table that you can reference like a time-machine; which, incidentally, almost* negates the need for such things as these triggers we’re talking about, but that’s another topic for another time.
*(The ‘almost’ is because temporal tables don’t tell you “WHO”, which is our primary motivator here.)
The interesting bits are how we identify our suspect, our ‘killer’ if you will. See, we not only want to know who they are in the database context, we also (and likely, more importantly) want to know who they are at the server level context. And just in case they’re impersonating another login, we want to check that too.
So we actually have a lot of options here. There’s CURRENT_USER or USER_NAME(), for the DB context user. Then we have SUSER_SNAME(), SUSER_NAME(), SYSTEM_USER, and ORIGINAL_LOGIN() for the server context. If you’re curious, you could also get things like @@SPID (server session id), SUSER_ID() (server login id), and SESSION_USER (database session user).
ORIGINAL_LOGIN() may be the most potentially interesting, especially if we want to write our trigger with elevated (impersonated) permissions to be able to write to the logging table that we’ve set up to capture its detective-work. I did not need it for this example, but it’s worth keeping in mind.
So we’ve got our evidence table, we’ve got our detective trigger, now we just need a suspect. Thankfully we can test it out first, to make sure our operation will succeed when the real perp comes along. We can do this, of course, by impersonation. Or by using different SSMS query-windows with different logins — your choice.
Our faux-suspect’s login name is DummySuspect. We map him to the db_datawriter and db_datareader roles in our CSI database — we know the real perp at least has write permission on the table dbo.Victims, otherwise he/she wouldn’t be able to delete those poor victim rows! And we’re probably the db_owner, which is fine. Let’s call our own login SergeantX.
Now we can pretend to be DummySuspect and execute a DELETE against CSI.dbo.Victims , and make sure it writes to our auditing table, which we called aud.Evidence.
Yes, in practice, we’d probably want to put our Evidence table in a separate database, to really ensure those pesky Suspects can’t update it or delete from it, i.e. “cover their tracks” — here, I’ve settled for simply using a different schema, to keep the example workable. Otherwise we’d have to deal with cross-DB permissions and such, which goes beyond the scope of one little blog post.
Ready? Let’s try it!
Go on over to the GitHub repo and check out the code. There are 3 easy steps – ‘Step 1’, create the objects, including a new DB to house them called CSI. You’ll see the trigger in there as well. Then you can try ‘Step 2’, where I impersonate DummySuspect and delete a row from Victims, and then check the Evidence log when done. And finally, ‘Step 3’ is a similar test, but assumes that you’ve actually connected that SSMS window/tab/query ASDummySuspect instead of impersonating him (or her!). After you’ve done that, check out aud.Evidence again to make sure it logged the 2nd delete.
And there you have it. A simple example of how to write and test an after delete trigger that writes the action info to a separate auditing table for investigation.
Hope you enjoyed! Leave a comment here or on GitHub; I welcome all feedback.
One more thing…
Apparently I’ve been fork‘d! Someone on GitHub liked my take on the Nested Set Model so much that they decided to pull it into their own library and have a play with it. Yay!! 😀 Whoever you are, thank you and good luck building with it. Enjoy!
A typical part of a DBA’s work-week might involve the occasional DB user-role-membership management, so I hope this helps the lone-wolf DBAs out there and/or the developers who need to know what to ask for…
Just a brief post on adding/removing users (database level users) to/from roles (database level roles). It’s relevant because several shops are still stuck supporting at least a few 2008 (or hopefully, 2008R2) instances, and there is a key difference between those and newer (2012 & up) versions in the “preferred” method of doing this security task.
There are reams of documentation and books and articles written about SQL security in general. That is beyond the scope of this post (and indeed, beyond the scope of any single blog, unless you’re an SME on the subject!). But a typical part of a DBA’s work-week might involve the occasional DB user-role-membership management, so I hope this helps the lone-wolf DBAs out there and/or the developers who need to know what to ask for, when they’re planning/deploying a new app against their SQL DB(s).
The “old” method involves calling system stored-procedures, sp_addrolemember and sp_droprolemember, in which you pass the role-name and username. The “new” method, supported starting with SQL 2012, is to use the command-phrases ALTER ROLE [role] ADD MEMBER [user], and ALTER ROLE [role] DROP MEMBER [user].
The latter is more ‘standard‘, while the former is more ‘Microsoft-y‘. I couldn’t easily find whether it’s part of the official ANSI standard or not… that’s an exercise for the reader. What I find very interesting is that Azure’s data warehouse offerings require the old method. Of course, hopefully in a DW setting you’re not messing with security nearly as much as a typical OLTP system, but… yeah.
Does that mean those Azure services are built on top of older SQL engine versions? Possibly. MSFT isn’t too open about the deep internals of such tech, but neither is any other cloud vendor, so we can’t really ask them such a question and expect anything more than a blank-stare. But it is curious, no?
Syntax examples: Let’s add the user foo to the database Bard, in the db_datareader built-in role. Then we’ll remove him. (Or her, I guess; “foo” is a pretty gender-neutral name.) Creating said user is easy, so I’ll start with that, and it’s the same in all supported versions. You need a server-level login to link it to; if you don’t have one, I’ll show you how to create it first.
Create server-level login:
--preferably, you create a login for an existing AD/Windows account:
CREATE LOGIN [yourdomain\foo] FROM WINDOWS;
--or, you can just create a SQL login (not connected to domain/Windows/ActiveDirectory; also less secure, as discussed here and here)
CREATE LOGIN [foo] WITH PASSWORD = 'foobar';
Create database-level user:
--if you made the domain/Windows login:
CREATE USER [foo] FOR LOGIN [yourdomain\foo];
--or, if you just made the SQL login:
CREATE USER [foo] FOR LOGIN [foo];
ALTER ROLE db_datareader ADD MEMBER [foo];
Check (see above)
Remove user from role:
ALTER ROLE db_datareader DROP MEMBER [foo];
Notice that, because the “old way” is simply executing sys-sp’s, we can actually run it from any database context. Whereas the “new way” requires you to connect to the database in question.
Note: I am in no way shape or form responsible for you screwing up your database or SQL instance, nor for you getting yelled at by your DBA or security admin or any other form of verbal assault you may incur as a result of running these commands. But since you need server-admin & database-owner equivalent permissions anyway, you’re probably one of those people already, so you’ll just end up yelling at yourself.
Cleanup (just so you don’t muddy your instance/DB up with a silly example user):
DROP USER [foo];
DROP LOGIN [foo];
If you have any questions, feel free to reach out to me!