Replication “Just Trust Me”

For what seems like years, I’ve bemoaned the fact that SQL Transactional Replication doesn’t come with a “Just Trust Me” option. I’ll explain more about what I mean in a moment. The other thing I’ve complained about is that there’s no “Pause” button — which not entirely accurate, since obviously you could just stop the distribution and subscription agents. But specifically what I mean is, it’s not easy to ‘put it on hold so you can make some schema changes to one of the tables that’s being replicated’, and then easily “Resume” it after you’re done with said changes.

Well, I’m happy to say that now I have both of these tools/methodologies in my arsenal!

Quick level-set: If you’ve been living under a virtual rock, SQL replication is an old-hat “tried-and-true” method of producing readable copies of your data on other SQL servers, whether for reporting or DR. It’s not an HA technology per-se, although I suppose you could use it for that if you were feeling adventurous. It’s more for “I need a reasonably up-to-date copy of my data ‘over there’ so I can run reports / crappy user-formed / EF-generated queries against it without slowing down my production OLTP system.”

Yes, I did just take a pot-shot at Entity Framework. #DealWithIt

i don't always break replication but when i do it drives me to drink
But not Dos Equis. That stuff is terrible. =P

Why?

The word that comes to most DBA’s minds when they think of replication is ‘brittle’. And for good reason — when it breaks, it breaks hard, and you’re often left trying to pick up the pieces while wondering how much worse it could be if you just started over from scratch (i.e. dropped all the replications and re-created them). Which, honestly, sometimes is easier. But not if you have a large volume of data, and certainly not if that data is indexed and you don’t want your apps to experience a performance-crisis!

Now, because this post has been sitting in my ‘Drafts’ area for far too long, I’m going to break this up into 2 parts, so I can get something out the door. In part 1, I’ll briefly explain each of the key components of the process. In part 2, I’ll dive into a little more step-by-step detail.

Primary resources that went into this: docs, article1, article2, article3. And my very own dba.SEanswer where I apparently went through a similar process back in 2016 and subsequently forgot about it (mostly).

Key 1: Sync-Type

TL;DR: the “Just Trust Me” option is, when you create the subscription, sys.sp_addsubscription, specifying the @sync_type = 'none' parameter value. Huge thanks to @garethn in the SQL Community Slack.

Sidebar: if you haven’t yet joined the SQL Community Slack, WHAT ARE YOU WAITING FOR?!?!? DO IT, DO IT NOW!!!

Ahnold ‘teh Governator’

@sync_type = 'replication support only' may be applicable in some scenarios as well, but I’m not 100% clear on the difference / use-cases at the moment. More to come later, hopefully.

Key 2: Script Publication Procs

Protip: sys.sp_scriptpublicationcustomprocs @publication = 'PublicationName' generates the internal repl-procs that control the table creations/updates on the subscriber. You run this ‘script’ command on the publisher, then get the results (the script it generates), copy-paste to a new SQL file, and run on the subscriber.

This has come in handy on several recent occasions, wherein I had to either swap tables behind-the-scenes due to a PK change, or make a column & index change that involved truncation. Using the “stop, shuffle, start” method, which I’ll get into in part 2, I’m able to tell the subscriber “Hey, the definition of this table has changed, you need to grab these new repl-procs so you can handle it correctly!”

Key 3: Publication Properties

In order to tell our publication that “We’re gonna be making some changes, don’t panic!”, we want to turn OFF 2 properties (assuming they’re true, which they likely are by default) using sys.sp_changepublication @publication='MyPub'. The properties are 'allow_anonymous' and 'immediate_sync', and you simply append the arguments to the proc call like so: @property='allow_anonymous', @value='false' / @property='immediate_sync', @value='false'.

Later, after we’re all done with our under-the-hood changes, we’ll want to turn the back on, in reverse order: first enable 'immediate_sync', then 'allow_anonymous'. Cool? Don’t ask me why; DBAs much smarter than I have decreed it so.

OMG, remember Xena Warrior Princess? Holy wow that’s some nostalgia for ya.

Honorable Mention: Pull Subscriptions

In one instance, I was using a PULL subscription (as opposed to PUSH). I had to re-start the Distribution agent (on the subscriber) twice for it to work (to start actually synchronizing). It STILL shows as ‘Uninitialized Subscription’ in the repl-monitor, though. Kinda annoying.

Pull subscriptions can be nice because they shift the burden to the subscriber DB, so that your publisher (master, primary, whatever you wanna call it) doesn’t get too bogged-down. But as always, there are trade-offs. Check out this handy little comparison guide on the topic from a fellow DBA blogger.

That’s all for now; stay tuned for more as I go into detail about how I used these in what scenarios. Thanks for reading! ❤

Facepalms Per Hour

My current velocity is sometimes measured in FPH – facepalms per hour.

This is a rant. Fair warning.

I guess the new ‘Millenial’ colloquialism for “grumpy” or “sarcastic” is “salty“. So I’m feeling extra salty this week. For several reasons. One, it’s audit season. Two, I had to churn out about a dozen new reports in the span of 4 days because the manager who was supposed to be tracking that project dropped the ball and forgot they were due by the end of this month until… yeah, last Friday. Wheeeeee!

Thus, I decided, my current ‘velocity’ (a SCRUM/DevOps term for “how much work are you getting done”) shall be measured in FPH – Facepalms Per Hour. Currently I’m at 3. Earlier this week I was approaching the double-digits, when the lovely report consumers kept thinking of “just one more little thing” they forgot about until after I’d delivered the ‘final’ product.

‘Final’ actually being a meaningful adjective in this context approximately NEVER.

facepalm original picard
The original gangsta.

Change Logs

How best to describe this scenario while still maintaining separation of “real job” from “blog land”… Hrm. So let’s say we have a CRM, like most companies. This stores customers, among other things, in a database. And since it also stores sales transactions and financials, it’s heavily audited — it has a lot of change-tracking mechanisms.

Now, auditors come along and want a report of some specific type of change over time. I happily oblige. Then… PANIC! And not at the disco. “What are all these changes to these customers by these users who don’t have permission to make said changes?!?”

K, calm down sparky. Try not to sound the alarm; auditors are a sensitive bunch.

Turns out, those changes are, in a word, “fake”. You see, there’s this background “customer sync” process that keeps them up to date with another part of the CRM where the actual changes were made. But, because it’s written poorly, it thinks that ANY field change, even just the Name or Address (which a lot more CSR’s, customer service reps, have the permission to change, because, you know, that’s their job), constitutes a change to the ENTIRE customer record on the other end. So the change tracker logs a change to every single field on the receiving end of that sync process, even though nothing really changed on the source side except maybe one or two fields.

With me so far? Great. So now the question is, “Well, can we get a report that doesn’t show those ‘fake’ changes?” But wait, it has to be “system generated” and you’re not allowed to “filter” or “add special exceptions” to it, because it still needs to be audit-able.

So what you’re saying is, give me a report that shows me what I care about, but you’re not allowed to change the logic behind said report.

Riiiiiiight.

So I give them a new report. I don’t explain how the sausage is made, I just make it and serve it up. “But why is this different from the original report?”

double facepalm

Well, do you want the audit-able answer, or the real answer? The audit-able answer is, “We made a system change that allowed us to prevent the ‘fake’ changes from being logged incorrectly.”

The real answer is, “B*tch, I AM the system!” — meaning yes, I excluded those with some hacky logic, and you need to stop asking questions about it.

Anyway. Change Logs are super fun.

Reports

Speaking of reporting. I could really go on for pages about how terrible and broken this whole system of “request-based report development” is. But it’s frankly all we have right now. Until there’s sufficient business buy-in to the concept of agile data warehousing and collaborative cross-functional data modeling, shit just comes in one funnel and goes out another with a little sparkle spackled to it. And we call it a report.

Example, you say? Sure! Let’s say we run a special sale on certain types of widgets every quarter. We want to track how these ‘specials’ perform — do they increase our sales of those widgets? By what factor, compared to the other not-on-sale widgets? Can we trend this over several quarters?

Oh but wait. The data structures that govern widget pricing and time-span-based sale pricing, and the logic that relates customer orders to what pricing structure they used at the time of ordering, is awful, terrible, and changes every time there’s a new quarterly promotional sale.

So you’re saying you want a report that trends sales of widgets based on arbitrarily changing promotional pricing as compared to other widgets that may or may not be subject to ‘normal’ pricing during that same time period, all without a simple definitive data-point that says “This is a Quarterly Promo sale, and That is Not.”

elrond's facepalm
Mister Anderson… I mean, wait.

Let’s try to get at the root of the problem, shall we? The business doesn’t seem to understand that the way they implement promo-sales is detrimental to long-term/comparative reporting. The data model makes this harder, not easier. Can we perhaps put some heads together and come up with a compromise that both A) makes more business sense, and B) improves the data model to be a bit more intuitive?

In Closing…

What’s your FPH? What causes you to facepalm on a regular basis? Let me know in the comments!  :o)

Quickie: SSRS Sort Descending

Let’s make it real simple for the end-user to get the ‘best’ behavior by default. Let’s sort by our ‘Revenue’ column in DEscending order first.

You ever have a paginated report, i.e. a standard old-fashioned report with a big table (tablix, matrix), where you want the users to be able to sort by any column they want? And let’s say one such column is ‘Revenue’ — some money amount where the “biggest is the most important”. Like your top customer, for example.

Now, you know about ‘Interactive Sorting’ right? No?!? Oh it’s awesome! Let me remind you. It’s super easy to configure: just get to the Text Box Properties of the header box of whatever column you want to use, & go to the ‘Interactive Sorting’ tab. Like so:

ssrs interactive sorting property
Simple example — we just have ‘Details’ rows, no other row-groups or hierarchies.

Make sure you pick the appropriate Field in the Dataset to “Sort by”. Hint: it’s usually the one you’re displaying in that column!

But you know what’s kinda annoying? You can’t dictate a “first sort direction” — it just assumes that the first time you click the sort-arrows, you want ‘Ascending’ (lowest first). Then you can switch to ‘DEscending’ (highest first). This makes perfect sense for alpha values (strings), but not always for numeric values — at least not when you’re dealing with money, when generally the highest dollar amount is the most important!

So let’s make it real simple for the end-user to get the ‘best’ behavior by default. Let’s try to make it sort by our ‘Revenue’ column in DEscending order first. Ready?

ssrs interactive sort expression negative revenue
Set the ‘Sort by’ expression to the NEGATIVE of the field value! Simple as that.

It’s so small in that picture you might miss it. I’m doing this, but I’ll add whitespace below for easier reading:

= - Fields!Revenue.Value
..^look, there's the minus sign!
(assuming your viewing device hasn't mangled the fixed-width text)

You could also get more pedantic/readable and spell-it-out like so:

=(-1)*(Fields!Revenue.Value)

Right? Beautiful. So what does it look like on the report?

ssrs interactive sort by revenue descending
I clicked it! And it sorted highest-first just like I wanted!

Notice my silly hyphenated column names, like ‘Custom-er No’ and ‘Rev-enue’. That’s because those sort-icons take up space within the textbox, and you can’t control it, so the text wraps in a non-obvious manner. Thus, I purposefully added a hyphen and a line-break to the text, so I could control how they looked, instead of leaving it to the wiles of the SSRS render-er.

That’s all there is to it! Happy reporting. =)

 

Quickie: SSRS Multi-Column List/Grid

While there’s no native control like it, with a few simple tricks and visual slight-of-hand, we can fake it.

Today I present to you, a hopefully less confusing and more intuitive explanation of how to make a “multi-column list” control on your SSRS report.  And if the term is confusing by itself, I’ll illustrate in a second.

Three links that helped me get going in the right direction:

They all roughly boil down to the same concept: placing redundant or repetitive controls (like a set of 2 or 3 tablixes), and using a RowNumber() expression with some modulo arithmetic to set “visibility” properties of the dataset rows shown in each.

We don’t quite get to “dynamic # of columns and rows based on the total # of records” nirvana (e.g. make it a 3×5 grid if you have somewhere around 15 records, but only make it 2×4 if you have 8 or less records, etc.), but I’d estimate that this approach would suffice in 90% of use-cases.  And whoever at Microsoft decided to not build a control like this into the toolset, curse you!!

Yes, we’re not even past the intro section and I’ve already cursed someone at MSFT.  Doesn’t change my passion for the platform in general; it’s just a minor curse.  =)

Concepts

SSRS uses a dataset to fill one or more controls with data; controls such as the tablix. In my regular implementations, a dataset is usually a stored-proc. The tablix shows the results of the proc in tabular form. The problem with the tablix control is that you can’t turn it into a multi-column layout, as you might see with some more advanced UI controls. You might think that the matrix control would be suitable for this, but no; that is just a table-like structure that can support hierarchical column-groups and row-groups.

For example, let’s say you have a table of Users, with columns ID, Name, and Role. Like so:

ID | Name    | Role   
---|---------|--------
1  | Bob     | Admin
2  | Alice   | User
3  | Jack    | User
4  | Jill    | User
5  | Joe     | Admin

But with many more records. We want to display this on a report, which is typically viewed on a PC screen, i.e. landscape orientation (wider than tall). So wouldn’t it be nice if we could display it like so?

ID | Name  | Role  || ID | Name  | Role  || ID | Name | Role
---|-------|-------||----|-------|-------||----|------|------
1  | Bob   | Admin || 2  | Alice | User  || 3  | Jack | User
4  | Jill  | User  || 5  | Joe   | Admin ||

etc. Right? Right. So essentially what we want is 3 “sections” of data — 3 “columns” each with its own “part” of the table.

Bonus: Yes, this works even when the # of rows you have is not evenly divisible by the # of sections you want to break into!

Implementation

The actual work is fairly simple. We add 3 tablixes to the report and place them side-by-side. We connect them all the same dataset, i.e. stored-proc — which will only be executed once, conveniently! On the data row (as opposed to the header row) of each tablix, right click to Row Visibility. Then use an expression like so:

=IIf(RowNumber(Nothing) Mod N = X, False, True) — where N is the # of sections (tablixes) you want, and X is going to vary from 1 to (# of columns) - 1, and finally to 0 for the last (right-most) tablix.

Remember, the expression defines the row’s Hidden property, not “Visible”. I know, it seems backwards, but that’s how it is.

In plain English, and if we just want 3 sections (tablixes), we could say the following. For Section 1, left-most: “If the row is the 1st row, show it, else hide it”. Likewise, for the Section 2, the middle, we’d say: “if the row is the 2nd row, show it.” And finally, for Section 3, the left-most: “If the row is the 3rd row, show it.” And so on.

So those Hidden expressions would look like this:

=IIf(RowNumber(Nothing) Mod 3 = 1, False, True) //section 1
=IIf(RowNumber(Nothing) Mod 3 = 2, False, True) //section 2
=IIf(RowNumber(Nothing) Mod 3 = 0, False, True) //section 3

Not bad, right?

Final Thoughts

It’s certainly odd that there’s no native way of building such a report. But with a few simple tricks and visual slight-of-hand, we can fake it.

ssrs-multiple-tablix
I highlighted the middle tablix so you can see that there are 3 separate ones, but we want to make sure they’re precisely adjacent to each other so the end-user can’t tell them apart.

VS SSRS Project Gotchas

I now present to you, a recent adventure in building & configuring a SSRS 2016 / VS2015 project to replace our aging SSRS 2008R2 / VS2008 setup.  In trying to make things ‘better’, I wanted to centralize the storage of the Data Sources.  I found a tip on StackOverflow with a neat (albeit hacky) idea on how to do this.  So I started diving into it.  Here’s how that went.

The Setup

I have a bunch of published reports on a ‘real’ report server (SSRS instance), which were created off-the-cuff using Report Builder 3.0 (no source-control connection or anything).  So the first thing I had to do was go and download each RDL (thanks, Microsoft… ugh!), to my local VS2015 project folder, so I could bring them into my SSRS project.  I didn’t bother trying to download/copy the Data Sources (they’re basically just connection-strings, and anyway you can’t download them from Report Manager), so I settled for re-creating those in VS.

Then it was time to set up the solution & projects.  I prefer to organize my reports in context-relevant folders, like “Operations”, “Marketing”, etc.  In order to do this sensibly, within the VS solution, you need to create a Project for each major fold you’ll have.  Makes sense; this is how the old solution was constructed too.  Here’s where I tried to improve things:  I created a “Datasources” project, which would house just the shared Data Sources.  Here’s where I created my new data-sources, setting them up with saved SQL auth (login & pwd).  Sure, maybe not the “best practice” but it keeps things simple — my SSRS user only ever has read-only db access, and the pwd is managed well enough to keep auditors happy.

Thus, my plan (in following the SO tip) was to have all the other projects’ Data Sources be pointers to these shared Data Source files (RDS​’s).  Sounds good in theory, right?

what could possibly go wrong
ooh, sparkly!

Starting to Implement

Well.  I started small with just one project, one report, and one data source.  I went into my Nate test project, did “add existing item” into the Reports folder, and browsed to pick the rdl that I had just downloaded from my live SSRS server.  I then went to my Datasources project, copied the rds that I needed, pasted it into Nate test project’s Shared Data Sources folder.

Now at this point there are 2 copies of this file in my solution folder.  We don’t want that.  So I opened up trusty ol’ Notepad++ on the Nate test.rptproj file, and edited the XML node under <DataSources><ProjectItem><Name>My-Source.rds</Name><FullPath>My-Source.rds</FullPath></ProjectItem></DataSources> , changing the FullPath node value to <FullPath>..\Datasources\My-Source.rds</FullPath>.  I then deleted the physical copy of the file at \MySolution\Nate test\My-Source.rds , so that only the shared one physically remains in the filesystem (\MySolution\Datasources\My-Source.rds).

Another way to accomplish this, you may read, is to right-click Shared Data Sources within your project, e.g. Nate test, and say “Add existing item..”, and go select the global shared RDL from your master Datasources project-folder.  However, this still results in a physical copy of the file within Nate test folder!  So in either case you’ll end up doing some editing & deletion.

With me so far?  Okay, here’s where it gets interesting.  I could successfully build and deploy this report to my SSRS server, verifying that it showed my “new thing” (I added a text-box that simply said “This is the new hotness!”).  But I could not preview the report in VS!  Sad panda.  And of course, the most vague generic error message in the world:

An error occurred during local report processing
An error occurred during report processing

sarcasm-smiley
How specific and informative!

More Googling led to a somewhat random tip in the vein of “Go to the report properties, Open the report’s Data Source, and re-select the Shared Data Source from the dropdown”.  Combine with “Edit the Shared Data Source, ensure your saved login/pwd is filled in, and Save it”.  Presto!  Previewing now worked.

But why is this?  Well, as it turns out, it’s not all that surprising.  The RDLs are stored without the saved login/pwd, because of course they’re just plain-text XML files.  So to “protect you”, MSFT doesn’t store your SQL auth credentials, even if (and now matter how many times) you click “Save my password” on the connection-properties window (connection-string builder).

Great.  Well, I did that once, for Nate test project.  Let’s see what happens with another project (aka report folder), say Test again.  So I add an existing RDL, I copy-paste the Data Source from the global shared Datasources project, manually edit the rptproj in Notepad++, delete the copied rdl, etc.  I then try to preview the report… and behold, the same maddeningly vague error message!

“Well duh!” you may think to yourself, “the credentials still aren’t stored anywhere un-encrypted!”.  Yes, dear read, you are correct.  But do I want to right-click on this project’s Data Sources, edit it, fill in the credentials, save, rinse, repeat, for every single project?!  A resounding NO!

nobody-got-time-old-fashioned
ANGTFT

So what should I do?  I could go type in the credentials to the actual connection-strings contained within the ‘master’ rds files.  Of course then they’re in plain-text for all to see… but wait, do I care?  Another no.  Let’s go edit some connection-strings!

As a reminder, here’s what they will look like:

<ConnectString>Data Source=MyServer;Initial Catalog=MyReportDB;User ID=MyReportUser;Password=ThisIs@nAw3s0meP@ssw0rd!;</ConnectString>

Why did MSFT choose to name the XML node ConnectString in their rds schema instead of ConnectionString?  Don’t ask me!

The Gotchas

Here’s the first gotcha:  My reports (RDLs) need to be re-pointed at the correct Shared Datasource, because currently, as they were downloaded from my ‘real’ SSRS server — which had my desired top-level folder layout of “Datasources”, “Nate test”, “Marketing”, etc. — their Data Sources xml node points at \Datasources\MyDatasource.rdl.  This was the correct path on the SSRS server and still will be the correct path when all is said & done.  BUT, to support local debugging/previewing, the build output has to make a copy of the rds in the Project’s bin\Debug folder, which it will happily do, and then just as happily proclaim Could not find a part of the path 'C:\Users\Nate\Documents\Visual Studio 2015\Projects\MyReportSolution\Nate test\bin\Debug\Datasources\MyDatasource.rdl'. — because DUH, the build system didn’t make you a ‘Datasources’ sub-folder under the Debug folder, why would it!?  So by either manually editing the rdl file to remove the ​​\Datasources\ path from the node, or by clicking on the Report’s Data Sources node in the Report Data pane and re-pointing it at the local Shard Data Source (which again, is itself a pointer to the globally shared Datasources!), you can fix this issue.

In the rdl XML, that looks like this (using strikethru to show what you remove):

<DataSources>
<DataSource Name=”MyReport_DataSource”>
<DataSourceReference>\Datasources\MyDatasource</DataSourceReference>
<rd:SecurityType>None</rd:SecurityType>
<rd:DataSourceID>gobbledegook-guid</rd:DataSourceID>
</DataSource>
</DataSources>

And there’s another gotcha:  After we do all this, if we open the rds from within VS solution-explorer, it still won’t “think” it saved the authentication!  But the connection-string will be clearly visible with the plain password.  Here’s a couple pictures of what that looks like.

shared-datasource-properties-ssrs-vs2015
Data Source properties

datasource-connection-properties-ssrs-vs2015
Connection properties of the Data Source

Oh, by the way, did I mention how many times VS crashed on me while doing these gymnastics?  At least a few.

Lessons Learned

So, what have we learned?  Well, for one, this is a crappy situation born of poor in-product support.  I should be able to configure Solution-level shared Data Sources, use them in as many Projects (within said Solution) as I want, and have VS configuration management support them; bonus points for doing so with saved & encrypted credentials.  Ideally, when we check this into source-control, we’d check in the “DEV” environment flavor connection-configs.  Then, when the reports get deployed to the “PROD” SSRS server, the same globally shared Data Sources are already present (and they don’t get over-written, thankfully by default!), configured by the DBA with prod credentials, and nobody in the development pipeline needs to know said credentials.  Yay?

But alas.  We live in an imperfect world.  I also live in a world where I’m both the Report Developer, the Report Manager, and the Production DBA.  So guess what I get to do?  That’s right, not care about the connection-string security!

Okay, yes, I care a little.  Depending on my mood.  But not enough to turn my whole dang world upside-down trying to design and implement a robust multi-tiered-environment solution for one measly SSRS repository.  Kudos to you if you do — you’re a better man than I.

Or “better woman”.  But I’m not a woman.  So I guess the correct phrasing would be “You’re a better woman than I am a man.”  No, that’s not right.  “You’re a better person than I am?”  Sure, I guess that works.

In Review

To recap:

  1. Create the solution & projects
  2. Create the global shared Datasources project, and create your Data Sources (RDS‘s)
  3. Import your Reports (RDL‘s) to each requisite report-project
  4. Manually edit your rptproj to point the data-sources at the central shared path
  5. Manually edit your RDL‘s to remove previously established paths to the data-sources
  6. Configure your rptproj‘s to output to the correct folders on the report server — this should be done by default, i.e. VS has the “intelligence” to guess these for you, but just double-check.
  7. Build, Preview, and Deploy!
  8. Commit it all to source-control and let the collaboration begin.

And now to go automate all this with PowerShell… right after this beer.

Anyway.  Until next time, folks!  If I’ve made any mistakes, missteps, or otherwise offending your technical sensibilities, and/or if you know of a better way to accomplish what I’m after, I’d love to hear from you in the comments!  =)