You might have heard people talking about all the cool stuff that Microsoft introduced with SQL Server 2016 Service Pack 1. There’s lots to get excited about: finally, those of us stuck on Standard Edition get a lot (but not all) the bells and whistles of Enterprise Edition. A lot of the developers and BI people I work with are super excited to leverage some of the cool features of 2016 now in instances that aren’t Enterprise licensed.
I can get excited by change data capture, columnstore indexes, and dynamic data masking as much as the next person, but buried in the release from Microsoft was something that jumped out at me:
DROP TABLE support for replication – DROP TABLE DDL support for replication to allow replication articles to be dropped. For more information, refer KB 3170123.
… and then I woke up; apparently I passed out from a combination of excitement and surprise that replication got an update. To be fair, though, this change wasn’t just something introduced with SP1 for 2016; Microsoft initially tucked this change into SQL Server 2014 SP2.
A major complaint I hear about SQL Server replication is around how well it doesn’t play with things like automated deployments or continuous integration. It can be hard to maintain a replication set up that mirrors your production environment; space requirements can be high, and refreshing a non-production environment means automating your replication settings so that when refreshes complete, you have all the publications and subscriptions set up. A lot of DBAs (myself included) forgo replication in DEV/INT/QA for that reason.
Two of the biggest offenders are when you’re trying to rename a table (or move it to a new schema) or outright drop a table. So that means your deployments will fail, which is never a good thing in agile environments that need to get code out the door fast. To correct the issue, you’d either need to make sure a table is taken out of replication ahead of the deployment (which means you’d have to know about the change ahead of time) or you’d need to react to the failed deployment and yank the table out and then redeploy.
Both aren’t great solutions, but with this recent change, you can take some of that sting out. 2016 SP1 and 2014 SP2 now include a new publication option called “allow_drop” that allows you to drop a table that is marked for replication. So now, if you or any developers you work with check in code to drop deprecated tables, it won’t throw an error anymore. There are some requirements (more on that in a minute), but to enable this change on a published database hosted on a 2016 SP1 or 2014 SP2 server, you’d run the following command (this isn’t something you can set in the GUI, at least not yet):
exec sp_changepublication @publication = '<Publication Name>', @property = 'allow_drop', @value = 'true'
And it works! Neat! Everything is great, let’s just end this article here, shall we? Too bad, because this is replication we’re talking about. This isn’t ‘Nam, there are rules.
To enable this change though, there’s more to it than just “you need to be running this service pack level on your publisher.” It’s far, far more than that:
I can’t make that last point any bolder; that’s not documented anywhere that I could find. I’ll explain why later, but for now, roll with it. In fact, let’s see what happens if we don’t upgrade our distributor as well. Time to go to the lab!
In my lab, I have the following configuration:
For my test, I have one publication with multiple articles. For simplicity sake, I am going to just use the table dbo.AWBuildVersion since it’s a nice, easy one row table and it’s not foreign keyed to the hilt, so it’s a nice test case since the drop will be easy. Next, I’m going to issue a DROP TABLE command for the table. It fails, because we haven’t turned anything on yet:
To prepare, let’s first run the T-SQL statement that will enable the new feature. Remember, this command will fail if you’re not running the appropriate service pack.
exec sp_changepublication @publication = 'AdventureWorks_TR', @property = 'allow_drop', @value = 'true'
Next, I’m going to issue a DROP TABLE command, and bingo: the table is dropped even though it’s marked for replication. Everything is awesome! Or is it?
If we open up replication monitor on our distributor, we might see a red x. Which, if you’ve been using replication as long as I have, that elicits a Pavlovian response to reach for the liquor.
We’ve got an error in our subscription saying that the there’s an invalid distribution command type in the queue. Strange; does that mean this transaction (the DROP TABLE DDL statement) is trying to be replicated down to the subscribers? And how do we deal with this error? Well, that’s where we go back to the bolded requirement I talked about earlier: in order for this functionality to work, not only do you have to make sure your publisher is updated to the correct service pack, your distributor needs the correct patch level, too.
Why? Well, probably because of what’s included here from the KB:
The log reader agent will post a cleanup command for the distribution database of the dropped table and do the cleanup of the publisher metadata. If the log reader hasn’t processed all the log records that refer to the dropped table, then it will ignore new commands that are associated with the dropped table. Already processed records will be delivered to distribution database. They may be applied on Subscriber database if the Distribution Agent processes them before Log Reader cleans up the obsolete (dropped) article(s).
So whatever this functionality entails, your distributor needs to be able to handle the command as well. Fortunately, if you have your distributor on the same server as your publication(s), this isn’t a big deal, and you’ll probably not encounter this scenario. However, if you have a dedicated distributor, you need to plan accordingly to make sure it’s updated as well. And if you’re stuck with a SQL Server 2012 or earlier instance for it, you won’t be able to use this feature, no matter what version your publisher is using. Ouch.
So how do we fix this error? There’s three ways:
SELECT [publisher_database_id] ,[xact_seqno] ,[type] ,[article_id] ,[originator_id] ,[command_id] ,[partial_command] ,[command] ,[hashkey] ,[originator_lsn] FROM [distribution].[dbo].[MSrepl_commands] WHERE xact_seqno = 0x0000007E00000480000200000000 and command_id = 1 DELETE FROM [distribution].[dbo].[MSrepl_commands] WHERE xact_seqno = 0x0000007E00000480000200000000 and command_id = 1
That should only affect one row and after a few seconds, your replication monitor should go green for the given again and start delivering any other transactions queued up. You can test it by putting a tracer token in and making sure it delivers.
If you’re suffering from failed deployments or constantly running into situations where a DROP TABLE statement is causing your heartburn, then you could consider it. Just be mindful of the requirements.
While other DDL statements can filter down in transnational replication, this statement won’t. So that still means you’ll be dealing with tables that were dropped in production living on in your replicas. I’d also argue that with a change like this in place, you’re probably less likely to remember to get rid of tables on your subscribers since it won’t break replication. Think about it: if you’re not sure what’s going to production already, how likely is this change likely to let you know something isn’t in your replicated tables anymore. No, you’d need some additional monitoring to look for changes in your table objects and compare them to what’s in replication vs. what is still on your replicas.
While it’s nice to see some replication changes in service packs, I’m hoping that Microsoft continues to consider those of us that continue to rely on this technology. This was a good start. I probably won’t be turning this on in my environments. My advice to you is pretty much the advice of all IT changes: test the hell out of it. Even if it means building a non-production distributor to test with and setting up some replication in at least your QA environment.
The other consideration is that this command, unlike other DDL statements has no bearing on your subscribers. As usual, dropping a table on your publisher will not drop it on your subscriber(s). Personally, I’m okay with that: it’s more work for me track down deprecated tables that are still on replicas than it is to just have tables disappear for reporting users. And if you’re replicating data to third parties, they probably wouldn’t appreciate it either.
If you’re constantly being bitten by code that conflicts with replication breaking your deployments, instead of relying on this enhancement (or if you’re unable to take advantage of it) you can educate your developers on how to test for replication. You could show them a query like this…
select * from sys.tables where is_published = 1
… and how they can use it to know if a table is replicated or not, and test for it. Will something like that solve all your code deployment problems, or easy any pains when you’re trying to remove objects that are published for replication? Probably not, but it at least can help move your release process or testing scenarios to help reduce, or hopefully eliminate these difficulties for you.
Pingback: Replication Support For Dropping Tables – Curated SQL
I’m going to guess that MS didn’t mention the distributor being updated as a requirement because at one point their recommendation was that the Publisher and Remote Distributor should always be the same version.
That’s probably accurate, and really good practice to follow. It’s a good thing to know, though, just in case you have a well-established replication environment already. And with SQL 2016 gaining more adoption for new installs, I’m willing to bet this may bite some people. I know if I hadn’t tested it, it would have bit me pretty hard here just because our distributor is still running 2012.