Availability Group Seeding

First off, I had some fun with the AI generated images with this. I think silly images are the way to go.

Automatic seeding for Availability Groups is one of those features that’s fantastic when it works and incredibly frustrating when it doesn’t. When seeding is healthy, databases just show up on the secondary and life is good. When it’s not, you’re left staring at vague status messages, wondering whether anything is actually happening at all. I really hate how the GUI handles this, because if the seeding is working or not, you get no feedback whatsoever until its basically done.

Luckily, there are scripts to help here, but if you don’t have them handy, you aren’t getting any information.

The first place I always check is:

SELECT * FROM sys.dm_hadr_automatic_seeding;

This DMV tells you whether seeding started, whether it failed, how many retries have occurred, and whether SQL Server recorded an error code. If seeding failed outright, this is usually where you’ll see the first clue as to why.

Next:

SELECT * FROM sys.dm_hadr_physical_seeding_stats;

When things are healthy, this view can show progress and estimated completion. When things are not healthy, it can be empty, partially populated, or frozen in a state that never changes. So you can use that knowledge; if seeding is supposedly “in progress” but this DMV isn’t showing anything, something is wrong.

Check Whether Data Is Actually Moving (Secondary)

That gnawing question is almost answered. Is anything actually happening right now?

On the secondary replica, I use performance counters to answer that question. This script samples backup/restore throughput over a short window to see if seeding activity is occurring:

----- RUN ON SECONDARY ------
-- Test if there are processes for the seeding occurring right now

IF OBJECT_ID('tempdb..#Seeding') IS NOT NULL DROP TABLE #Seeding;

SELECT GETDATE() AS CollectionTime,
instance_name,
cntr_value
INTO #Seeding
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Backup/Restore Throughput/sec';

WAITFOR DELAY '00:00:05';

SELECT LTRIM(RTRIM(p2.instance_name)) AS [DatabaseName],
(p2.cntr_value - p1.cntr_value)
/ DATEDIFF(SECOND, p1.CollectionTime, GETDATE()) AS ThroughputBytesSec
FROM sys.dm_os_performance_counters AS p2
INNER JOIN #Seeding AS p1
ON p2.instance_name = p1.instance_name
WHERE p2.counter_name LIKE 'Backup/Restore Throughput/sec%'
ORDER BY
ThroughputBytesSec DESC;

If you see throughput here, seeding is still moving data, even if the DMVs look suspicious. If you see nothing, seeding is probably broken.

Restarting Seeding (Without Restarting SQL)

When seeding is stuck, sometimes the fastest path forward is to effectively “kick” the process. On the primary replica, toggling the seeding mode can force SQL Server to restart the automatic seeding workflow:

-----*** RUN ON PRIMARY ******-----
-- Change to your AG name and server names

ALTER AVAILABILITY GROUP
MODIFY REPLICA ON 'SecondaryServer1'
WITH (SEEDING_MODE = AUTOMATIC);

ALTER AVAILABILITY GROUP
MODIFY REPLICA ON 'SecondaryServer2'
WITH (SEEDING_MODE = AUTOMATIC);

This isn’t magic, and it doesn’t fix underlying problems like permissions, disk space, or network throughput, but it often clears up cases where seeding simply stopped progressing for no obvious reason. I use these scripts all the time to verify that there is data movement happening on an AG that stopped syncing over night or after a patch.

A Cautionary Tale About “Helpful” AI

I like to test AI to see what it suggests on problems I’m troubleshooting. Lots of times it tells me what I already know, but one time an AI tool confidently suggested a SQL command that would “restart AG data movement.”

That sounded amazing. I got excited. This must be a new script I didn’t know about from a new release?

No…It didn’t exist. It was just a hallucination.

AI can be a great accelerator, but you still need to verify everything against reality. Especially when something sounds too good to be true.

Final Thoughts

AG seeding failures are rarely caused by one thing, and no single DMV tells the whole story. You have to look at:

  • Seeding status and error codes
  • Physical progress
  • Data movement
  • And sometimes, force SQL Server to reattempt the process

The good news is that with the right scripts and a little patience, most seeding issues can be diagnosed without guesswork. The bad news is that when things break, SQL Server is still not very good at telling you why unless you know exactly where to look.

Hopefully, scripts like these save you a little time the next time seeding decides to go wrong.

Power BI: Easy Data Wins – but Annoying SQL Connections

I want to talk about two things about Power BI today. First, I will give a generic pitch for why it is so great. Then, I will discuss a specific gripe I have about the connection window which I find lacking.

Power BI: One of the Easiest Wins in Data

I’ve been working with data for a long time. One thing that hasn’t changed is how much easier everything gets once the data is in a pretty picture. Whether you’re troubleshooting a weird performance spike or trying to understand a trend, a simple visualization can be helpful. It can also aid in making sense of raw logs. It can highlight things you’d never catch in a wall of text.

But that’s probably obvious to more data driven people. Power BI is also awesome for people who wouldn’t normally think it is for them. I’ve taught Power BI basics to dozens of people. So far, I think everyone (even the skeptics) appreciated it once I explained how useful it can be.

Developers, DBAs, analysts, managers, administrators…almost anyone can find a good use for Power BI. You don’t have to be a reporting expert to get value out of it. Ingest a dataset from a CSV. Drag a couple visuals in. Suddenly, you can explain something in 10 seconds that used to take a 20-minute conversation and a whiteboard. You can send a quick report to your manager to showcase a win or a loss. It can help you visualize with the visualizations.

Even if you don’t plan on publishing dashboards or rolling out a reporting platform, Power BI is great as a personal tool. You can explore a dataset, build a quick visual, and understand the patterns. Then, you can move on. It turns “staring at CSV files” into something productive and dare I say, fun. There are other tools like it, but I think Power BI is perhaps the easiest to get started in. I still love SSRS for easier complex reports, but the performance is horrible, and it is deprecated. Databricks has Dashboards that are shockingly similar to Power BI. If you know how to use one tool, you’ll be able to use the other one with minimal effort.

Direct SQL Connection Woes

I do have a few annoyances with the Power BI, and the one I want to mention today is how it handles direct SQL connections.

If your SQL Server doesn’t have a trusted connection, but encryption is not forced, the native SQL connection will complain (see below), but still let you in.

Image

However, if you are forcing encryption and have an untrusted certificate, things get bad. Ideally you want to have your certificate issued from a trusted certificate authority, but I know this doesn’t always happen quickly. So…unlike SSMS, there is no checkbox for trusting the certificate. You’ll just get a connection error about the untrusted cert.

Image

This really irked me and I thought I was at an impasse until I found a connection work around.

The Workaround: Use the OLE DB Connection Instead

The good news is that Power BI has other connection types. One option here is using OLE DB. It’s not as obvious or user-friendly as the standard SQL connector, but it gives you something the default connection doesn’t: the trust certificate checkbox.

Image
Image

Bottom Line: Get a Trusted Certificate

This workaround shouldn’t replace proper security. If your SQL Server has an untrusted certificate, the real solution is to fix the certificate. I’ll admit I’m too lazy on most of my dev boxes to do that, but it’s the right way for production.

I’ll probably post a blog sometime about all the nuances of requesting, building, and the requirements for a SQL certificate, but that’s for another day.

SSMS 2016 Policy Management Quote Parsing Error

I discovered a bug today in 2016 Management Studio when creating and updating policies. It drove me crazy until I realized what was going on, causing lots of lost time. Hopefully this will get fixed fast; we are reporting it immediately because I couldn’t find any references to it already out there. Special thanks to Kenneth Fisher for helping confirm that it wasn’t just affecting me.

The Problem

In the latest release of SSMS 2016, 16.5.1 and newer, policy conditions are removing quotes on each save, causing parse errors.

Vote the Connect up. A fix for this should be released in the next few weeks, but it doesn’t hurt to show your support for Policy Based Management.

Example

I’ll walk through a full, simplified policy creation showing how I discovered the problem, but it can be recreated by just editing a condition.

I created a new policy named Test and a new condition, also named Test. I set the condition facet to server, and input the following code into the field to create an ExecuteSql statement. Everything requiring a quote inside of the string has to have at least double quotes.


Executesql('string',' Select ''One'' ')

conditionscript

Once the code was input, you can see below that the code parsed correctly. SSMS was happy with it, so I hit OK to continue.

conditionready

I finished creating the policy, everything was still looking fine.

createpolicy

I then went to Evaluate the policy. The policy failed, as I expected. That’s not the point. If you look closely, you’ll notice that the Select One statement is no longer surrounded by double quotes. That shouldn’t have happened.

evalresults

I opened the Condition itself and received a parse error. Without the required double quotes, the Condition was broken.

parseerror

Summary
I tested this by creating or editing a condition without a policy or evaluating it and got the same results using SSMS 2016 on two separate computers, versions 16.5.1 and 17.0 RC1. When using SSMS 2012 or 2014, the code was not altered, everything worked as it should have. Finally, Kenneth happened to have an older version of SSMS 2016 and could not reproduce my error until he updated to the latest version of SSMS 2016, indicating that it is a recently introduced bug.

And again, if you haven’t already, vote up the Connect item.

Failover Cluster Manager Connection Error Fix

A few days ago I encountered a new error with Failover Cluster Manager.  A couple of servers had been rebuilt to upgrade them from Windows Server 2008 to 2012. They were added back to the cluster successfully. However, one of the servers would not open Failover Cluster Manager properly, and tracking down the solution took a long time.

The problem server successfully joined the cluster, but now it would not connect to the cluster using Failover Cluster Manager. If you opened up the application, it didn’t try to automatically connect, and manually connecting with the fully qualified name failed too. Below is the generated error.

failoverclustermanager_wmierror

I love how this error has absolutely no useful information to it. Luckily I was able to track Error 0x80010002 down online.

Research indicated that there was some sort of WMI error on the computer. Rebooting didn’t help anything, and after numerous attempts to correct/rebuild the WMI repository, not much was accomplished. Eventually, the server could connect to the cluster, but that only worked about 30% of the time, and it nearly timed out even when it did succeed! The cluster still never connected automatically.

After further poking around on the internet, I found a few suggested solutions, with my ultimate fix closely following this post. I still had to combine everything together and run scripts all over the cluster before things returned to normal.

First of all, this is a condensed version of the Cluster Query from the TechNet post linked above.

1) Cluster Query


$Nodes = Get-ClusterNode
ForEach ($Node in $Nodes)
{
 If($Node.State -eq "Down")
  { Write-Host "$Node : Node down skipping" }
 Else
 {
  Try
  {
   $Result = (Get-WmiObject -Class "MSCluster_CLUSTER" -NameSpace "root\MSCluster" -Authentication PacketPrivacy -ComputerName $Node -ErrorAction Stop).__SERVER
   Write-Host -ForegroundColor Green "$Node : WMI query succeeded"
  }
  Catch
  {
   Write-Host -ForegroundColor Red "$Node : WMI Query failed" -NoNewline
   Write-Host  " //"$_.Exception.Message
  }
 }
}

Any server that throws an error with the above query needs to have the following scripts ran on it:

2) MOF Parser
This will parse data for the cluster file. 

cd c:\windows\system32\wbem
mofcomp.exe cluswmi.mof

FCM was still not working correctly, so I reset WMI with the following command.

3) Reset WMI Repository


Winmgmt /resetrepository

That will restart the WMI service, so you’ll probably have to try running it multiple times until all the dependent services are stopped. The command shouldn’t take more than a few seconds to process either way though.

After that, the server that failed the Cluster Query (1) was reporting good connections, but FCM still wouldn’t open properly!

I decided to try the two WMI commands (2 & 3) again on the original server that couldn’t connect to FCM. I had already ran those commands there during the initial troubleshooting, so I was starting to think this was a dead end. Still, it couldn’t hurt, so I gave it a shot.

I reopened FCM and voila! Now the cluster was automatically connecting and looking normal.

As a further note, after everything appeared to be working correctly, SQL was having trouble validating connections to each node in the cluster during install, and I had to run commands 2 & 3 on yet another node in the cluster before things worked 100%, even though that node never had a connection error using the Cluster Query (1).

SQL Server Storage: Reading Block Size

This is the final post in the SQL Server Storage line of blog posts I’ve made. First we discussed Pages & Extents, then how to Read A Page using T-SQL, and finally this post will be about Disk Partition Offsets and Allocation.

Knowing that Pages are 8 KB and Extents are 64 KB, it’s understandable that SQL would have the best performance when the disks are aligned in the same manner. However, in earlier versions of Windows, this was never the case. Instead, Windows used an alignment configuration of 63 512 Byte sectors for a total of 31.5 KB. Since the way SQL Server reads and writes was not aligned with Windows, havoc ensued in the form of increased I/O operations. As of Windows 2008, partitions are aligned at 1024 KB by default, providing fewer I/O operations since 1024 is a multiple of 64.

I tried to keep the above explanation simple, but it still got wordy. The takeaway should be: you want SQL Server disks to have a Starting Offset that is a multiple of 64 KB, with the desirable default being 1024 KB. Higher values are fine for special circumstances, just so long as your Starting Offset is evenly divisible by 64.
As for your block size, that should also be 64 KB, or the size of a single Extent, to improve performance.

There is a single script to find both Block Size and Starting Offset, and it almost even works!

Get-WmiObject win32_DiskPartition |
	SELECT SystemName, Name, BlockSize, StartingOffSet |
	FT -Auto

That script returns the correct Starting Offset, but that Block Size is a screwy number that should not be trusted. To make matters worse, you get a Partition Number, but no Drive Letter. That doesn’t help me understand if my data disk is formatted correctly. In comes another query.

Get-WmiObject -Class win32_Volume |
    FT DriveLetter, Label, BlockSize -Auto

Great, now we have an accurate BlockSize and a Drive Letter! The only problem is, now there is no Partition Number to match up with the OffSet from the first query. Getting ALL the information at one time is a pain. Luckily there are Custom Objects to help us smash results together and Win32_LogicalDisk to help us relate the drives and partitions.

$Report = @()
$Disks = Get-WMIObject Win32_logicaldisk | WHERE { $_.DriveType -eq '3' }
$Result = ForEach ( $Disk in $Disks)
{
    $OffSet = Get-WmiObject -Query "Associators of {Win32_LogicalDisk.DeviceID='$($Disk.DeviceID)'} WHERE ResultRole=Antecedent"
    $Report = [PSCustomObject] @{
		ComputerName = $Disk.SystemName
		DriveLetter = $Disk.DeviceID
		Partition = $OffSet.Name
		BlockSize = ( Get-WmiObject Win32_Volume | WHERE {$_.DriveLetter -eq "$($Disk.DeviceID"} | SELECT -expand BlockSize )
		StartingOffSet = $OffSet.StartingOffSet
    }
    $Report
}
$Result | FT -Auto

Now you can determine the Block Size and Starting Offset of your drives easily. Changing those values, well that’s another story. Normally that will require formatting the drive, so its best to get it right in the first place. Make sure all your SQL Server drives are formatted correctly before you get started. Just educate your friendly Storage Admin on what SQL Server needs so everything is correct by the time you get it.

SQL Server Storage: Reading Pages with T-SQL

This is a continuation post from last weeks’ SQL Server Storage: Pages and Extents explanation. Since the description was a bit longer than I originally expected it to be, I decided to split the scripts into more posts.

First, I’m going to quickly cover how you can see information about a specific page using T-SQL. Now, normally this won’t be of much use to you, but it’s fun to play around with a bit just to see how things are actually stored. You might have need to read a page during some heavy troubleshooting at some point in the future too.  This procedure is undocumented though, so information is scarce and the feature could disappear without notice in future versions.

To read a page, you’ll need to utilize DBCC PAGE which I’ve listed the basic layout for below.


DBCC PAGE
(
  'DbName' OR DbId -- Database name or Database ID, either one!
  ,FileNumber -- File Number of the Page
  ,PageNumber -- Page Number in the File
  ,PrintOption -- Display option ranging from 0-3 with different info from each
);

Now we need to know what FileNumber and PageNumber to supply to DBCC PAGE though. Random numbers might work, but if you are actually trying to do anything halfway useful, they won’t get you far. To solve this problem, we have to utilize another procedure…DBCC IND


DBCC IND
(
  'DbName' OR DbId -- Database name or Database ID, either one!
  ,TableName -- Table Name...I don't think this really needs a comment
  ,IndexId -- index_id from sys.indexes; -1 = indexes and IAMs, -2 = IAMs
);

Using DBCC IND we can get some relevant data to pass into DBCC PAGE. The problem is, you still need a relevant Index_ID  for the selected table. The following query can help with that. Just supply the desired TableName in the WHERE clause.


--Get the index_id relating to your desired TableName to pass into DBCC IND
SELECT * FROM sys.indexes
WHERE OBJECT_NAME(object_id) = 'TableName'

SysIndexesResults

A sample result of sys.indexes. Grab the index_id and plug that into DBCC IND

 

With an index_id and a TableName in mind, we can get some results from DBCC IND.

DBCC_Ind_Results

If you are just testing, index_id = 1 is not a bad idea to check.

Awesome, now we have meaningful ids to use with DBCC PAGE. You’ll need to do one more thing before you run it though. Trace flag 3604 has to be set for SQL to provide output. Without it, you won’t get any results at all.

NoTraceFlagSet

No Trace Flag means no results


DBCC TRACEON(3604)
DBCC PAGE('DbName',1,100,3) WITH TABLERESULTS;
GO

With the Trace Flag on, you’ll get more data than you probably know what to do with.

DBCC_Page_Results

DBCC Page provides LOTS of information, I only captured a tiny snippet for your visual delight

Now you know how to get page information about your tables, so long as these undocumented procedures are available. Since this post again got longer than I originally expected, I’ll cover reading and setting block sizes on your disks next week. That will involve some fun PowerShell too!

SQL Server Storage: Pages and Extents

It’s time for another SQL Server refresher today! This time we will discuss some storage basics, specifically Pages and Extents and how they relate to each other. There are a lot of resources out there discussing these storage units, but I’ve tried to put my own spin on things and aggregate as much data as I could find about them.

Pages

Naturally we are going to discuss pages first, since they are the most fundamental unit of storage for SQL Server. Pages store everything in the database and are only 8 KB small. Your entire disk I/O is performed at the page level.

Page

An example data page layout is easier to visualize than explain

Pages have three major components, a page header, records, and the offset array. Pages start with a 96 byte header which contains meta-data, like the page number, owner’s object id, and page type. Pages end with the offset array which is 36 bytes and has pointers to each new row stored in the page.  These pointers are stored last to first, but that’s more easily explained in the picture. The offset array is essentially the index for the page. The middle of the page is the records, and consists of the remaining 8060 bytes containing stored data.

There are different types of pages, such as data, index, image, and a number of informational pages. Probably the most interesting type are overflow pages. If a row is greater than 8060 bytes, the data can be stored on overflow pages which are linked together. Overflow pages can store as much as 2GB in a single column, but obviously this is less than ideal. The performance impact increases since each extra page increases read times. The most obvious example of this situation is VARCHAR(MAX) or VARBINARY(MAX) datatypes. Data type limitations normally relate directly to the size of a page, (MAX) datatypes effectively bypass the limit and cause overflow pages. For instance, VARCHAR(8000) and NVARCHAR(4000) are the normal limits and based on the size of a single page. Using (MAX) datatypes that span multiple pages increases reads causing less than stellar performance. Queries like SELECT * can grab poorly performing datatypes accidently and should be avoided as much as possible.

Extents

Extent

Extents can be Uniform and have all the same type of pages or Mixed with a combination of page types

Simply put, extents are groups of pages. Extents consist of exactly eight contiguous pages, with every page being part of an extent. In other words, pages always come in groups of eight, so data grows in a minimum of 64 KB increments. Unlike the many types of pages, there are only two types of extents.

Mixed Extents: In these extents, pages are allocated to multiple objects, or different types of pages. New tables or indexes are put into mixed extents for storage efficiency. When a small table is made that would consist of less than eight pages, it gets stored in a mixed extent with other similarly small objects. If a database grows large enough to fill an entire extent, it can utilize the next type of extent.

Uniform Extents: These extents have pages that are all allocated to the same object. Larger databases often have extents with identical page types, such as data pages or index pages. More data can be read in a single read operation with uniform extents, so performance can see an improvement.

Originally I had planned to provide some example scripts to discover information about your pages, and storage setup, but in an effort to keep the information in byte-sized chunks, I’ll continue with that next week.

Remotely Enable Always On

Always On Availability Groups is the new feature for High Availability in SQL Server 2012. It’s been out for awhile now, but unless you have Enterprise Edition SQL, you might not have been able to use it much.

Of course you need a cluster to utilize Always On, but once that is complete, you also have to enable Always On in Configuration Manager on all your servers that will be participating in the AG as well.

Continuing on with my lazy, automated DBA goals of logging into computers as rarely as possible, I developed the below PowerShell script to connect to SQL Servers, enable Always On, and then restart the SQL Service in order for the changes to take effect.

The only thing you need to change below is the computer names, it should automatically detect your SQL instance names. If that doesn’t work (I haven’t been able to test every possible name parsing possibility), you can supply the instance names yourself.


## List Servers in AG Here ##
$Computers = 'Computer1','Computer2'
## Everything Else is Automated ##

# Finds the servers running the services, and the services' names
Invoke-Command -ComputerName $Computers -Scriptblock {
$Services = (Get-Service -Include MsSql* | Where { $_.Status -eq 'Running' } )
$Nodes = @()

# Parses the names of the SQL Instances
ForEach( $Service in $Services )
{ $Nodes += $Env:ComputerName'\'+$Service.DisplayName.Split('(')[1].Replace(')','') }

# Loops through each instance and enables AlwaysOn, restarting with -Force
ForEach ( $Node in $Nodes )
{ Enable-SQLAlwaysOn -ServerInstance $Node -Force }

# Starts SQL Service on the affected server(s) if it's still stopped
If($Services -ne $NULL)
{ Start-Service -DisplayName ($Services.DisplayName) }
}

The key code here for enabling Always On is this snippet below.

Enable-SQLAlwaysOn -ServerInstance $Node -Force

If the longer script cannot automatically detect your ServerInstance, you can provide it manually and run the command. Restart your SQL Service for the change to take affect.

SQL Server Changing Passwords and an SSPI Context Error

The other day I encountered a login error when connecting to a SQL Server. The circumstance seemed strange compared to similar errors described online with many of those seeming rather complicated to find the real solution. Since this server had been around for awhile, it was unlikely that some major Active Directory change would be necessary to resolve the issue.

This SQL Server was part of an Availability Group, and the connection worked fine when connecting using the Server/Instance name, however, when attempting to connect via the Listener, the following error occurred.

Cannot connect to Server/Instance.
The target principal name is incorrect.
Cannot generate SSPI context (Microsoft SQL Server)

Articles online indicated this was an SPN, Kerberos, and/or Active Directory issue, and something needed to be reset, but the only way to know for sure was to continue down a long troubleshooting list. Luckily, the problem was simpler than that, but still very strange.

I had reset the service account passwords the afternoon before this error became apparent. Each service was restarted afterwards to verify the change worked properly and SQL had been successfully connected to. Everything seemed fine from my perspective.

The next day, some users attempted to connect using the Listener and that’s when the errors started. I don’t normally connect via the Listener, so I hadn’t thought to check that, didn’t think it would be necessary.

Troubleshooting the easy solutions first seemed like a good idea, so I decided to try restarting the SQL service, which failed everything to another server in the cluster immediately. The services came online, and now both the instance and Listener could be connected to. OK, well probably sort of solved.

I failed it to a third node in the cluster, everything still worked great. Cool. This was looking even better.

Next I failed it back to the original node.  This time, the SQL Service came online, but not the Listener. Strange, how did it work in the first place? Everything was running on that server before I restarted the service, even if it wasn’t running correctly. I reset the passwords in SQL Configuration Manager, and then restarted the services. Everything worked perfectly now.

In summary, somehow all the services restarted on the server after the password change, but the Listener had a bad password and was not allowing connections. When I attempted to restart the Listener again, it failed until the password was corrected. I still don’t know how this happened, but it’s a good reminder to be especially careful when changing service passwords.  Changing passwords on a cluster can be even more dangerous since you have extra services to update that may not even be running on the server at the time, so verifying everything went smoothly can take a few extra steps.

Tales of when a Log Fails to Shrink in an Availability Group

I received a report that one of my servers had 7% free space on its log drive. Sounded like something fun to resolve. I checked on what was going on and found a log file that was 99% free and a hundred gb in size. While shrinking a log file is not a good practice and I’m not advocating it by any means because it’s just going to grow again and your storage is there specifically to hold logs, this situation was a out of the ordinary and we needed space.

The problem was, this log would not shrink. It was being extremely uncooperative. I took a backup, log backups, multiple shrink attempts, but it wouldn’t budge. The message returned was a big clue though.

<code>The log for database ‘dbname’ cannot be shrunk until all secondaries have moved past the point where the log was added.</code>

As you might have guessed, this server was a SQL Server 2012 instance and in an Always On Availability Group. The database in question could not shrink because it was participating in the AG.

It wasn’t an ideal fix, but by removing the database from the Availability Group, I was able to perform a log shrink to get the size to a more manageable amount. No, I did not truncate it to minimum size, I adjusted it to a reasonable amount based on its normal work. I didn’t want the log to just have to grow again. The shrink worked flawlessly, and with adequate drive space, I attempted to add the database back to the AG via the wizard.

The AG wizard refused to help. The database was encrypted and the AG wizard will not let you add a database if it is encrypted. No explanation why, it just doesn’t like that. You can add an encrypted database to an AG via script though. You can even script the change from the wizard by using a non-encrypted database then changing the database name in the scripted result. The resulting script is exactly what the AG wizard would do, it just cannot execute it automatically.


ALTER AVAILABILITY GROUP AgName
ADD DATABASE DbName;
GO

With free space and an encrypted database safely back in my AG, I was off to new adventures!