August | 2024 | Every Byte Counts

In my role at Pure Storage, I often engage with customers who wish to migrate their SQL Server databases off of their prior storage onto our hardware. And after some digging around for prior-published material, I was surprised to find that there really wasn’t much that was comprehensive. After all, one doesn’t change SANs too often. But when it does happen, it is nice to have some reference material from others who have. So I decided to try and give a good overview of how I’d approach the challenge.

This is meant to be a “food for thought” kind of post. I’m going to keep things somewhat high level, but will provide links to other blogs and material that can help you continue down whatever path you choose. And for simplicity, I’m going to limit this scope to a single SQL Server.

Questions First

Is your SQL Server virtualized or not?
- VMware or another hypervisor?
- If VMware, are your existing volumes RDM, VMFS, or vVols?
How much downtime can you take for the actual cutover?
Are you migrating your databases to a new SQL Server instance or keeping everything on the same instance?

VMware VM?

First, the easy one. If your SQL Server instance is on VMware VMFS or vVols, use Storage vMotion and call it a day. Are you one of the teeny-tiny percentage of folks running on Hyper-V? That’s cool – there’s storage migration functionality in there too, both for standalone Hyper-V VMs and Hyper-V VMs that are clustered and residing on Cluster Shared Volumes.

If you’re on RDM, know that from VMware’s perspective, RDMs are dead and vVols are the way of the future, so take this migration opportunity to consider changing that as well. One legacy reason SQL Server’s needed RDMs had to do with clustered storage, but now the capabilities needed for that are available in VMFS and/or vVols. Still have a Failover Cluster Instance with RDMs? Then check out this oldie but goodie blog for starters.

Cutover Downtime

The most critical question to answer next is how much downtime can you take for the actual migration? Do you have the luxury of being able to take an entire weekend because your workload is only Mon-Fri? Cool, you have it easy. More typically, I see teams that can afford a few hours of downtime overnight. And occasionally, I’ll hear “we can’t have ANY downtime!”

The long and short of it is this… your migration solution will most likely increase in complexity as the amount of downtime available to you decreases.

My Preferred Option

Generally speaking, I would pre-stage the databases via backup/restore to the new storage. If you are also migrating to a new SQL Server instance, use Log Shipping! It just works!

If you are remaining on the same SQL Server instance, you can use a “log-shipping-like” methodology. Present new volumes from your new SAN to the same instance and restore the databases side-by-side (named differently of course). Then if you want, you can use DIFFs and/or T-Log backups to keep that copy up-to-date. This makes the most sense if you say, leisurely pre-stage the databases 1-2 weeks beforehand, and say take a DIFF the morning of the cutover and restore that during the workday. Then at the time of cutover, you’d take a final transaction log backup (the tail), and restore that, and then swap database names. Of course, you can switch drive letters/mount points if you really want to as well.

Here’s a SQL Server Central blog that covers this in a bit more detail.

PowerShell Is Your Friend

I’d also strongly suggest making your life easier by using dbatools.io. If you’re on the same instance, check out Move-DbaDbFile. I like this one for same-instance scenarios as it’s stupid simple, but at the trade-off of taking each database offline during the operation (though there is a restore option too). But it handles metadata which is great.

And if you’re moving to a new instance, utilize Start-DbaMigration. If you’re new to dbatools.io, no worries! There’s a wealth of resources available – go search “dbatools” on YouTube and you’ll find a ton of amazing presentations showcasing how to use it, like this one from Jess Pomfret!

One other benefit to PowerShell… well, remember how I said this blog was only for 1 SQL instance? If you have a TON of SQL Servers to migrate, PowerShell is an amazing way to code up a solution and apply it to a ton of SQL Servers!

Filegroups – aka: I Didn’t Know I Could Do That!

Filegroups is one of those under-rated capabilities in SQL Server that I find most folks (myself included), don’t use. When you create a basic database, you get a data file, log file, and a single PRIMARY filegroup. But what you can do is create another filegroup and add underlying files to that file group. Then you can migrate existing data to the other filegroup. The cool thing about this option is that it’s fully online! The one drawback to this option is that it may not be as fast as one might like because it’s a size of data operation. But you remain online at least, so it can be done over a long period of time. There is one more quirk – it’ll introduce a ridiculous amount of external fragmentation. And remember that though external fragmentation is not as impactful like it once was spinning platters, it still negatively impacts read-ahead operations, so could negatively impact your workloads.

If you’re curious to learn more about this, check out Bob Pusateri’s blog post here. He presents a hybrid solution, because of specific requirements he had, but it’s extremely insightful on how to orchestrate this. And yes, this is a complex approach, but as I said earlier, if you must absolutely minimize downtime, a more complex orchestration will be your trade-off.

Thanks for reading!

In my current role at Pure Storage, I have the privilege of working with two amazingly smart, awesome SQL Server nerds; Andrew Pruski (b) and Anthony Nocentino (b). We often find ourselves facing interesting questions about SQL Server and storage, and today was no exception.

Andrew had a customer who wanted to know what happens on our FlashArray, from a space usage perspective, when they first delete a large volume of data in a database’s data file, then subsequently shrink the database’s data file.

DELETE Records

To properly answer what happens on FlashArray, one must look at all of the steps that happen in between.

First, what happens when SQL Server executes a DELETE operation against a ton of data? In a nutshell, SQL Server marks the data records and pages as ghosted and the ghost writer eventually comes up and marks the data pages for re-use. The data pages are not deallocated in the Windows file system, so no space is given back to Windows.

Aside: Individual record/singleton deletions are a bit of a different story, which I may explore in a future blog. Today, we’re only focusing on mass-deletion of tons of records/tons of data pages.

SHRINK DATABASE

So now that we’ve deleted a bunch of data, for whatever reason (good or bad) we want to run a SHRINK DATABASE to reclaim storage space. In a SHRINK operation, SQL Server will do is essentially physically move and consolidate data pages within each data file. If there are gaps within the data file, SQL Server will move data pages around, “smooshing” them together, then deallocate the newly freed space at the end of the data file. And in Windows on NTFS, this means that a deallocation is sent to the file system, which is essentially pushed down to the storage array.

To use an analogy, you have a large storage container (aka your data file) and little boxes (data pages) randomly sitting about inside. When you SHRINK, you’ll move boxes and stack them all in one corner. Then imagine you can use magic to shrink the larger storage container into a smaller one. So now your storage container takes up less overall space.

Aside: Paul Randal goes into greater detail in this blog post, along with why SHRINK is a bad idea

Aside 2: If you’re VMware virtalized on VMFS, there are additional caveats that will not be covered in this blog post today.

FlashArray

Within FlashArray, we do a variety of different things with the underlying data that is written to us. These operations include thin provisioning, data reduction algorithms, global deduplication, and compression. Capacity utilization needs to be thought of a bit differently on our array. And we use an algorithm called redirect-on-write, which I like to think of as an Append-Only or Insert-Only methodology for writing data (over-generalizing: we never update, just insert new).

So would sending a Windows NTFS deallocation to FlashArray, cause FlashArray to reclaim space?

Testing

To prove this out, I adapted Paul Randal’s demo script from his SHRINK data files blog, and added additional steps to check storage utilization on the FlashArray. You can review my published test script here.

To outline what I did, I created a new database whose data file was isolated on a dedicated volume (don’t care about the transaction log for this test). Then I used Paul’s methodology to create a table and hydrate it with a bunch of data. Then I created another table, hydrated that, then recorded my storage utilization from both SQL Server’s perspective and from FlashArray’s.

To begin the actual test, I dropped the first table then recorded FlashArray’s space utilization on the volume where the data file resided. Next, I rank DBCC SHRINKDATABASE, and then recorded space utilization again.

Results

Before executing the first delete, my database was 31.11 GB in size on FlashArray. After the DELETE, nothing changed. This is expected, because Windows did not deallocate anything – the SQL Server data file remained the same allocated size. Then after the SHRINK operation, which deallocated data, the size on FlashArray dropped to 15.94GB. Space was reclaimed on FlashArray.

Conclusion – TL;DR

When deleting records from a table, the space consumed is not impacted or reclaimed on the FlashArray. If you want to reclaim space, you must execute an operation to cause Windows to deallocate, which a SHRINK operation would do.

Thanks for reading!

Every Byte Counts

Widgets

Search

Month: August 2024

Migrating SQL Server Database Files Between Storage Subsystems