Data Saturday Chicago – One Month Away!

Image

It’s been a hot minute since we (the organizing team) announced Data Saturday Chicago, and I can’t believe it’s a month away!

If you haven’t signed up yet, the team would love to invite you to join over 100 already-registered data professionals for Data Saturday Chicago! This event is an in-person training event for anyone interested in the Microsoft Data Platform, brought to you by the team who have been bringing you SQL Saturday Chicago since 2010!

This year, there’ll be THREE DAYS worth of training options available to you!!!


Why Attend Data Saturday?

Meet and learn from over 35 world-class experts in an intensive, one-day format. You’ll have the opportunity to gain actionable knowledge on a variety of topics including SQL Server performance, Azure Architecture, Microsoft Fabric, and AI Integration.

Beyond the technical sessions, you’ll have the opportunity to connect and network with the regional data community. Who knows, you may meet your next boss or coworker!

View the Full Schedule and see the value for yourself.


Level Up Even Further: Pre-cons on Friday

For those who want advanced, deep-dive training, we are offering two full day workshops!

Modern Data Warehousing with Microsoft Fabric

  • Focus: Architectural foundations, data modeling and ingestion, migration strategies, & pitfall prevention
  • Presented by Kristyna Ferris & Chris Hyde
  • Cost: $188.58 (includes lunch & Eventbrite fees)
  • Learn more & Register

Advanced T-SQL Triage: The Art of Fixing Terrible Code

  • Focus: Advanced T-SQL performance optimization and refactoring
  • Presented by Erik Darling
  • Cost: $188.58 (includes lunch & Eventbrite fees)
  • Learn more & Register

Deepen Your Data Perspectives: Redgate Summit on Thursday

Our Platinum Sponsor Redgate is offering their own, all-day training event, nearby on Thursday!

This event is focused on putting you in control of your databases with visibility, simplicity, and security across three focused tracks: Security & Compliance, AI, and The Redgate Clinic.

  • Featured Speakers: Grant Fritchey, Kellyn Gorman, Steve Jones, John Martin, and Pat Wright
  • Keynote by Bob Ward (Microsoft) presenting “From the Ground to the Cloud: Building Trustworthy AI Applications with SQL.”
  • Cost: FREE
  • Learn more & Register

Data Saturday: Event Details & Registration

Main Event Page: https://datasaturdaychicago.com/

Friday, Mar 13, 2026 – Pre-Conference Sessions
Saturday Mar 14, 2026 – Data Saturday Conference

Location:
Wojcik Conference Center @ Harper College, Palatine IL

Cost: $28.52 (includes Saturday admission, lunch, and Eventbrite fees)

CLICK HERE to REGISTER


Taking Snapshots of Databases on VMFS Datastores

Image

Every now and then, I’ve talked about how one can leverage storage based snapshots on Pure Storage, against SQL Server databases, to do cool things like orchestrate object level restores. And we have a number of example scripts published out on our Github, that you can use as building blocks to orchestrate time-saving workflows, like the old-fashioned restore Prod backups a bazillion times to non-Prod.

Almost all of those examples had one thing in common: they were written with vVols in mind. Why? Because storage array snapshots are volume based and vVols are awesome for a number of reasons. But then Broadcom decided to deprecate them in a future VCF release (note the article no longer explicitly states VCF 9.1 like it did originally), but that’s a different topic for a different time.

TL;DR – Where Can I Get Code Examples?

Back to VMFS & VMDK files

Many folks who virtualize their SQL Servers on VMware choose to rely on VMFS datastores with VMDK files underneath. And that poses an interesting challenge when it comes to taking storage array snapshots.

Remember how I said that our snapshots are volume based? That means that a storage volume corresponds to a VMware VMFS datastore. And what’s inside a VMFS datastore? A filesystem containing a bunch of VMDK files, each of those representing a virtual disk that is presented to a virtual machine. Thus, if one takes a snapshot of a datastore, one could be snapping a dozen SQL Server VMs and ALL of their underlying disks! Fortunately FlashArray is space efficient because it is globally data deduping, but this nuance still presents a headache when one just wants to clone some SQL Server databases from one machine to another.

With vVols, RDMs, and bare metal, we can isolate our snapshot workflows to just snap volume(s) containing our databases. If you’ve segregated out your user databases into individual volumes, like D:\ and L:\ for data files and log files respectively, that means we can skip your C:\ with OS, T:\ with TempDB, S:\ with system databases, M:\ with miscellaneous, etc.

To accomplish the same with a VMFS datastore, we need to work with the underlying VMDK files to achieve virtual machine disk level granularity.

Explaining the Workflow

First, if you want to follow along in my example code, hop over to our Github and you can pull up the Powershell script to follow along. Each discrete step is commented so I’ll just be giving a higher-level overview here.

In this example scenario, there’ll be a Production SQL Server (source) and a non-Production SQL Server (target). The source will reside on one VMFS datastore and the target will reside on a different datastore. In this example, it will also reside on a different array, to showcase asynchronous snapshot replication.

Like our normal workflow, we will take a crash-consistent snapshot of the source datastore, then we will clone it into a new volume on the array. That new volume will then be attached/presented to the ESXi host that our target SQL Server VM resides on. Next (after prepping the SQL & Windows layers), we will take the VMDK file(s) that contain our database(s) that we want to refresh, and detach and discard them from the target VMFS datastore. Don’t panic! Remember, we don’t need them anymore because we’re refreshing them with newer cloned ones from our source! After discarding the old VMDKs, we then connect up the VMDK files that we want, that are residing in the cloned datastore. Once that’s done, we can bring our disks and databases back online in Windows and SQL Server, and work can resume on the target SQL Server instance.

But we’re not quite done. At this point, the target SQL Server VM now has two datastores attached to it: the original datastore it resided on plus the newly cloned datastore with the newly cloned VMDKs from our source. We don’t want to leave those VMDK files there. This is where we leverage VMware Storage vMotion! We’ll conduct an online storage vMotion operation that’ll swing the VMDK files over from the cloned datastore to the primary datastore. Even better, VMware should leverage XCOPY under the covers, meaning the FlashArray won’t actually copy the VMDK files, but just update pointers, which’ll save a tremendous amount of time! Once the Storage vMotion is completed, we will go through the steps to disconnect then discard the cloned datastore because we no longer need it.

Voila, now we’re done!

From an operational standpoint, this will take longer than “just a few seconds,” mostly because of the new VMware steps involved, like re-scanning storage HBAs (twice). But all in all, it should only take about a minute or so, end to end. That’s still a tremendous time savings vs a full copying of a backup file to your non-prod environment, then executing a traditional restore.

What about T-SQL Snapshot Backup?

Great question! T-SQL Snapshot Backup is one of my most favorite features in SQL Server so I also put together an example that leverages that capability with a VMFS datastore! You can find it in the Point in Time Recovery – VMFS folder! It’s fundamentally the same thing as before, but with the extra VMFS/VMDK steps like what I outlined above.

Thanks for reading – hope you enjoy your time saved by using FlashArray snapshots!

T-SQL Tuesday #193: What I Wish I Knew Back When…

T-SQL Tuesday Logo

Welcome back to another edition of T-SQL Tuesday. This month’s edition is hosted by Mike Walsh, who asks authors to write two short notes to yourself – one in the past, and one from the future.

A Note to My Past Self…

Or otherwise titled, learn from what I wish I knew then and know now.

Understand Business Value

And how to communicate it. This is the lesson that I wish I knew then, that I know now. As a technologist, especially my younger self, I never cared about the business side of things. I went to school for a computer science degree – not a business degree. Business folks are good at what they do, and I wanted to do what I wanted to do, which was play in tech.

But as we all know, oftentimes in our career, to play with tech, we need buy in from the business. There’s a lot of things we must justify and understand the why’s behind it.

  • Want to fix some code? Translate that to how it impacts the business.
  • Want to attend a conference or training? Understand how increased knowledge and skill will bring value to your business

Along those lines, I also wish I knew and understood the value of writing a business plan. Or at least a business justification document. And there are several principles of communication that I wish I understood better… like focusing on solutions that drive to desired outcomes. Former me would have brushed off those terms as business-speak jargon, but nowadays I firmly believe in this mindset.

That’s it for now. Mike may have asked for a message to my future self, but instead Sebastian the dog is demanding his evening walk. 🙂

Until next time – thanks for reading!

SQL Server 2025 – Love for Standard Edition

Super excited that today, SQL Server 2025 is now Generally Available!

Standard Edition Enhancements

While all of the new features and such were available during Public Preview, one thing that was not public until now is that Standard Edition limits have been increased! Yay! The CPU core count limit is now 32 cores (or 4 sockets, whichever is lesser) and the max buffer pool memory per instance is now 256GB! Additionally, Resource Governor is now available for Standard Edition. And in SQL Server 2025, Resource Governor can also help you manage TempDB!

Why This Matters

I still encounter many organizations that rely on SQL Server Standard Edition. And many often delay upgrading SQL Server for one reason or another. In my opinion, the hardware limit changes now give organizations a much more compelling reason to consider 2025 as their next upgrade rather than going to 2022 next. Remember, if you don’t want any of the other new “fancy” 2025 features, you don’t have to use them and/or can turn them off via Database Scoped Configurations.

Check it Out Today

Developer download ISO’s are available now. And oh yeah, don’t forget that you can now install Standard Developer edition, to match your Standard edition prod environment too!

T-SQL Tuesday #192: SQL Server 2025!!!

T-SQL Tuesday Logo

Welcome back to another edition of T-SQL Tuesday! This month’s edition is hosted by Steve Jones, who is asking authors to share what they’re excited about in this new release.

Going Beyond Vector Search

Through the tight partnership between Pure Storage and Microsoft, I’ve had the privilege of “knowing what’s coming next” for quite a while now. It’s why I was able to get a jump start on Vector Search work for example. But that’s not what I’m going to talk about today.

At this point, Vector Search and AI may steal the spotlight, but there’s SO MUCH MORE in the new release! What I’m going to highlight may be the least “shiny” and “flashy” but I believe it’ll have broad impact – Availability Group enhancements.

AG’s Under the Covers

In my role at Pure Storage, I’ve learned a tremendous amount about how AG’s work under the covers. Pure offers storage array options for HA/DR replication and I regularly have conversations with customers about the pros and cons of mixing and matching application level options like AGs with storage array options.

One thing I never really fully understood, until I stumbled upon this blog about a year back, is the construct of Flow Control in AGs. In fact, if you’ve never checked out SQLTableTalk.com, take a moment to check it out – they offer a tremendous amount of deep content about SQL Server (with the two original founders being Microsoft employees – RIP Yvonne).

But I digress. All of the new options in regards to AG’s like max ucs send boxcars are there to improve performance, resiliency, and failover efficiency of Availability Groups. And I feel many DBAs do not fully appreciate how AG’s really work under the covers, which is why I feel that it is worthwhile to highlight these resources.

Your Homework

If you use AG’s today, I would encourage you to refresh your knowledge on how AG’s communicate behind the scenes. Once you do, then look into the 2025 enhancements and I believe you’ll come away with a better appreciation of how 2025 further improves on Availability Groups.

TIL: How to Add Commas to Numeric Output

One thing that’s always driven me crazy is when I have large numbers in my resultsets and the lack of commas for readability. For a lot of different things I do, the more commas a number has, the more attention I want to give to a given value.

Image
Image

When blown up like this, these are easier to read… but when I have hundreds of values that I want to skim, not so much. I’d much prefer to have my used_space also only display 2 decimal places… it’s less noise and I don’t need any more precision for what I’m doing. I did not feel like doing something like CAST(column AS DECIMAL(10, 2)) to force the decimal places, and it still would not get me the commas I wanted.

Out of annoyance, I decided to search to see if someone had a simple UDF out there that could solve my problems. And lo and behold, the heavens opened up as I got an AI generated response in addition to my usual search results…

Image

Mind… blown!!!

I had sort of remembered when FORMAT() came out, but will confess that I never really bothered looking into it. I knew it could do date and time manipulation, but had no idea it could add comma separators AND limit decimal places!!!

Image
Image

Note ‘N2’ for two decimal places and ‘N0’ (zero) for no decimal places.

Moral of this story – it’s never too late to learn something new.

T-SQL Tuesday #189: Musings on AI’s Impact at Work

T-SQL Tuesday Logo

Welcome back to another edition of T-SQL Tuesday. This month’s blog party host is Taiob Ali, who asks “How is AI changing our careers?

Rewind a Year Ago…

Had you asked me this question a year ago, I would have said not at all. Me in the summer of 2024, still looked at AI as a hyped up fad technology. At that point, all I really knew of AI were stories of people using ChatGPT for research with flawed results, writing papers and blogs with hidden inaccuracies, and generating ridiculous images. To me, it was overblown hype and fluff.

What Changed Since Then?

Two things started to change my perspective. First was SQL Server 2025 – I got involved in Private Preview and one of the flagship enhancements was Vector Search support. The second was Anthony Nocentino (b), who started sharing with me how he was making use of AI while coding. I had a number of “what, really, you can do that?” moments. He showed me how these tools were more than just “dumb chatbots” but could take wider inputs like existing code, and how one could iteratively work with them to modify and make changes.

The industry is not only changing, but changing TOO FAST.

I had the fortune of attending an AI and Data conference here in Boston where I heard that quote and it really resonated with me. I think back to other shifts and evolutions in technology in the last 30 years… the rise of the Internet… smartphones… computing… and I would argue that none of these have moved as fast as we’ve seen AI tech evolve in the last handful of years. While I haven’t keep tabs on it, I am very much doing so now.

Where Am I Today?

If you’ve been tracking my blog, you’ll see that I’ve really dug into Vector Search on SQL Server 2025. My “Practical AI in SQL Server 2025: Ollama Quick Start” blog has been one of my most popular blogs ever! And I am super happy with my new presentation: A Practical Introduction to Vector Search in SQL Server 2025.

How Am I Using AI Day to Day?

I don’t spend my day-to-day living in code, like I once did. But I do find myself having to dig deep and troubleshoot interesting things. Just the other day, had a scenario where a customer was setting up a new WSFC cluster but without shared storage, and were confused as to why some nodes saw volumes presented and others did not. We have a Gemini enterprise license, so I used the Pro model to do some deeper research on my behalf. It generated a very in-depth report for me that helped teach me a ton in regards to how Windows recognizes volumes that are presented to it when it is a standalone install vs when it is a node in a cluster.

What was more important to me is that this Deep Research functionality provided over 40 different citations, enabling me to dig deeper at the sources it used to further validate its findings. Hallucinations (or false output) are definitely a real possibility (and the “why” is a complete separate but fascinating topic), so having a more robust output with citations that I can validate, is a turning point for me.

Like It Or Not, Change is Here…

There’s still a lot of hype and a lot of noise to cut through. And there are also very sharp edges and definite threats and dangers too. But I also strongly believe that we must look past the ragebait headlines and dig deeper into the component technology pieces themselves. I believe that we, as technologists, need to better understand each component so that we can utilize these tools in a responsible, ethical manner.

The other aspect that I am very mindful of is that as a data professional, what drives all of these tools? Data! I feel that we data professionals have an amazing opportunity to future-proof our career journey, if we embrace and build our expertise in this industry.

And here’s an AI generated image of a dog, using the AI image generation built into WordPress. I still prefer real photos but this ain’t too shabby.

Image
Prompt: Generate a realistic image of a brown and white dog sitting a desk, with a laptop in front of him, facing the camera. The dog should be a chihuahua or labrador mix, with pointed ears. Behind the dog should be an array of books and bookshelves.

Practical AI in SQL Server 2025: A Vector Demo Database For You

Today, I have the honor and pleasure of debuting a new presentation for MSSQLTips: A Practical Introduction to Vector Search in SQL Server 2025 (you can watch the recording here too). To accompany that new presentation, I opted to create a new demo database instead of retrofitting one of my existing demo databases. And I’m sharing it with you so you don’t have to go through the headache of taking an existing database and creating vector embeddings.

RecipesDemoDB

Background about the Database

This new database was built with SQL Server 2025 CTP 2.1, and backed up using ZSTD-high compression, weighs in around 16GB striped across 8 backup files.

The dbo.recipes table contains just under 500k recipes, and weighs in at about 2GB. This data was sourced from kaggle and is a dump of recipes from food.com.

Next, there’s other tables under the vectors schema, that contain vector embeddings. The naming scheme is such that those tables correspond to the same named column in dbo.recipes. ex: dbo.recipes.description -> vectors.recipes_description. There is one table that is called recipes_other_cols, which is a JSON concatenation of some of the shorter columns from dbo.recipes – name, servings, and serving_size. Each of the vectors.* tables also have a vector index. All of the vector data is about 22 or 23GB, bringing the total database to about 24-25GB in full.

And finally, there’s a few example stored procedures with KNN and ANN code examples. I would also suggest checking my Practical Intro to Vector Search repo which has some other demo code.

You’ll still need to have Ollama setup and make a few changes to match your own environment. Make sure you use the same embedding model that I did (nomic-embed-text) so any vector embeddings you subsequently create match.

And finally, there is also a sub-folder on the demo-dbs repo that has all of the different “steps” that I took to create the various tables and generate the vector embeddings.

Why Should I Use this Database? Creating Vector Embeddings

I am running a Lenovo P14S with a Nvidia GeForce 3080 GPU connected via TBT3 to an external GPU housing. For the ~500k recipes, and 5 or 6 embedding tables, the entire process took an entire weekend. I don’t have an exact time, because I’d kick off one table to process, then come back later/the next day, validate the data, then run the next one. So yeah, it took a while, hence why I thought I’d share this database to save time for others.

Wrapping Up

If you decide to start using this demo database for your own learning and testing of vector search, I’d love to hear about it. And if you write any interesting demo code that you’d be willing to share, that’d also be amazing as well! As always, please let me know if you run into any quirks or have any feedback.

Happy learning – thanks for reading!

Announcing Data Saturday Chicago 2026!!

On behalf of the team that brought you SQL Saturday Chicago, I am ecstatic to announce that we are returning as Data Saturday Chicago on March 14, 2026 at Harper College.


Image

Sign up to receive E-mail Updates


How We Got Here

Chicago has a rich history of SQL Saturday events – NINE events since 2010!!!

  1. SQLSaturday #31April 17, 2010
  2. SQLSaturday #67March 26, 2011
  3. SQLSaturday #119May 19, 2012
  4. SQLSaturday #211April 13, 2013
  5. SQLSaturday #291April 26, 2014
  6. SQLSaturday #484March 5, 2016
  7. SQLSaturday #600March 11, 2017
  8. SQLSaturday #719March 17, 2018
  9. SQLSaturday #825 March 23, 2019

And we were ready to celebrate our 10th SQL Saturday event, SQL Saturday #945, on March 21st of 2020. But well, we all know how that went to straight to hell.

My Personal Journey

Writing this blog post took me down memory lane. I first attended SQL Saturday Chicago in 2012, #119. For #211 in 2013, I opted to volunteer and get involved. As chance would have it, I went from helping stuff attendee bags on Friday night, to being in charge of room monitors all day Saturday! And that started my journey as an official co-organizer, from 2014 onward.

After we got sunk in 2020, I moved to Boston in 2021. As such, I stepped down as co-leader of the Chicago Suburban SQL Server User Group and stepped away from SQL Saturday Chicago. The remaining team found it challenging to reorganize and start again. Frankly speaking, we all got our asses kicked by 2020. The landscape for in-person events is a shade of its former self, and rebooting an event is a hell of a lot more complicated than you might think.

Why Am I Doing This Again (Remotely)?

The answer is simple: I love our data community and the people.

By the community… for the community.

This community has given so much to me and I am more than happy to keep giving back. And I believe that I speak for the Data Saturday Chicago team when I say that this is one of our core values. Speaking of whom…

Introducing the 2026 Organizing Team

I am honored to be a part of an amazing team of volunteers, working together to put this event together for you (in alphabetical order):

  • Andy Yun
  • Bill Lescher
  • Bob Pusateri
  • Brandon Leach
  • Dave Bland
  • Frank Gill
  • Jared Karney
  • Lowry Kozlowski
  • Rod Carlson
  • Ross Reed
  • Wendy Pastrick – who has literally been organizing SQL Sat Chicago since the very first!

Why Rebrand as Data Saturday?

Since its inception, SQL Saturday was all about SQL Server. But back then, SQL Server was a much smaller product. It was possible to learn the entirety of what SQL Server had to offer (who remembers Microsoft Certified Masters – MCMs?).

But today, like the sizes of our databases (and my waistline), SQL Server has grown and evolved. We decided, collectively as a team, to also evolve and rebrand under Data Saturdays. The Microsoft Data Platform will continue to be our primary focus, but nowadays that encompasses so much more than simply SQL Server.

What to Expect Next

We’ve accomplished the hardest task, which is to lock down a suitable venue and date. Next, we will be focusing on our website, getting a registration platform setup, and opening a Call for Speakers. Please be patient with us as we are still a team of volunteers working this in our spare time.

I hope you are as excited as we are. See you in March 2026!

P.S. Sign up to receive e-mail updates!

T-SQL Tuesday #188 – Growing the Next Generation

T-SQL Tuesday Logo

Welcome back to another edition of T-SQL Tuesday. This month’s topic is an inspirational one, hosted by John Sterrett. John would like us to blog about what we can do to help grow the data community.

Growing new speakers is a topic that I have done a lot with already, so I won’t beat that drum. Instead, what I will suggest is something that anyone can do:

TL;DR – Recognize, Ask, & Offer

Step 1: Recognize Talent and Potential in Others

If you’re reading this, you’re part of the data community. You may not speak or blog regularly, but nonetheless, you are a participant. As such, you can watch for and recognize others. Maybe you’re reading an interesting thread on the SQL Server sub-Reddit… maybe you’re watching a conversation on the SQL Slack channel… Look for statements by individuals that pique your interest.

Step 2: Ask or Invite Them to Contribute

You’ve noticed someone talking about something really interesting? Ask them to share more of what they know about that topic. Invite them to write something more in-depth about it. Being asked and invited to share more, is validation for that other individual. And I believe that each one of us should always be looking for opportunities to validate others and raise each other up.

Step 3: Offer to Help

Anyone can help anyone else.

Stop – re-read that first statement again.

You can help.

There’s a mental trap that many of us fall into (I used to be one of those), where we believe that we are not in a position to help, because we do not believe we are an expert. You do not have to be an expert to help. You can help, simply by encouraging someone get their ideas out there and become more involved in the data community. Words of encouragement… or saying thank you for something you found useful… these are all ways to help.

One More Twist

This blog has been written, from the angle of asking you, the reader, to recognize potential in someone else, encouraging you to ask them to contribute, and offering your own help to further grow them.

Are YOU, whoever is reading this right now, the one who has that potential?

Look in the mirror… the answer is yes. You can be a part of the next generation… and there are many out there who will help you. Don’t know where to find someone? Ask… ask me… you’re reading my blog right now… contact me.