Today, I have the honor and pleasure of debuting a new presentation for MSSQLTips: A Practical Introduction to Vector Search in SQL Server 2025 (you can watch the recording here too). To accompany that new presentation, I opted to create a new demo database instead of retrofitting one of my existing demo databases. And I’m sharing it with you so you don’t have to go through the headache of taking an existing database and creating vector embeddings.
RecipesDemoDB
- Download Link Here – Github (~16GB)
- Code to Reverse Engineer – Github
- Ollama Quick Start Guide
- Practical Intro to Vector Search Demo Code – Github
Background about the Database
This new database was built with SQL Server 2025 CTP 2.1, and backed up using ZSTD-high compression, weighs in around 16GB striped across 8 backup files.
The dbo.recipes table contains just under 500k recipes, and weighs in at about 2GB. This data was sourced from kaggle and is a dump of recipes from food.com.
Next, there’s other tables under the vectors schema, that contain vector embeddings. The naming scheme is such that those tables correspond to the same named column in dbo.recipes. ex: dbo.recipes.description -> vectors.recipes_description. There is one table that is called recipes_other_cols, which is a JSON concatenation of some of the shorter columns from dbo.recipes – name, servings, and serving_size. Each of the vectors.* tables also have a vector index. All of the vector data is about 22 or 23GB, bringing the total database to about 24-25GB in full.
And finally, there’s a few example stored procedures with KNN and ANN code examples. I would also suggest checking my Practical Intro to Vector Search repo which has some other demo code.
You’ll still need to have Ollama setup and make a few changes to match your own environment. Make sure you use the same embedding model that I did (nomic-embed-text) so any vector embeddings you subsequently create match.
And finally, there is also a sub-folder on the demo-dbs repo that has all of the different “steps” that I took to create the various tables and generate the vector embeddings.
Why Should I Use this Database? Creating Vector Embeddings
I am running a Lenovo P14S with a Nvidia GeForce 3080 GPU connected via TBT3 to an external GPU housing. For the ~500k recipes, and 5 or 6 embedding tables, the entire process took an entire weekend. I don’t have an exact time, because I’d kick off one table to process, then come back later/the next day, validate the data, then run the next one. So yeah, it took a while, hence why I thought I’d share this database to save time for others.
Wrapping Up
If you decide to start using this demo database for your own learning and testing of vector search, I’d love to hear about it. And if you write any interesting demo code that you’d be willing to share, that’d also be amazing as well! As always, please let me know if you run into any quirks or have any feedback.
Happy learning – thanks for reading!

