Mete Atamel

LLM Guard and Vertex AI

Posted on November 11, 2024 (Last modified on August 7, 2025)

LLM Guard and Vertex AI

I’ve been focusing on evaluation frameworks lately because I believe that the hardest problem while using LLMs is to make sure they behave properly. Are you getting the right outputs grounded with your data? Are outputs free of harmful or PII data? When you make a change to your RAG pipeline or to your prompts, are outputs getting better or worse? How do you know? You don’t know unless you measure. What do you measure and how? These are the sort of questions you need to answer and that’s when evaluation frameworks come into the picture.

Promptfoo and Vertex AI

Posted on November 4, 2024

Promptfoo and Vertex AI

In my previous DeepEval and Vertex AI blog post, I talked about how crucial it is to have an evaluation framework in place when working with Large Language Models (LLMs) and introduced DeepEval as one of such evaluation frameworks.

Recently, I came across another LLM evaluation and security framework called Promptfoo. In this post, I will introduce Promptfoo, show what it provides for evaluations and security testing, and how it can be used with Vertex AI.

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

Firestore for Image Embeddings

Posted on October 29, 2024

Firestore and LangChain

In my previous Firestore for Text Embedding and Similarity Search post, I talked about how Firestore and LangChain can help you to store text embeddings and do similarity searches against them. With multimodal embedding models, you can generate embeddings not only for text but for images and video as well. In this post, I will show you how to store image embeddings in Firestore and later use them for similarity search.

GenAI GoogleAI VertexAI Gemini Firestore Google Cloud Platform

Firestore for Text Embedding and Similarity Search

Posted on October 9, 2024

Firestore and LangChain

In my previous Persisting LLM chat history to Firestore post, I showed how to persist chat messages in Firestore for more meaningful and context-aware conversations. Another common requirement in LLM applications is to ground responses in data for more relevant answers. For that, you need embeddings. In this post, I want to talk specifically about text embeddings and how Firestore and LangChain can help you to store text embeddings and do similarity searches against them.

GenAI GoogleAI VertexAI Gemini Firestore Google Cloud Platform

Persisting LLM chat history to Firestore

Posted on October 1, 2024 (Last modified on October 3, 2024)

Firestore and LangChain

Firestore has long been my go-to NoSQL backend for my serverless apps. Recently, it’s becoming my go-to backend for my LLM powered apps too. In this series of posts, I want to show you how Firestore can help for your LLM apps.

In the first post of the series, I want to talk about LLM powered chat applications. I know, not all LLM apps have to be chat based apps but a lot of them are because LLMs are simply very good at chat based communication.

GenAI GoogleAI VertexAI Gemini Firestore Google Cloud Platform

Semantic Kernel and Gemini

Posted on August 19, 2024 (Last modified on September 20, 2024)

Semantic Kernel and VertexAI

Introduction

When you’re building a Large Language Model (LLMs) application, you typically start with the SDK of the LLM you’re trying to talk to. However, at some point, it might make sense to start using a higher level framework. This is especially true if you rely on multiple LLMs from different vendors. Instead of learning and using SDKs from multiple vendors, you can learn a higher level framework and use that to orchestrate your calls to multiple LLMs. These frameworks also have useful abstractions beyond simple LLM calls that accelerate LLM application development.

GenAI GoogleAI VertexAI Gemini Google Cloud Platform

DeepEval and Vertex AI

Posted on August 12, 2024

DeepEval and VertexAI

Introduction

When you’re working with Large Language Models (LLMs), it’s crucial to have an evaluation framework in place. Only by constantly evaluating and testing your LLM outputs, you can tell if the changes you’re making to prompts or the output you’re getting back from the LLM are actually good.

In this blog post, we’ll look into one of those evaluation frameworks called DeepEval, an open-source evaluation framework for LLMs. It allows to “unit test” LLM outputs in a similar way to Pytest. We’ll also see how DeepEval can be configured to work with Vertex AI.

GenAI VertexAI Gemini Google Cloud Platform

Deep dive into function calling in Gemini

Posted on August 6, 2024

Introduction

In this blog post, we’ll deep dive into function calling in Gemini. More specifically, you’ll see how to handle multiple and parallel function call requests from generate_content and chat interfaces and take a look at the new auto function calling feature through a sample weather application.

What is function calling?

Function Calling is useful to augment LLMs with more up-to-date data via external API calls.

You can define custom functions and provide these to an LLM. While processing a prompt, the LLM can choose to delegate tasks to the functions that you identify. The model does not call the functions directly but rather makes function call requests with parameters to your application. In turn, your application code responds to function call requests by calling external APIs and providing the responses back to the model, allowing the LLM to complete its response to the prompt.

GenAI VertexAI Gemini Google Cloud Platform

Control LLM costs with context caching

Posted on July 19, 2024

Context caching

Introduction

Some large language models (LLMs), such as Gemini 1.5 Flash or Gemini 1.5 Pro, have a very large context window. This is very useful if you want to analyze a big chunk of data, such as a whole book or a long video. On the other hand, it can get quite expensive if you keep sending the same large data in your prompts. Context caching can help.

GenAI VertexAI Gemini Google Cloud Platform

Control LLM output with response type and schema

Posted on July 15, 2024

Introduction

Large language models (LLMs) are great at generating content but the output format you get back can be a hit or miss sometimes.

For example, you ask for a JSON output in certain format and you might get free-form text or a JSON wrapped in markdown string or a proper JSON but with some required fields missing. If your application requires a strict format, this can be a real problem.

GenAI VertexAI Gemini Google Cloud Platform