Teradata - Medium

Build AI, Not Infrastructure: Inside Teradata’s Autonomous Knowledge Platform

Janeth Graziani — Thu, 07 May 2026 15:29:22 GMT

There’s a hidden tax you as a data scientist and ML engineer pay every day when working in notebooks, and it starts before the work even begins.

You spin up a notebook and wait. Five minutes. Ten. Sometimes longer, just to import a library. Other times the environment is already running, but now you’re guessing compute sizes, watching costs creep up, and hoping your experiment doesn’t trigger a FinOps alert.

And when you need something like GPU access, production data, or updated permissions, it turns into a ticket. Then a sprint. Sometimes two. By the time everything is ready, the momentum is gone. The idea you had at the start of the week? Buried under infrastructure friction.

This isn’t just an inconvenience. It’s one of the biggest reasons AI projects stall between pilot and production.

Developers and teams can get models working in notebooks. They can connect data, run experiments, and even show early results. But the moment they try to scale — when workloads become unpredictable, concurrency increases, costs need to be controlled, and production SLAs need to be met — the project starts to slow down.

What’s missing isn’t another tool. It’s an autonomous AI platform that can anticipate demand, manage resources, and enforce business SLAs. That’s what the Teradata Autonomous Knowledge Platform is designed to do.

Core capabilities of the Autonomous Knowledge Platform

At its core, the Autonomous Knowledge Platform provides three foundational capabilities:

Autonomous Tera agent execution that continuously optimizes performance, cost, and scale across production workloads
A compute layer with always‑on active compute for mission‑critical and agentic workloads, alongside elastic compute for on‑demand workloads
A connected data foundation that brings together low-latency local storage, and cost optimized object stores under a single architecture, with support for open table formats and Enterprise Vector Store

These capabilities operate continuously, and AI Studio is where developers experience them in practice. Let’s explore the platform, Tera agents, and notebooks via AI Studio.

AI Studio: the developer workspace for building and scaling AI

AI Studio is the unified workspace where developers build, manage, and scale AI outcomes on the Autonomous Knowledge Platform.

AI Studio: where developers build, manage, and scale outcomes on the Autonomous Knowledge Platform

It isn’t a replacement for existing Teradata environments. Instead, it runs on top of your Teradata systems without requiring data migration or environment re-creation.

From AI Studio, developers work with integrated:

Notebooks for SQL, Python, and analytics workflow experimentation, development, and collaboration with team members
ModelHub to access, monitor, and manage production‑ready models (including embedding and chat models) with visibility into token and cost usage
ModelOps to create, manage, and run models at scale
Vector Store to store and manage data as vector embeddings for search integration and RAG applications
Built‑in Tera agents that perform a range of tasks from data analysis to continuously managing infrastructure

Everything we’ll explore below happens inside AI Studio, running directly on the Autonomous Knowledge Platform.

A single workspace for notebooks, models, vectors, and agents

Tera and autonomous AI agents for production workloads

Tera is Teradata’s autonomous AI-powered workspace, serving as the natural language interface with enterprise-grade agent execution environments.

You can ask Tera to:

Get information about your system or environment
Retrieve and analyze schema information, table structures, and column definitions
Assist in identifying data suitable for visual representation

Tera operates under existing Teradata permissions, ensuring users only see data they are authorized to access.

Tera is a natural language interface with enterprise-grade agent execution environments.

Tera includes built-in modes for data analysis with Tera Analyze, coding with Tera Code, and multi-agent system automation and orchestration with Tera Claw.

How Tera agents automate scaling, cost, and governance

Tera agents operate across the spectrum of autonomy from deterministic automation to policy‑governed autonomous actions, within a secure agent harness and runtime.

Tera provides a natural-language interface for governed execution and agentic workflows.

For example, a healthcare claims team runs conversational analytics over three years of data. The pilot proved value. Users love it! Now the challenge begins when moving the pilot into production.

Traditionally, moving to production means:

Refactoring notebook‑based code and models to run repeatedly, autonomously, and at scale
Capacity planning meetings
Cost modeling sessions
Tickets for elastic clusters
Manual tuning as concurrency spikes

With the Autonomous Knowledge Platform, an administrator expresses their intent to Tera in plain language and the agents handle that work automatically.

“We’re expecting ~300 users for claims and provider data access via MCP, with workload demand varying throughout the day. Monthly compute spend must be under 8,000 units. Enforce access logging on all queries for later audit. Aim for query response under 5 seconds.”

Define intent (users, cost ceiling, SLAs); the autonomous knowledge platform proposes a deployment plan.

Once the administrator approves the proposed deployment, the agents:

Analyze pilot data around usage, performance, and cost
Identify usage patterns and inefficiencies
Recommend elastic compute configurations
Show projected cost, performance, and impact up front

Autonomous agents surface tradeoffs up front: projected cost, performance, and impact.

The required changes within predefined guardrails are executed automatically. Larger or riskier changes wait for human approval.

Built-in guardrails help an autonomous knowledge platform automate safely

Every action is logged. Every decision is auditable.

Real-time visibility into auto-scaling, usage, and cost — managed automatically by Tera agents within defined guardrails.

What this means for you is simple:

You don’t wait for infrastructure
You don’t tune clusters
You don’t negotiate for resources
You don’t move data out of the platform

Tera agents give you the ease, speed, and flexibility developers love. While delivering the governance, cost control, and predictability enterprises demand.

Tera agents remove infrastructure pain points for productionizing AI workloads. Notebooks continue to be where developers explore data, build models, and iterate ideas during development. Agents take over when those ideas move toward production. Once a workflow proves value, platform-level agents handle deployment, scaling, cost controls, and governance, using real telemetry and human-defined intent.

Running AI Workloads at Scale with Notebooks + In-Database AI

Let’s explore how we build AI solutions with Notebooks in AI Studio. From the Notebooks tab in AI Studio, we can start a session and select our desired compute resources. Compute pools are configurable by each organization and can scale to meet the performance, cost, and workload requirements of the business. For example, in my environment I can select from:

General Compute: for data exploration, analytics, and in-database functions. This profile runs on up to five nodes with 16 vCPUs and 64 GB of RAM.
High Memory: for large datasets and memory‑intensive workloads that require substantial in‑memory processing. This profile contains up to three nodes with 64 vCPUs and 512 GB of RAM.

In this example, I’ve selected General Compute to demonstrate text analytics with native LLMs in ModelHub.

Start a notebook session and select the right compute pool for your workload.

Choose a notebook and kernel and begin running Python against data in Teradata.

Run Python/SQL where the data lives without moving data out of the platform

Below is an example of using Python to run complex text analytic workflows using Teradata’s generative AI package and production-ready LLMs and embedding models.

First, we establish a connection to the Teradata system.

 from teradataml import create_context

# ── Connect to Vantage ────────────────────────────────────────────────────────
eng = create_context(
host='{db_host}', # Replace with your database host
username='{db_username}', # Replace with your database username
password='{db_password}', # Replace with your database password
logmech='TD2')

We load sample product review data into Teradata with the copy_to_sql() method, which converts a pandas DataFrame into a Teradata table.

import pandas as pd
from teradataml import copy_to_sql, DataFrame

# ── Sample data: customer feedback with PII embedded in text ──────────────────
feedback_pd = pd.DataFrame({
'review_id': [1, 2, 3, 4, 5],
'product': ['SmartWatch X1', 'SmartWatch X1', 'Laptop Pro 15', 'Laptop Pro 15', 'EarBuds Z'],
'review': [
"Hi, this is Emily Carter from San Francisco. I love the battery life on the SmartWatch X1 — it lasts nearly 5 days on a single charge. You can reach me at emily.carter@gmail.com if you have questions.",
"I’m Michael Rodriguez in San Jose. The watch looks great, but the strap broke after two weeks. Support asked me to call 408-555-0147, but I haven’t been able to get through yet.",
"Feedback from Sarah Nguyen: the Laptop Pro 15 has blazing-fast performance and a stunning display. My receipt was emailed to snguyen@outlook.com.",
"This is David Thompson (dthompson@company.com). I use this laptop for heavy development workloads, and it gets very hot. The fan noise is pretty distracting during long builds.",
"Jessica Lee here. The EarBuds Z have crystal-clear audio and excellent noise cancellation. Feel free to text me at 310-555-0168 if you’d like more detailed feedback."
]
})

copy_to_sql(
feedback_pd,
table_name='customer_feedback',
if_exists='replace',
index=False
)

df = DataFrame('customer_feedback')
print('Sample data loaded into Vantage:')
df

Here’s a preview of the data we just loaded, using a TeradataML DataFrame. This DataFrame represents a structured dataset that now resides on our analytic platform.

Customer review data loaded into Teradata, ready for large-scale text analytics and PII masking.

A TeradataML DataFrame can reference a table, a view, or even a complex query spanning open table formats and object storage. These datasets may range from thousands to billions of rows and often represent data joined across hundreds of tables.

While interacting with a DataFrame feels local, all computation is executed on Teradata’s massively parallel analytic cluster — enabling fast, scalable operations on data at any size.

We can now demonstrate how to run large-scale text analytics like masking sensitive data, analyzing sentiment, extracting key phrases, translating text, and generating summaries using models available in ModelHub.

Let’s start by configuring our connection to ModelHub and selecting the LLM we want to run tests with.

To begin running natural language processing operations with LLMs, we need a ModelHub endpoint, ModelHub key, and model name.

from teradatagenai import TeradataAI
from teradatagenai.text_analytics import TextAnalyticsAI
from teradataml import DataFrame

# ── Configuration ─────────────────────────────────────────────────────────────
# Replace with the endpoint URL and API key copied from AI Modelhub
MODELHUB_ENDPOINT = '{your_endpoint_url}' # e.g., 'https://your_instance.teradata.com/one-td/litellm/v1/chat/completions'
MODELHUB_API_KEY = '{your_api_key}' # Replace with your API key
MODEL_NAME = '{model_name}' # as shown in Modelhub

llm = TeradataAI(
api_type="nim",
model_name=MODEL_NAME,
api_base=MODELHUB_ENDPOINT,
api_key=MODELHUB_API_KEY
)

analytics = TextAnalyticsAI(llm=llm)

Select the ModelOps tab from the left bar menu.

Use ModelOps to move from notebook experimentation to governed deployment

This will open ModelOps in a new tab. ModelOps is Teradata’s model lifecycle management capability, enabling data scientists and ML engineers to train, evaluate, deploy, monitor, and retrain task‑specific ML models. Models developed in notebooks can move seamlessly into production with unified governance, observability, and auditability. The same lifecycle controls apply to LLMs, which is why ModelOps integrates seamlessly with AI ModelHub to manage both traditional ML and generative AI within a single platform.

Operationalize and manage machine learning models at scale with ModelOps and integrated AI Model Hub.

Select Open in the bottom fold to explore the AI model hub, and enter your virtual API key to view the models your user has access to.

AI ModelHub

The AI ModelHub in AI Studio is a catalog of production-ready AI models, including LLMs, embedding models, and domain-specific models deployed and served within your Teradata environment as well as models from cloud service providers.

These models are exposed via LiteLLM and are designed to be accessed from Python using the teradatagenai package.

These same model endpoints can be reused for generative and agent-driven workflows. Both built-in Tera agents and customer-defined agents can invoke these endpoints, enabling centralized management, consistent access policies, and visibility into model usage across the platform.

This design keeps all inference inside your secure environment. No data leaves the platform, and no external API keys or internet access is required.

Browse and manage production-ready LLMs from Model Hub with centralized access, governance, and usage visibility.

Select a model card to view the required model name and the endpoint in the overview page.

ModelHub model card details

With the proper credentials, we can now run our operation using teradatagenai. teradatagenai provides TextAnalyticsAI, a higher-level class that wraps the LLM to run text analytics operations directly on Teradata DataFrames. These operations combine the language model with your data at scale, pushing results back into Teradata without moving data to the notebook.

Supported operations include:

analyze_sentiment() — classify emotional tone as positive, negative, or neutral
extract_key_phrases() — identify the most important terms in each text
summarize() — condense long text into a concise summary
translate() — convert text between languages
mask_pii() — redact personally identifiable information

masked_data = analytics.mask_pii(
data = df,
column = 'review',
id_col = 'review_id'
)
masked_data

Run governed text analytics in-database and return results as tables.

Because this is running in database, the results will come back as a table just like other analytic results.

The important thing we just showed here is that we executed an LLM inside Teradata and against enterprise data — without leaving the platform, and with full security. This is an example of a first-class AI capability embedded in your data platform.

This is what it means to build AI in production without managing infrastructure. Developers work in notebooks as usual, while Tera agents handle execution, optimization, and governance when it’s time to take your workloads into production.

Request a demo of AI Studio by visiting https://www.teradata.com/about-us/contact.

Build AI, Not Infrastructure: Inside Teradata’s Autonomous Knowledge Platform was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

Building Smarter AI Agents for Data Science Workflows at Scale

Janeth Graziani — Fri, 01 May 2026 19:15:25 GMT

Building Smarter AI Agents for Data Science Workflows on Enterprise Data Platforms

How do we get AI agents to actually operate the right way at scale across data platforms, whether that’s a lakehouse, data lake, or data warehouse?

If you’ve tried connecting large language models, custom agents, or agentic systems to your data platform you’ve probably already run into a few familiar problems. The agent generates SQL that technically runs but it’s wildly inefficient. It pulls millions or billions of rows out of the database just to do local processing instead of using the platforms in-database processing engine. Or worse, it confidently hallucinates platform specific functions and results that were never computed.

So the challenge becomes:

How do we force AI systems and agents to work the right way on specific data platforms at scale?

That’s exactly what Kevin Sturgeon, Director of Cloud Engineering at Teradata, set out to solve in his latest work. In our recent Teradata DevTalk, Kevin walked through how he designed an open‑source MCP server with a Skills‑based architecture, and demonstrated how context‑aware agents using progressive disclosure can improve data science and analytics workflows, speed up onboarding for new data scientists to a specific platform, and reduce costly inefficiencies at scale. More specifically Kevin used MCP as a transport layer and repackaged Teradata expertise (Teradata documentation, best practices, and SQL patterns) as injectable skills, he created a system where agents learn how to work correctly on a platform without prompt overload, filesystem coupling, or framework lock‑in.

https://medium.com/media/09a2011c23ef26b5c28782ee661ff70e/href

This post is a quick recap of what we covered in our DevTalk. All of the projects discussed here are available as open source for anyone to explore, and the mcp + skills architecture itself is not tied to Teradata it can be applied to any data platform.

GitHub Repo: tdsql MCP Server: https://github.com/ksturgeon-td/tdsql-mcp/blob/main/README.md
GitHub Repo: LangGraph tdsql Agent: https://github.com/ksturgeon-td/tdsql-agent
Teradata Free Trial: https://www.teradata.com/getting-started/demos/clearscape-analytics

The Core Problem with Today’s Agentic AI Workflows

Modern LLMs are incredibly good at reasoning. They understand statistical concepts, machine learning algorithms, and analytical workflows. They know what ARIMA models are. They know how feature engineering works. They know how to evaluate models.

What they don’t know is how to do those things well on your data platform.

When you connect an agent to a database without enough context, a few things tend to happen:

The agent writes generic or inefficient SQL
It defaults to client-side processing (pulling data into notebooks or local environments)
It guesses at functions that don’t exist
It makes up accuracy metrics or results

This isn’t a Teradata specific issue it’s an industry-wide problem. Agents aren’t incentivized to minimize data movement, reduce cost, or use platform-native functions unless we explicitly guide them.

And at scale, those mistakes get expensive fast.

Letting Agents Do What They’re Good At (and Platforms Do the Rest)

One of the themes Kevin kept coming back to is:

Let agents reason.
Let data platforms execute.

Teradata has spent decades solving a problem that many open-source ecosystems still struggle with: how to operationalize complex analytics at speed and massive scale.

Time series forecasting, geospatial analysis, machine learning, model evaluation, these aren’t new problems. Teradata already has hundreds of native, highly optimized analytical functions designed to run directly where the data lives.

The challenge is helping agents discover and use those capabilities without stuffing the entire Teradata documentation into a prompt.

Progressive Context for AI Agents (Not Prompt Overload)

One solution people have been experimenting with is “skills”, text-based instructions that tell an agent how to do things correctly.

That works… up to a point.

Skills tend to be:

File-system based
Hard to reuse across tools
Limited to certain clients (like Claude Desktop)

Kevin identified the need for an additional architectural layer that helps large language models retrieve the right information about how to work on a platform when they need it rather than relying solely on static instructions.

That layer is the Model Context Protocol (MCP), an open‑source standard introduced by Anthropic in late 2024.

MCP allows agents to dynamically request tools, documentation, and behavioral guidance as they work, instead of carrying all context upfront in a single prompt. You can think of it as progressive disclosure for agent intelligence providing context on demand, based on what the agent is actively trying to do.

In the open‑source tdsql MCP server, Kevin combined MCP with a Skills‑based approach by doing the heavy lifting of translating Teradata documentation, best practices, and SQL patterns into a reusable Teradata skills library.

These skills are then exposed as injectable MCP tools, making them discoverable and callable by agents at runtime.

For example, the tdsql MCP server exposes platform‑verified SQL and analytics guidance as a callable MCP tool instead of static prompt instructions:

@mcp.tool()
def get_syntax_help(topic: str = "index") -> str:
    """Return Teradata SQL syntax reference for a given topic.

    IMPORTANT: Call this tool BEFORE writing any analytics, transformation, ML, or data
    preparation SQL. Teradata Vantage has native distributed table operators for most
    operations — scaling, encoding, binning, statistics, clustering, classification, text
    analytics, vector search, and more. These outperform hand-written SQL and should always
    be preferred. Do not write manual SQL for an operation if a native function exists.

    Recommended call order:
      1. get_syntax_help(topic='guidelines') — see the canonical mapping of common SQL
         patterns to native Teradata functions (start here if unsure what exists)
      2. get_syntax_help(topic='index') — browse all available topics and the Workflows
         section that maps use cases to topic sequences
      3. get_syntax_help(topic='') — load exact syntax for a topic

    Args:
        topic: The topic name (e.g. 'data-prep', 'ml-functions', 'vector-search').
               Use 'index' to list all available topics.
               Use 'guidelines' for the native-functions-first reference.

    Returns:
        Markdown reference text for the requested topic, or a list of valid topics
        if the requested topic is not found.
    """

Instead of embedding long instructions directly in prompts, agents can now retrieve structured, platform‑aware guidance at runtime.

Why This Matters

As a result, the agent workflow changes fundamentally. Rather than guessing or overloading its prompt, the agent can:

Check platform guidelines before generating SQL
Look up supported syntax and native analytical functions
Apply Teradata-specific best practices for scale and execution
Build context incrementally as it reasons through a task

This keeps prompts smaller, focuses the agent’s attention, and dramatically reduces inefficient queries and hallucinated behavior while still allowing the agent to do what it’s best at: reasoning and decision-making.

This pattern using MCP as a transport layer and delivering platform expertise as injectable skills is emerging across the industry. Several vendors and frameworks now explicitly recommend combining MCP and Skills. What differentiates Kevin’s work is how far this idea is pushed: fully removing filesystem coupling, aggressively enforcing progressive disclosure, and adapting deep analytic platform knowledge into a reusable MCP‑delivered skills library.

What This Enables in Practice for AI Agents on Data Platforms

Once you give an agent structured, platform‑aware context, some interesting things start to happen.

And when combined with interactive chat environments like Claude Desktop, ADK, or Goose, this becomes a powerful way to both ground agent reasoning and maintain a direct connection to the database.

Kevin demonstrated a few advanced analytical examples with Claude Desktop first.

1. Ad Hoc Analytics That Actually Scale

From a simple natural language prompt like “perform time series analysis,” the agent will load the mcp tools and syntax:

Discover relevant tables
Assess seasonality and stationarity
Run native Teradata time-series functions
Compare multiple forecasting models
Evaluate results at scale

All without pulling raw data out of the database.

2. From Experimentation to Operational Pipelines

One of my favorite parts of the demo was this task:

After exploring the data and models interactively, we asked the agent:

“How do I operationalize this?

Instead of stopping at basic analysis the agent:

Tuned and optimize SQL queries
Created reusable views and model tables
Produced a stored procedure pattern that can be used and scheduled

This is the part that usually breaks down in data science workflows getting from a notebook to something reliable and scalable. The agent doesn’t magically solve that, but it accelerates the path dramatically.

Kevin also demoed the versatility of his progressive disclosure MCP tool by connecting to Claude Code attached to VSCode

3. AI Code Generation Tools That Learn From Its Mistakes

Even with good context, agents still make errors. That’s expected.

The difference here is that:

The agent can explain failed queries
Look up correct syntax
Retry with improved SQL
Optimize execution patterns
Produce .sql file that can be ran and verified

You’re not just getting generated code you’re getting an iterative collaborator that can reason about its own outputs.

And because MCP is model-agnostic, this works across tools like Claude, Copilot, Gemini, and others that support the mcp protocol.

What This Is (and What It Is Not)

It’s important to be very clear about this.

This does not replace:

Data scientists
Data engineers
Validation, testing, or governance workflows

Hallucinations are still possible. Even with strong context injection, LLMs are not 100% deterministic.

The right mindset is trust, but verify.

What this does give you is:

Faster onboarding to platform-specific analytics
Safer, more efficient agent behavior
A huge productivity boost for exploratory and operational analytics

Getting Started

The MCP server and examples Kevin demoed are available on GitHub as an open-source project. There’s nothing proprietary here it’s a reorganization of Teradata’s best practices into a form agents can actually use.

You can also try this yourself using a Free Teradata Trial, which provides an evaluation environment with sample datasets and analytics capabilities ready to go.

Plug the two together, point an agent at your data, and start exploring.

Resources:

GitHub Repo: tdsql MCP Server: https://github.com/ksturgeon-td/tdsql-mcp/blob/main/README.md
GitHub Repo: LangGraph tdsql Agent: https://github.com/ksturgeon-td/tdsql-agent
Teradata Free Trial: https://www.teradata.com/getting-started/demos/clearscape-analytics

Final Takeaway

AI agents aren’t going away but poorly structured agents can be dangerous, expensive, and misleading at scale.

The future isn’t about smarter prompts. It’s about smarter context.

By grounding agents in platform-native capabilities and guiding how they reason not just what they output we can finally make agentic workflows practical for real-world data science.

If you watched the DevTalk live, this article should help you slow things down and unpack the details. If you missed it, now you know why this space is worth paying attention to.

More soon.

Teradata Monthly DevTalks

Building Smarter AI Agents for Data Science Workflows at Scale was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

Carbon Footprint Analytics: a data integration story

Gregory Leduc — Mon, 13 Apr 2026 14:50:47 GMT

Measuring GHG emissions is a data integration challenge. See how Teradata’s CFA solution accelerator tackles it.

Galina Nelyubova ; Earth Day — a collection of environmental 3D illustrations

Climate emergency intensifies, and companies must play a significant role in ensuring a sustainable future. As climate change is one of the biggest systemic risks, mitigation of adverse financial effects and adaptation to rapidly changing circumstances is key for companies to ensure long-term going concern and growth. Centerpiece for these activities is proper reporting and reducing of companies’ greenhouse gas (GHG) emissions. And doing so requires much more than good intentions: it demands robust data integration, and the right platform.

Why Carbon footprint analytics matters

Every company’s operations — from logistics and manufacturing to IT and facilities — generate emissions. These emissions are environmental concerns, and they can also become a financial and reputational risk. Regulatory bodies are moving toward mandatory enterprise-wide GHG reporting, and stakeholders are demanding transparency and action.

To meet these expectations, companies must answer critical questions:

· How much GHG emissions are we responsible for?

· Which activities contribute the most?

· What’s our roadmap to a sustainable level of GHG emissions? (Ones would say “path to net-zero”, but let’s focus on reduction first before compensating).

The answers lie within the data companies already possess: financial systems, operational logs, energy consumption records… All these contain the clues needed to build a comprehensive GHG emissions profile. And here is precisely the challenge most companies are facing: transforming existing data into actionable insights.

Figure 1 — Calculating a carbon footprint is easy… until your start digging into the details!

Why it’s a data integration story

Carbon footprint analytics is fundamentally a data integration challenge. As the saying goes, “What gets measured, gets managed”. And measuring emissions accurately means harmonizing data from disparate sources, each with its own format, granularity, units, and quality. Trying to report GHG emissions without data integration is like assembling a jigsaw puzzle where pieces are taken from different puzzles: You’ll never see the full picture.

The Carbon Footprint Analytics (CFA) solution accelerator designed by Teradata tackles this complexity challenge. Based on a flexible data model, it ingests enterprise-specific data using proven algorithms, and outputs detailed GHG emissions across all scopes (1, 2 and 3) defined by the GHG Protocol.

Key capabilities include:

· Unit harmonization: automatically converts measures to configurable preferred units (pounds vs kilograms, kWh vs MWh, etc.)

Figure 2 — Automated management of measure systems heterogeneity

· Geospatial awareness: identifies the right emission factors based on location, which is particularly critical for electricity consumption (electricity mix, thus GHG emissions, significantly differ between countries and regions)

· Temporal modeling: the CFA design reflects evolving emission factors and activity periods.

A key point to understand is that CFA is not a reporting tool. It is an engine in which we can connect reporting tools. This engine integrates and enriches data to deliver a harmonized, enterprise-wide view of emissions.

Think before you jump: design matters

Before diving into carbon analytics, companies must design a robust and flexible data model. A poorly structured model can lead to wasted efforts and inconsistent results.

The CFA solution accelerator emphasizes thoughtful architecture:

· A core data model that supports multiple aggregation levels and units

· Pre-integrated calculation rules for seamless processing and explainable results

· Advanced analytics capabilities, including machine learning, geospatial, time-series and string similarities.

Figure 3 — Overview of the CFA data model, designed for scope 1, 2 and 3 GHG emissions

This foundation ensures that companies can not only report emissions, but also forecast, simulate and optimize their GHG emissions reduction strategy.

Why Teradata is the right platform

Teradata Vantage is well positioned to support Carbon Footprint Analytics:

· Geospatial capabilities: visualize GHG emissions and apply location-specific factors.

Figure 4 — Geospatial capabilities facilitate segregation of GHG emissions according to location

· Temporal intelligence: track changes over time with time-aware tables.

· Advanced analytics: leverage ClearScape Analytics™ features to observe the trends and to simulate the effect of your strategic decisions for multiple scenarios.

· Scalable integration: harmonize data from multiple sources with high performance and reliability.

Figure 5 — Multiple data sources must be efficiently integrated, both internal and external

The ClearScape Analytics Experience demo showcases CFA in action — transforming raw data into meaningful carbon insights. So, what are you waiting for? Deploy your own CFA solution accelerator, with Teradata ready to support you every step of the way.

Carbon Footprint Analytics: a data integration story was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

Building Multimodal Agentic Search Over Images and Documents — A DevTalk

Daniel Herrera — Thu, 09 Apr 2026 12:55:14 GMT

Teradata DevTalk — Multimodal Agentic Semantic Search

Recently I had the opportunity to host a DevTalk that focused on something many teams are actively experimenting with right now: how to reason over unstructured data using multimodal embeddings and agentic retrieval patterns.

Together with Ajay Gopalan from Unstructured.io, we walked through a live, end‑to‑end demo that shows how images, documents, and text can be brought together. The goal start from raw unstructured data and end with an experience where you can select an image, ask questions about it, and receive answers grounded in real enterprise documentation.

The full session is available on YouTube, and if you are working with vector search, multimodal models, or agentic workflows, it is worth watching the demo in full.

https://medium.com/media/79e5bd6086933a7020f876e3cdf68e87/href

Below is a summary of what we covered and why it matters:

Why Multimodal Search Keeps Coming Up

Images exist everywhere in enterprise environments. They appear inside PDFs, research documents, medical records, financial reports, and technical documentation. On their own, images rarely provide enough context to answer useful questions. The surrounding text, tables, and metadata usually carry the explanation.

The challenge is connecting those pieces in a way that scales.

During the session, we framed the problem around a simple interaction. A user selects an image and asks a question. The system retrieves the relevant documentation that explains that image and uses it to produce a grounded answer. That interaction requires coordinated ingestion, embedding, storage, retrieval, and reasoning across multiple data modalities.

Structuring Unstructured Data with Unstructured

We started with ingestion, using Unstructured to process a mix of composite PDFs and standalone image files. Ajay walked through how Unstructured workflows break documents into structured elements such as text blocks, tables, and images. These elements are then enriched and prepared for downstream use.

One point that came up during the demo is how useful the Unstructured UI is during experimentation. You can design and test workflows visually, refine parsing strategies, and then transition to the API when you are ready to automate ingestion at scale. The same pipeline is used in both cases.

Storing and Searching Embeddings in Teradata VantageCloud

Once the documents and images are processed, embeddings are generated and stored inside Teradata using Enterprise Vector Store capabilities. This keeps vector search close to the rest of the enterprise data, with the same governance, access control, and operational model teams already rely on.

In the demo, we created a vector store from the generated embeddings and preserved metadata such as file names and record locators. That metadata becomes important later, when retrieved results need to be traced back to their original documents.

From a data platform perspective, this aligns well with how many teams already think about indexing. The mental model remains focused on accelerating discovery and retrieval, with similarity used as the matching mechanism instead of exact values.

Adding Agentic Reasoning on Top of Retrieval

Retrieval alone gets you relevant context. Agentic workflows turn that context into an interactive experience.

Using LangChain integrations with Teradata, we built a small set of tools that the agent can use. One tool performs similarity search over embeddings to locate the most relevant documentation for a selected image. Another tool retrieves and displays the source document associated with that result.

The agent prompt describes how these tools should be used together. When a user selects an image and asks a question, the agent retrieves the closest matching documentation and generates an answer grounded in that evidence. The reasoning step stays tied to retrieved content instead of relying on general knowledge.

During the demo, we asked questions about medical images, including tissue health and visible markers. Each response referenced information asdpulled from the most relevant documents in the library.

Observations from the Live Discussion

The audience questions touched on topics many teams are evaluating right now.

Cost considerations came up early, particularly around embeddings and multimodal models. As with most architectural decisions, the balance depends on scale and workload. Bringing vector search into the same platform as analytics removes certain integration and duplication costs that often appear with external systems.

We also discussed entry points. Multimodal search is compelling, though many teams begin with document retrieval, tables, policies, and reports. These use cases already benefit from structured parsing and vector search and provide a solid foundation before adding image‑based reasoning.

Another area of interest was familiarity. Vector stores can feel unfamiliar at first, though the underlying ideas map closely to indexing strategies DBA teams already use. Teradata’s approach keeps vector operations accessible through both SQL and Python, which helps bridge that gap.

Try the Demo Yourself

The full demo runs inside Teradata.com/experience, which is available at no cost. It includes the notebooks, ingestion pipeline, vector store setup, and agentic workflow shown during the livestream. Unstructured also provides free credits that make it easy to reproduce and extend the example.

If you are exploring multimodal embeddings, agentic retrieval patterns, or enterprise‑grade vector search, the livestream walks through the full system working together, we have also prepared a detailed blog for step by step guidance.

Building Multimodal Agentic Search Over Images and Documents — A DevTalk was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

End-to-End Multimodal RAG and Agents With Teradata and Unstructured

Daniel Herrera — Mon, 23 Mar 2026 12:06:15 GMT

Multimodal RAG and Agents with Teradata & Unstructured

The challenge of grounding the work of agentic systems

Agentic systems work across multiple steps to achieve a specific goal. Precision and predictability in achieving that goal require rich context that often goes beyond structured transactional data. Policies, procedures, diagnostics, design documents, and customer conversations are often stored as documents, PDFs, images, and audio files. When that content is transformed into embeddings, making it searchable, agents gain something closer to long-term memory: they can retrieve relevant context, cite sources, and ground their responses in what is true for the organization, making them accurate and traceable.

Most enterprise data stacks split these two worlds cleanly. Transactional facts live in relational tables. Policies, manuals, and media live somewhere else, in object storage or file systems. Bringing them together requires additional data-processing pipelines and adds governance overhead.

It is possible, though, to bridge the gap between these two worlds, and it can be done on the same platform, without data duplication, data movement overhead, and extra governance layers.

Multimodal agentic development with Teradata and Unstructured

Teradata Enterprise Vector Store has unleashed a myriad of use cases for Teradata customers, including:

Customer 360 enrichment and contact center intelligence
Autonomous customer insights
Compliance and policy intelligence
Insurance claims and contract analysis
R&D and engineering knowledge retrieval
Large‑scale enterprise search
Defense and public‑sector operational intelligence

These use cases are enabled by performing vector and metadata-aware search across structured and unstructured data in a single operation, directly within the Teradata system as the single source of truth.

Teradata’s partnership with Unstructured extends these capabilities by introducing a unified multimodal ingestion and enrichment layer capable of parsing 70+ file types and generating embeddings for text and images, with audio support on the roadmap. The Unstructured pipelines handle chunking, metadata extraction, enrichment, adaptive model selection, and embedding generation, and deliver these enriched artifacts directly into Teradata’s enterprise analytics platform. The Teradata platform provides a native security model, object‑level governance, and lineage propagation. Because the Enterprise Vector Store is implemented on top of Teradata’s MPP execution framework, vector indexing, similarity search, and hybrid semantic–lexical retrieval scale transparently with other analytic and operational workloads, inheriting Teradata’s concurrency, data locality, and high‑throughput characteristics.

For development, the langchain-teradata library offers a familiar interface for incorporating Teradata Enterprise Vector Store into your agentic workflows, providing the standard agentic primitives: chat models, tools, and retrievers. The result is a single system where agents can search, and act, without the extra pipelines or governance overhead that come with stitching separate tools together.

Experience multimodal agentic development with Teradata and Unstructured

In the walkthrough that follows, we will build a life sciences image question-and-answering agent powered by Teradata Enterprise Vector Store, using data ingested and embedded by Unstructured.

The final product is an agentic system where a user can pick an image from a gallery of medical image samples and ask questions about it. The agent relies on a library of medical image documentation, finds the documentation most like the selected image, and uses it to answer the user’s question.

Image 1 — Image grid gallery

The process of building this agent consists of the following general steps:

Build an ingestion and processing pipeline for composite PDF documents and images using Unstructured, which also generates the embeddings.
Store the embeddings in Teradata’s enterprise analytics platform and index them with Teradata Enterprise Vector Store.
Match a user‑selected image to its most similar documents using vector similarity search.
Wrap the workflow in a LangChain agent that allows users to query the results in natural language.

Setting up your development environment

If you haven’t already, start by creating an account at teradata.com/experience.
Once you’re logged in, create a new environment. This ensures you’re working with the latest features and capabilities.

Image 2 — Create an environment

Start the Jupyter Notebook environment by clicking “Run demos.”

Image 3 — Run demos

Exploring the notebook

Setting up needed libraries and access to Teradata Enterprise Vector Store

First, we install the packages needed.

%%capture
!pip install -r "./utils/requirements.txt" --quiet

After restarting the kernel, we import the libraries that will power our agentic multimodal system.

# Required imports

# General imports
from teradatagenai import VSManager
from langchain_teradata import TeradataVectorStore
from teradataml import *
import os
import json
import re
import time

# Credentials and configuration management
from dotenv import load_dotenv
from getpass import getpass


#Langchain Imports
from langchain.chat_models import init_chat_model
from langchain.agents import create_agent
from langchain_core.messages import HumanMessage
from langchain.tools import tool

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')
display.suppress_vantage_runtime_warnings = True

# Widget display
from IPython.display import display, HTML, IFrame
import ipywidgets as widgets

# Import utilities
from unstructured_utils.teradata_ingest import ingest
from utils.image_grid import display_image_grid

Our trial experience provides access to a Teradata instance with Enterprise Vector Store capabilities.

ues_uri=env_vars.get("ues_uri")
if ues_uri.endswith("/open-analytics"):
  ues_uri = ues_uri[:-15]  # remove last 5 chars ("/open-analytics")

if set_auth_token(base_url=ues_uri,
         pat_token=env_vars.get("access_token"), 
         pem_file=env_vars.get("pem_file")
         ):
  print("UES Authentication successful")
else:
  print("UES Authentication failed. Check credentials.")
  sys.exit(1)

VSManager.health()
```

Ingestion and processing pipeline with Unstructured

Unstructured offers a graphical user interface (GUI) to create workflows. The same building blocks used in the GUI are available from Unstructured API. Namely: sources, destinations, workflows, and jobs.
Workflows orchestrate data processing between a source and a destination with steps such as partitioning, chunking, enriching, embedding, etc.
Jobs are specific execution of a workflow.
We have built a function that abstracts our defined workflow. The full implementation is contained in the /unstructured_utils folder and imported into the notebook.

The workflow has three sequential nodes:

Partitioning: Uses Claude as a vision-language model (VLM) to extract structured content from raw documents. Using a VLM rather than a traditional text extractor means it can handle complex layouts, images, and tables by “looking” at the document rather than just parsing text.
Chunking: Splits the partitioned content into pieces using a title-aware strategy. It targets 1,500 characters per chunk, caps at 2,048, and adds a 100-character overlap between consecutive chunks so context isn’t lost at boundaries. It also retains the original elements alongside the chunks, which is useful for traceability.
Embedding: Converts each chunk into a vector using Voyage AI’s multimodal model. The multimodal part matters here because the upstream VLM partitioner may have preserved image-based content, and a multimodal embedder can represent that meaningfully rather than ignoring it.

In the notebook, we call this function to execute the ingestion of both composite PDFs and images.

ingest(api_key=unstructured_api_key, 
    td_host=env_vars.get("host"), 
    td_user=dbuser, 
    td_password=dbpwd, 
    td_database=default_db, 
    td_table='composite_pdfdocs_embedded', 
    s3_uri="s3://dev-rel-demos/teradata-unstructured/healthcare-assets/composite-pdfs/", 
    s3_anonymous=True)

ingest(api_key=unstructured_api_key, 
    td_host=env_vars.get("host"), 
    td_user=dbuser, 
    td_password=dbpwd, 
    td_database=default_db, 
    td_table='image_samples_embedded', 
    s3_uri="s3://dev-rel-demos/teradata-unstructured/healthcare-assets/images/", 
    s3_anonymous=True)

Teradata Enterprise Vector Store: The engine for processing unstructured data

The first step to create our Vector Store is to ingest the data loaded by Unstructured into teradataml DataFrames. We can inspect a sample of the records.
The preview displays the record_id, filename, and record_locator columns. The embedding vectors are omitted here for readability, but they are present in the underlying table and will be used for similarity search in later steps.

Image 4 — Ingested data in Teradata DataFrame

Once the documents are ingested and embedded, we create a Vector Store based on the embeddings through the `TeradataVectorStore` class `from_embeddings` method. This is a class inherited from `langchain-teradata`.

vs_emb = TeradataVectorStore.from_embeddings(name = f"{default_db}_mm_embeddings_vs",
                   data = image_documentation_bank,
                   data_columns = "embeddings",
                   key_columns = ["id", "record_id"],
                   embedding_data_columns = "text",
                   metadata_columns = ["text","date_created", "date_modified", "record_locator", "filename"],)

The Vector Store registers and creates a new vector index in VantageCloud over the embedding column, enabling sub-second approximate nearest-neighbor lookups at scale. Key parameters include the data table, the column containing the embedding vectors `data_columns`, the primary key columns for deduplication, and the metadata columns to carry through into search results.
The `TeradataVectorStore` class includes methods for performing searches, semantical or lexical, asking questions about the results of searches (powered by an included LLM client), and to update, re-index or destroy the Vector Store.
We can use the Vector Store to perform similarity searches between any image on the image bank and the documentation library.

In this case, we are retrieving the top record from the DataFrame containing the image data, and we perform a similarity search by vector retrieving the most similar document.

response = vs_emb.similarity_search_by_vector(data = raw_images_df.head(1), column='embeddings', top_k=1)

Image 5 — Similarity search results

The match is driven by the image contained in the composite document. The retrieval returns the associated text along with metadata included when creating the Vector Store index, such as the `record_locator` and `filename`. In an agentic system, this metadata allows an agent to fetch the original document if needed, while the text provides information to answer questions about the selected image.

Teradata Enterprise Vector Store includes a built-in LLM integration that enables natural language queries over vector search results.

question='I need to recover The title, description and record id, and locator of the most similar record?'
prompt='Format the response in a conversational way.'
response = vs_emb.prepare_response(question=question, similarity_results=response, prompt=prompt)

The response object contains the requested data in the requested format as processed by the chat model included in the Teradata Enterprise Vector Store.

Based on the provided data, here is the information you requested:

Title: Image 4: Immune Cell Infiltration in Solid Tumor Tissue
Description: Microscopic image showing immune cell infiltration in solid tumor tissue. The image displays a colorful fluorescent staining pattern with bright yellow circular structures scattered throughout a predominantly pink and magenta background with blue areas. Dark black voids are visible throughout the tissue section. The staining reveals cellular structures and immune markers in the tumor microenvironment.

This image captures a solid tumor section with visible immune cell infiltration. The bright yellow circular structures likely represent lipid droplets or specific immune cell subtypes such as macrophages or dendritic cells. The dark voids may be necrotic foci or gland-like structures within the tumor. The predominant pink and magenta signals indicate widespread expression of tumor or immune markers throughout the tissue. This kind of staining is commonly used in studies examining the tumor immune microenvironment, particularly in the context of immunotherapy response.

Record ID: c724ea12-fa51–554d-b552-cb2d5de3b0a7
Record Locator: {“protocol”: “s3”, “remote_file_path”: “s3://dev-rel-demos/teradata-unstructured/healthcare-assets/composite-pdfs/”}

The capabilities we have explored unlock a lot of possibilities for an agentic system, such as the one we are building.

Building a LangChain agent powered by Teradata Enterprise Vector Store

Our agent is wrapped in a widget UI that allows users to select an image from the gallery, and a chat widget that enables conversations. These widgets are merely UI abstractions; the components of the agent itself are the same as any LangChain agent. We need a `chat_llm` component, and a set of tools.

Our trial experience includes a proxy LLM client to power the `chat_llm`, while Teradata Enterprise Vector Store includes all the functionality needed to power the core tool.

llm_key = env_vars.get("litellm_key")
llm_url = env_vars.get("litellm_base_url")

llm = init_chat_model(
  model="openai-gpt-41",
  model_provider="openai",
  base_url=llm_url,
  api_key=llm_key,
)

The main tool is a thin abstraction over the `similarity_search` and `prepare_response` methods of Teradata Enterprise Vector Store. The image grid widget is generated from the DataFrame containing the image data, and selection buttons that update the `selected_id` parameter to look up the corresponding embedding. From there, the process follows the same steps described above.

@tool
def search_similar_image_documentation(dummy: str = "") -> str:
  """
  Runs a similarity search using the currently selected image and displays
  the most similar result. Call this whenever the user asks about similar images.
  """
  if result.selected_id is None:
    return "No image is selected. Please select an image from the grid first."

  response_similarity = vs_emb.similarity_search_by_vector(
    data=raw_images_df[raw_images_df["record_id"] == result.selected_id],
    column="embeddings",
    top_k=1,
    return_type="json",
  )
  question = "I need to recover the title, description, filename, record id, and file_locator of the most similar record?"
  prompt = "Format the response as JSON object."
  response_chat = vs_emb.prepare_response(
    question=question,
    similarity_results=response_similarity,
    prompt=prompt,
  )

  with chat_output:
    display(HTML(
      f"{response_chat}
"
    ))

  return response_chat

The agent itself includes specific instructions to interact with the tools and respond to the users’ intent.

You are an image analysis assistant.

Tools available:

- search_similar_image_documentation: Find images similar to a given input and retrieve associated documentation

- display_pdf_from_locator: Render a PDF document

Workflow:

1. When a user provides an image or asks for similar images, call search_similar_image_documentation

2. If the user asks to view/open a PDF from the results, extract `filename` and `remote_file_path` from the search response and pass them to display_pdf_from_locator

3. If the user asks to describe features from the image in questions like: — What are the XYZ features of this image -

a. First call search_similar_image_documentation

b. If the documentation contains relevant information, answer based on it

c. If not, inform the user that the image library documentation does not contain information to answer their question, then display the PDF using display_pdf_from_locator so they can review it directly

d. Never ask to upload the image as the user is referring to the selected image.

If a search returns no results or the PDF fields are missing, let the user know.

The widget UI and the agent enable the fully agentic experience.

Image 6— Agentic medical image question-and-answer experience

Learn more about Teradata multimodal agentic capabilities.

Build the demo yourself at teradata.com/experience.

Conclusion

Bringing together Teradata Enterprise Vector Store and Unstructured creates a unified foundation for multimodal agentic systems, one where structured and unstructured data coexist naturally, governance stays simple, and developers can build advanced retrieval-augmented workflows without extra pipelines or infrastructure overhead. High-fidelity ingestion, scalable vector indexing, and agentic tooling combine into a stack that reduces the friction between experimentation and production. The walkthrough here is one example of the pattern, but the same approach applies to customer experience, automation, and context-aware intelligence across industries. As multi-modal data and agentic AI continue to expand, this integrated stack gives organizations a practical path to accurate, grounded, and enterprise-ready agentic systems built for scale.

End-to-End Multimodal RAG and Agents With Teradata and Unstructured was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

Introducing Teradata Enterprise MCP: Build Your First Data Analyst Agent

Daniel Herrera — Fri, 30 Jan 2026 12:02:23 GMT

Teradata Enterprise MCP is designed to empower organizations to build advanced data analyst agents through secure, scalable, and agentic AI-powered analytics.

An autonomous enterprise is one that uses its knowledge, applies agentic reasoning, and delivers faster, higher‑quality outcomes for customers. It operates through a modern stack that unifies a knowledge layer, an intelligent agent layer, and an outcomes layer, enabling systems to reason and act with context rather than rely solely on manual workflows.

The agentic layer combines Large Language Models (LLMs) reasoning capabilities with trusted data, policies, context, and analytics, enabling agents to perform actions such as querying structured and unstructured data, executing advanced analytics, running predictions, retrieving documents, and grounding responses in enterprise reality. This connection transforms AI from experimentation into operational impact.

Teradata Enterprise MCP acts as a core pillar of this autonomy by providing secure access to data, tool standardization, context management, auditability, and high scalability for agent actions. It ensures that agents can interact with enterprise systems in a trusted, governed, and consistent way while using analytics and knowledge capabilities at production scale.

The Model Context Protocol –MCP– in brief:

MCP is an open-source standard designed to make AI systems talk to the outside world in a predictable, structured way. Instead of every developer inventing a new format for tools, capabilities, or metadata, MCP provides a common contract defining all these aspects, standardizing LLMs interactions with external tools. For a developer, this means spending more time designing the behavior of AI powered agents, not worrying about how to package or describe needed tools or necessary parameters. The protocol takes care of that for you.

Teradata research teams and a group of community contributors built the community driven Teradata MCP, this open-source effort has taken off in ways none of us fully expected. Suddenly, people were building agents that could explore data, run analytics, and automate workflows directly inside the Teradata ecosystem. The best part is that the project keeps getting better as contributors continue to refine the server, add new tools, tighten the interface, and push the boundaries of what MCP powered agents can do.

Teradata MCP server architecture

The challenges of deploying and implementing an MCP server

MCP isn’t without its challenges. While implementing an MCP server for an enterprise knowledge platform, the same questions always come up. Can it scale across teams? Can it connect to sensitive data with the right governance and privacy controls? What happens when teams start chaining servers together and bump into context‑overflow issues?

In consideration of these pain-points, Teradata has developed Teradata Enterprise MCP. Teradata Enterprise MCP tackles the core operational and governance challenges head‑on. Enabling data professionals to focus on building agents that deliver value, easing the concerns around deployment, scaling, and access controls common to MCP as a protocol.

How Teradata Enterprise MCP contributes to your Agentic AI projects

Teradata Enterprise MCP ships with a set of features that tackle common implementation challenges.

Features of Teradata Enterprise MCP

Security:

Every user connects to Teradata Enterprise MCP using their own credentials, whether that’s basic TD2 authentication or JWT and Bearer tokens validated through the organization’s identity provider. This ensures the server only exposes the data that specific users are authorized to see, instead of routing all users through a single service account, unless the latter is the chosen implementation decision of the organization.
RBAC controls tool visibility. MCP roles, with their corresponding assigned toolset, can be mapped to Teradata database roles so you can define exactly which tools a user can access in the system according to the user’s role.
SQL injection protection, security scans, and end-to-end logging provide an enterprise grade security posture without adding friction for developers.

Performance:

The server is built for multiuser, multi-session workloads with automatic connection pooling and sub second response times. Teams across the company can reliably share one MCP server without creating bottlenecks.

Tool Context Management:

A deep catalog of more than a hundred production ready tools; from database queries and DBA functions to ML, vector search, and RAG might be difficult to manage for LLMs, however, because tool exposure is governed per role, each agent stays within a clean and compliant sandbox.

Experience Teradata Enterprise MCP

You can experience Teradata Enterprise MCP directly inside ClearScape Analytics Experience. We’ve put together a sample notebook that shows a simple LangChain integration. This is only a starting point that you can extend according to your needs.

You can easily experiment on your own by following the steps below:

Setting up your development environment

If you haven’t used ClearScape Analytics Experience before, start by creating an account at teradata.com/experience.
Once you’re logged in, create a new environment. This ensures you’re working with the latest features and capabilities. Take note of your chosen password as you will need it to create the MCP server configuration later.

Create an environment on ClearScape Analytics Experience

Start the Jupyter Notebook environment by clicking “Run demos”.

Run demos on ClearScape Analytics Experience

In the Jupyter Notebook environment, open the “Getting Started” folder look for the folder named Enterprise_MCP_Data_Analyst_Agent, open the notebook inside that folder.

Exploring the notebook

First, we install the packages needed for the MCP integration with LangChain.

pip install -U langchain langchain-mcp-adapters langchain-openai httpx

After restarting the kernel, we import the libraries that will power our agent: asyncio for async operations, the MCP client adapter, and LangChain’s chat model and agent components.

import asyncio, sys, os, httpx
import time
from dotenv import load_dotenv
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain.chat_models import init_chat_model
from langchain.agents import create_agent
from langchain_core.messages import HumanMessage
from langchain_core.messages import AIMessage

The notebook pulls in your environment settings from a .env file. This keeps your credentials secure and your code clean.

environment_path = "/home/jovyan/JupyterLabRoot/VantageCloud_Lake/.config/.env"
load_dotenv(dotenv_path=environment_path) 
llm_key = os.getenv('litellm_key')
llm_url = os.getenv('litellm_base_url')
username = os.getenv('username')
password = os.getenv('my_variable')

Here we establish the connection to the Enterprise MCP Server using LangChain’s MultiServerMCPClient. The MCP server uses the credentials retrieved from the environment variables to connect on your behalf and returns the available tools.

async def get_mcp_tools():
    client = MultiServerMCPClient(
        {
            "teradata": {
                "transport": "http",
                "url": "http://host.docker.internal:8001/mcp",
                "auth": httpx.BasicAuth(username, password),
            }
        }
    )
    tools = await client.get_tools()
    return client, tools

client, tools = await get_mcp_tools()
print([t.name for t in tools])

We initialize a chat model through the LLM proxy available in ClearScape Analytics Experience, then create a LangChain agent by passing in the Teradata MCP server tools. The system prompt tells the agent it’s a data analyst working with Teradata.

llm = init_chat_model( 
    model="openai-gpt-41", 
    model_provider="openai", 
    base_url="https://llmlite.ci.clearscape.teradata.com", 
    api_key=llm_key, ) 

agent = create_agent(
        model=llm, 
        tools=tools, 
        system_prompt=(
        ''' You are a data analyst that responds to user questions regarding data in a teradata system ''' 
        ), )

With everything set up, you can send natural language queries to the agent. It interprets your question, picks the right MCP tools, and executes them against your Teradata environment.

user_query = input('\n What would you like to know about your data? ')
result = await agent.ainvoke(
    {"messages": [HumanMessage(content=user_query)]}
)

Query submitted to the agent

The answer of the agent to this specific question is as follows:

print(result["messages"][-1].content)

Agent response to the user query

ClearScape Analytics Experience is a free of cost experience with access to limited resources, for a full demo on a production grade Teradata environment feel free to reach us out at Teradata Enterprise AgentStack.

Conclusion:

The shift toward autonomous enterprises is no longer theoretical. With agentic AI, unified knowledge layers, and governed execution environments, organizations now have the foundations they need to build systems that reason, act, and deliver value with less manual intervention. The Teradata Enterprise MCP Server plays a significant role in making this real. It removes the operational and security hurdles that traditionally slow down MCP adoption and replaces them with a platform that feels stable, scalable, and ready for enterprise use. By giving teams, a secure, governed, and high‑performance way to connect LLMs and agents to trusted data, the Enterprise MCP Server frees developers and data practitioners to do what they do best: build. Whether you are exploring early prototypes, scaling agentic workloads, or preparing for production‑grade deployments, the Enterprise MCP Server provides the backbone that lets you move faster, stay compliant, and unlock the full potential of agentic AI inside your organization.

About the Author:

Daniel Herrera is Principal Developer Advocate at Teradata, specializing in AI/ML and advanced analytics. He helps developers and data practitioners unlock enterprise-scale innovation through ClearScape Analytics and VantageCloud. With deep expertise in data engineering, cloud architecture, and model deployment, Daniel bridges technical complexity with practical solutions. A frequent speaker and community contributor, he focuses on enabling scalable AI use cases across industries.

You can reach out to Daniel in LinkedIn

Introducing Teradata Enterprise MCP: Build Your First Data Analyst Agent was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

A Gateway to Enhanced Agents & AI Workloads: Arrow Flight SQL with QueryGrid®

Sebastian — Thu, 29 Jan 2026 14:36:49 GMT

Unveiling the power of the Arrow Flight SQL connector for QueryGrid®

The introduction of the Arrow Flight SQL connector for QueryGrid® marks a significant advancement for organizations seeking to accelerate data-driven & AI-driven innovation. By enabling streamlined, high-performance access to diverse data sources without the overhead of traditional ETL processes, this capability empowers teams to unlock faster insights, enhance collaboration, and optimize their AI workflows. As a result, customers can maximize the value of their existing data infrastructure while future-proofing their operations for emerging AI and agentic workloads.

QueryGrid(r) connector stack

Introduction to Apache Arrow

Apache Arrow is a cross-language development platform designed for in-memory data processing. It provides a standardized columnar memory format for flat and hierarchical data, enabling efficient analytics operations. Arrow’s key strength lies in its ability to handle large datasets with minimal overhead, facilitating rapid data access and reduced memory usage. Over the last few years, Arrow has gained traction and interest in the Data Science community, with commits to its codebase added every day.

Arrow Flight and Arrow Flight SQL

Arrow Flight is an extension of Apache Arrow that introduces high-performance messaging for data transfer. It leverages gRPC to enable efficient, secure, and highly scalable data transport between systems. Building upon Arrow Flight, Arrow Flight SQL provides a protocol for executing SQL queries over Arrow Flight, allowing seamless and efficient interaction with SQL-based systems. The performance achieved is over 15 times higher than what a single-threaded protocol like JDBC can deliver (based on lab tests).

Meet QueryGrid® Arrow Flight SQL connector

QueryGrid® is a sophisticated query fabric that seamlessly connects diverse data sources and processing engines. It enables businesses to execute queries across multiple platforms, integrating disparate data environments into a unified analytical ecosystem. The integration of QueryGrid® with Arrow Flight SQL extends this capability to a broader range of engines and systems, and provides a net new capability: now client tools can also connect through the QueryGrid® Arrow Flight SQL connector. All is done leveraging Arrow’s modern and efficient data format and transport protocols, maintaining QueryGrid’s legacy: providing high-throughput parallel data exchange, and minimizing data movement when possible.

QueryGrid(r) Arrow Flight SQL Connector

Client Server Architecture

QueryGrid® Arrow Flight SQL connector comes in two flavors: Client & Server. When the client capability is configured, the Teradata Vantage database can read data in Arrow format from other engines on the fly, without the need to copy the data upfront or perform any ETL processes (capability is available today). When the Server is configured, the Teradata Vantage database is exposed as an Arrow Flight SQL endpoint, allowing not only engines, but also tools, to read and write data in parallel, leveraging this highly efficient columnar format and transfer protocol. The server capability will be released soon, but don’t let this block you from planning how to leverage it in your future architecture; you can start the conversation with us today and see this in action.

Expanding the Ecosystem

Integrating QueryGrid® with Apache Arrow allows enterprises to tap into a wide array of data processing engines supported by Arrow Flight SQL, such as DuckDB, Dremio, and Apache Doris, just to name a few. It also allows any ADBC-compliant driver to read data from Vantage (coming soon with server capability).

Enhancements in Data Processing

QueryGrid® and Apache Arrow facilitate smooth, high-speed parallel data transfer and query execution across multiple data sources. Combined with the intelligent push-down processing that QueryGrid® provides, this enables efficient data exchange between applications, something that is key in a distributed environment.

QueryGrid MCP Server CE enables Agentic workflows

Built for AI Workloads & Agents

The integration of QueryGrid® with Apache Arrow is particularly advantageous for AI and Agentic workloads. These workloads often require the processing of massive datasets, real-time data streaming, and the execution of complex queries. Apache Arrow’s columnar format and the high-performance transmission via Arrow Flight and Arrow Flight SQL provide the robustness and speed necessary for these demanding tasks, while also simplifying the interface between systems.

Accelerating Training and Inference

With the capability to efficiently handle large volumes of data, the integration aids in speeding up both the training and inference phases of machine learning models. Data scientists can seamlessly access and manipulate data from various sources, reducing the time spent on data preparation and transformation.

Enhanced Interoperability and Flexibility

This integration ensures that models can leverage the best-suited data processing engines, enhancing flexibility and scalability. We know that Agents are greedy and require massive amounts of data, while also the need for consolidating multiple systems. QueryGrid® Arrow Flight SQL connector exposes a simple interface for agents to efficiently grab data from multiple engines and perform complex tasks. Imagine the power you can get if you also connect this with Teradata’s and QueryGrid MCP Servers, or the Teradata AgentStack.

Conclusion

The QueryGrid® Arrow Flight SQL connector delivers high performance and seamless integration for AI and Agentic workloads. By utilizing Apache Arrow’s efficient columnar data format and high-speed transport protocols, QueryGrid® enables robust connectivity across numerous engines and systems, empowering organizations to execute advanced analytics and AI/ML tasks with ease. This integration not only streamlines data workflows but also provides agents with an efficient and unified interface to access large volumes of data from diverse sources, unlocking new possibilities in data science and machine learning.

A Gateway to Enhanced Agents & AI Workloads: Arrow Flight SQL with QueryGrid® was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

Why You’ll Want QueryGrid® MCP Server Community Edition in Your Toolkit

Sebastian — Tue, 06 Jan 2026 12:41:04 GMT

Streamline, automate, and secure your QueryGrid management with this new release

Let’s face it, managing large data resources can become complicated quickly. If you’re working with QueryGrid®, you already know that staying on top of your environment means juggling a lot of moving parts. Enter the QueryGrid® MCP Server Community Edition (CE), a freshly launched toolkit that’s about to make your life a whole lot easier.

QueryGrid(r) MCP Server components

What Is the QueryGrid MCP Server CE?

Think of the MCP Server CE as your new command center for all things QueryGrid® management. It’s designed to help AI assistants and other MCP clients talk directly to the QueryGrid Manager (QGM) using a standardized protocol. The focus here isn’t on querying data directly: it’s about making administrative and operational tasks faster, smarter, and more secure. And if you’re already running QueryGrid 3.x, you’re good to go; this release is fully compatible with your environment.

Key Features

Massive Toolkit: Over 120 built-in tools support just about every QueryGrid operation you can think of: monitoring resources, adding and removing sources, tweaking configs, creating and dropping links, and tons more.
Controlled Security: Basic authentication for QueryGrid Manager means better access control. Plus, logging is totally configurable, so you get the transparency you need.
RESTful API Integration: Full support for Create, Read, Update, and Delete (CRUD) operations. All you need for the automation you’ve been looking for.

Why Switch to MCP Server CE?

Boosted Efficiency: The toolkit is comprehensive and fully tested, backed by a robust API layer provided by QueryGrid® Manager service. That means even complex management tasks are handled quickly and consistently, so you can accelerate your development and deployment cycles.
Automation is King: With the ability to programmatically manage QueryGrid® resources, you can automate all your admin workflows — whether through chatbots, agents, or other tools — and get back to the fun stuff.
Built-in Security: Only authorized users or systems can make changes, thanks to secure authentication mechanisms.
Ready for Scale: Built for production from the ground up, with configurable logging for transparency and traceability. Start with small projects and climb to mission-critical deployments.

Takeways

Whether you’re a developer, data engineer, DBA, or someone who wants smoother QueryGrid® management, the QueryGrid® MCP Server Community Edition is a game-changer. It’s secure, flexible, and packed with features that will make your admin processes more efficient and reliable. If you’re ready to embrace modern automation and development practices, this toolkit deserves a spot in your environment. And remember, you can expand the power of this freshly released server with Teradata’s MCP Server Community Edition.

Start your journey now by downloading your copy from Teradata GitHub.

Why You’ll Want QueryGrid® MCP Server Community Edition in Your Toolkit was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

Building Autonomous Customer Intelligence: A Developer’s Guide to Teradata’s Customer Intelligence…

Vidhan Bhonsle — Tue, 06 Jan 2026 12:32:58 GMT

Building Autonomous Customer Intelligence: A Developer’s Guide to Teradata’s Customer Intelligence Framework

Introduction

The way customer data systems are built is undergoing a remarkable transformation. Across the industry, significant milestones have been reached — unified Customer 360 views, sophisticated machine learning (ML) models, and real-time streaming pipelines. These foundations set up the next leap: systems that don’t just analyze customer behavior, but actively shape experiences in real time.

The Teradata Customer Intelligence Framework represents this next generation of customer intelligence architecture. Built on established data engineering and machine learning practices, it enables autonomous, real-time decision-making. The framework equips data scientists, data engineers, ML engineers, and platform architects to create systems that sense needs, understand context, and take immediate action; powered by AI for CX that delivers measurable business impact.

The challenge

Modern customer data architectures hit the same three walls repeatedly. Data lives in disconnected systems, making a reliable Customer 360 fragile at best. Batch-heavy pipelines introduce latency, so insights arrive after the moment has passed. And unresolved customer tasks often escalate to costly channels, driving up operations while eroding satisfaction.

Pain points of customer experiences

The Customer Intelligence Framework addresses these limits with an event-driven approach: unify raw streams into reusable data products, detect signals that carry business meaning in real time, and trigger guard-railed agent decisions that act immediately. In short, move from fragmented data and delayed reports to a closed loop that senses, decides, and acts.

Architecture: from data to intelligence

The Customer Intelligence Framework transforms raw data into autonomous intelligence through three core components working across five integrated layers:

Foundational Elements

Core components of Customer Intelligence Framework

Data products — Reusable, structured assets that organize complex customer data. These multi-dimensional building blocks can be assembled differently for various use cases, with full version control and API accessibility via REST, Python SDKs, or SQL.
Signals — Real-time patterns detected from customer behavior, not just transactions. The framework continuously identifies meaningful events like engagement drops, anomalous patterns, or service escalations, enabling immediate response rather than batch processing.
Agents — Intelligent entities that autonomously execute decisions within configured guardrails. These agents operate across the end-to-end framework — from data orchestration and feature engineering to signal detection and service activation. They provide natural language interfaces, coordinate with other agents, and maintain full audit trails of their reasoning and actions.

Architecture Components

Customer Intelligence Framework Diagram

Data ingestion. Structured, semi-structured, and unstructured sources land through streaming, ETL, object storage, and APIs; catalogued for lineage.
Feature engineering (in-database). Transform inputs into robust features and vector representations close to the data to minimize movement.
Models and rules. Hybrid decisioning blends ML models, heuristics, and policy rules to balance accuracy, cost, and control.
Signal processing. Detect, evaluate, and fuse patterns; a semantic layer maps technical signals (scores, thresholds, sequences) to business concepts (intent, risk, opportunity).
Service activation (VCX). Publish decisions and signals via pub/sub so downstream applications subscribe to just-in-time intelligence (NBA/NBO, retention, remediation). Outcomes flow back for continuous learning.

The architecture flows from data ingestion (supporting structured, semi-structured, and unstructured sources) through features engineering and model deployment using ClearScape Analytics® for in-database execution.

Signal processing interprets these patterns through a semantic layer that maps technical signals to business concepts, while the service activation layer, powered by Teradata VCX, streams real-time intelligence to downstream applications via publish/subscribe patterns.

Key technical advantages:

Industry Data Models provide pre-built schemas for rapid deployment
In-database processing eliminates data movement overhead
Multi-model orchestration combines rules, ML models, and heuristics
Every application can subscribe to relevant signals via REST APIs

Customer lifetime value multi-agent system demo

This demo shows how autonomous customer intelligence turns CLV forecasting and growth into a production workflow. The multi-agent system runs in Teradata AgentBuilder, keeps computation inside Teradata Vantage, and analyzes historical behavior directly in-database. LLMs — GPT-4.1 (routing/response) and Claude Sonnet-4 (analysis/strategy), drive intent understanding, exploration, and decisioning, combining reasoning with real-time signals to answer questions like “Which customer type has the highest CLV and why?” and to recommend next-best actions that lift retention and cross-sell.

The implementation uses Flowise inside AgentBuilder for low-code composition; the pattern is integration-friendly and supports additional connectors.

Architecture of the Customer Lifetime Value multi-agent system

In the multi-agent system architecture, a request enters at “Start”, the conditional agent classifies intent, and exactly one specialist is selected — data exploration, insights generator, or strategic reasoning. The “check if chart required” gate then decides whether a visual adds clarity. If a chart is warranted, the “Visualization Agent” prepares a spec and a custom JavaScript function renders the graphic using Teradata plotting; otherwise, the LLM node composes a concise text response. The result is a governed path from question to decision, with visuals used only when they improve understanding.

The multi-agent system includes the following nodes:

Start node: Initializes the run by capturing the user’s input and shared context (customer/segment, time window, policy flags) for downstream nodes.
Condition agent nodes: Route the flow based on intent (exploration/insights/strategy) and presentation needs (whether a chart is required).
Agent nodes: Execute scoped tasks — data exploration, insight generation, strategic reasoning, and visualization using Teradata MCP tools against Vantage with clear guardrails.
LLM node: Composes the final, business-ready response by fusing agent outputs, rationale, KPIs, and (if present) a chart caption.
Custom function node: Renders artifacts like charts from a validated spec and returns an embeddable image/URI for the response.

Let’s look at the flow of the whole system!

Start node

The start node initializes the flow, captures the user’s input (chat or form) and sets shared variables — question, optional customer/segment, time window, table handles, and policy flags, for all downstream nodes.

Start Node configuration

Condition agent node: intent identification agent

The Intent Identification Agent is the first of the two condition-agent nodes in the system. It decides which agent to call from the three available specialists, based on the scenario expressed in the user’s query.

There are three scenarios:

User is asking for basic details about tables, databases, columns, etc.
- Routes to Data Exploration Agent
User is asking about banking churn, customer lifetime value, CLV, correlation, sentiments analysis, charts like pie chart, histogram etc.
- Routes to Insights Generator Agent
User is asking about recommendation to reduce the churn, what are the main factors leading to a churn or low CLV, recommend me method to reduce churn, which are the important factor for churn.
- Routes to Strategic Reasoning Agent

Intent Identification Agent node configuration

Based on the prompt instructions, an LLM such as Azure ChatOpenAI (GPT-4.1) interprets the intent, selects the appropriate scenario, and triggers the corresponding agent node. The model can be chosen per task; in this agent node, GPT-4.1 is used and configured with an API key (plus endpoint/deployment where applicable).

LLM configuration of Intent Identification Agent node

Agent node: data exploration agent

Reports what data is available (e.g., databases/tables, key columns, grain, joins, and freshness) and can return small, PII-safe previews using Teradata MCP tools against Vantage.

Data Exploration Agent node configuration

The LLM used here is Anthropic Claude Sonnet 4; it can be configured the same way as GPT-4.1. All access to LLMs is provided through a cloud service provider. The specific models available, along with credentials and base URLs, depend on the deployment of Teradata and AgentBuilder.

LLM configuration of Data Exploration Agent node

Access to the Teradata MCP server is configured through the “Tool” panel. As with LLM integration, the connection URL and credentials depend on the specific deployment of Teradata and AgentBuilder.

Providing the right tools via Teradata MCP is critical for accurate LLM outputs. Select and scope these tools in the “Available Actions” panel so each agent can call exactly what it needs for governed, reliable results.

Tools can only be selected under Available Actions once MCP server is configured.

MCP configuration of Data Exploration Agent node

Agent node: insight generator agent

Interprets the question and derives findings only from the governed data, following a reasoning-first workflow: quick data checks (columns, types, missingness), method selection (aggregations, correlations, simple tests), and a brief rationale. It performs the calculation, not just a description, and explains the process before stating results.

For simple scalar asks, it stays concise; for richer questions, it returns a compact business summary. Output is markdown with “Reasoning” and “Conclusion” sections for clarity and reuse.

Insight Generator Agent node configuration

The LLM and MCP configurations mirror the Data Exploration Agents configuration, with the addition of Sentiment Extractor in Available Actions when needed.

MCP tools configuration of Insight Generator Agent node

Agent node: strategic reasoning agent

Converts analytical findings into actionable churn-reduction strategies by reviewing feature importance, explaining how key factors drive attrition, and prioritizing where to intervene. It proposes evidence-based methods (e.g., onboarding improvements, complaint triage, targeted cross-sell) and links each recommendation to the contributing drivers with clear rationale and guardrails. Responses are concise and structured in markdown with an intro, numbered reasoning steps, recommended methods to reduce banking churn, and a brief summary of the highest-impact actions.

Strategic Reasoning Agent node configuration

The MCP and LLM configuration remains the same, with only changes in MCPs Available Actions.

MCP tool configuration of Strategic Reasoning Agent node

Condition agent: check if chart required

Decides whether the answer needs a visualization or should remain textual. It inspects the prompt and, only when the user asks to plot, selects a chart type from Line, Polar, Pie, or Radar; single-value outputs (e.g., corr: 0.89) skip charts. If a valid chart is warranted, it passes a clean spec to the Visualization Agent and Custom Function to render, otherwise it defaults to a table or plain response and routes directly to the “Generate Response” LLM node.

Check if chart required node configuration

The LLM configuration mirrors the earlier condition agent, this node takes the question as input and chooses between two scenarios — chart-related or general. If chart required is selected, it triggers the Visualization Agent to plot a graph, otherwise, it returns a straightforward textual output.

Input and Scenarios detail of Check if chart required node

LLM node: generate response

Composes the final, business-ready answer using the context passed from upstream agents, focusing on clear reasoning and actionable guidance (no code). It identifies the domain of the ask (visualization, tables, churn reduction, database design), applies best-practice logic, and balances detail with brevity for simple scalar questions. The output is 2–4 tight paragraphs that explain the “why,” list concrete next steps, note caveats, and tie recommendations to measurable impact.

Generate Response node configuration

For clean presentation, Return Response As is set to Assistant Message.

Selection of Assistant message in Generate Response node

Agent node: visualization agent

Emits a strict JSON payload — no prose, notes, or code. If unsure about columns, it first verifies with MCP tools before invoking any chart tools. When a chart is needed, it returns only this structure:

{ "type": "", "title": "", "labels": […], "datasets": […] }

If a chart isn’t appropriate, it returns JSON for a simple table instead, without adding or removing fields or including extra text.

Visualization Agent node configuration

Teradata MCP Available Actions include Teradata plotting tools that help generate the requested visual.

MCP tool configuration of Visualization Agent node

It can generate four chart types:

Line — trends over time (e.g., weekly churn rate by segment)
Polar — circular comparison of magnitudes (e.g., complaints by category)
Pie — part-of-whole (e.g., share of customers by risk tier)
Radar — multi-metric profile (e.g., cohort scores across engagement, spend, friction)

Custom function: draw function

The Draw Function turns the Visualization Agent’s JSON into a live chart. It expects a clean payload with type, title, labels, and datasets; if that structure is present, it renders an embedded Chart.js graphic in an iframe. If the input isn’t valid chart data (or the task doesn’t require a chart), it gracefully falls back to a plain message instead of breaking the flow. Supported chart types match the gating node: line, polar, pie, and radar.

To wire it in the flow, pass the Visualization Agent’s output to the function as “input_data = {{$flow.state.main_vis_op}}”. The function parses that value, validates that labels and datasets are arrays, and then builds the chart with the provided title. If validation fails, the function simply returns the text it received (e.g., “No chart needed”), keeping responses clean and predictable.

This approach keeps rendering logic outside the agents, avoids code in the final answer, and ensures visuals appear only when they add clarity, everything else stays fast, lightweight, and easy to audit.

Draw Function node configuration

Output

The CLV multi-agent system tailors each response to the question — returning plain, business-ready text when a single number suffices (e.g., “What is the average CLV?” → $412.37), and a chart when it adds clarity. Two quick examples from the demo:

Text only ask: “What is the average value of CLV”
- Flow: Start → Intent Identification Agent → Insights Generator Agent → Generate Response

The assistant computes the mean CLV in-database and returns a concise explanation (no visualization).

Response and process flow details of “What is the average value of CLV” question

Visualization ask: “Create a visualization for average value of balance and CLV”
- Flow: Start → Intent Identification Agent → Insights Generator Agent → Check if chart required → Visualization Agent → Draw Function

Response details of “Create a visualization for average value of balance and CLV” question

The chart gate detects a plotting request, the Visualization Agent emits a JSON spec, and the Draw Function renders a bar chart.

Process flow details of “Create a visualization for average value of balance and CLV” question

Every response is traceable, the process flow shows which nodes executed, and the output links back to the data and policies used. Visuals appear only when requested or clearly helpful, otherwise, the answer stays fast, readable, and ready to act on.

Conclusion

Teradata’s Customer Intelligence Framework closes the loop from data → signal → activation. Data remains governed in Vantage; signals are detected and interpreted in real time; and activation happens through guard-railed next-best actions that are explainable and measurable. Built in AgentBuilder, the CLV demo shows autonomous customer intelligence in practice: agents understand intent, assemble evidence, and trigger the right response — text for simple answers, visuals when they add clarity, and recommendations when action is warranted.

The pattern extends beyond customer lifetime value to use cases such as churn reduction, cross-sell optimization, and service issue deflection, without redesigning the underlying architecture. Performance comes from keeping computation in Teradata Vantage; auditability comes from end-to-end traces of signals, decisions, and actions; and delivery speed comes from reusable prompts, tools, and node templates assembled in AgentBuilder.

Ready to turn signals into outcomes?

Building Autonomous Customer Intelligence: A Developer’s Guide to Teradata’s Customer Intelligence… was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.

Chat with your data in Visual Studio Code through an MCP server

Joshitha Recharla — Wed, 17 Dec 2025 17:45:17 GMT

Teradata’s Model Context Protocol (MCP) with GitHub Copilot

If you build data pipelines or analytic Python scripts, like dbt models, in Visual Studio Code (VS Code), you know the pain of context-switching just to sanity-check a column list or scan for NULLs. Traditional SQL clients are great, but bouncing between these database clients and VS Code can be distracting and break your flow when your focus is the code you’re writing on VS Code.

Teradata’s Model Context Protocol (MCP) server helps you by allowing GitHub Copilot Chat to talk to your data directly, providing a handy assistant for drafting analytics scripts.

Teradata MCP server is an open and extensible backend server that acts as a bridge between large language models (LLMs) and your Teradata environment, allowing you to query data, explore metadata, and even validate logic just by asking questions in plain English.

You stay in VS Code, GitHub Copilot routes data-related requests to the MCP server, and results come back in the chat.

With this setup, GitHub Copilot can:

Interpret your natural language questions
Translate them into SQL queries behind the scenes
Connect to your Teradata Vantage® instance
Retrieve structured answers directly in your Copilot chat window

In this article, I’ll walk you through:

How to set up your development environment to take advantage of Teradata MCP Server
How to set up your own MCP server to work with GitHub Copilot Chat in VS Code
How to query your Teradata database as if you’re having a conversation, simplifying chores like discovering tables, validating filters, and validating data quality in VS Code

Let’s begin the setup.

Requirements

A free Teradata VantageCloud environment. Get one in just a few minutes via ClearScape Analytics® Experience.
Visual Studio Code with the Copilot Chat and MCP-Client extensions. The download is available at https://code.visualstudio.com/.
A Teradata MCP server. An open-source community edition is available in Teradata’s GitHub organization; later we’ll cover the details for cloning this repository and setting up the server.

Preparing the development environment

Log into ClearScape Analytics® Experience

Sign up for a free account if you haven’t already, or log in.
In the console, select “create new environment.” Take note of your host and password and the username and database, which will also be needed to set up your server later, default to demo_user in ClearScape Analytics® Experience.

Creating a new ClearScape Analytics environment

Install uv

We utilize the uv package manager to run the Teradata MCP server in an isolated Python environment to avoid package conflicts and ease the setup.

Install uv by following the steps at https://docs.astral.sh/uv/getting-started/installation/
Verify if uv is properly installed, open your terminal, and run: ```uv –v```

This will display the installed version of uv. If you see any error, make sure you have followed the installation instructions and restart the console.

Setting up your own Teradata MCP server to work with GitHub Copilot in Visual Studio Code

Clone the Teradata MCP Server

git clone https://github.com/Teradata/teradata-mcp-server.git

Set environment variables

Create a .env file with the following structure:

DATABASE_URI= teradata://username:password@host:1025/databasename
LOGMECH=TD2 #TD2 or LDAP
MCP_TRANSPORT=stdio
PROFILE = dba

We’ll use the stdio transport method in this setup, though other methods are also available. Check the relevant documentation in the repository for details.

Replace username, password, host, and database name with your own Teradata Vantage® credentials.
The DATABASE_URI tells the server where to point your queries.
Given the tasks we’re performing, we need a profile that enables the required tools. Specifically, set PROFILE=dba to access the required toolset.

Start the MCP Server

From your project directory (i.e., the folder where you cloned the teradata-mcp-server), test starting the MCP server using the following command.

uv run teradata-mcp-server

This boots up the MCP server to verify that everything is working as expected. If everything is in order, you can interrupt this run by pressing Ctrl + C.

Connecting the MCP server to GitHub Copilot in VS Code

Now let’s link this server to your GitHub Copilot:

Open VS Code
Press Ctrl + Shift + P to open the Command Palette
Select MCP: Add Server
Choose stdio as the transport type
Fill in the dialog as follows:
Command: “uv”
ID: For example, “teradata-mcp-server”
Then confirm and save now

An mcp.json file will be created automatically. Complete the required arguments as shown below:

{
"mcp": {
"servers": {
"teradatastdio": {
"type": "stdio",
"command": "uv",
"args": [
" - directory",
"/teradata-mcp-server",
"run",
"teradata-mcp-server"
],
}
}
}
}

Make sure to replace with the actual location of the teradata-mcp-server directory if it didn’t load itself.

Launch the MCP Server in VS Code

Once configured, start the server from within VS Code if it didn’t start itself:

Friendly controls to start, restart, and stop the server overlay on top of the mcp.json file for convenient operation

Configuring the MCP server in mcp.json to run the Teradata MCP server via stdio.

Chat with your data

To interact with your Teradata database in natural language, you’ll need to use Copilot in Agent Mode, which allows GitHub Copilot Chat to talk directly to the MCP server.

Here’s how to do it.

Open GitHub Copilot Chat in VS Code

In your sidebar, click on the GitHub Copilot Chat icon.
If you don’t see it, open the Command Palette (Ctrl + Shift + P) and search for “GitHub Copilot: Open Chat View”

Switch to Agent Mode

Go to the Copilot Chat panel in VS Studio
Ensure it’s set to Agent Mode (not Chat Mode). If you don’t see this, go to command palette to enter Copilot: Toggle Agent Mode.

Set GitHub Copilot Chat to Agent Mode to allow MCP-based tool and database interactions.

Once in Agent Mode, Copilot will be ready to route your questions through the MCP server to your Teradata backend and return results right inside your chat window.

Note on data: By default, our walkthrough defines demo_user as the default database. This database is empty in a new ClearScape Analytics® environment, you should load sample data first or update the database/table names in the examples to match tables that exist in your environment. This ensures you can explore and query your data.

Try prompts like:

“Show me the first 5 rows from demo_user.dim_customers”
“What tables exist in the demo_user schema?”
“Give me the total sales by region from demo_user.sales_fact”
“Which columns are available in the demo_user.orders table?”

Querying the Teradata database in natural language via GitHub Copilot Chat, with requests executed through the MCP server.

Conclusion

By combining Teradata’s MCP Server with GitHub Copilot Chat in Visual Studio Code, we unlock a powerful way to make data feel more conversational, accessible, and integrated into the daily developer workflow.

If you’re someone who works with Teradata and builds in VS Code, I highly recommend giving this workflow a try. Running the queries and getting results in the chat might boost your productivity.

Got questions or want to share your own use case? Feel free to reach out or leave a comment.

About Joshitha Recharla

Joshitha Recharla is a developer advocate intern at Teradata and a graduate student in data science at SUNY Buffalo. She enjoys breaking down complex AI and analytics workflows into hands-on, developer-friendly tutorials and demos. Through her internship, she has explored how tools like MCP can make data more accessible and useful where developers work.

Chat with your data in Visual Studio Code through an MCP server was originally published in Teradata on Medium, where people are continuing the conversation by highlighting and responding to this story.