Cloud Blog

Log Analytics is now Observability Analytics: Query logs and traces with SQL

Tue, 23 Jun 2026 16:00:00 +0000

To effectively operate and troubleshoot applications, developers and site reliability engineers (SREs) need to understand the full context of their system's behavior, typically as part of their logging and observability tooling. Today, we’re excited to announce a variety of new capabilities in our Google Cloud Observability suite:

Log Analytics is now Observability Analytics.
Trace data within Observability Analytics is generally available (GA).
The Observability API for management and configuration is GA.

Together, these bring logs and traces together into a unified experience, helping you go from viewing high-level trends to deep, contextual, root-cause analysis for agentic as well as traditional workloads, and to configure and manage those workloads programmatically, as part of observability buckets.

Further, support for SQL in Cloud Trace is an important new tool in your toolbelt. You can, for instance, write a single SQL query that joins your application logs with your distributed trace spans and find any checkout requests that took longer than 5 seconds, to instantly see which internal microservice spent the most time processing them. Or, for AI agents, you can analyze telemetry across thousands of runs to identify which tool calls most frequently fail, or calculate the aggregated P95 response time for all external tool executions to pinpoint performance bottlenecks. The possibilities are endless!

In this blog, let’s take a closer look at Observability Analytics, and a few key use cases leveraging traces and logs, so you can put these new capabilities to work in your environment right away.

What is Observability Analytics?

Observability Analytics, formerly Log Analytics, brings the power of BigQuery and SQL to your telemetry data directly within Cloud Observability. It allows you to run complex analytical queries joining high-volume log and trace data to identify patterns, troubleshoot issues, and generate insights into your agent and application's health and performance without having to move or duplicate data. This brings a number of important benefits:

Unified telemetry: Run SQL queries to analyze and JOIN high-volume log and trace data in a single place.
Business correlation: Join your observability datasets with business-critical data stored in BigQuery (e.g., conversion rates, revenue, operational costs) to quantify the business impact of technical issues.
In-place analysis: Analyze your data where it’s already stored (in Cloud Logging and Cloud Trace), reducing duplicate export storage costs and complexity.

For instance, with Cloud Observability, you can analyze how application latency impacts conversion rates or identify the financial implications of service outages, transforming raw telemetry into actionable business intelligence.

Unlock deeper insights with traces and logs

Correlating logs and traces in a single analytics view breaks down data silos and accelerates troubleshooting. You can now analyze performance trends from trace data and directly correlate them with corresponding application or infrastructure logs to understand the “why” behind the “what.” Let’s take a couple of examples.

Use case 1: AI agent optimization (analyzing tool failures and latency at scale)

AI agents often perform complex, multi-step tasks by executing various external tools (e.g., database queries, web searches, API calls). When optimizing agents at scale, inspecting individual trace graphs in a UI often isn't enough. You need to answer systemic questions like “Which tools are failing most frequently?” and “Which ones are causing latency bottlenecks?”

With Observability Analytics, you can run aggregate queries across millions of span events to calculate failure rates and latency percentiles (like P95) for every tool in your system.

Example query: Rank agent tools by failure rate and 95th percentile latency over the last 7 days.

code_block: <ListValue: [StructValue([('code', 'SELECT\r\n JSON_VALUE(attributes, \'$."agent.tool.name"\') AS tool_name,\r\n COUNT(span_id) AS total_calls,\r\n -- Calculate failure rate (status.code = 2 represents ERROR in OpenTelemetry)\r\n SAFE_DIVIDE(COUNTIF(status.code = 2), COUNT(span_id)) * 100 AS failure_rate_percentage,\r\n -- Calculate P95 latency in milliseconds\r\n APPROX_QUANTILES(duration_nano / 1000000, 100)[OFFSET(95)] AS p95_latency_ms\r\nFROM\r\n `YOUR_PROJECT_ID.us._Trace.Spans._AllSpans`\r\nWHERE\r\n name = \'Agent.executeTool\' -- Filter for spans representing tool execution\r\n AND start_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AND CURRENT_TIMESTAMP()\r\nGROUP BY\r\n tool_name\r\nORDER BY\r\n failure_rate_percentage DESC, p95_latency_ms DESC\r\nLIMIT 10'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7405d2970>)])]>

With the above query, you can:

Spot bottlenecks: Instantly see if a tool like DatabaseQueryTool has a P95 latency of 8 seconds, indicating you need to optimize database indexes or connections.
Identify flaky tools: Discover if a specific API tool has a 15% failure rate, suggesting API rate limits or integration bugs.
Drill down to the prompt: Once you identify a flaky tool, you can write a follow-up query joining these trace spans with application logs to extract the exact LLM prompt and reasoning that led to the failures. Here’s that SQL query:

code_block: <ListValue: [StructValue([('code', 'SELECT\r\n t.name AS tool_name,\r\n l.timestamp,\r\n -- Retrieve the agent\'s thoughts and the prompt from application logs\r\n JSON_VALUE(l.json_payload.agent_thoughts) AS agent_reasoning,\r\n JSON_VALUE(l.json_payload.llm_prompt) AS prompt_sent_to_llm\r\nFROM\r\n `YOUR_PROJECT_ID.us._Trace.Spans._AllSpans` t\r\nJOIN\r\n `YOUR_PROJECT_ID.us._Default._AllLogs` l\r\nON\r\n t.trace_id = SPLIT(l.trace, \'/\')[SAFE_OFFSET(3)]\r\n AND t.span_id = l.spanId\r\nWHERE\r\n t.name = \'Agent.executeTool\'\r\n AND JSON_VALUE(t.attributes, \'$."agent.tool.name"\') = \'NameOfFlakyTool\'\r\n AND t.status.code = 2 -- Filter for failed tool calls\r\n AND l.severity = \'ERROR\''), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7405d2c70>)])]>

Use case 2: Identify latency impact on specific customers (business context)

If you don't propagate user or customer identifiers in your trace attributes (e.g., for privacy or technical reasons), but you do log them in your application access logs, you can join traces and logs to identify which customers are experiencing the worst performance.

Example query: Find the top 10 customers experiencing the highest 95th percentile latency.

code_block: <ListValue: [StructValue([('code', "SELECT\r\n JSON_VALUE(l.json_payload.customer_id) AS customer_id,\r\n AVG(t.duration_nano / 1000000) AS avg_latency_ms,\r\n APPROX_QUANTILES(t.duration_nano / 1000000, 100)[OFFSET(95)] AS p95_latency_ms,\r\n COUNT(t.span_id) AS total_requests\r\nFROM\r\n `YOUR_PROJECT_ID.us._Trace.Spans._AllSpans` AS t\r\nJOIN\r\n `YOUR_PROJECT_ID.us._Default._AllLogs` AS l\r\nON\r\n t.trace_id = SPLIT(l.trace, '/')[SAFE_OFFSET(3)]\r\n AND t.span_id = l.spanId\r\nWHERE\r\n t.start_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY) AND CURRENT_TIMESTAMP()\r\n AND t.kind.name = 'SPAN_KIND_SERVER'\r\n AND JSON_VALUE(l.json_payload.customer_id) IS NOT NULL\r\nGROUP BY\r\n customer_id\r\nORDER BY\r\n p95_latency_ms DESC\r\nLIMIT 10"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa732f99820>)])]>

You can find more query examples for trace in this github repo.

Observability Analytics page vs. log and trace explorers

Cloud Logging and Trace will both continue to offer log and trace explorers — tools that are optimized for finding and inspecting individual log entries and traces, making them ideal for investigating a specific issue.

Observability Analytics, in contrast, is designed for aggregations and in-depth analysis. Think of it as your tool for answering broad questions about your services, such as "What is the 95th percentile latency for my checkout service over the last week?" or "Which API endpoints have the highest error rate after our last deployment?"

Enabling AI agents to query traces and logs using SQL

Finally, with rapid growth in agentic assistants, you need to be able to access your telemetry programmatically. The Observability API lets you create linked BigQuery datasets for your observability buckets, making the data available to query directly from the BigQuery ecosystem. Now, your AI agents or analytical workloads can query this data directly via standard BigQuery APIs and tooling.

Get started today

You can start analyzing your trace data in Observability Analytics today. Simply navigate to the Observability Analytics page in the Google Cloud console to begin exploring your trace data. Ensure you have enabled the Observability API to unlock configurations and management capabilities.

Verifiable, private AI: Google Cloud expands Confidential Computing frontiers

Tue, 23 Jun 2026 16:00:00 +0000

Protecting sensitive data used with AI is a critical part of our commitment to providing advanced and secure cloud infrastructure. Confidential Computing cryptographically protects data in use in hardware-based Trusted Execution Environments (TEEs) with verifiable data integrity.

We are thrilled to share our latest Confidential Computing innovations across our hardware ecosystem that help further strengthen verifiable privacy in cloud AI deployments.

Confidential AI at global scale

By scaling our Confidential AI capabilities globally, we help ensure that AI inference and fine-tuning workloads can run with enforceable privacy guarantees.

Democratizing Confidential AI: Confidential G4 VMs with NVIDIA RTX PRO 6000 Blackwell GPUs in preview

We are excited to announce a landmark moment for accessible Confidential AI at global scale: Confidential VMs and Confidential GKE Nodes on the accelerator-optimized G4 machine series, featuring NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs.

What makes this a game-changer is its global scale and flexibility. Confidential G4 is available in every Google Cloud region that the standard G4 is available, across multiple consumption models including On Demand, Reservations, DWS Flex Start, and Spot/Preemptible.

"As organizations scale AI across multiple infrastructure environments, maintaining privacy and control over data and execution becomes increasingly challenging. Google Cloud Confidential G4 VMs powered by NVIDIA RTX PRO 6000 Blackwell GPUs are a meaningful addition to the expanding Confidential AI infrastructure ecosystem. As AI workflows now span agents, data sources, and infrastructure boundaries, Super Protocol provides a consistent Confidential AI operating model across Google Cloud Confidential VMs, other clouds, and on-premises environments — abstracting away confidential computing complexity and allowing teams to focus on AI outcomes," said Yulia Gontar, COO, Super Protocol.

Powered by 5th Generation AMD EPYC Turin CPUs leveraging AMD SEV, the G4 machine series with NVIDIA RTX PRO 6000 Blackwell GPUs activates robust hardware-based security. This architecture helps ensure that sensitive data is protected during processing inside the TEE, while also encrypting data as it travels between the CPU and GPU.

"GCP's Confidential G4 VM was the obvious choice for Vertebrae because privacy and security are non-negotiable for our customers. Our product processes sensitive work discussions, so we need to support hardware-signed attestation that both CPU and GPU are running in a trusted execution environment. Using confidential computing on Google Cloud lets us deliver the frontier of AI privacy in the cloud," said Andy Qin, CEO, Vertebrae.

With Confidential G4, you can unlock AI inference, fine-tuning, HPC, and use cases involving highly restricted data, sensitive models, or private prompts, all with minimal performance impact. Get started with Confidential G4 VMs and Confidential G4 GKE Nodes.

Enabling end-to-end private inference: Open-source Prompt Encryption SDKs

Even as we make Confidential AI accessible, we understand that protecting sensitive data in AI workloads goes beyond securing the model execution environment. The prompts and responses themselves can contain highly-confidential information. To provide cryptographic protection for the entire inference lifecycle, we are happy to announce the open-source launch of our Prompt Encryption SDKs, now available on GitHub.

This toolkit helps you establish an end-to-end secure channel for your AI inference workloads, ensuring that prompts are cryptographically protected from the moment they leave the client until they are processed in the TEE; model responses are similarly protected all the way back to the client.

Prompt and response encryption using Prompt Encryption SDK.

The Client SDK is integrated into the client application and works in tandem with the Server SDK integrated into the inference server running in the TEE. Once the SDKs have been used to establish an attested TLS session, the client can be confident that the server is running an authorized workload within a verified Confidential Computing environment.

The client app can then send encrypted prompts to the inference server, knowing that only this server will be able to decrypt and process it in the TEE. Once the server has a response ready, it sends it back via the same encrypted channel to the client app.

You can get started today with the GitHub repository and the Codelab.

Enabling Apple Private Cloud Compute on Google Cloud

Our commitment to privacy is deeply exemplified by our collaboration with Apple to expand Private Cloud Compute (PCC) on Google Cloud.

We are proud to collaborate with Apple to extend Apple’s privacy and security commitments to PCC on Google Cloud. Our platform supports Apple’s PCC privacy commitments with a layered security approach built upon Google Cloud’s infrastructure. This includes leveraging Google Cloud Confidential Computing with Intel TDX, NVIDIA Confidential Computing with NVIDIA Blackwell GPUs, our Titanium security architecture with the Titan chip, and a co-engineered open-source host stack to ensure verifiable transparency.

Together, these technologies help Apple PCC on Google Cloud meet stringent requirements for data protection and user privacy. To dive deeper into this collaboration, read our blog post: Powering the next era of Confidential AI.

Advancing confidential foundations

Google Cloud is committed to making Confidential Computing capabilities broadly available across our infrastructure. Our goal is to integrate hardware-based security features deeply into our foundational compute offerings, allowing customers to enhance data protection without compromising performance or operational flexibility.

Bringing Intel Trusted Domain Extensions (TDX) to the C4 machine series

Confidential VMs with Intel TDX on the C4 machine series will be available in preview soon.

Powered by the latest 6th Generation Intel Xeon processors, this integration offers a significant leap in compute density and performance for data-intensive workloads. By using Intel TDX, C4 instances create hardware-isolated Trust Domains (TDs) that protect sensitive applications and data from the underlying host and hypervisor.

This architecture provides confidentiality and privacy while enabling remote attestation so you can cryptographically verify the environment before processing sensitive data. Best of all, you can turn Confidential Computing on with a few clicks and no code changes.

Expanding Live Migration capabilities

Running mission-critical production environments requires high availability and continuous uptime, even during scheduled cloud maintenance.

Live Migration on C3D-based Confidential VMs is now generally available. This capability allows Google Cloud to perform planned hardware maintenance without interrupting workloads or exposing encrypted guest memory, ensuring seamless uptime for long-running confidential applications.

Enhancing trust and collaboration: Innovations in Confidential Space

Confidential Space is a Confidential Computing environment designed to enable secure multi-party computation and data sharing. It allows organizations to collaborate on sensitive data, such as for joint machine learning or data analytics, without revealing the data to each other or to Google Cloud.

“Google Cloud Confidential Space allows us to provide financial institutions with security guarantees similar to or better than an on-prem service," said Olivier Richaud, vice-president, Platforms and Site Reliability Engineering, Symphony. "Transitioning such security and privacy-sensitive customers to a cloud-based SaaS service would have been impossible without the power of Confidential Computing.”

A key design principle of Confidential Space is to remove the workload operator from the trust boundary, providing cryptographic assurance that only the authorized, attested workload can access the data.

“As AI systems increasingly act on behalf of consumers in financial services, trust in how data is processed becomes paramount. At Sahamati, we see Google Cloud Confidential Space as a foundational technology for enabling privacy-preserving AI in India’s Open Finance ecosystem, creating the trust needed for innovation while maintaining strong security and accountability guarantees,” said Kiran Gopinath, chief innovation officer, and Head, Sahamati Labs.

Our new advancements for Confidential Space provide greater flexibility and stronger assurances. Key updates include:

Independent Verification: Integration with Intel Trust Authority

We are pleased to announce that Intel Trust Authority (ITA) is now generally available as an independent attestation verifier service for Confidential Space.

This integration enables organizations to independently verify the integrity of the Confidential Space environment using Intel’s hardware-rooted attestation before encryption keys are released to workloads. By decoupling attestation verification from the cloud service provider, customers benefit from enhanced transparency, stronger assurance, and a more robust trust model.

"With Confidential Computing woven into our core infrastructure, Google Cloud and Intel are making hardware‑rooted security and independent attestation part of the default fabric of modern compute. From Intel TDX‑powered C4 Confidential VMs running production workloads, to Confidential Space with Intel Trust Authority — now generally available — enabling verifiable multi‑party collaboration, customers can now encrypt, verify, and scale their most sensitive AI and data workflows without rewriting applications or compromising performance, even in the most demanding regulatory environments,” said Anand Pashupathy, general manager and vice-president, Intel Product Assurance and Security (IPAS), Intel Corporation.

Accelerating secure collaboration: Confidential Space with H100 GPU support

To power secure multi-party AI and machine learning, Confidential Space support for NVIDIA Hopper GPUs is now generally available. This can help multiple parties pool their data for training and inference within a Confidential Space environment, using the power of Hopper GPUs, while ensuring that their individual data remains protected from other participants and from Google Cloud.

Confidential Space unlocks use cases like federated learning on sensitive datasets, and building joint models without centralizing data.

“Confidential GPU support in Google Cloud Confidential Space removes one of the biggest barriers to adopting secure AI: the tradeoff between protecting sensitive workloads and achieving production-grade performance," said Adi Hirschtein, VP Product, Duality. "For Duality customers in healthcare, financial services, and government, this enables federated learning, confidential AI, and encrypted RAG workflows to run on sensitive data at scale while keeping data and models protected throughout processing.”

Next steps

Confidential Computing is becoming an essential layer of cloud computing in the AI era. Explore our expanding portfolio of Confidential VMs, accelerated hardware, and open-source tools to see how you can enable secure collaboration and private AI innovation within your organization.

To learn more, join us at the Confidential Computing Summit on June 23 and 24, 2026.

Open models, global networks: How AT&T and GSMA are accelerating telecom innovation with Gemma

Tue, 23 Jun 2026 12:00:00 +0000

Telecommunications is an incredibly complex, highly specialized domain. Modern mobile networks are inherently multi-vendor, featuring diverse and often proprietary data structures. While AI has made massive leaps in general language and coding, telecom domain knowledge is rarely accessible on the open internet — there is simply no "Wikipedia" for telecoms.

This data scarcity creates a major hurdle for AI models trying to deeply understand network operations. When operating at an immense global scale that connects billions of people hundreds of billions of times a day, the industry requires absolute precision. Yet, according to GSMA Intelligence, only 16% of total AI deployments in telecoms are on the network, largely due to the difficulty of training models on specialized domain knowledge.

While general-purpose AI models have come a long way, the scale, complexity, and specificity faced by telecom providers means domain-specific models remain the best way to achieve the dramatic network and process automation and agentic workflows that are at the heart of the AI era. And it takes an open model to deliver the flexibility and dynamism global networks require.

Why domain-specific models matter

Generalized frontier models are incredibly capable at broad reasoning and language tasks, but they lack the foundational context required to manage critical infrastructure. General models still struggle with highly specialized vocabulary, complex network topologies, and vendor-specific telemetry data unique to the telecom sector.

Telco-specific models solve this by anchoring the AI in the actual realities of network operations. By training on domain-specific datasets, these tailored models can interpret nuanced technical logs, diagnose network performance bottlenecks, and understand standard industry protocols with the high degree of accuracy and precision required for real-time systems.

Google’s Gemma models: Underpinning Open Telco AI

To address this challenge, the GSMA recently launched the Open Telco AI platform to build accurate, efficient, and trusted telco-grade AI. As a core part of this collaborative effort, AT&T post-trained a family of open telco models, called OTel, on different architectures including Google’s open-source Gemma models.

These models were trained on a specialized telco-specific dataset curated by GSMA and its collaborators, including telecom operators, network equipment providers, and academia. The initiative successfully delivered 30 models across a range of sizes and architectures, optimizing the balance between accuracy and efficiency.

Crucially, these models are built with safety at their core, being trained for abstention using retrieval augmented generation (RAG) to drastically reduce hallucinations — an absolute necessity in highly regulated telecom environments that are so central to modern life.

“The Open Telco AI platform represents a critical milestone in establishing trusted, domain-specific intelligence for the telecommunications industry,” said Louis Powell, director for AI technologies at GSMA. “By leveraging open-source foundations like Gemma, we are proving that highly accurate, efficient, and reproducible models can be built through global industry collaboration.”

Gemma emerges as a leading model

AT&T’s tests during OTel development highlight the strength of Gemma compared to other architectures, demonstrating strong performance gains across the entire OTel model family after telecom-specific fine-tuning. Notably:

The gemma-4-E4B-it model returned correct response 91.74% of the time, achieving the highest overall accuracy for all models tested.
This baseline version of Gemma 3 with 27-billion parameters delivered the strongest performance in initial model training across the models tested by AT&T.
The Gemma 3 model with 300-million telco-related embeddings saw a significant retrieval improvement.

"Gemma models have increasingly been setting the standard for open-source fine-tuning," said Mark Austin, VP of data science and AI at AT&T. "By training these models specifically on telco data, we'll be able to outperform legacy models several times its size in certain telco scenarios. This can help increase accuracy while driving down costs at the same time."

Empowering the future with Google Cloud's full-stack solutions

The impact of this open collaboration has been immediate, with over 18 million downloads of the models to date. Today, OTel stands as one of the top models on the Open Telco Benchmarks, demonstrating that tailored, smaller models can outperform massive frontier models when optimized for specific domains.

Looking ahead, Google Cloud is committed to supporting telecom operators globally in developing and deploying their own custom telco AI models.

By providing a comprehensive, full-stack solution — including robust AI-optimized infrastructure, AI development tools, and open models like Gemma — we can help operators, vendors, and innovators fine-tune these models further with their own data. This enables telecom operators to accelerate their journey in AI adoption while deploying telco-grade AI safely using Gemma’s built-in support and guardrails.

Together, the telecom industry can replicate the incredible progress seen in coding and reasoning, bringing those advanced capabilities into critical telecom sub-domains such as automated network configuration and self-healing systems.

Boost BigQuery with Python: Managed Python UDFs now generally available

Mon, 22 Jun 2026 17:00:00 +0000

SQL is the industry standard for high-performance structured data analysis. However, expressing complex procedural logic, scientific computations, advanced string manipulations, or machine learning workflows in pure SQL can be highly challenging, if not impossible. That kind of work is better done with Python. Data practitioners often take on additional infrastructure management tasks — maintaining custom images and containers, and working with additional compute services — just to run simple helper functions with custom Python code and libraries.

Today, we are thrilled to announce the general availability (GA) of BigQuery Managed Python User-Defined Functions (UDFs).

This launch represents a major milestone in BigQuery’s extensibility strategy, allowing data scientists, engineers, and analysts to execute custom Python code directly and securely inside BigQuery using standard SQL queries or BigQuery DataFrames (BigFrames) in Python. With this release, Python UDFs are fully supported for production enterprise workloads and completely integrated into BigQuery's billing SKUs.

Bridging SQL and the Rich Python Ecosystem

BigQuery Managed Python UDFs run on BigQuery-managed serverless resources that automatically scales to billions of rows, without having to set up infrastructure or manage containers. BigQuery automatically handles the compilation, image building, security patching, deployment, and execution of your Python code, making it super simple to use Python functions in your SQL.

Core benefits

Flexibility: Access the vast Python ecosystem — including top-tier scientific and mathematical libraries like NumPy, SciPy, pandas, scikit-learn and more — directly in your SQL select statements.
Tight external API integration: Clean and enrich your BigQuery tables in real time by calling external web APIs or Google Cloud services such as Cloud Translation, Gemini Enterprise Agent Platform or custom microservices securely within your queries.
Fully managed and serverless: BigQuery handles the underlying container infrastructure and auto-scales performance dynamically.

Code example

Here is an example of a Python UDF that utilizes a popular Python package — beautifulsoup — to remove HTML tags. We use this function to process

StackOverflow answer bodies that are stored in a BigQuery public table:

code_block: <ListValue: [StructValue([('code', 'CREATE OR REPLACE FUNCTION `your_project.your_dataset.clean_html`(html_content STRING)\r\nRETURNS STRING\r\nLANGUAGE python\r\nOPTIONS (\r\n runtime_version = \'python-3.11\',\r\n entry_point = \'strip_tags\',\r\n packages = [\'beautifulsoup4>=4.12.0\']\r\n) AS r\'\'\'\r\nfrom bs4 import BeautifulSoup\r\n\r\ndef strip_tags(html_content):\r\n if not html_content:\r\n return ""\r\n soup = BeautifulSoup(html_content, "html.parser")\r\n return soup.get_text(separator=" ")\r\n\'\'\';'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa732e7d580>)])]>

How to query it:

code_block: <ListValue: [StructValue([('code', 'SELECT \r\n id, \r\n `your_project.your_dataset.clean_html`(body) AS cleaned_answer_body\r\nFROM \r\n `bigquery-public-data.stackoverflow.posts_answers`\r\nLIMIT 100'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa732e7d250>)])]>

Advanced capabilities

For advanced users, Python UDF adds a set of capabilities to tune the performance as well as monitor the usage. Here are some examples.

Vectorized processing with Pandas PyArrow
To maximize throughput, the GA release supports direct processing of vectorized input as PyArrow RecordBatches. By processing columns of data in bulk rather than row-by-row, PyArrow eliminates Python serialization and conversion overhead, boosting performance by up to 10x for data-intensive calculations.

Configurable container resources
For heavy-duty data science and ML data preparation, you can now provision container memory (up to 16 GB) and CPU (up to 4 vCPUs) per function. This enables memory-intensive workloads (such as loading large serialized models or geospatial datasets) to run directly within the sandbox.

Customizable concurrency
Optimize your throughput and resource efficiency by configuring concurrent requests per container (up to 1,000 concurrent operations). This helps ensure that your scale-out execution is highly cost-effective and performs exceptionally well under heavy parallel loads.

Streaming logs and real-time metrics
Easily debug and monitor your production workloads. The BigQuery console now features a direct link from your query results to real-time CPU, memory, and concurrency metrics in Cloud Monitoring.

Billing

BigQuery Managed Python UDF are billed with BigQuery Services SKU. This SKU is fully eligible for BigQuery spend commitment-based usage discounts (CUDs), allowing you to maximize budget efficiency.

You can also get cost observability through INFORMATION_SCHEMA.JOBS as well as using billing labels MANAGED_ROUTINE_EXECUTION and MANAGED_ROUTINE_BUILD).

See more details in the Pricing section of the documentation.

Getting started

To get started with BigQuery Python UDFs, first check out product documentation.

Then, try out the functions published in the public BigQuery dataset. For example, run the following code in a BigQuery project to tokenize country names data from BigQuery public data. Under the hood, the token UDF utilizes the o200k_base tokenizer library.

code_block: <ListValue: [StructValue([('code', 'SELECT \r\n country_code,\r\n country_name,\r\n `bigquery-public-data`.python_udfs.tokenize(country_name) AS name_tokens,\r\n ARRAY_LENGTH(`bigquery-public-data`.python_udfs.tokenize(country_name)) AS token_count\r\nFROM \r\n `bigquery-public-data.census_bureau_international.country_names_area`\r\nORDER BY \r\n country_name'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa73273e070>)])]>

Or, try out this code lab to explore some advanced scenarios.

Then, to learn how to implement other advanced design patterns, we encourage you to explore our official public documentation guides:

Calling Google Cloud or online services (with connections): To connect to first-party Google Cloud services such as Gemini Enterprise Agent Platform or Cloud Translation, or external API endpoints securely using Cloud Resource connections, - check out the Call Google Cloud or online services in Python code guide.
BigQuery DataFrames (BigFrames) Python UDFs:To learn how to write, deploy, and scale custom Python functions natively from standard Jupyter notebook or Colab environments using BigQuery DataFrames, visit the Customize Python functions for BigQuery DataFrames guide.

Bring your Python workflows out of isolation and directly into the heart of your data warehouse today!

The Starter Tier for Google AI Studio explained

Mon, 22 Jun 2026 14:00:00 +0000

You've got a working prototype in Google AI Studio. A React frontend, a Node.js backend, maybe a database. Now you want a live URL to share with your team, your users, or a friend who wants to try it.

Google Cloud gives you a full platform for deploying production applications, with fine-grained IAM controls, billing management, and region selection. That's exactly what you want when you're building something serious. But when you just need to get a prototype online in the next ten minutes, there's now a faster path.

Google Cloud Starter Tier resources like Cloud Run, Cloud Firestore, Cloud SQL for PostgreSQL, and Firebase Authentication are provisioned in a fully-managed project. You can get started with using them without a payment method (like a credit card) or a billing account. Your Google Account is enough to go from prompt to live URL, with a database and auth all baked in.

What the Starter Tier actually is

When you set up any of the Starter Tier services within Google AI Studio, Google provisions a fully managed project behind the scenes. You don't create it, configure it, or administer it. Google handles the region selection, API enablement, and security policies for you.

Who can use it? The Starter Tier is currently available to individual Google Accounts. If you are signed in with a corporate or educational Google Workspace account, organization-level administrative policies may restrict your ability to deploy resources. It is also bound by the regional availability of Google AI Studio.

This is different from a standard Google Cloud project where you'd manage IAM roles, enable APIs, and link a billing account. The Starter Tier project is minimalist by design. You can't enable BigQuery or Pub/Sub in it. You can't change the region of any resources. And that's the point: fewer knobs means fewer ways to go off track.

The console experience matches this philosophy. Instead of the full Google Cloud console with hundreds of product pages, Starter Tier users get a simplified view focused on what matters for a prototype: application logs, performance metrics, and basic container configuration. If you navigate to an unsupported product, you'll be prompted to start a separate Free Trial instead of accidentally provisioning billable resources.

One thing to know: Starter Tier resources aren't governed by the standard Google Cloud Terms of Service. They fall under the Starter Tier Additional Terms. For prototyping and business applications, these terms won't get in your way.

What you get: the pre-wired stack

The Starter Tier doesn't give you the entire Google Cloud catalog. Instead, it offers a pre-wired stack of four products that are provisioned on demand as your application's architecture requires them.

Cloud Run

Cloud Run is the compute layer. Every Google AI Studio deployment creates a Cloud Run service that handles HTTP traffic. Under the Starter Tier, you can deploy up to two active web applications at a time per Google Account. Cloud Run services scale automatically based on incoming traffic and scale down to zero when idle, meaning your prototypes don't consume resources when not in use. They run in a single region that is locked in when you first provision your Starter Tier environment.

Firebase Authentication

If your app needs user login, the Starter Tier includes Firebase Authentication with Google Sign-In preconfigured. The AI agent in Google AI Studio can detect when your prompt implies user identity (for example, "build a shared to-do list") and will offer to enable auth automatically.

If your application builds on Google Workspace integrations, this sign-in flow simplifies credentials. Once a user logs in, your application can request OAuth access scopes to securely interact with their Gmail, Docs, Calendar, or Sheets data, making it straightforward to prototype internal tools like summarizers or inbox sorters.

Cloud Firestore

Cloud Firestore is a database service that handles NoSQL data storage. The Google AI Studio agent can provision it automatically when your prompt implies the need for structured data storage. The AI agent generates the client-side sync code (typically a /src/lib/firebase.ts file), and drafts application-appropriate Firebase Security Rules (for example, utilizing request.auth.uid to restrict document access to the authenticated creator).

If you hit a "Missing or insufficient permissions" error, you can click "Fix error" in Google AI Studio, and the agent will rewrite the security rules to match your updated app logic. It's worth reviewing these security rules manually before sharing your app broadly, though. AI-generated security rules are a starting point, not a guarantee.

All Firestore databases created by the Google AI Studio agent share a usage quota (more on that in the limits section below).

Cloud SQL for PostgreSQL Developer edition

When you need relational data with proper schemas, joins, and ACID compliance, the Starter Tier provisions Cloud SQL for PostgreSQL Developer edition, designed to work seamlessly with AI Studio agent. The developer edition enables instant provisioning and scale to 0, which enables fast and low cost developer experience. You also get the full power of open source PostgreSQL with capabilities like pgvector, so you can build semantic search or RAG applications without bolting on a separate vector database.

As you iterate on your application using prompts, Google AI Studio agent will automatically generate the required schema and migrate the schema, as you move through building and publishing your application.

From prompt to live URL in five steps

1. Open Google AI Studio Build Mode. Go to Google AI Studio and switch to Build Mode. No payment method, no project setup.

2. Describe your app. Type a prompt like "Build a shared to-do list app using Firebase as a backend." The agent generates a React frontend and a Node.js backend, with a live preview on the right side of the screen.

3. Enable Firebase (if prompted). If your prompt involves user data or authentication, the agent shows a configuration card to enable Firebase. Click the Settings icon to pick a region (this locks in the Cloud Run region too), then confirm.

4. Click Publish > Get Started > Publish App. The agent packages your code and provisions a Cloud Run service in your Starter Tier project.

5. Grab your URL. Within seconds, you'll have a live .run.app URL. You can monitor it from the simplified Google Cloud console view that shows logs and metrics for your deployed containers.

That's it. No Dockerfile, no gcloud CLI, no YAML configuration files.

How the Starter Tier compares

Google Cloud offers several ways to explore for free. Below, we compare the Starter Tier to the Free Trial, the most common entry point for new users.

	Starter Tier	Free Trial
What you get	Pre-wired stack that includes four products, with limited quota: Cloud Run Firestore Cloud SQL Firebase Authentication	$300 Welcome credit Google Cloud Free Tier Other product-specific free trials 90-day exploration with no risk of being billed.
What we need from you	A Google account Accept Starter Tier Additional Terms of Service	Accept Google Cloud Terms of Service A form of payment for anti-fraud purposes
Time limit	None	90 days
Project control	Google-managed	Full control
Console experience	Simplified	Full
Best for	Prototyping from AI Studio	Evaluating the full Google Cloud platform
What happens when you are ready for more?	Upgrade to a paid account by adding a payment method. If you’ve never had a billing account before, you will receive the $300 Welcome credit and access to the Free Tier. You will then be billed for usage that the Free Tier and $300 credit cannot cover.	Upgrade to a paid billing account to keep your existing project, remaining credits, and Free Tier and full platform access. You will then be billed for usage that the Free Tier and any remaining credit cannot cover.

Starter Tier is best for AI Studio prototyping. Choose the Free Trial If you need BigQuery, GKE, or Gemini Enterprise Agent Platform, or the 90-day period to evaluate GCP broadly with no risk of being billed. Both paths allow you to seamlessly upgrade to a paid account for the full experience whenever you are ready.

How to plan for limits

The Starter Tier is generous for prototyping, but it does have boundaries. Knowing them upfront saves you from unpleasant surprises.

Two-app cap. You can deploy a maximum of two applications. Note that if you want to replace one of your active applications, you should deploy over or overwrite the existing app slot in Google AI Studio rather than attempting to delete the service manually in the Cloud Console.

Single region. All resources in your Starter Tier project are pinned to one region, chosen whenever the first Starter Tier service is provisioned. For example, if a Firestore database is provisioned before deploying to Cloud Run, then the region is chosen at that time.

Locked API surface. You can't enable additional Google Cloud APIs (BigQuery, Pub/Sub, Cloud Functions, etc.) in a Starter Tier project. If you need them, you'll need to upgrade.

Ephemeral filesystem. Because your published Google AI Studio app runs inside a serverless Cloud Run container, it inherits a temporary filesystem. Any files you write directly to disk (like uploaded images, generated PDFs, or local SQLite databases) will vanish when the container scales to zero or gets redeployed. Since Google AI Studio redeploys your container with each prompt iteration, this happens frequently. Store persistent data in Firestore or Cloud SQL for PostgreSQL.

Firestore shared quota. All Firestore databases created by the Google AI Studio agent share a single shared-quota group. In Google Cloud, a quota represents a usage limit or daily budget to protect the project and prevent abuse. It is not a guarantee of reserved server capacity.

Quota Metric	Starter Tier Maximum Limit
Total Stored Data	1 GiB total
Network Egress	10 GiB per month
Write Operations	40,000 writes per day
Read Operations	50,000 reads per day
Real-Time Updates	50,000 updates per day

If any database in the group exhausts a daily limit, all databases in the group pause until roughly midnight Pacific Time. Firebase Authentication usage is metered separately, so a spike in logins won't eat into your database quota.

Cloud SQL share quota: You are limited to building a maximum of 2 apps with Cloud SQL. AI Studio agent will automatically fallback to Firestore if the Cloud SQL quota is exceeded. You can get more quota by growing out of the sandbox.

Growing out of the sandbox

The best part of the Starter Tier is how you upgrade from it. There's no migration, no data export, no DNS cutover. When you're ready to scale, you upgrade in place.

From the Projects page in Google AI Studio, click "Set up billing." You'll create a Cloud Billing account, enter a payment method, and accept the standard Google Cloud Terms of Service. If you are eligible, you will automatically receive the $300 Welcome credits, which will offset your usage costs during the trial period. The upgrade happens with zero downtime: your Cloud Run services keep running, your databases keep their data, and your .run.app URLs don't change.

After upgrading, you get full IAM control, the ability to enable any Google Cloud API, and access to all regions and scaling options. The following cost safeguards are recommended:

Set a budget alert: Go to the Google Cloud Billing console and set up a budget alert (e.g., at $10) to notify you if usage exceeds your expectations.
Set a Cloud Run max instance cap: In the Starter Tier, Google pins your maximum container instances to 1. Once you upgrade, configure an instance limit (e.g., --max-instances 5) to prevent unexpected scaling charges from sudden traffic spikes.
Configure API quotas: Set caps on API calls (such as the Gemini API or Firestore reads/writes) to enforce a hard ceiling on usage.

One caveat: Firestore databases created by the Google AI Studio agent stay in the shared-quota group even after you add billing. If you want to get more usage quota for your database, then you need to go to the Firebase console, navigate to your Firestore database, and click "Upgrade database". This will remove the instance from the shared-quota group and put it on standard billing, although standard Firestore Free Tier limits still apply before you are charged.

The continuity across paths makes this process smooth. You can start with a prototype on the Starter Tier, iterate on it for weeks, and then flip it to a production-grade Google Cloud project when it's ready, without rebuilding anything.

Got questions about the Starter Tier or want to share with me what you've built with it? You can also share your thoughts with the community on r/GoogleCloud and r/Firebase subreddits.

What’s new with Google Cloud

Fri, 19 Jun 2026 16:00:00 +0000

Want to know the latest from Google Cloud? Find it here in one handy location. Check back regularly for our newest updates, announcements, resources, events, learning opportunities, and more.

Tip: Not sure where to find what you’re looking for on the Google Cloud blog? Start here: Google Cloud blog 101: Full list of topics, links, and resources.

aside_block: <ListValue: []>

Jun 15 - Jun 19

Join us for a deep dive into agentic AI control with AppyThings
Your integrations aren’t failing—they are evolving. When users interact with AI agents, they no longer arrive directly at your site, resulting in experiences stripped of your context, expertise, and intended experience. Join us on Thursday, June 25, for a community tech talk in partnership with AppyThings to learn how to solve this new gateway challenge. We will explore how MTN laid an integration foundation with the Model Context Protocol (MCP) to deliver accurate, consistent experiences. Our technical experts will demonstrate how to leverage Apigee as a centralized tools management solution to govern agent access.

Register for the session
Optimize Spot VM Deployments with Capacity Advisor for Spot, Now in Public Preview
Google Compute Engine has launched Capacity Advisor for Spot to Public Preview, now open to all customers. This tool turns Spot capacity discovery into a data-driven process by providing real-time deployment recommendations to maximize obtainability and minimize preemption risks. Query the Capacity Advisor API for obtainability and minimum estimated uptimes, or use the new Console UI featuring a global availability map, spot price lookups, and historical preemption rate trends to visually find the most cost-efficient compute capacity.

Get started today to start optimizing your Spot VM deployments!
Build a multi-tenant agentic AI system
When scaling generative AI across different business units, your teams need specialized AI agents with unique operational rules and tools. Our new reference architecture helps you build a centralized multi-tenant platform to prevent fragmented silos, eliminate data exposure risks, and maintain unified compliance. Read the guide to design and deploy a multi-tenant agentic AI system in Google Cloud.
How to Configure Gemini Enterprise to Connect to a Custom MCP Server
The Gemini Enterprise MCP Connector was a big announcement at Google Cloud Next because it introduces the ability to connect Gemini Enterprise to MCP servers. This blog post provides a step-by-step guide on how to configure your first Custom MCP Server connector using the Google Maps Ground Lite MCP server as an example. Once you understand this flow, you can configure multiple MCP servers with Gemini Enterprise to bring all the context you need.

Jun 8 - Jun 12

Simplify Multi-Cloud Planning with Cloud Location Finder, now Generally Available
Cloud Location Finder provides up-to-date data on public regions, zones, and Google Distributed Cloud Connected locations across Google Cloud, AWS, Azure, and OCI. You can now programmatically discover locations based on provider, proximity, territory, and carbon footprint to optimize your global infrastructure strategy for performance, compliance, and sustainability.

Get started for free today

Jun 1 - Jun 5

Modeling the physical world with BigQuery Graph
Managing complex supply chains requires more than just spreadsheets; it requires a digital replica of the physical world. In this post, Guru Rangavittal and Candice Chen explore how BigQuery Graph enables organizations to build a digital twin by turning physical assets into an interconnected map of nodes and edges. By moving beyond traditional relational databases, businesses gain real-time clarity into operations—from executing surgical ingredient recalls to analyzing weather-driven logistics risks. Discover how BigQuery Graph transforms reactive firefighting into proactive, precision modeling, allowing you to see critical connections in seconds and future-proof your supply chain.
Apigee for AI: Govern LLMs and MCP Servers (Presented in Spanish)
Learn how to securely transition your AI initiatives from experimental prototypes to enterprise-ready deployments. Join Luis Cuellar on June 18 for a technical deep dive (presented in Spanish) exploring Apigee’s latest AI gateway capabilities. Discover how to centralize governance over Model Context Protocol (MCP) servers, protect Large Language Models (LLMs) with robust API gateway security policies, and manage token-based quotas.

Register for the June 18 Spanish Community TechTalk

May 25 - May 29

Anthropic’s Claude Opus 4.8 is now available on Gemini Enterprise Agent Platform. As we continue to expand our platform's model offerings, this addition gives organizations more options for handling complex, multi-stage enterprise workflows. Claude Opus 4.8 brings strong capabilities in agentic coding, allowing developers to manage extensive refactors and tracking dependencies over extended sessions.
API Horizon Munich July 6, 2026: Orchestrating the Next Era of AI and APIs
Master the orchestration of next-gen AI and digital ecosystems. Join Google Cloud experts and DACH tech leaders on July 6 for an exclusive look at the Apigee roadmap, Agent Management, and Model Context Protocol (MCP). Gain real-world insights and connect with the regional integration community.

Register now
Securing AI Agents: The Extended Agent Gateway Pattern
Learn how to prevent autonomous AI agents from invoking unauthorized APIs. Join Apigee Specialist Joel Gauci on June 4 for a technical deep dive into the Extended Agent Gateway pattern. This session covers enforcing Fine-Grained Authorization (FGA), implementing secure token exchange, and establishing Model Context Protocol (MCP) governance at the API gateway layer to protect enterprise backend services.

Register for the June 4 Community TechTalk
API-to-Agent Security: Exposing REST APIs to Gemini Enterprise via MCP
Connect Gemini Enterprise agents to core data without creating security hazards. Join Google Cloud Specialist Nigel Walters on June 11 to learn how to instantly transform legacy REST APIs into secure Model Context Protocol (MCP) servers. We’ll cover how to safely register tools with Gemini while enforcing gateway-level guardrails like rate limiting and access control policies.

Register for the June 11 Community TechTalk

May 18 - May 22

Chinese Webinar | June 4: AI Command and Control
As AI agents move from experimental pilots to core enterprise functions, governance has become a critical next step. Join Google Cloud on June 4th at 10:00 AM (Beijing Time) to learn how to build a secure AI management layer architecture. We'll explore how to develop governed MCP (Model Context Protocol) endpoints, manage tool access to enterprise data, and leverage robust audit logs to operationalize AI. This session also includes a practical demonstration of these governance frameworks on Google Cloud.

Register here
GCP Announces New Features to Benchmark and Optimize LLMs for On-Device Use Cases
Deploying fine-tuned LLMs from GCP to edge devices like smartphones is complex due to fragmented hardware. Google AI Edge Portal bridges this gap, giving GCP developers the ability to test AI performance on 120+ Android devices, representing the full diversity of high, medium, and low tier smartphones on the market today. This week at I/O, we announced brand new capabilities to benchmark and debug LLM performance across these devices. Sign-up to utilize these new features in private preview today.

May 11 - May 15

Build Your AI & MCP Control Tower for Universal Governance
Master the future of agentic security with Apigee. Join our Community TechTalk on May 21 to discover how Apigee serves as a central "Control Tower" for the Model Context Protocol (MCP). We will explore how new JSON-RPC tool authorization enables fine-grained access policies across your organization, ensuring secure and scalable AI deployments. Whether managing internal tools or external users, learn to govern your agentic ecosystem with absolute precision. This session is designed for global coverage across EMEA and AMER regions.

Register for the May 21 Community TechTalk

Apr 27 - May 1

Master Your Launch: The Apigee Production Go-Live Checklist
Ensure a secure launch with the Apigee production guide. Join Nicola Cardace on May 28 to explore security guardrails, including IAM roles, mTLS configurations, and encrypted KVM migrations. Scheduled at 11 AM EDT / 5 PM CEST to support EMEA and AMER teams, this TechTalk provides the technical roadmap you need to flip the switch with absolute confidence.

Register for the May 28 Community TechTalk
Transforming APIs into Governed Agentic Tools on the Google Cloud Agentic Platform
Turn your APIs into secure, governed agentic tools on the Google Cloud Agentic Platform. Join Specialist Christophe Lalevée on May 7 for a technical deep dive into AI productization. Scheduled at 5 PM CEST / 11 AM EDT to maximize coverage for developers across EMEA and AMER, this session explores the integration and governance frameworks required to scale enterprise-ready AI with confidence.

Register for the May 7 Community TechTalk
Fractional G4 VMs are Generaly Available, providing a highly efficient and cost-effective entry point for AI and graphics workloads. These new configurations, using NVIDIA virtual GPU (vGPU) technology, allow you to leverage the power of the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs in flexible, smaller increments, so you can right-size your infrastructure to match the specific demands of your applications. By providing more granular access to advanced hardware, fractional G4 VMs let you optimize resource allocation and reduce overhead without sacrificing performance. You can now select from additional GPU slice sizes for your specific needs:
- 1/2 GPU: Ideal for more intensive tasks such as LLM inference, robotics sensor simulation, and high-fidelity 3D rendering.
- 1/4 GPU: Optimized for mainstream workloads, including mid-range creative design, video transcoding, and real-time data visualization.
- 1/8 GPU: Great for lightweight applications such as remote desktops, productivity tools, and entry-level streaming services.
Transitioning AI from a sandbox prototype to an enterprise-grade system is a major hurdle. A monolithic script won't suffice for widespread deployment. To achieve true scale and reliability with Gemini, organizations must adopt service-oriented micro-agent architectures, establish Zero-Trust security, and implement rigorous EvalOps. Master the "Agentic Maturity Ladder" to ensure your AI & Agentic solutions are robust, secure, and ready for the real world.

Watch the deep dive and read the developer blog to learn more.
ML Development in VS Code with Google Cloud Power: Workbench Extension Now Available
Data scientists and developers can now combine the local productivity of VS Code with the scalable infrastructure of Google Cloud. The new Google Cloud Workbench Notebooks extension allows you to connect to and run notebooks on managed cloud environments directly within your local IDE. This integration streamlines the ML lifecycle by eliminating context switching and providing high-performance compute for complex workloads in a familiar interface. As part of our commitment to the developer ecosystem, the extension is fully open-sourced to support community-driven innovation.
- Install from Marketplace: GoogleCloudTools.workbench-notebooks
- Contribute on GitHub: colab-enterprise-vscode

Apr 20 - Apr 24

Announcing the 2026 Google Cloud Partners of the Year
Google Cloud is honored to celebrate the winners of the 2026 Partner of the Year awards! These awards recognize an exceptional group of partners across AI, Security, Infrastructure, and more, who have demonstrated a commitment to customer success. From global system integrators to specialized startups, these winners are leveraging the power of Google Cloud to solve complex challenges and drive digital transformation worldwide. Join us in congratulating these organizations for their innovation, collaboration, and impactful results over the past year.

See the 2026 Partner Award winners

Apr 13 - Apr 17

We're excited to announce the Public Preview of Datastream’s metadata integration with Knowledge Catalog. This is the first step in our vision to provide a centralized, "single pane of glass" for all Datastream assets. The enhancement automatically synchronizes Streams, Connection Profiles, and Private Connections, eliminating data silos. It enhances discoverability, allowing you to search for Datastream assets using the same interface as BigQuery tables. Centralized governance is also provided, making your real-time data estate more transparent and easier to manage.
Upgrading Apigee OPDK to 4.53 with OS Modernization
Modernize your infrastructure using Google’s official, sequential upgrade path. Our Technical expert, Rakesh Talanki outlines how to upgrade Apigee OPDK to v4.53 while migrating to a supported OS (RHEL 8.x/9.x). This guide covers the "build-out" methodology, including multi-data center syncing, to ensure a stable, zero-downtime transition

Read the guide
Cloud Run Worker Pools and CREMA: Powering Serverless AI at Scale
Google Cloud has announced the General Availability of Cloud Run worker pools, a new resource type designed specifically for pull-based, non-HTTP workloads. Unlike traditional Cloud Run services that scale based on request traffic, worker pools provide an "always-on" environment for background tasks like processing message queues or running large-scale AI inference. To support this, Google Cloud also open-sourced the Cloud Run External Metrics Autoscaler (CREMA). Built on KEDA, CREMA enables queue-aware autoscaling for worker pools, allowing them to dynamically scale based on external signals like Pub/Sub backlog or Kafka lag.
Apigee Model Context Protocol (MCP) now Generally Available
Expose enterprise APIs as MCP tools for agentic AI applications with the General Availability of MCP in Apigee. This update allows developers to transform APIs into AI-ready tools using OpenAPI Specifications, removing the need for local MCP servers or additional infrastructure. With managed endpoints and semantic search in API hub, you can now provide AI agents with secure, governed access to enterprise data at scale.

Explore the MCP overview

Apr 6 - Apr 10

Community TechTalk: Powering Retail Agents with ADK, UCP & Apigee X
Move beyond basic chatbots to secure, transactional AI experiences. Join our Community TechTalk on April 16 to learn how Apigee X and Gemini build a "Trust Layer" for AI shopping assistants using UCP standards. We’ll demonstrate how to block prompt injections with Model Armor and implement cost governance via token limits to secure the path from discovery to purchase.

Register for the TechTalk
Implement multimodal capabilities in your AI agents
Explore three new reference architectures for building sophisticated multi-agent AI systems that can process and analyze multimodal data. To analyze disparate multimodal data and produce a high-confidence classification, see Classify multimodal data. To create a fluid conversational AI that processes audio and video streams in real time, see Enable live bidirectional multimodal streaming. To consolidate fragmented multimodal data into a searchable knowledge graph, see Multimodal GraphRAG resource orchestration.
Automate SecOps workflows with an agentic AI system
To accelerate incident response and reduce manual toil for your security team, you need a system that can automate remediation playbooks. Our new reference architecture helps you build an AI agent that orchestrates complex triage and investigation workflows across disparate security tools, such as SIEM, CSPM, and EDR, from a single interface. See the full guide to orchestrate security operations workflows.

Mar 30 - Apr 3

ASEAN Webinar | April 30: Mastering Agentic Governance at Scale with GCP
As AI agents move from experimental pilots to core enterprise functions, governance is the critical next step. Join Google Cloud experts Shilpi Puri & Wely Lau for a webinar on April 30th at 11:00 AM SGT to learn how to architect a secure AI Management layer. We’ll explore developing governed MCP endpoints, managing tool access to enterprise data, and operationalizing AI with robust audit logs. The session includes a live demo of these frameworks in action on Google Cloud.

RSVP here.

Mar 23 - Mar 27

Turn your API sprawl into an agent-ready catalog
As organizations scale, APIs often become scattered across multiple gateways, creating "blind spots" that hinder AI adoption. To solve this, we’ve introduced two new capabilities for Apigee API hub: a new integration with API Gateway to automatically centralize API metadata into a single control plane, and a specification boost add-on (now in public preview). This add-on uses AI to enhance your API documentation with the precise examples and error codes that AI agents need to function reliably.

Read the full blog post to get started.
Webinar | April 16: AI Command & Control
As AI agents move from experimental pilots to core enterprise functions, governance is the critical next step. Join Google Cloud expert Satyam Maloo for a webinar on April 16th at 11:00 AM IST to learn how to architect a secure AI Management layer. We’ll explore developing governed MCP endpoints, managing tool access to enterprise data, and operationalizing AI with robust audit logs. The session includes a live demo of these frameworks in action on Google Cloud.

RSVP here.
Modernizing and Decoupling Event Ingestion with Apigee
In modern cloud-native architectures, decoupling producers from consumers is critical for building resilient systems. While Google Cloud Pub/Sub provides a scalable backbone, exposing it directly to external clients can introduce security and management overhead. This new guide explores how to leverage Apigee as an intelligent HTTP ingestion point. Learn how to handle security, mediation, and traffic control before messages reach your internal bus using the PublishMessage policy or Pub/Sub API.

Read the full guide.

Mar 16 - Mar 20

Gemini-powered Assistant in BigQuery Studio Gets Context-Aware Upgrades
The Gemini-powered assistant in BigQuery Studio has been transformed into a fully context-aware analytics partner, supporting your entire data lifecycle. The new capabilities include intelligent resource discovery, which uses Dataplex Universal Catalog search to find resources across projects and deep dive into metadata using natural language. You can now automate tasks, such as scheduling production-grade queries directly through the chat interface, and instantly troubleshoot long-running or failed jobs with root cause analysis and cost control auditing.

Explore the full range of what the assistant can do.

Mar 9 - Mar 13

Want to use Gemini to develop code and don't know where to start?
This article includes a couple of examples of developing code with Gemini prompts; it identified changes that were needed to be made to get the code working. The article also refers to other examples that are available on github.

Mar 2 - Mar 6

Introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model. Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier. Gemini 3.1 Flash-Lite can tackle tasks at scale, like high-volume translation and content moderation, where cost is a priority. And it can also handle more complex workloads where more in-depth reasoning is needed, like generating user interfaces and dashboards, creating simulations or following instructions.

Starting today, 3.1 Flash-Lite is rolling out in preview to enterprises via Vertex AI and developers via the Gemini API in Google AI Studio.
TechTalk: Implementing Device Authorization Grant (RFC 8628) for Apigee
Learn how to authorize "headless" devices like Smart TVs or AI agents that lack keyboards and browsers. Join our Community TechTalk on March 19 (5PM CET / 12PM EDT) to go under the hood of Apigee X/Hybrid. We’ll cover the real-world mechanics of state management, polling, and human-in-the-loop security patterns for devices and autonomous agents.

Register for the TechTalk

Feb 23 - Feb 27

Pro-level image generation gets faster and more accessible with Nano Banana 2
Nano Banana 2 is our state-of-the-art image generation and editing model. It delivers Pro-level image generation and editing at the speed you expect from Flash — making the quality, reasoning, and world knowledge you loved about Nano Banana Pro more accessible. Learn more about the model here.

The Intelligent Path to Compliance: Transforming Regulatory QC with Google Cloud
Reducing "Refuse to File" (RTF) risks and submission cycle times is critical for life sciences leaders. Google Cloud’s Regulatory Submission Semantic QC Auditor leverages Gemini and RAG architecture to transform Quality Control from a manual burden into an active, intelligent workflow.

By automating semantic cross-referencing, narrative coherence checks, and dynamic guidance-based auditing, this solution ensures rigorous accuracy and auditability. Operating within a secure GxP-ready environment, it empowers teams to detect subtle inconsistencies and generate remediation plans without sacrificing data privacy.

Learn more.
Stop typing, start interacting! The Gemini Live Agent Challenge is here. Build immersive agents that can help you see, hear, and speak using Gemini and Google Cloud. Compete for your share of $80,000+ in prizes and a trip to Google Cloud Next '26!

Submissions are open from February 16, 2026 to March 16, 2026. Learn more and register at geminiliveagentchallenge.devpost.com

Feb 9 - Feb 13

Introducing Gemini 3.1 Pro on Google Cloud.
3.1 Pro is a noticeably smarter, more capable baseline for complex problem-solving. We’re shipping 3.1 Pro at scale, building upon our goal to help you transform your business for the agentic future. Learn more about the model’s capabilities here. Gemini 3.1 Pro is available starting today in preview in Vertex AI and Gemini Enterprise. Developers can access the model in preview via the Gemini API in Google AI Studio, Android Studio, Google Antigravity, and Gemini CLI.
Automate Storage Compatibility with GKE Dynamic Default Storage Classes
Managing storage across mixed-generation VM clusters in GKE just got easier. With the new Dynamic Default Storage Class, Google Kubernetes Engine automatically selects between Persistent Disk (PD) and Hyperdisk based on a node's specific hardware compatibility. This abstraction eliminates the need for complex scheduling rules and manual pairing, ensuring your volumes "just work" regardless of the underlying infrastructure. By defining both variants in a single class, you reduce operational overhead while maintaining peak performance and cost-efficiency across your entire cluster.

Explore automated disk type selection
Community TechTalk: AI-Powered Apigee Development with strofa.io
Join the Apigee community on February 26 for a deep dive into strofa.io. Guest speaker Denis Kalitviansky will demonstrate how this new AI-powered tool automates and orchestrates Apigee development, from local emulators to large-scale hybrid environments. Discover how to scale your API management and streamline team collaboration using the latest in AI-driven automation.

Register now to reserve your spot.

Jan 26 - Jan 30

Simplify API Governance with Native OpenAPI v3 Support
Eliminate integration debt and accelerate deployment velocity with the General Availability of OpenAPI v3 (OASv3) support for API Gateway and Cloud Endpoints. You no longer need to downgrade modern specifications to OASv2. Instead, you can now define API contracts and enforce critical policies—including telemetry, quotas, and security—using native Google-specific extensions directly within your OASv3 files. This update ensures your APIs are secure by design while remaining fully compatible with the modern developer ecosystem and Google Cloud’s AI services.

Get started with OpenAPI v3 on API Gateway and Cloud Endpoints.

Accelerate API Testing with the New Open Source API Tester
Start validating your APIs with API Tester, a simple, YAML-based Test Driven Development (TDD) framework. Designed for the Apigee community, this tool allows you to write human-readable tests, run them instantly via a web client or CLI, and perform deep unit testing on Apigee proxies. With native support for JSONPath assertions and Apigee shared flows, you can verify everything from payload data to internal variables like proxy.basepath without leaving your terminal.

Explore the API Tester guide and start testing your proxies today.
Secure Sensitive Data with Kubernetes Secrets in Apigee hybrid
Enhance security in Apigee hybrid by accessing Kubernetes Secrets directly within your API proxies. This hybrid-exclusive feature keeps sensitive credentials within your cluster boundary and prevents replication to the management plane. It supports strict separation of duties: operators manage secrets via kubectl, while developers reference them as secure flow variables—ideal for high-compliance and GitOps workflows.

Implement Kubernetes Secrets in your hybrid proxies.
See the Console in a Whole New Light: Dark Mode is Now Generally Available in Google Cloud
Elevate your cloud management workflow with Dark Mode, now generally available in the Google Cloud console. We have delivered a modern, cohesive, and accessible experience reimagined for maximum comfort and productivity—especially during extended working hours and low-light environments. Dark Mode can be enabled automatically based on your operating system's preference, or manually through the Settings -> Appearance menu.

Switch to Dark Mode today to enjoy a modern, comfortable, and productive environment!
Apigee X Networking: PSC or VPC Peering?
Deciding how to connect Apigee X? Watch this video to compare Private Service Connect and VPC Peering. We break down northbound and southbound routing, IP consumption, and how to reach targets on-prem or in the cloud. Learn to simplify your architecture and avoid common networking "gotchas" for a smoother deployment.

Watch the video.

Jan 19 - Jan 23

Bridge the Gap: Excel-to-API Conversion in Apigee Portals
Give your customers more ways to connect! This new article by Tyler Ayers explores how to extend the Apigee Integrated Portal to support direct Excel file uploads. By leveraging SheetJS and custom portal scripts, you can enable users to upload spreadsheets, preview data, and submit it directly to your APIs, all without writing a single line of integration code themselves. It’s a powerful way to simplify onboarding for those who aren't yet API-ready.

Learn how to build it.
Elevate your applications with Firestore’s new advanced query engine
We have fundamentally reimagined Firestore with pipeline operations for Enterprise edition. Experience a powerful new engine featuring over a hundred new query features, index-less queries, new index types, and observability tooling to improve query performance. Seamlessly migrate using built-in tools and leverage Firestore’s existing differentiated serverless foundation, virtually unlimited scale, and industry-leading SLA. Join a community of 600K developers to craft expressive applications that maximize the benefits of rich queryability, real-time listen queries, robust offline caching, and cutting-edge AI-assistive coding integrations.

Learn more about Firestore pipeline operations.

Scaling Ray Serve LLM on GKE: Performance without losing the developer experience

Thu, 18 Jun 2026 16:00:00 +0000

Developers looking for LLM inference and model serving often turn to Ray Serve, a scalable model serving library with developer-friendly, Python-native APIs built by Anyscale. Combined with Google Kubernetes Engine (GKE), developers have a powerful, unified platform optimized for demanding LLM serving use cases, spanning from initial model development to online production serving.

However, that flexibility and feature set used to come at a cost to performance. But today, in partnership with Anyscale, we are delivering up to 5x higher throughput and 8x lower latency in Ray Serve, meeting the growing demands and rigorous performance requirements of state-of-the-art distributed inference, without having to sacrifice ease of use.

Scaling inference without the bottlenecks

Through our joint engineering partnership, we are introducing three major architectural optimizations that dramatically improve Ray Serve LLM's performance characteristics:

Ray Serve HAProxy integration: Ray Serve now builds in HAProxy to manage internal request routing and load balancing. This setup drastically reduces proxy overhead and prevents the Python runtime from saturating under high traffic.
Direct token streaming architecture: This architecture decouples the initial request path from the return stream. Tokens stream directly from individual model replicas back to the proxy, bypassing the ingress router completely for the streaming data path to cut latency.
v2 Ray executor backend for vLLM: The revamped Ray backend for vLLM moves Ray out of the data plane to enable asynchronous scheduling. This unifies the code path with native vLLM executors, closing the performance gap and helping to ensure Ray users benefit from the latest engine-level optimizations.

Benchmarking performance on GKE

We’ve also collaborated with Anyscale to benchmark the updated Ray Serve LLM on GKE clusters utilizing next-generation AI hardware, including Google Cloud A4 VMs powered by NVIDIA HGX B200 systems. We chose to run Gemma 4 E2B as a small, efficient model to isolate bottlenecks introduced from orchestration and routing. Our benchmarks compared the new Ray Serve LLM to its prior performance, as well as a plain vLLM setup using the Ray executor.

These technical enhancements deliver a transformative impact on performance, offering up to 5x higher throughput and 8x better latency compared to previous Ray Serve configurations.

The improved Ray Serve LLM demonstrated a remarkable improvement on a serving cluster with eight replicas, showing a scaling pattern that far exceeds previous performance, and showing comparable performance to running vLLM natively, but without the flexibility that Ray brings to the table.

We observe that with an increasing number of concurrent users, Ray is now able to scale up throughput while maintaining a low 99th percentile time-to-first-token, where previously it struggled. Now LLM practitioners don’t have to sacrifice Ray’s rich features and ecosystem to get production-grade performance on Kubernetes.

Why choose GKE for Ray Serve

GKE provides the foundational infrastructure that makes these software optimizations shine. When using the Ray Operator add-on for GKE, you get turnkey deployment across Google Cloud's AI accelerators, including automated horizontal scaling, monitoring, multi-cluster scaling, and built-in fault tolerance. GKE abstracts the complex parts of orchestrating distributed physical hardware, so your team can focus on refining your models and application logic with Ray.

Try Ray Serve LLM on GKE

We encourage developers to try out these enhancements in the latest Ray release (2.56 and later) and experience the future of high-performance LLM serving on GKE.

For more details, check out the following resources:

Scaling the Next Generation of Global Innovation: How Google Supports Top Startups Around the World

Thu, 18 Jun 2026 12:51:00 +0000

In the high-stakes world of tech entrepreneurship, the leap from a brilliant prototype to a scalable, market-defining business can be brutal. Founders need much more than capital; they need deep architectural guidance, sovereign-level policy alignment, and technical systems engineered to enable rapid growth.

Joy’s Law states: "[N]o matter who you are, most of the smartest people work for someone else."

We recognize that true innovation inherently happens “elsewhere.” This philosophy drives our active support of global accelerators across a diverse, geographic footprint of innovation markets to tap into this decentralized brilliance. For over a decade, our Google accelerator program has acted as a catalyst for this exact transition. By bridging the gap between raw entrepreneurial ambition and Google’s world-class engineering ecosystem, the program has quietly built one of the most resilient, high-performing startup portfolios on Earth.

The Power of the Network: A Decade by the Numbers

While many startup accelerators struggle with significant failure rates, our accelerator program has set a high bar for long-term success. By pairing top-tier founders and CTOs with customized, deeply technical engagement from Google, along with learned industry best practices, the program has consistently helped build both highly valuable companies and products.

The scope of this global network is impressive:

Metric	Impact to Date
Global Footprint	2,011 startups supported across 88 countries
Program Experience	144 cohorts graduated over 10 years
Survival Rate	93% portfolio survival rate
Financial Momentum	$46.3B in funding raised; $135.1B collective portfolio valuation
Startup Job Creation	305,900 employees across the entire startup portfolio

The Developer Value-Add: By design, this isn't a high-level business bootcamp. The founders of Accelerator startups identify a deeply technical problem that they then work on with bespoke support from Google to solve. These startups get access to Google engineers and product managers, along with access to our platforms and tools. From advising on architectures to optimizing AI model pipelines, Google experts work directly with the founding teams to help tackle some of their most complex technical hurdles.

Strategic Momentum: Geopolitics, Green Infrastructure, and Robotics

The startup ecosystem is shifting rapidly, and our accelerator program is evolving along with it. This year, Google launched new initiatives to support global economic development and explore and evolve critical environmental infrastructure. Just a few examples:

Sovereign-Level Policy & Strategic Wins

Australia: Accelerator alumni have successfully anchored the Google AI stack directly into the country's national R&D strategy, engaging directly with Members of Parliament in Canberra.
Canada: The Canadian Office of Innovation, Science, and Economic Development officially recognized and cited the impact of the Canada accelerator program in its formal report for the G7 Summit.

Cutting-Edge Frontier Programs

This year marks a major expansion into specialized, frontier tech verticals:

The Google DeepMind Accelerator (Europe): Dedicated strictly to hardening technical builds for AI-native robotics companies, effectively bridging the gap between lab prototyping and commercial market success.
The GDM Accelerator (AI for Planet) in APAC: A joint initiative between Google DeepMind and Google's Sustainability teams. The program focuses heavily on biodiversity foundation models to position Google at the forefront of the critical ESG (Environmental, Social, and Governance) infrastructure market.
Japan Relaunch: Marking a major strategic re-entry into one of Asia's most vital technology hubs.

The hive mind opportunity

To maximize the power of this unique network, earlier this year we successfully transitioned our disparate regional alumni networks into a Unified Alumni Community. We now bring together more than 1,750 startups and 3,000 founders across 90+ countries through shared online channels and the opportunity to attend in-person events, where founders get access to Google senior leadership and our newest models and tech, opportunities to directly influence the development of new Google products to better support their businesses’ growth, and learn from and support each other.

Don't Miss It: Upcoming Demo Days

The culmination of each of our intense accelerator journeys is Demo Day, where top-tier cohorts showcase their technical builds and new market-defining concepts. You can watch these milestones live streamed directly via the Google for Startups events on YouTube. Mark your calendar for the remaining 2026 showcases:

Summer & Fall 2026

Africa Accelerator: June 19
Middle East, North Africa, and Turkey Accelerator: June 26
Korea Accelerator: July 15
Brazil Accelerator: July 16
Europe DeepMind Accelerator (Robotics): September 11
India: September 30

Winter 2026

India Accelerator: November 4
Southeast Asia Accelerator: November 13
North America Accelerator (Energy): November 19
South Africa Accelerator: December 11
Europe and Israel (Energy): December 11
Global Google.org Accelerator(Government Innovation): December 11

Open & Upcoming Applications

If you are a founder or CTO looking to radically scale your technical infrastructure, optimize your product market-fit, and gain equity-free support from Google's global talent pool, applications are officially moving.

Applications Open Right Now:

GFSA Southeast Asia (Leverage the newly launched AI Startup Innovation Corridor connecting SEA to Silicon Valley)
GFSA China
Google.org Accelerator: AI for Science

Agent Factory Recap: 100X engineering with AI agents in Google Antigravity 2.0

Thu, 18 Jun 2026 07:00:00 +0000

In this episode of the Agent Factory, I sat down with Rody Davis, one of Google’s top agentic engineers. We dive into the massive shift from traditional IDEs to agent-first platforms, the reality of code reviews in an AI-driven world, and how to use "skills" to perform at a 100X level.

This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.

Google Antigravity 2.0 - What is it?

Antigravity 2.0 has evolved from a simple agentic IDE into a full-scale agent-first platform. It now consists of four core pillars: a standalone desktop Agent Manager for orchestration, a robust CLI for server-side work, an SDK for custom Python-based workflows, and a specialized IDE. This unbundled approach allows developers to compose their own environment, managing multiple folders and complex project structures without being forced into a single-workspace layout.

Rody Davis on 100X Engineering

We explored the strategies elite engineers use to scale their impact and reduce the "cognitive toil" of daily development.

Scaling Impact and Reducing Toil

Timestamp: 01:55

Rody explains that AI isn't just about writing code; it's about accelerating the entire lifecycle. He uses agents to write richer test suites and prototype multiple versions of an app before committing to a framework. By offloading "toil", like building marketing sites, he can focus on high-level architecture and problem-solving.

Skills as "Context Cheat Sheets"

Timestamp: 03:05

A core philosophy in Rody’s workflow is the use of "Skills." He views skills as a way to compress context for the model. "It’s literally a cheat sheet for the agent," Rody notes. By providing the agent with specific design systems or API documentation, the model becomes significantly faster and more accurate, avoiding the latency of searching through massive, unorganized docs.

Customizations, Skills, and MCP Servers

Timestamp: 04:17

Rody walks us through the customizations tab in Antigravity 2.0, showing how to extend an agent's capabilities:

Android CLI: Building and deploying mobile apps directly from the command line.
Modern Web Guidance: Grounding the agent in the latest CSS and accessibility standards.
MCP Servers: Using the Model Context Protocol to enable features like hot reloading for Flutter and Dart.

The Bonsai Approach to Code Review

Timestamp: 05:27

Rody compares maintaining a codebase to being a Bonsai artist: constantly pruning to keep things simple. He advocates for flat architectures where state, UI, and data are strictly separated. This makes it easier for a human to "steer" the agent; if the agent starts putting files in the wrong place, the architectural violation is immediately obvious.

Do you review 100% of agent-generated code?

Timestamp: 07:11

Rody’s answer depends on the task. For a marketing site, he focuses on the visual output rather than the code. However, for backend logic, he cares deeply about API contracts and schemas. He recommends writing the first example yourself so the agent can simply "copy the pattern" for the rest of the codebase.

Building Extensions to Solve Daily Friction

Timestamp: 09:05

To solve the problem of managing files across multiple Git projects, Rody used Antigravity to build a custom macOS Finder extension in Swift. This tool allows him to filter files by time boxes (today, last week, etc.), demonstrating how agents can build specialized utilities that reduce daily friction.

Do AI engineers still write code by hand?

Timestamp: 10:22

"Oh yeah," Rody says. He still loves the syntax of languages like Go and the challenge of controlling computers. He believes it's vital to understand the building blocks deeply so that when you face a problem two years down the road, you know exactly which "old project" to reach back for.

Powering Personal Websites with Gemma 4

Timestamp: 11:42

Rody showcases his personal website, which uses Gemma 4 and Embedding Gemma to provide dynamic content recommendations offline. By vectorizing post summaries at compile time, the site can suggest related content via a local vector database without needing a live backend server.

The Factory Floor

The Factory Floor is our segment for getting hands-on. Here, we moved from high-level concepts to practical code with live demos.

Multi-Agent Parallelism in Action

Timestamp: 14:02

In this demo, Rody uses a single stream-of-thought voice prompt to build a full-stack application. We watched as Antigravity:

Spun up parallel sub-agents, including a dedicated DevOps and QA engineer. (see 19:48)
Built a multilingual note-taking app using Vite, Go, and SQLite.
Orchestrated the entire stack via Docker Compose.
Localized the app into five different languages simultaneously.

Unbundling the IDE Ecosystem

Timestamp: 15:35

We discussed why Google separated the IDE from the Agent Manager. Rody highlights that this unlocks different workflows: the CLI is perfect for SSH sessions on a Raspberry Pi, while the Agent Manager handles general knowledge work and orchestration across multiple folders.

Turning Documentation into Reusable Skills

Timestamp: 25:41

Rody shares his process for turning documentation into skills. He wrote a Go CLI that parses websites into markdown, allowing him to install hundreds of skills for the sites he visits frequently. This ensures the agent always has access to the specific version of the docs he is using.

Rapid Fire: Future Tech Predictions

Timestamp: 27:35

We put Rody on the spot with some controversial takes:

Vibe Coding: Rody believes a non-technical founder will launch a company using only vibe coding by 2026, but the real test will be maintaining it in years 2 through 5.
Production Failures: Rody agrees that vibe coding will cause significant production failures, leading to a new hot job for software engineers: consulting to solve those failures.
Codebase Health: Rody argues that poor codebase health, not context windows, is the biggest bottleneck in AI speed.

Grounding Yourself in a Changing Landscape

Timestamp: 31:10

Rody advises engineers to focus on why they were hired: to solve problems and engineer things that didn't exist before. He suggests using AI to provide better communication handoffs between colleagues, making artifacts so easy to approve that they are "ready to sign off" the moment they are handed over.

Conclusion

The era of agentic engineering is here, but as Rody Davis demonstrated, it requires more architectural discipline, not less. By treating your codebase like a Bonsai tree and your agents like an orchestra, you can move past the "toil" and focus on building the frameworks of the future.

Your turn to build

Are you ready to build anything? We’ve officially launched the #NapkinChallenge. Take a handwritten sketch of an app idea, use Antigravity 2.0 to build it, and share your creation on social media.

Try Antigravity 2.0: antigravity.google
Join the Challenge: Napkin Challenge Details
Rody’s personal website, github repo and skills

Connect with us

Rody Davis → X, LinkedIn
Shir Meir Lador → X, LinkedIn

Choice, compliance, and collaboration: Europe’s path to open digital sovereignty

Thu, 18 Jun 2026 07:00:00 +0000

The European Commission’s Tech Sovereignty Package comes at a defining moment for the continent's digital future. European competitiveness and security are top of the agenda for European business, institutions, and citizens, and a significant investment in European digital capacity is needed to deliver those goals. In that context, it is understandable that Europe is considering how to boost the European Union digital footprint from chips, to cloud adoption, to AI data infrastructure.

The European Commission’s strategy is to be grounded in "openness, partnership, and fair competition." Indeed, the package contains bold measures consistent with these principles on interoperability to address vendor lock-in and an open source strategy for the public sector, as well as on more rapid data center deployment.

We will work cooperatively with the EU institutions providing our best knowledge about how to achieve these stated objectives in practical terms. To that end, we believe certain elements of the Cloud and AI Development Act (CADA) should be changed to avoid unintended market isolation, ensuring that trusted global partners can continue to support Europe’s security and scaling goals under a framework of true openness.

Our approach to sovereignty, developed over many years, is grounded in delivering tangible, technical, and verifiable control and open choice, while investing in the growth and security of Europe’s digital infrastructure — consistent with what we understand to be the goals of this strategy.

We have engineered a comprehensive menu of Sovereign Cloud solutions, designed to meet Europe's tiered compliance requirements at every level. From standard public cloud configurations with strict European data boundaries to independently operated regional cloud services to fully air-gapped solutions for the most sensitive public-sector operations, we ensure that compliance never requires sacrificing technological excellence.

Through our deep “Made with Europe” collaborations with regional champions — including S3NS in France; Thales, the Schwarz Group, and T-Systems in Germany; PSN in Italy; Clarence in Luxembourg; and Telefónica in Spain — we are actively delivering the operational resilience and jurisdictional controls designed to meet the highest regulatory standards of existing sovereignty frameworks at national level.

Across our partner-led sovereign solutions, the S3NS offering in France has been qualified to meet SecNumCloud 3.2, Europe’s highest sovereignty regulatory bar. Our partners Clarence and S3NS, together with Mistral, offer services that have been approved by the EU Directorate-General for Digital Services (DIGIT) for use by EU Institutions who have sovereign cloud needs. We believe this is what constitutes a true trusted partnership and encourage the Commission to follow this existing path, which is already meeting sovereign expectations across Europe today.

1. Refining sovereign certification

A primary concern within the CADA proposal is the design of the Union Assurance Levels (UALs). While harmonizing sovereignty criteria across member states is a constructive step, criteria at each of the four UALs would limit or exclude global providers, regardless of the security mitigations they offer.

Regulations should create space for innovative and effective technology approaches to sovereign control, instead of rigid geographic criteria that sacrifice the potential to have control without undue disruption to global supply chains.

We understand and support the data sovereignty and extra-territorial risk-mitigation priorities of European policymakers. Through capabilities like Cloud External Key Manager (EKM), one of the tools within our suite of sovereign solutions, Google Cloud allows customers to maintain their encryption keys outside of Google's infrastructure. This control creates a technical barrier to unauthorized access to unencrypted data by third parties without the explicit consent and awareness of the customer.

The EU has already designed an alternative, more balanced model in the proposed Industrial Accelerator Act. This framework has the potential to successfully maintain collaboration with trusted non-EU partners under a default presumption that trusted partners can operate as EU origin, underpinned by robust global trade rules and strong back-stop powers. We urge co-legislators to apply a similar philosophy to CADA.

2. Promoting interoperability, combating vendor lock-in, and reforming procurement

Sovereignty must empower end-users with more choice, not less. A healthy European digital ecosystem requires open foundations that prevent vendor lock-in, restrict choice, and drive up costs.

We strongly support CADA's goal to foster an open, interoperable cloud ecosystem. To make this meaningful, we believe that the policy must align with a commitment to openness across every level of the digital stack — infrastructure, models, and applications.

Our own approach is built on this foundation: We offer open, portable infrastructure with no data transfer exit fees, we champion open AI models like Gemma, and we support open-standards applications. Our stack-wide open approach is designed to help European enterprises build, migrate, and scale without friction.

Yet organizations can’t maximize the benefits of an open approach because restrictive licensing practices lock customers into a single ecosystem. To restore true choice, we advocate for three straightforward reforms: allowing users to move their software licenses freely, ensuring fair pricing for legacy software, and guaranteeing that software runs equally well on any cloud platform.

3. Building sustainable, open infrastructure for Europe's AI future

Physical compute infrastructure is the bedrock of digital sovereignty. While we support the ambitions of the Chips Act 2.0 to invest €30 billion in European semiconductor research and development, we believe that this investment is just as important as establishing regulatory rules that attract large scale investments in compute infrastructure.

To help achieve that goal, we recommend the measures outlined below. As a long-standing investor in European data infrastructure, operating 13 European cloud regions and deepening that commitment with recent investments in Germany, Belgium and Sweden, we hope to see a policy that leverages the pace and scale of committed global investors like us.

We welcome the introduction of "special project" status to streamline permitting, grid access, and power purchase agreements (PPAs) in designated zones. To ensure these measures succeed, we support:

Prioritizing fast-track permitting benefits for highly sustainable infrastructure projects.
Aligning national sustainability criteria with the upcoming EU-wide rating scheme, ensuring it does not penalize energy-efficient technologies like water cooling.
Ensuring that these acceleration zones do not artificially constrain the geographic location of new sites, and extending supportive grid connection measures to viable data centers operating outside of designated zones.

The path forward: Made with Europe

As ministers prepare to gather for the upcoming Council Summit, Europe has a historic opportunity to build a resilient, competitive, and truly open digital future.

By championing open-source software — from our contributions to Kubernetes, Chromium, Android, TensorFlow, and open AI models like Gemma — and by co-engineering solutions with Europe's industrial leaders, we are proving that global innovation and European values can be furthered together.

We look forward to collaborating with Member States, European policymakers and our regional partners to ensure that the final Tech Sovereignty Package fosters local economic growth, safeguards national security, and keeps Europe at the cutting edge of global AI innovation.

Cloud Network Insights: end-to-end observability for the Cross-Cloud Network

Wed, 17 Jun 2026 19:30:00 +0000

In today’s digital landscape, the network is no longer confined to a single data center or even a single cloud provider. Enterprises are increasingly adopting cross-cloud strategies, connecting Google Cloud workloads to on-premises environments, other clouds like AWS and Azure, and a vast array of internet-facing applications. While this flexibility drives innovation, it can also introduce significant operational complexity. When a user experiences degradation in application performance, the critical question remains: Is it the network, the application, or something else?

We are excited to announce the general availability of Cloud Network Insights, an out-of-the-box, Google Cloud-native solution that provides comprehensive visibility into network and digital experience performance across complex multi-cloud, and hybrid environments.

Closing the visibility gap with active monitoring

Cloud Network Insights, offered in partnership with Broadcom AppNeta, expands your observability beyond Google Cloud to your entire global deployment. By utilizing active synthetic probing, the solution monitors network routes even when no user traffic is present, allowing teams to be proactive rather than reactive.

Whether the source of degradation is in the cloud, on-premises data centers, internet applications, ISPs, or last-mile connectivity, Cloud Network Insights helps you pinpoint the exact location of the bottleneck.

Cloud Network Insights integrates directly into the Google Cloud Observability suite, bringing sophisticated network intelligence into the tools you already use. With Cloud Network Insights, you get:

End-to-end network path visibility: Gain a hop-by-hop visualization of the network path between your sources and destinations. Monitor critical metrics like round-trip time (RTT), packet loss, and jitter across networks you don’t directly manage.
Digital experience insights: Go beyond the network layer to monitor digital experience for web applications. Measure DNS resolution times, HTTP response codes, and full browser page-load times to identify whether an application's degradation is due to the network or the application itself.
Proactive detection and alerting: Use synthetic testing to identify performance dips before they impact your customers. Alarms are integrated with Cloud Monitoring and Cloud Logging, enabling alerting via email, Slack, or PagerDuty.
SLA validation: Arm your team with the data needed to verify if ISPs and service providers are meeting their performance commitments.
Rapid root-cause analysis: Quickly differentiate between network problems, application-level issues, or browser performance impacts.
Integrated monitoring: Access metrics and logs directly within Google Cloud, leveraging Cloud Monitoring and Cloud Logging for dashboards and alerting. Utilize the open partner ecosystem of Google Cloud as well as support for the OpenTelemetry protocol for metrics and logs, allowing direct ingestion by OTel SDKs and collectors.
Agentic workload monitoring: Use synthetic testing to monitor connectivity and network performance to help ensure optimal connectivity to your agents and tools.

Network performance and multi-path routes to/from Google Cloud, AWS, and Azure in one view

How it works: active synthetic probing

Cloud Network Insights uses active synthetic probing technology that consists of three main components:

Monitoring Points: You deploy lightweight software agents, called Monitoring Points, into critical network segments, such as a central VPC, a remote branch, or an on-premises data center. These can be deployed as containers or virtual machines.
Synthetic probes: These Monitoring Points send small, frequent bursts of synthetic traffic (simulating a user or application) to a target destination. This allows you to monitor performance 24/7, even when no real users are on the network.
Data synchronization: The Monitoring Points send real-time performance telemetry to a central backend service. This data is then synchronized back to Google Cloud, with metrics exported to Cloud Monitoring, and alarms and events sent to Cloud Logging.

Core capabilities

Cloud Network Insights supports two primary types of monitoring to give you a full picture of your infrastructure:

1. Network performance monitoring (Layers 3 and 4)

This provides a hop-by-hop visualization of the network between a source and a destination, including.

Metrics captured: Round-trip time (RTT), packet loss, jitter, and path changes.
Single-ended mode: The agent probes an external target (like a URL, IP address or an API endpoint) that doesn't have a Monitoring Point installed.
Dual-ended mode: The Monitoring Point probes another Monitoring Point. This provides richer data, including precise one-way latency and the ability to detect asymmetric routing (when data takes a different path going out than it does coming back).

Network path metrics in Google Cloud console

2. Digital experience monitoring (Layer 7)

With digital experience monitoring, you can track the end-to-end experience of a web application. Here, you can choose from:

Browser mode: Uses a real browser engine (Selenium) to load full web pages, execute JavaScript, and render content. It measures complete page-load times to validate the actual user experience.
HTTP mode: Sends synthetic HTTP/S requests to a URL or API endpoint. This is a lightweight check for server availability, response time, and DNS/TLS performance.

Intelligence and automation

Cloud Network Insights also offers a variety of monitoring and troubleshooting capabilities.

Proactive alarms: Cloud Network Insights leverages auto-baselining to establish dynamic performance thresholds based on your historical metric data. If a metric deviates from your defined parameters, the system instantly triggers an event in Google Cloud, routing alerts directly to your team via email, Slack, or PagerDuty.
Monitoring policies: You can automate monitoring setups across large-scale environments by defining policies that dynamically create or remove paths based on custom tags. For instance, you can automatically track a core web application's performance from specific geographic regions.
Root-cause analysis: Because Cloud Network Insights extends visibility into traditionally "unwatched" areas like ISPs and transit networks, it instantly pinpoints whether a slowdown is occurring within Google Cloud, at the ISP level, or inside another cloud environment like AWS or Azure.
AI-driven insights: With integration to Gemini Cloud Assist, you can use natural language to interrogate Cloud Network Insights telemetry alongside your broader infrastructure data. Rather than manually pivoting between dashboards, ask Gemini to cross-reference specific Cloud Network Insights metrics against other Google Cloud metrics, reducing mean time to resolution (MTTR).

What customers are saying

We are already seeing strong interest from customers looking to simplify their cross-cloud operations. Organizations like Sabre and Pexip are already using Cloud Network Insights to gain clarity in their hybrid environments.

"In an environment as complex and high-scale as Sabre’s, total visibility isn't just a luxury — it's a requirement for operational resilience. Cloud Network Insights will enable us to further shift our posture towards proactive optimization. By providing granular, real-time telemetry across our global cloud footprint, it helps eliminate the traditional 'black box' of the network, allowing our teams to resolve bottlenecks before they impact the traveler experience." - Alfredo Rodriguez, VP of Cloud and Infrastructure, Sabre

“Cloud Network Insights closes the 'visibility gap' between the private corporate network and the public cloud, empowering our joint customers to pinpoint performance bottlenecks in seconds rather than hours.” - Alan Davidson, CIO, Broadcom

Get started today

Navigating complex digital ecosystems shouldn't mean sacrificing visibility. Cloud Network Insights bridges the gap across multi-cloud and hybrid environments by combining deep network performance metrics with digital experience monitoring. Coupled with direct integrations into Google Cloud Observability and Gemini Cloud Assist, your teams are empowered with intelligent alerting, robust SLA validation, and rapid root-cause analysis. We look forward to helping you gain a clearer, unified view of your Cross-Cloud Network.

You can get started in the Google Cloud console today. To learn more:

Explore our product documentation for deep dives into deploying Monitoring Points and configuring policies.
Check out the latest release notes to stay updated on new features.
Watch the overview video
Hear more about the partnership between Google Cloud and Broadcom:

How growing UK midsize businesses are building in the AI era

Wed, 17 Jun 2026 08:00:00 +0000

The UK’s 5-million-plus small and midsize businesses and enterprises (SMBs) are the backbone of our economy. Today, we’re seeing these critical businesses begin to put AI to work, to operate more efficiently, move faster, and ultimately deliver better outcomes for their customers.

This shift is driven by tangible day-to-day results. According to recent research from Enterprise Nation published in partnership with Google, 71% of AI adopters surveyed in the UK say the technology helps them save time on routine tasks, and 64% report a direct boost in productivity. On top of this, AI-enabled productivity tools (like Google Workspace with Gemini) are delivering a 20% boost in productivity for SMBs, which effectively hands them back one full working day every single week.

At Google Cloud, we have a front row seat to this shift: SMBs have long utilized platforms like Google Workspace, and today they’re transforming with Google’s AI platform and models. In fact, we’ve seen the number of UK-based SMBs using Google Cloud AI nearly double year-over-year. This includes our Gemini models and products like Gemini Enterprise and AI Studio, which are helping SMBs do things like:

Roll out better customer support systems to help escalate and resolve customer support calls more quickly.
Automate repetitive actions in areas like payroll and accounting.
Help more employees understand and leverage data at work — even those not trained as data analysts.
Rapidly create and implement new designs for marketing collateral.
Help more people build their own AI agents to help them in their everyday jobs.
Conduct complex research projects at a speed and price point previously unavailable.

At today’s Google Cloud London Summit, we’re showcasing a number of innovative SMB customers who are actively using our AI tools to transform how they work, including companies who have recently expanded their work with us:

Neural Alpha, a sustainability fintech company, is using Gemini models to read unstructured environmental and corporate sustainability reports to automatically find and organize thousands of key facts, cutting months of slow, manual research down to a fraction of the time.
Sep 2, a digital security provider, uses Gemini Enterprise to deploy autonomous AI agents for 24/7 threat monitoring — accelerating incident detection and quickly neutralizing security threats for its customers.
Sunhouse, a strategic brand design agency, uses Gemini Enterprise to easily find archived design work stored on Google Drive, enabling its teams to spend less time hunting for files and more time growing its business with global brands.
Terrapinn, a global B2B events company, is transforming its operations by leveraging Gemini models, NotebookLM, Looker, and BigQuery to turn manual tasks into automated workflows, accelerating how its teams design, market, and deliver world-class conferences.
VoCoVo, a telecommunications provider, is integrating Google Cloud AI across its systems to turn isolated data into actionable intelligence and build autonomous workflows, streamlining routine operations so their team can focus on high-impact innovation.

Empowering Your Team: AI Upskilling Resources for Growing British Businesses

To help midsize teams maximize their impact and confidently navigate the modern AI landscape, we’ve developed a suite of dedicated, no-cost upskilling resources. Whether you want to train your existing teams or democratize data tools across your entire workforce, these programs will help you build an AI-ready organization:

SMB-Focused Programs: Explore our new SMB Learning Path or enroll in the Gemini Enterprise Agent Ready (GEAR) program for specialized training in agentic AI.
Google Skills for Organizations: Access our no-cost, on-demand learning platform featuring over 3,000 AI courses and hands-on labs created by experts at Google Cloud and Google DeepMind.
Get Certified: Ready to validate your team's expertise? This premium, cohort-based program offers instructor-led training, technical mentorship, and AI-infused skill badges designed to prepare your team for industry-recognized certifications.

By offering a full suite of SMB technology and training — from productivity in Workspace, to all our Ads services, and now powerful AI tools — Google is helping small and midsize firms thrive, no matter where the future takes us.

From AI potential to agentic reality: Driving the UK’s next chapter

Wed, 17 Jun 2026 08:00:00 +0000

The United Kingdom, and London in particular, continues to be one of the great hubs for AI development in Europe and the world. We’re home to Google DeepMind, of course, as well as significant AI unicorns — and Google Cloud customers — like Ineffable Intelligence, which is today announcing an important partnership with us.

A year ago, we joined you for the London Summit to showcase the vast potential of generative AI, including a major investment in upskilling the UK civil service. Today, as we welcome our partners once again to the historic vaults of Tobacco Dock, that potential has become an industrial-scale reality. In my conversations with leaders across both Whitehall and The City, the focus has moved from chatbots and media experiments to full-production execution. This is the moment of the agentic enterprise, where we shift from systems that simply chat with us to systems that can reason, plan, and execute multi-step workflows.

This transition is the cornerstone of the UK’s projected £400 billion economic boost from AI by 2030. At Google Cloud, we are the only provider offering the full integrated stack — custom silicon, frontier models, and planet-scale infrastructure — required to turn the Agentic Enterprise into a reality.

The new frontier of British enterprise and research

The banking sector is a key proving ground for this shift. And HSBC, one of the largest and most important financial institutions in the world, is showing the way. Today, we’re announcing a multi-year transformational partnership with HSBC to accelerate AI adoption across HSBC’s products and services globally. This new collaboration will further accelerate the shift towards AI-enabled ways of working across HSBC’s global operations.

HSBC will work with Google Cloud and Google DeepMind engineering teams to collaborate on new AI-powered tools and programmes, with access to Google’s latest agentic AI capabilities – including Gemini models and the Gemini Enterprise Agent Platform. The initial delivery focus on three areas: hyper‑personalised wealth management support, stronger financial crime risk management, and AI tools to enhance frontline/relationship manager client service

UK startups also continue to break new ground with technology, and AI in particular, as demonstrated by the work of frontier labs like Ineffable Intelligence. The company, which launched earlier this year, has chosen Google Cloud as its preferred cloud partner, utilizing Google’s full stack of AI-optimized hardware and tools to build and train Ineffable’s first generation of foundational models.

Led by David Silver, a former Google DeepMind researcher who was instrumental in the AlphaGo project, Ineffable Intelligence is taking a unique approach to AI development. The team are building systems that learn primarily through their own experience through reinforcement learning, instead of relying on the large-scale human-generated datasets behind language models. The ambition is to create a “superlearner” that develops knowledge through trial and error. This year, Ineffable Intelligence set a record for a European seed funding round of $1.1 billion, and now Ineffable Intelligence will support its training work by deploying one of the largest clusters of A5X, powered by the NVIDIA Vera Rubin NVL72 platform on Google Cloud, delivering massive computational scale.

To move from experimentation to true industrial production, businesses need more than just models; they need a roadmap. To help show them the way, we’re expanding our partnership with Deloitte, which will open a new AI Studio at its London campus. Developed in collaboration with Google Cloud, the studio will help British organisations move beyond AI experimentation to deploy autonomous, action-oriented AI systems at scale.

Deloitte is also committing to upskill 1,000 members of its UK AI and data workforce on Gemini Enterprise. This certification program will ensure that Deloitte’s AI and data engineers’ are equipped with the technical expertise to implement Google’s most advanced agentic architecture, providing UK clients with one of the largest pools of certified AI talent in the region.

Building a future-ready public sector

The blueprint for a modern digital government requires moving away from rigid legacy contracts toward agile, AI-driven public services. In collaboration with the Ministry of Housing, Communities and Local Government (MHCLG), the i.AI incubator, Google Deepmind, and Faculty, we are delivering tangible public sector reform and tools for reinvention that directly support the national goal to "get Britain building."

Agencies like MHCLG are already using a tool called Extract which was built using Google technology to help transform planning processes by reducing document processing times from two hours to just two minutes. Simultaneously, we are supporting trials of an AI planning tool — co-created with local planning authorities in Barnet, Dorset, and Camden — which aims to cut decision times for everyday applications by 50%. Furthermore, the Department for Transport (DfT) is utilizing Gemini to streamline public consultation analysis, a move projected to save £4 million annually.

Innovation on this scale also requires a secure, sovereign foundation. That is why Google Cloud is working to strengthen our UK data residency commitments, including measures like making Gemini 3.5 Flash, which features in-country AI processing, available by late June 2026 for sensitive sovereign use cases. We are giving British organizations the confidence to innovate within strict compliance boundaries.

To help keep businesses safe from the challenges posed by bad actors using AI and other digital threats, we also recently announced a comprehensive AI-powered cybersecurity platform — Google AI Threat Defense — which combines Wiz, Mandiant, Gemini & CodeMender to find, fix, and protect our customers from vulnerabilities.

Proven impact from the high street to public service

Autonomous agents are no longer a future prospect; they are delivering value across the UK economy today. Our work with THG Ingenuity, an ecommerce solutions provider, has delivered an 8x higher conversion rate via its AI Shopping Assistant. Starling is similarly empowering customers with "spending intelligence" tools for instant habit analysis around purchases and expenses. And Rightmove, has launched a beta version of an AI-powered conversational property search, built with Google’s Gemini models, enabling users to search for homes in their own words.

The breadth of this impact is visible across every sector: Kingfisher is pioneering retail-specific agentic applications; Openreach is driving field service optimization in telecommunications; andUnilever is using AI at scale across the entire value chain to drive growth and build desirable brands in the new era of consumer goods.

Meanwhile, VMO2 is streamlining complex data operations; Vodafone is executing a $1 billion partnership to redefine network performance; and WPP is integrating Gemini across creative workflows, whether that's generating high-fidelity campaign assets at speed and scale, powering AI agents, or training robotic camera operators.

Empowering the engine of growth for small to medium businesses and startups

The true measure of Britain’s AI success lies in its small and medium enterprises and startup ecosystem. Our AI Works research highlights a pivotal moment: AI has the potential to boost productivity for small and medium enterprises by 20% and unlock £198 billion in output for the UK economy. With 56% of smaller firms already seeking guidance, we have launched the AI Works for Britain upskilling initiative to ensure no business is left behind.

We also continue to foster the next generation of British unicorn startups through our ongoing partnership with Tech Nation at the London AI Hub. This sustained commitment ensures founders have the resources and community needed to scale, and this September, we will further this mission by hosting the Gemini Startup Forum: Cybersecurity in London to help startups build secure-by-design AI applications.

The Model Garden at Platform 37

Our belief in the UK’s potential is reflected in our physical footprint, too. We are continuing to invest in the UK's digital infrastructure to support growing demand: Our state-of-the-art data center in Waltham Cross launched in September 2025, a key part of our two-year, £5 billion investment to help power the UK's AI economy. And earlier this year, we opened our new office in London in Kings Cross, Platform 37, along with plans for The AI Exchange, a new public space dedicated to deepening understanding of AI.

Building on this momentum, we are excited to introduce The Model Garden at Platform 37, launching in the fourth quarter of 2026. This London-based hub is far more than a physical space; it serves as a strategic investment designed to fundamentally elevate how we engage with our most important customers. Blending the timeless aesthetics of a classic English garden with immersive, high-tech innovation — from living digital walls to a three-story atrium — The Model Garden acts as a physical marketplace for our best ideas.

The blueprint for the agentic enterprise

For UK businesses, civic leaders, and organizations to continue to lead in the AI moment, they must not only rethink the technology they use but also fundamental aspects of how we work. As we support thousands of organizations and millions of teams here and around the globe, we see three core strategies helping achieve success with AI:

Culture: We must reimagine our organizations for the future. True transformation means getting teams excited, enabled, and equipped to work with AI agents in completely new ways. It is about human-AI collaboration, not just automation.
Responsibility: We must build with safety and security in mind from day one. Protecting your users, your customers, and your brand is paramount. Our frontier models are built on a foundation of rigorous AI principles and secure-by-design infrastructure.
Sustainability: In an era of rising compute demands, we must scale in a way that is both financially viable and positive for our planet. At Google, we are committed to carbon-free energy 24/7, ensuring that the UK’s AI growth does not come at the cost of our climate goals.

Architecting the future together

Google Cloud is the primary partner for the UK’s agentic transition. We are moving beyond the hype of experimentation into the rigor of production. From the research labs of King's Cross to the diverse enterprises powering the high street, we are architecting a resilient, sovereign, and prosperous future for the United Kingdom.

Thank you to everyone who’s joining us in London — yesterday, today, and into the future. This year we’ve packaged up an exclusive on-demand experience, allowing you to stream the defining London Summit moments, available anywhere, anytime.

Build and Deploy a Remote MCP Server to GKE in 30 Minutes

Wed, 17 Jun 2026 00:00:00 +0000

Build and Deploy a Remote MCP Server to GKE in 30 Minutes

Integrating context from tools and data sources into LLMs can be challenging, which impacts the ease of development for AI agents. To address this challenge, Anthropic introduced the Model Context Protocol (MCP), which standardizes how applications provide context to these models. Developers often want to build an MCP server for their APIs to make them available to fellow developers, allowing them to use it as context in their own applications. Google Kubernetes Engine (GKE) provides a scalable, reliable, and secure environment to deploy these remote MCP servers.

This guide shows the straightforward process of setting up a secure remote MCP server on GKE.

MCP transports

The Model Context Protocol follows a client-server architecture. It initially only supported running the server locally using the stdio transport. The protocol has since evolved and now supports remote access transports, specifically Streamable HTTP.

With Streamable HTTP, the server operates as an independent process that can handle multiple client connections. This transport uses HTTP POST and GET requests. The server must provide a single HTTP endpoint path that supports both POST and GET methods, such as https://example.com/mcp. You can learn more about the different transports in the official documentation.

Benefits of running an MCP server on GKE

Running an MCP server remotely on GKE provides several architecture benefits:

Scalability: GKE Autopilot is built to handle highly variable traffic. Since MCP Servers are stateless, GKE can scale horizontally to handle spikes in demand efficiently.
Centralized access: Teams can share access to a centralized MCP server, allowing developers to connect from local machines, Agents or pipelines instead of running redundant local servers. Updates to the central server immediately benefit everyone.
Enhanced security: The Kubernetes Gateway API combined with SSL certificates provides an easy way to force secure, encrypted traffic. This allows only secure connections to the MCP server, preventing unauthorized access.

Prerequisites

Before starting, ensure the following tools are installed:

python 3.10 or higher
uv (for package and project management, see the installation documentation)
Google Cloud SDK (gcloud)
kubectl command-line tool

Installation

Prepare environment variables

code_block: <ListValue: [StructValue([('code', 'export PROJECT_ID=$(gcloud config get-value project)\r\nexport REGION=us-central1'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa732794ee0>)])]>

Create a folder, mcp-on-gke, to store the code for the server and deployment.

code_block: <ListValue: [StructValue([('code', 'mkdir mcp-on-gke && cd mcp-on-gke'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7330cec40>)])]>

Now configure the Google Cloud credentials and set the active project.

code_block: <ListValue: [StructValue([('code', 'gcloud auth login\r\ngcloud config set project $PROJECT_ID'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7330ce910>)])]>

Initiate the GKE Autopilot cluster creation in the background. This process takes a few minutes, so starting it now allows the cluster to provision while you complete the rest of the setup. Make sure to use an Autopilot version that ensures Cost-Optimized Compute (CCOP) is enabled for fast autoscale.

code_block: <ListValue: [StructValue([('code', 'gcloud container clusters create-auto mcp-cluster \\\r\n --region $REGION \\\r\n --release-channel rapid \\\r\n --async'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7404ed910>)])]>

Use uv to create a project, which will generate a pyproject.toml file.

code_block: <ListValue: [StructValue([('code', 'uv init'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7404edaf0>)])]>

Next, create the additional files needed: server.py for the MCP server code, test_server.py for testing, and a Dockerfile for the container deployment.

Math MCP server

Large language models are excellent at non-deterministic tasks, such as generating text, summarizing ideas, and reasoning about concepts. However, they can be unreliable for deterministic tasks like math operations. To solve this, developers can create tools that provide valuable context. Using FastMCP, a framework for building MCP servers in Python, it is possible to create a simple math server with two tools: add and subtract.

First, add FastMCP as a dependency.

code_block: <ListValue: [StructValue([('code', 'uv add fastmcp\r\nuv add asyncio'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7404ed2b0>)])]>

Copy the following code into server.py to create the server.

code_block: <ListValue: [StructValue([('code', 'from fastmcp import FastMCP\r\nfrom starlette.requests import Request\r\nfrom starlette.responses import PlainTextResponse\r\nimport asyncio\r\nimport logging\r\n\r\nlogger = logging.getLogger(__name__)\r\nlogging.basicConfig(format="[%(levelname)s]: %(message)s", level=logging.INFO)\r\n\r\nmcp_port=3000\r\n\r\n# Initialize the FastMCP server\r\nserver = FastMCP(\r\n "Math Server",\r\n)\r\n\r\n@server.tool()\r\ndef add(a: int, b: int) -> int:\r\n """Add two numbers together."""\r\n return a + b\r\n\r\n@server.tool()\r\ndef subtract(a: int, b: int) -> int:\r\n """Subtract the second number from the first."""\r\n return a - b\r\n\r\n@server.custom_route("/healthz", methods=["GET"])\r\nasync def health_check(request: Request) -> PlainTextResponse:\r\n """Simple health check endpoint that returns a 200 OK response"""\r\n return PlainTextResponse("OK")\r\n\r\nif __name__ == "__main__":\r\n logger.info(f" MCP server started on port {mcp_port}")\r\n # Could also use \'sse\' transport, host="0.0.0.0" required for Cloud Run.\r\n asyncio.run(\r\n server.run_async(\r\n transport="streamable-http", \r\n host="0.0.0.0",\r\n port=mcp_port\r\n )\r\n )'), ('language', 'lang-py'), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019370>)])]>

This example uses the streamable-http transport, which is recommended for remote servers. The script encapsulates the logic needed to run a scalable MCP endpoint.

Testing the MCP server locally

Create the test_mcp_server.py script to connect to test the MCP Server. This will be useful to test the MCP server before deploying it to GKE.

code_block: <ListValue: [StructValue([('code', 'from fastmcp import Client, FastMCP\r\nimport asyncio\r\nimport logging\r\n\r\n# Connect to the remote MCP server\r\nclient = Client("https://localhost:3000/mcp")\r\n\r\nasync def test_remote_server():\r\n async with client:\r\n # Basic server interaction\r\n await client.ping()\r\n\r\n # List available operations\r\n tools = await client.list_tools()\r\n print(f"Available tools: {tools} \\n")\r\n\r\n # Execute add operation\r\n result = await client.call_tool("add", {"a": 5, "b": 3})\r\n print(f"Result of addition: {result} \\n")\r\n\r\n # Execute subtract operation\r\n result = await client.call_tool("subtract", {"a": 5, "b": 3})\r\n print(f"Result of subtraction: {result} \\n")\r\n\r\nif __name__ == "__main__":\r\n asyncio.run(test_remote_server())'), ('language', 'lang-py'), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019eb0>)])]>

Run the MCP server locally to test the connection:

code_block: <ListValue: [StructValue([('code', 'uv run server.py'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019a90>)])]>

Then execute the test script in a new terminal to verify the connection.

code_block: <ListValue: [StructValue([('code', 'uv run test_mcp_server.py'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019a60>)])]>

The output should print available tools and the results of invocing the add and subtract tools confirming the MCP server is functional.

Building the container image

To speed up the deployment process, build the container image while the cluster is still creating.

First, prepare the Dockerfile:

code_block: <ListValue: [StructValue([('code', 'FROM python:3.10-slim\r\nCOPY --from=ghcr.io/astral-sh/uv:0.4.15 /uv /bin/uv\r\nWORKDIR /app\r\nCOPY pyproject.toml .\r\nCOPY server.py .\r\nRUN uv sync\r\nCMD ["uv", "run", "server.py"]'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019e80>)])]>

Now, set up the Artifact Registry and build the container image.

Set up Artifact Registry

code_block: <ListValue: [StructValue([('code', 'gcloud artifacts repositories create mcp-repo \r\n--repository-format=docker \r\n--location=$REGION'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019040>)])]>

Build and push the image in parallel

code_block: <ListValue: [StructValue([('code', 'gcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/mcp-repo/math-mcp-server:latest'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019610>)])]>

Once the image build is complete, verify that the cluster is ready and retrieve the credentials. If the output of the cluster is not "RUNNING" wait for it to be ready.

code_block: <ListValue: [StructValue([('code', 'gcloud container clusters list\r\ngcloud container clusters get-credentials mcp-cluster --region $REGION'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7330199d0>)])]>

Deploying to GKE with Gateway API and SSL

The next step involves deploying the server workloads and exposing them securely using the Kubernetes Gateway API rather than the legacy Ingress. This guarantees secure, encrypted traffic via SSL certificates.

Create a deployment.yaml file to define the Kubernetes Deployment and Service. Replace the placeholders with your actual project ID and region.

code_block: <ListValue: [StructValue([('code', 'apiVersion: apps/v1\r\nkind: Deployment\r\nmetadata:\r\n name: mcp-server\r\nspec:\r\n replicas: 2\r\n selector:\r\n matchLabels:\r\n app: mcp-server\r\n template:\r\n metadata:\r\n labels:\r\n app: mcp-server\r\n spec:\r\n containers:\r\n - name: mcp-server\r\n image: $REGION-docker.pkg.dev/$PROJECT_ID/mcp-repo/math-mcp-server:latest\r\n ports:\r\n - containerPort: 3000\r\n resources:\r\n requests:\r\n memory: "256Mi"\r\n cpu: "250m"\r\n limits:\r\n memory: "512Mi"\r\n cpu: "500m"\r\n livenessProbe:\r\n httpGet:\r\n path: /healthz\r\n port: 3000\r\n initialDelaySeconds: 15\r\n periodSeconds: 20\r\n readinessProbe:\r\n httpGet:\r\n path: /healthz\r\n port: 3000\r\n initialDelaySeconds: 5\r\n periodSeconds: 10\r\n---\r\napiVersion: v1\r\nkind: Service\r\nmetadata:\r\n name: mcp-service\r\nspec:\r\n selector:\r\n app: mcp-server\r\n ports:\r\n - port: 80\r\n targetPort: 3000'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019d30>)])]>

Apply this configuration to the cluster:

code_block: <ListValue: [StructValue([('code', 'kubectl apply -f deployment.yaml'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019190>)])]>

Check the pods are up and running

code_block: <ListValue: [StructValue([('code', 'kubectl get pods'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa733019160>)])]>

To ensure our remote MCP Server is accessible let's try to reach it with a port-forward.

code_block: <ListValue: [StructValue([('code', 'kubectl port-forward svc/mcp-service 8080:80'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7406b9100>)])]>

Run the test script to verify the connection. make sure to edit the MCP Server URL in the test script to http://localhost:8080/mcp.

code_block: <ListValue: [StructValue([('code', 'uv run test_mcp_server.py'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa73354ee50>)])]>

Now let's secure the connection. To do so, we'll use a Google-managed SSL certificate and attach it to a Gateway API resource. First, reserve a static IP address for your load balancer:

code_block: <ListValue: [StructValue([('code', 'gcloud compute addresses create mcp-server-ip --global\r\nexport MCP_SERVER_IP=$(gcloud compute addresses describe mcp-server-ip --global --format="value(address)")\r\necho "Your IP: $MCP_SERVER_IP"'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa73354e400>)])]>

Point your domain's DNS A record at $MCP_SERVER_IP. Example: mcp.yourdomain.com

Create a Google-Managed Certificate. Replace mcp.yourdomain.com with your actual domain.

code_block: <ListValue: [StructValue([('code', 'gcloud compute ssl-certificates create mcp-cert --domains mcp.yourdomain.com --global'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa73354ef10>)])]>

Create a gateway.yaml file to provision the load balancer and configure Transport Layer Security (TLS) termination.

code_block: <ListValue: [StructValue([('code', '# Gateway: HTTPS load balancer with the managed certificate and static IP\r\napiVersion: gateway.networking.k8s.io/v1beta1\r\nkind: Gateway\r\nmetadata:\r\n name: mcp-gateway\r\nspec:\r\n gatewayClassName: gke-l7-global-external-managed\r\n listeners:\r\n - name: https\r\n protocol: HTTPS\r\n port: 443\r\n tls:\r\n mode: Terminate\r\n options:\r\n networking.gke.io/pre-shared-certs: mcp-cert\r\n addresses:\r\n - type: NamedAddress\r\n value: mcp-server-ip\r\n---\r\n# HTTPRoute: forward traffic to the MCP Server\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: HTTPRoute\r\nmetadata:\r\n name: mcp-route\r\nspec:\r\n parentRefs:\r\n - name: mcp-gateway\r\n hostnames:\r\n - "mcp.yourdomain.com"\r\n rules:\r\n - matches:\r\n - path:\r\n type: PathPrefix\r\n value: /mcp\r\n backendRefs:\r\n - name: mcp-service\r\n port: 80\r\n---\r\n# The GCPBackendPolicy is used to configure session affinity and other backend.\r\n# Since MCP Servers are stateful we enable session affinity. This ensures that\r\n# requests from the same client are sent to the same backend.\r\napiVersion: networking.gke.io/v1\r\nkind: GCPBackendPolicy\r\nmetadata:\r\n name: mcp-backend-policy\r\nspec:\r\n default:\r\n sessionAffinity:\r\n type: CLIENT_IP\r\n targetRef:\r\n group: ""\r\n kind: Service\r\n name: mcp-service\r\n---\r\n# The HealthCheckPolicy is used to configure custom health probes for the MCP Server.\r\napiVersion: networking.gke.io/v1\r\nkind: HealthCheckPolicy\r\nmetadata:\r\n name: mcp-health\r\n namespace: default\r\nspec:\r\n default:\r\n checkIntervalSec: 15\r\n timeoutSec: 5\r\n healthyThreshold: 1\r\n unhealthyThreshold: 2\r\n logConfig:\r\n enabled: false\r\n config:\r\n type: HTTP\r\n httpHealthCheck:\r\n port: 3000\r\n requestPath: /healthz\r\n targetRef:\r\n group: ""\r\n kind: Service\r\n name: mcp-service'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa73354ed90>)])]>

Deploying this configuration creates the infrastructure required to route external traffic securely to the MCP server.

code_block: <ListValue: [StructValue([('code', 'kubectl apply -f gateway.yaml'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa73354eeb0>)])]>

Wait a few minutes for the load balancer to become active and the certificate to provision. Developers can check the status using kubectl get gateway mcp-gateway.

Try to reach the remote MCP Server. Run the test script to verify the connection. make sure to edit the MCP Server URL in the test script to https://mcp.yourdomain.com/mcp.

code_block: <ListValue: [StructValue([('code', 'uv run test_mcp_server.py'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa73354e280>)])]>

Cleanup

code_block: <ListValue: [StructValue([('code', 'kubectl delete -f deployment.yaml\r\nkubectl delete -f gateway.yaml\r\ngcloud compute addresses delete mcp-server-ip --global\r\ngcloud compute ssl-certificates delete mcp-cert --global\r\ngcloud artifacts repositories delete mcp-repo --location=$REGION\r\ngcloud container clusters delete mcp-cluster --region $REGION'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7fa7406d8130>)])]>

Deploying Model Context Protocol servers to Kubernetes enables new use cases for integrated agents and AI workflows. To dive deeper into these capabilities, explore the following resources:

Google named a Leader in IDC MarketScape SIEM 2026 Vendor Assessment

Tue, 16 Jun 2026 17:30:00 +0000

Security operations teams are under immense pressure to defend against adversaries who use AI to act with unprecedented speed, scale, and sophistication. To navigate these moments, secure mission-critical workloads, and build confident defense programs, organizations rely on modern security information and event management (SIEM) systems as the backbone of their security operations.

We are proud to announce that Google has been named a Leader in the 2026 IDC MarketScape for Worldwide SIEM Vendor Assessment (#US54126826, June 2026). We believe this recognition reflects our sustained investment and innovation in Google Security Operations, bringing together Mandiant's frontline expertise, comprehensive automation, and advanced AI agents to empower defenders.

According to the report, Google was recognized for several key strengths, including:

The Alert Triage and Investigation agent collects evidence, runs correlated searches, and produces a transparent verdict, reducing the security analyst workload. The additional agents announced at Google Cloud Next extend agentic workflows beyond triage into proactive hunting and rule generation.
Google designs the silicon, runs the infrastructure, develops the Gemini foundation models through DeepMind, and encodes its internal security expertise into agent evaluation loops. Vertical AI integration supports unit economics that would be difficult to achieve through third-party model APIs and gives Google tighter control over the iteration cycle that improves agent accuracy on security-specific tasks.
Curated detection content authored by Mandiant analysts is mapped to MITRE ATT&CK and refreshed on a regular cadence. Customers report that the higher-tier curated rule sets deliver useful detections out of the box.
Search performance over large data volumes is a consistently cited technical strength. The unified data lake, combined with all-time UDM search and multistage search with cross joins, allows analysts to query the full retention period without the performance degradation common on legacy on-premises platforms.

IDC MarketScape vendor analysis model is designed to provide an overview of the competitive fitness of technology and service suppliers in a given market.  The research methodology utilizes a rigorous scoring methodology based on both qualitative and quantitative criteria that results in a single graphical illustration of each vendor’s position within a given market. The Capabilities score measures vendor product, go-to-market and business execution in the short-term. The Strategy score measures alignment of vendor strategies with customer requirements in a 3-5-year timeframe. Vendor market share is represented by the size of the circles. Vendor year-over-year growth rate relative to the given market is indicated by a plus, neutral or minus next to the vendor name.

Google Security Operations, powered by AI

Speed and accuracy are crucial in threat detection and incident response. Google continues to drive security operations innovation to help defenders work smarter, not harder. By deeply embedding Gemini in Google Security Operations, we enable analysts to perform complex natural language searches across vast amounts of security telemetry. We have also added agents such as the Triage and Investigation agent that enhance analyst productivity by accelerating event summarization, dynamically generating detection rules, and building automated response playbooks in seconds instead of hours.

“With Google Security Operations, we’re able to take in large volumes of telemetry, introduce AI into our workflows, and we saw a 97% reduction in alerts,” Daniel Peterpaul, VP, Information Security, Sunrun.

Unparalleled access to threat intelligence

A modern SIEM must go beyond data aggregation; it requires context. Google Threat Intelligence combines Mandiant's frontline expertise, the global reach of the VirusTotal community, and the unparalleled visibility of Google's services and devices into Google Security Operations.

Our applied threat intelligence capability enables security teams to spend less time on manual monitoring and more time contextualizing alerts for better decision-making. Through services like Mandiant Hunt, we integrate our proactive experts directly into Google Security Operations to help defenders search for undetected attacks and adversary tactics, techniques, and procedures (TTPs) before they escalate.

Ensuring operational resilience for global enterprises

Organizations around the globe are making significant leaps in both the technology they use and the way they think about security operations by partnering with Google. The ability to stitch together security telemetry and threat intelligence gives organizations visibility to full-service recovery and holistic security transformation.

“Our engineers in the SOC are working on high fidelity, true positives only. So, you've got a high fidelity true positive that's fired, and frankly, you want that alarm then to be enriched with as much contextual information as possible, that's the shift that Gemini in SecOps will allow us to get to. We want AI to work in service of our people, and then we want people to use their human brilliance, creativity, big picture problem-solving to think about attack paths and predicting them, and really making our environment a hard target,” Matt Rowe, chief security officer, Lloyds Banking Group.

Take the next step in advancing your cyber defenses

Organizations that seek to work with a globally capable security leader with strong threat intelligence capabilities and a holistic approach to security operations should consider Google. To learn more about our capabilities and why Google has been named a Leader, read a complimentary excerpt of the 2026 IDC MarketScape for Worldwide SIEM Vendor Assessment here.

Introducing Brazos: Bringing liquid cooling to air-cooled data centers

Tue, 16 Jun 2026 16:00:00 +0000

Next-generation artificial intelligence (AI) and high-performance computing (HPC) chips routinely exceed 1000 W Thermal Design Power (TDP). Simply put, standard air cooling cannot manage these extreme heat loads. The alternative — retrofitting entire data center facilities with chilled water loops — requires extensive amounts of capital and time. To solve this problem, Google developed Brazos, a rack-mounted, closed-loop liquid-to-air cooling system that lets you deploy high-density, liquid-cooled equipment inside existing air-cooled environments. Brazos is generally available, and our manufacturing suppliers are ready to engage the broader industry to market and produce the Google Brazos design.

Data center facility updates can take months. Brazos breaks with this by allowing simple, one-rack-at-a-time installations. By separating the internal-to-IT liquid loop from the facility water supply, Brazos delivers high-performance liquid cooling with the operational simplicity of standard air systems.

Figure 1: Brazos OCP ORV3 Sidecar Configuration showing three units providing cooling to an adjacent IT rack.

Brazos functions as a self-contained liquid ecosystem, capturing heat via liquid at the component level and rejecting it into the data center's hot aisle using high-efficiency liquid-to-air heat exchangers. This plug-and-play architecture can be rapidly installed in any legacy facility that has sufficient power and standard air handling.

Figure 2: Photograph of three Brazos modular units in a sidecar rack.

System design and technical specifications

Brazos is a modular system that includes three cooling units and integrated rack manifolds, all engineered for high reliability. Each modular chassis occupies 11 Open Units (OU) of rack height and interfaces with standard Open Compute Project (OCP) ORv3 form-factor racks. Key design and performance parameters include:

Rack thermal capacity: Supports a 60 kW nominal thermal load per rack across three modular units
Coolant compatibility: Runs using either deionized (DI) water or a 25% propylene glycol mixture (PG25)
Power delivery: Operates on a 40–60 V DC input designed to connect directly with standard rack busbars
Safety features: Certified to UL/CSA/IEC 62368-1 standards and features built-in leak detection alongside pressure relief valves
Control plane: Local monitoring uses a built-in human-machine interface (HMI), while remote management connects via Modbus over TCP

The mechanical design prioritizes field serviceability. The chassis sits on low-friction slides so it can easily be extended for rapid component access. Crucial components like pumps and fans are designed as hot-swappable, field-replaceable units (FRUs) to minimize mean time to repair (MTTR).

Rapid deployment and industry adoption

In the coming months, we will formally open-source the technical specifications, design principles, and visual assets of Brazos through industry forums. As part of a broader infrastructure portfolio that continues to leverage waterless air-cooled systems alongside liquid cooling, Brazos represents one of many innovations we are contributing to the open hardware ecosystem. We invite system architects, manufacturers, and thermal engineers to evaluate these designs to scale rack-mounted cooling infrastructure for the high-power computing demands of the future.

Next steps

To optimize your legacy data center infrastructure for liquid cooling, follow our upcoming open-source design submissions through the Open Compute Project forum.

Introducing new Explores and Merge Queries in Looker

Tue, 16 Jun 2026 16:00:00 +0000

A key goal for many enterprises in the AI era is to empower their employees to uncover actionable data insights on their own. To help, we are evolving Looker Explore with a streamlined interface and integrated AI, so every usey can confidently turn data into a clear path to action.

A team of AI assistants

At the heart of the new Explore release is a suite of AI capabilities that guides users from their very first click with new insight and expression assistants.

AI-assisted Quick Start

We are virtually eliminating the cold start from an empty canvas. If the data modeler hasn't built predefined Quick Starts, Looker automatically generates a query for the user, tapping into Google’s latest Gemini models to generate ad hoc Quick Starts that can help users dive deep into the data, beyond visible fields, and surface potential questions the data can tackle.

The new Explore interface in Looker

Insight Assistant

Users can now prompt Looker Explores in natural language to modify data tables and visualizations. The Insight Assistant uses the Conversational Analytics API to identify relevant fields, apply filters, sort data, and construct the data table. We expect this feature to be a significant time-saver that can provide a rapid starting point for complex analysis.

You can ask questions in natural language to update data tables in Looker

Expression Assistant

Users can also use natural language to describe their custom calculation, and Looker will automatically fill in the appropriate syntax, without having to learn Looker Expression (Lexp) syntax. Users can also re-prompt the assistant to iterate on custom field expressions.

AI-generated Explore summary

If a user-generated description does not exist for an Explore, Looker will provide an AI-generated summary, to help data analysts rapidly gain familiarity.

An intuitive, modernized UI

In addition to these new assistants, we’ve updated the Looker user interface to be more modern and polished. There, you’ll find:

A customizable workspace: The new interface features a resizable field picker pane, with more easily readable long field names.
Data table contextual menus: Looker now offers powerful functionality right in the data table. Users can access quick menus on columns to switch data granularities, apply filters like 'IS NOT BLANK' or 'IS NOT NULL', and instantly add complex table calculations like '% of column' or 'running total'.
Visual pivots: Users will soon be able to drag and drop fields into a panel to pivot data into columns, rows, and aggregated values.

Connect data with redesigned merge queries

Looker’s new interface to quickly join modeled data

In addition, we redesigned Looker Explore’s Merge Query workflow with a unified, in-window architecture that includes:

A dynamic three-panel interface: The new design maintains context beautifully by displaying three simultaneous panels: a "configure joins" list on the left, a dynamic field picker in the middle, and your data preview/visualization on the right. You can edit a source query without losing the context of the overarching join configuration.
Smart join suggestions: The new panel automatically suggests optimal join fields, such as state and month, and shows the combined fields.
Instant query linking: If you have an existing query you want to use, you can paste a prebuilt query URL to start a join.
Expanded row limits: We've increased the default row limit for non-BigQuery sources to 50,000 rows.

By pairing conversational AI with a dramatically simplified user interface, Looker’s new Explore experience gives your business users the tools they need to investigate their data with confidence. Reach out to your Looker administrator today to enable this feature. For more information, click here for detailed documentation.

How Atlas scales hundreds of merchant databases with Cloud SQL Enterprise Plus edition

Tue, 16 Jun 2026 16:00:00 +0000

Atlas is building the operating system for restaurants. Online storefronts, point of sale, third-party logistics, food platform integrations, customer loyalty, and AI tools represent everything a restaurant needs to start, run, and grow. We work with brands like SaladStop, Killiney, Haidilao, Raffles Hotel, Lo and Behold Group and the Les Amis Group in Singapore, helping merchants increase basket sizes, grow sales, and reduce operational costs.

Every merchant on Atlas gets their own dedicated Cloud SQL for PostgreSQL database. Restaurants are very different from each other. A single-outlet cafe and a multi-outlet chain should not look the same underneath. Isolated databases give us full data separation, predictable performance even during peak lunch and dinner rushes, and the flexibility to scale, tune, or migrate each merchant independently. As Atlas grows, the number of databases grows with us.

The challenge: Scaling beyond standard

We started on the standard Cloud SQL Enterprise edition. It was a solid foundation, but as we onboarded more merchants and shipped more features, the operational layer required to manage our databases became a bottleneck.

We were managing connection pooling as a separate layer, which meant more services to run, secure, and monitor. When a query caused a CPU spike, we needed to know exactly what happened and which merchant triggered it, but we were spending too much time reconstructing problems from limited signals. With a lean team and no dedicated database engineers, every extra component multiplied the maintenance load.

The shift to Enterprise Plus edition

When we needed to provision new database instances, the Google Cloud team introduced us to Cloud SQL Enterprise Plus edition. We were already asking ourselves how much more operational overhead this was going to add, and what stood out was that Enterprise Plus edition removed whole categories of work we would otherwise have to own.

Managed connection pooling: Now built directly into Cloud SQL, we no longer run pooling as a separate layer. This means fewer moving parts, less to maintain, and a smaller security surface area.
Query insights: This was the most impactful feature for our needs. We can now see exactly which queries are expensive and which merchant is triggering them. It turns performance tuning from guesswork into something concrete and actionable. For a platform running hundreds of databases, this visibility is a "superpower."
Data cache: This keeps read performance consistent even as merchant datasets grow. Since restaurants generate more data every day, the data layer needs to stay fast as that complexity compounds.
Near-zero downtime scaling: We can now scale instances as merchants grow without disrupting service during off-peak hours.

After seeing the results on the new instance, we migrated all our existing databases to Enterprise Plus edition as well.

The impact: Focus on innovation, not plumbing

Atlas today powers thousands of restaurant outlets, processes tens of thousands orders daily using hundreds of managed databases. The biggest change is where engineering time goes. We spend 30% less time on database operations and more time building products. Merchant onboarding got simpler because a new merchant is provisioned in seconds with a ready-to-use managed database. We are much more proactive on performance now, catching and fixing issues before they reach merchants. Day to day, we are not thinking about database plumbing. We are thinking about how to serve merchants better and that has allowed Atlas to grow 200% to 300% year over year.

Looking ahead: An AI-first future

We are investing deeply in AI, both internally and externally. Internally, we have gone all in on agentic engineering through AI-assisted development workflows that let a lean team build, review, and ship code significantly faster. Externally, we are building AI-powered tools that help restaurant operators make better decisions and act on them. We have a lot of experimental ideas on the roadmap, including new product surfaces and new ways to help restaurants grow. The thing that gives us confidence to move fast on all of this is that the foundational layer, Cloud SQL and Google Kubernetes Engine (GKE), is battle-tested and does not get in the way.

Google Cloud handles the infrastructure complexity. Atlas stays focused on building the best tools for restaurants.

Cloud SQL Enterprise Plus gave us a database architecture that is flexible, observable, and easy to scale. We are not thinking about infrastructure anymore, we are thinking about our merchants. As we go deeper on AI and continue growing the platform, Google Cloud gives us the confidence to move fast without worrying about what is underneath.

Ready to scale your database architecture?

Don't let infrastructure bottlenecks slow down your innovation. Whether you are managing tens or hundreds of databases, see how Google Cloud SQL can streamline your operations, enhance observability, and give your engineering team the freedom to focus on what matters most.

Explore Cloud SQL Enterprise Plus edition today
Sign up to try Cloud SQL for free.

How Siemens "slices the elephant," advancing agentic workflows for industrial software development

Tue, 16 Jun 2026 14:00:00 +0000

For technology companies like Siemens, software is the nervous system of factories, energy grids, and transportation networks worldwide.

As a global leader in industrial AI, industrial software, and industrial automation, Siemens brings decades of domain expertise across factory and process automation, energy infrastructure, and intelligent transportation — expertise that no off-the-shelf AI solution can replicate. But innovation carries a heavy anchor: legacy code.

With codebases spanning hundreds of millions of lines developed for over more than a decade, Siemens faced a challenge that standard AI tools couldn't solve: understanding and modernizing this code and the applications which run on it. The scale and depth of industrial-grade software demand a fundamentally different approach. Existing coding assistants lacked the contextual depth required to navigate complex, multi-layered industrial codebases — a gap Siemens set out to close.

To solve this, Siemens and Google Cloud created Knowledge Fabric, an AI system for automating the software development lifecycle. It was built using knowledge graphs on Spanner Graph, the Google Agent Development Kit, Gemini API, Agent Platform, Gemini CLI, and Anthropic Claude Code. In a pilot migrating existing frontiers to web-based interfaces, Knowledge Fabric reduced implementation effort, freeing engineers to focus on customer innovations while maintaining full system compatibility.

“By ingesting the entire software ecosystem into an intelligent agentic system equipped with custom knowledge graphs, we aren’t just helping developers optimize their development time; we are enabling autonomous agents to reason across the past to build the future,” said Franz Menzl, senior vice president, product creation excellence at Siemens. “This is about freeing engineers from repetitive work so they can focus on higher-value problem solving.”

The challenge: the complexity of industrial software

Modernizing large-scale industrial-grade software systems is often compared to rebuilding a jet while flying it. For Siemens, the challenge had four dimensions:

Scale: The repositories are massive — far exceeding the context windows of standard large language models.
Fragmentation: Critical knowledge was scattered across code, Jira tickets, Confluence pages, and scanned PDF manuals from the early 2000s.
Complexity: Tracing the link between a specific line of code and a functional requirement document from 10 years ago presented a challenge that no manual or conventional tooling approach could address efficiently. It’s a reality shared across the industry.
Responsibility: Systems must adhere to strict quality, compliance, and lifecycle requirements, often over 15 to 20 years of operation. AI‑generated outputs must therefore be explainable, traceable, and verifiable. Hallucinated or unvalidated changes are not merely inefficient but operationally unacceptable.

"We realized that standard RAG (retrieval-augmented generation) wasn't enough," said Agata Gołębiowska, technical lead, Google Cloud. "Code isn't just text; it has inherent structure. A class belongs to a file, which belongs to a module. Flattening that into a vector database meant losing the representation of relationships elements of the codebase."

The solution: A domain-aware Knowledge Fabric

To make this sprawling software environment navigable for AI-driven workflows, the teams built the Knowledge Fabric agent. This agent goes beyond keyword matching to “understand” the relationships between assets.

We use Spanner Graph to model the inherent structure of the codebase, applying the same rigor to documentation across formats. By mapping connections between these domains, we can link specific code snippets directly to requirements in a design document. Agents then traverse this graph, using tools to query the structure via Graph Query Language (GQL).

But GQL is only one piece. To enable semantic understanding, we generate embeddings for every node, using Spanner's Approximate Nearest Neighbors (ANN) algorithm to perform efficient vector search across the full codebase. Finally, we give agents full-text search capabilities, which can be combined with GQL to pinpoint nodes and edges with precision.

Combining these three methods lets an LLM agent answer complex queries, such as: "Which functions need to be updated if I change the logic in the Axis Control Panel?" The system traverses the graph — weighing keyword and semantic similarity — to identify dependencies, retrieve relevant documentation, and present a precise impact analysis.

This precise context is what lets a coding agent produce a valid, usable, and maintainable implementation.

"Slicing the elephant:" the agentic workflow

A key insight from the project was that AI agents struggle with massive, ambiguous tasks. To succeed, the team adopted a design pattern dubbed "slicing the elephant."

The system breaks a sweeping request like “refactor this module” into smaller, more manageable tasks, each handled by a specialized agent built with the Google Agent Development Kit (ADK):

Search agent: Acts as a deep-research specialist. It uses tools to explore the code graph and cross-reference findings with documentation in Agent Search.
User story agent: Interviews the product owner to gather requirements, then drafts detailed user stories with acceptance criteria linked to existing system contexts.
Architecture impact agent: Analyzes proposed changes against the graph to predict side effects before a single line of code is written.
Task breakdown agent: Consumes the analysis from the architecture impact agent and breaks the work into small, manageable tasks, each carrying all the context relevant to a specific change.
Coding agent: Implements the change described in a specific task. Reaching this step without context and prior analysis produces unusable code.

The system keeps a human in the loop at every step, which ensures reliable, production‑grade outcomes and keeps engineers focused on meaningful work rather than routine implementation.

"By slicing the elephant — breaking complex refactoring jobs into smaller, agent-led tasks — we observed a significant productivity increase," said Alexander Lomakin, project lead at Siemens. "We essentially gave the AI the roadmap it needed to navigate the complexity."

Pilot results: Faster, more efficient engineering

Developers saw results almost immediately.

Analyzing dependencies for a new feature once required senior engineers to spend several days navigating codebases and legacy documentation. With the Knowledge Fabric, the same work now takes far less time.

In a recent production pilot migrating legacy control panels to modern web‑based interfaces, the Knowledge Fabric reduced overall coding effort while preserving system integrity and industrial quality standards.

Engineers now spend more time creating customer value and less on repetitive work.

Get started

The Knowledge Fabric shows that generative AI can do more than write boilerplate code, it can also help teams modernize the legacy systems their businesses depend on most.

To learn more about building graph-based agents for your own legacy modernization:

Read about Spanner Graph.
Explore Agent Platform and find pre-built production-grade agents on Agent Garden
Check out the Agent Development Kit.
Read more on how Siemens is advancing industrial AI.

How customer collaboration is shaping the future of GenAI security with Model Armor

Tue, 16 Jun 2026 07:00:00 +0000

At Google Cloud, we believe that the best products are built in partnership with our customers. Their feedback and real-world experiences are invaluable in helping refine our services and deliver solutions that truly meet our customers’ needs. In January 2026, our Google Cloud Developer Advocacy team participated in a high-velocity technical sprint with a major Google Cloud customer and a leader in the telecommunications industry.

This collaborative engagement provided us with deep insights, leading to significant enhancements in Model Armor information experience, our service for Runtime security for generative and agentic AI.

Accelerating GenAI adoption through "radical empathy"

The objective of this engagement was to support the productionization of a next-generation GenAI customer support platform built using Google Cloud's Agent Development Kit (ADK) and Agent Platform. By sitting directly with the customer's developers and security specialists, we gained a unique opportunity to observe how developers interact with Gemini Enterprise Agent Platform in a live, complex environment.

This experience provided something traditional documentation cycles cannot replicate: radical empathy. By logging friction points, as developers worked, we translated functional blockers into technical insights in real-time, identifying exactly where developers were hindered by ambiguous configuration guidance or a lack of granular detail.

Key discoveries from the front lines

By observing the development workflow firsthand, we identified four critical friction points:

Search-first workflows: Developers rarely navigate through documentation hierarchies; instead, they rely on search to jump straight to specific code examples. A lack of comprehensive, copy-pasteable snippets for common use cases—like PII redaction—was a primary point of friction.
Balancing confidence levels: Finding the right balance between comprehensive threat detection and minimizing disruptive false positives proved challenging. For instance, using aggressive settings like "low and above" often caused a high volume of false positives that interrupted legitimate customer support flows.
The need for granular guidance: While the core concepts of Model Armor were understood, developers needed more detail on how different enforcement methods function in practice to balance security with usability.
Integration roadblocks (the 403 error): When integrating Model Armor with other services like Apigee, developers frequently encountered 403 PERMISSION_DENIED errors. This indicated a gap in our documentation regarding necessary cross-service IAM roles and permissions.

Turning insights into action

The insights gained from this partnership were immediately channeled into a comprehensive overhaul of Model Armor’s documentation and guidance:

Tested, copy-pasteable code samples: We have added numerous tested, ready-to-use code samples throughout the documentation to support search-first workflows.
The confidence level matrix: We introduced a new technical reference to help users understand the trade-offs between different filter levels. We now explicitly recommend "High" or "Medium" thresholds for general content to minimize false positives, reserving "Low and above" for high-security threats like prompt injection and jailbreak detection.
Explicit integration guides: We updated our integration guides, with a focus on Apigee, Gemini Enterprise Agent Platform, and GKE. These now clearly outline the specific IAM roles required (such as roles/modelarmor.user) to ensure smooth, error-free deployments.
Deeper technical documentation: We have enhanced the documentation to provide in-depth explanations of enforcement methods and their real-world applications.

The power of partnership

Getting "in the room" with our customers allowed us to bridge the gap between technical accuracy and operational utility. This journey of co-innovation ensures that Model Armor serves as a genuine catalyst for your success. We encourage you to explore the updated documentation and share your feedback as we continue to build the most secure platform for your GenAI workloads.

Get started:

Explore the updated Model Armor documentation