DZone Spotlight

Saturday, March 14 View All Articles »

Google Cloud AI Agents With Gemini 3: Building Multi-Agent Systems That Actually Work

By Jubin Abhishek Soni

The transition from large language models (LLMs) as simple chat interfaces to autonomous AI agents represents the most significant shift in enterprise software since the move to microservices. With the release of Gemini 3, Google Cloud has provided the foundational model capable of long-context reasoning and low-latency decision-making required for sophisticated multi-agent systems (MAS). However, building an agent that "actually works" — one that is reliable, observable, and capable of handling edge cases — requires more than a prompt and an API key. It requires a robust architectural framework, a deep understanding of tool use, and a structured approach to agent orchestration. The Architecture of a Modern AI Agent At its core, an AI agent is a loop. Unlike a standard LLM call, which is a single input-output transaction, an agent uses the model's reasoning capabilities to interact with its environment. In the context of Gemini 3 on Google Cloud, this environment is managed through Vertex AI Agent Builder. The Agentic Loop: Perception, Reasoning, and Action Perception: The agent receives a goal from the user and context from its internal memory or external data sources.Reasoning: Using Gemini 3's advanced reasoning capabilities (such as Chain of Thought or ReAct), the agent breaks the goal into sub-tasks.Action: The agent selects a tool (a function call, an API, or a search) to execute a sub-task.Observation: The agent evaluates the output of the action and decides whether to continue or finish. System Architecture To build a multi-agent system, we must move away from a monolithic agent. Instead, we use a modular approach where a "Manager" or "Orchestrator" agent delegates tasks to specialized "Worker" agents. In this architecture, the Manager Orchestrator serves as the brain. It uses Gemini 3's high-reasoning threshold to determine which worker agent is best suited for the current task. This prevents "token bloat" in worker agents, as they only receive the context necessary for their specific domain. Why Gemini 3 for Multi-Agent Systems? Gemini 3 introduces several key advantages for agentic workflows that weren't present in previous iterations: Native function calling: Gemini 3 is fine-tuned to generate structured JSON tool calls with higher accuracy, reducing the "hallucination" rate during API interactions.Expanded context window: With a massive context window, Gemini 3 can retain the entire history of a multi-turn, multi-agent conversation without needing complex vector database retrieval for every step.Multimodal reasoning: Agents can now "see" and "hear," allowing them to process UI screenshots or audio logs as part of their reasoning loop. Feature Comparison: Gemini 1.5 vs. Gemini 3 for Agents FeatureGemini 1.5 ProGemini 3 (Agentic)Tool Call Accuracy~85%>98%Reasoning LatencyModerateOptimized Low-LatencyNative Memory ManagementLimitedIntegrated Session StateMultimodal ThroughputStandardHigh-Speed Stream ProcessingTask DecompositionManual PromptingNative Agentic Reasoning Building a Multi-Agent System: Technical Implementation Let's walk through the implementation of a multi-agent system designed for a financial analysis use case. We will use the Vertex AI Python SDK to define our agents and tools. Step 1: Defining Tools Tools are the "hands" of the agent. In Gemini 3, tools are defined as Python functions with clear docstrings, which the model uses to understand when and how to call them. Python import vertexai from vertexai.generative_models import GenerativeModel, Tool, FunctionDeclaration # Initialize Vertex AI vertexai.init(project="my-project-id", location="us-central1") # Define a tool for fetching stock data get_stock_price_declaration = FunctionDeclaration( name="get_stock_price", description="Fetch the current stock price for a given ticker symbol.", parameters={ "type": "object", "properties": { "ticker": {"type": "string", "description": "The stock ticker (e.g., GOOG)"} }, "required": ["ticker"] }, ) stock_tool = Tool( function_declarations=[get_stock_price_declaration], Step 2: The Worker Agent A worker agent is specialized. Below is an example of a "Data Agent" that uses the stock tool. Python model = GenerativeModel("gemini-3-pro") chat = model.start_chat(tools=[stock_tool]) def run_data_agent(prompt): """Handsoff logic for the data worker agent""" response = chat.send_message(prompt) # Handle function calling logic if response.candidates[0].content.parts[0].function_call: function_call = response.candidates[0].content.parts[0].function_call # In a real scenario, you would execute the function here # and send the result back to the model. return f"Agent wants to call: {function_call.name}" Step 3: The Orchestration Flow In a complex system, the data flow must be managed to ensure that Agent A's output is correctly passed to Agent B. We use a sequence diagram to visualize this interaction. Advanced Pattern: State Management and Memory One of the biggest challenges in multi-agent systems is "state drift," where agents lose track of the original goal during long interactions. Gemini 3 addresses this with native session state management in Vertex AI. Instead of passing the entire conversation history back and forth (which increases cost and latency), we can use context caching. This allows the model to "freeze" the initial instructions and background data, only processing the new delta in the conversation. Code Example: Context Caching for Efficiency Python from vertexai.preview import generative_models # Large technical manual context long_context = "... thousands of lines of documentation ..." # Create a cache (valid for a specific TTL) cache = generative_models.Caching.create( model_name="gemini-3-pro", content=long_context, ttl_seconds=3600 ) # Initialize agent with the cached context agent = GenerativeModel(model_name="gemini-3-pro") # The agent now has 'memory' of the documentation without re-sending it Challenges in Multi-Agent Systems Building these systems isn't without hurdles. Here are the three most common technical challenges and how to solve them: 1. The "Infinite Loop" Problem Agents can sometimes get stuck in a loop, repeatedly calling the same tool or asking the same question. Solution: Implement a max_iterations counter in your Python controller and use an "Observer" pattern where a separate model monitors the agentic loop for redundancy. 2. Tool Output Ambiguity If a tool returns an error or unexpected JSON, the agent might hallucinate a solution. Solution: Use strict Pydantic models for function outputs and feed the validation error back into the agent's context, allowing it to self-correct. 3. Context Overflow Despite Gemini 3's large window, multi-agent systems can produce massive amounts of logs. Solution: Use an "Information Bottleneck" strategy. The Orchestrator should summarize the output of each worker before passing it to the next agent, ensuring only high-signal data moves forward. Testing and Evaluation (LLM-as-a-Judge) Traditional unit tests are insufficient for agents. You must evaluate the reasoning path. Google Cloud's Vertex AI Rapid Evaluation allows you to use Gemini 3 as a judge to grade the performance of your agents based on criteria like: Helpfulness: Did the agent fulfill the intent?Tool efficiency: Did it use the minimum number of tool calls?Safety: Did it adhere to the defined system instructions? Evaluation MetricDescriptionTarget ScoreFaithfulnessHow well the agent sticks to retrieved data.> 0.90Task CompletionSuccess rate of complex multi-step goals.> 0.85Latency per StepTime taken for a single reasoning loop.< 2.0s Conclusion Gemini 3 and Vertex AI Agent Builder have fundamentally changed the barrier to entry for building intelligent, autonomous systems. By utilizing a modular multi-agent architecture, leveraging native function calling, and implementing rigorous evaluation cycles, developers can move past the prototype stage and build production-ready AI systems. The key to success lies not in the size of the prompt, but in the elegance of the orchestration and the reliability of the tools provided to the agents. As we move into the era of agentic software, the role of the developer shifts from writing logic to designing ecosystems where agents can collaborate effectively.= More

2026 Developer Research Report

By Carisse Dumaua

Hello, our dearest DZone Community! Last year, we asked you for your thoughts on emerging and evolving software development trends, your day-to-day as devs, and workflows that work best — all to shape our 2026 Community Research Report. The goal is simple: to better understand our community and provide the right content and resources developers need to support their career journeys. After crunching some numbers and piecing the puzzle together, alas, it is in (and we have to warn you, it's quite a handful)! This report summarizes the survey responses we collected from December 9, 2025, to January 27 of this year, and includes an overview of the DZone community, the stacks developers are currently using, the rising trend in AI adoption, year-over-year highlights, and so much more. Here are a few takeaways worth mentioning: AI use climbs this year, with 67.3% of readers now adopting it in their workflows.While most use multiple languages in their developer stacks, Python takes the top spot.Readers visit DZone primarily for practical learning and problem-solving. These are just a small glimpse of what's waiting in our report, made possible by you. You can read the rest of it below. 2026 Community Research ReportRead the Free Report We really appreciate you lending your time to help us improve your experience and nourish DZone into a better go-to resource every day. Here's to new learnings and even newer ideas! — Your DZone Content and Community team More

Trend Report

Database Systems

Every organization is now in the business of data, but they must keep up as database capabilities and the purposes they serve continue to evolve. Systems once defined by rows and tables now span regions and clouds, requiring a balance between transactional speed and analytical depth, as well as integration of relational, document, and vector models into a single, multi-model design. At the same time, AI has become both a consumer and a partner that embeds meaning into queries while optimizing the very systems that execute them. These transformations blur the lines between transactional and analytical, centralized and distributed, human driven and machine assisted. Amidst all this change, databases must still meet what are now considered baseline expectations: scalability, flexibility, security and compliance, observability, and automation. With the stakes higher than ever, it is clear that for organizations to adapt and grow successfully, databases must be hardened for resilience, performance, and intelligence. In the 2025 Database Systems Trend Report, DZone takes a pulse check on database adoption and innovation, ecosystem trends, tool usage, strategies, and more — all with the goal for practitioners and leaders alike to reorient our collective understanding of how old models and new paradigms are converging to define what’s next for data management and storage.

Refcard #388

Threat Modeling Core Practices

By Apostolos Giannakidis

CORE

Refcard #401

Getting Started With Agentic AI

By Lahiru Fernando

Engineering an AI Agent Skill for Enterprise UI Generation

Large language models have recently made it possible to generate UI code from natural language descriptions or design mockups. However, applying this idea in real development environments often requires more than simply prompting a model. Generated code must conform to framework conventions, use the correct components, and pass basic structural validation. In this article, we describe how we built an Agent Skill called zul-writer that generates UI pages and controller templates for applications built with the ZK framework. For readers unfamiliar with it, ZK is a Java-based web UI framework, and ZUL is its XML-based language used to define user interfaces. A typical page is written in ZUL and connected to a Java controller that handles the application logic. The goal of this agent skill is to transform textual descriptions or UI mockups into ZUL pages and a Java controller scaffold, while validating the output to ensure it conforms to ZK’s syntax and component model. This article focuses on the technical design of the agent, including prompt design, validation steps, and how we guide the model to generate framework-specific UI code. Architecting the Agent: Guiding the LLM Toward Valid UI Code When building tools for enterprise developers, free-form LLM generation is a liability. LLMs often invent non-existent tags, use unsupported properties, and mix architectural patterns. The solution is strictly architecting the agent's constraints. The prompt constraints (SKILL.md): Instead of writing a prompt that "teaches" the LLM how to write ZUL, we use Markdown frontmatter and structured sections inside SKILL.md to establish ironclad constraints. These constraints bind the LLM to a strict 4-step process, effectively removing its freedom to improvise outside of our defined architecture. Structuring the context (RAG in practice): To prevent the LLM from guessing components, we feed it an exact UI-to-component mapping (references/ui-to-component-mapping.md) and base XML templates (assets/). By providing these reference assets directly within the skill, it minimizes the LLM's chance of making up invalid UI tags or layout structures. It doesn't need to guess how an MVVM ViewModel should look; it just follows mvvm-pattern-structure.zul. Designing a Deterministic Workflow for LLMs (The 4-Step Process) Why does free-form prompting fail for complex UI generation? Because generating a full UI requires multiple context switches: understanding the layout, mapping the components, writing the XML, validating the schema, and finally wiring the backend controller. To handle this, zul-writer uses Dual Input Modes (text vs. image), natively supporting both descriptive text requirements and direct image inputs (like mockups or screenshots). Here is the deterministic workflow the skill enforces: Requirement gathering and visual analysis: If an image is provided, the agent performs a visual analysis to identify layouts, tables, and buttons. It then asks necessary clarifying questions: Target ZK version (9 or 10)? MVC or MVVM? Layout preferences?Context-aware generation: The agent generates the ZUL using the exact component mappings and base XML templates provided in the assets/ directory.Local validation: (Covered in the next section).Controller generation: Ensuring the Java code (Composer or ViewModel) is generated to match the IDs and bindings of the generated ZUL perfectly. Trust, But Verify: Validating AI Output In a professional engineering workflow, you cannot blindly trust AI-generated code. XML-based languages are particularly prone to LLMs inventing invalid attributes or placing valid attributes on the wrong tags, e.g., putting an iconSclass on a textbox. Why local script validation? (cost and efficiency): You might think: "Why not just ask the LLM to validate its own code against the XSD?" Validating against massive XSD schemas via LLM prompts consumes huge amounts of tokens, takes too long, and might be prone to "sycophancy" (the LLM telling you it looks fine when it doesn't). Offloading this to a local Python script is deterministic, vastly cheaper, and significantly faster. The zul-writer skill employs a local Python validation script (validate-zul.py) featuring a 4-layer validation strategy: Layer 1: XML well-formedness.Layer 2: XSD schema validation.Layer 3: Attribute placement checks (catching context-specific errors).Layer 4: ZK version-specific compatibility checks. The agentic loop: If the local script throws an error, the agent intercepts the stack trace, understands what went wrong, and self-corrects the ZUL file before presenting the final code to the developers. Test-Driven AI Development Building an AI workflow requires applying traditional software engineering practices — specifically, testing. Testing the agent with Google Stitch and human-in-the-loop: To test zul-writer, I used Google Stitch to rapidly generate diverse UI screenshots to serve as test inputs. The iteration loop looks like this: Feed the Stitch-generated image into zul-writer.Manually review the generated ZUL output for layout accuracy and component misuse.Identify the AI's "bad habits" and write explicit rules/constraints into SKILL.md to prevent future recurrences. (This is Prompt Optimization in action). Codebase testing: The repository includes a test/ directory with known good and bad ZUL files to independently verify the Python validation script. Furthermore, a zulwriter-showcase/ gallery serves as a runnable integration test to prove that the AI-generated UIs (like enterprise Kanban boards and Feedback dashboards) actually render perfectly. Developer pro-tip: During the development of the zul-writer skill, managing file changes can be tedious. Instead of repeatedly copy-pasting the skill directory into Claude Code's skill folder every time you make a change, use a Mac Symbolic Link to point ~/.claude/skills/zul-writer directly to your actual local project directory. This single trick saves endless context switching and allows for instant testing during development! The ZUL-Writer Showcase The screenshot generated by Stitch: The ZUL page generated by ZUL-writer: As you can see, the generated result is very similar to the mockup. But what makes the result particularly useful is that the generated page is not just a generic HTML layout. The agent understands the ZK component ecosystem and generates the interface using ZK components and icon libraries. As a result, the generated page is usually very close to what a developer would write manually. Layouts, components, and event hooks are already structured correctly for a typical ZK application. Developers typically only need to: Adjust minor UI detailsRefine component propertiesImplement the actual business logic inside the generated composer In practice, this reduces a large portion of repetitive scaffolding work and allows developers to focus on application logic rather than UI boilerplate. Conclusion and Key Takeaways Large language models are becoming increasingly capable of generating code, but producing reliable results in real development environments usually requires additional structure. In this article, we explored how an agent skill can guide the model to generate framework-specific UI code and validate the output through simple checks such as XML and XSD validation. While this example focuses on generating ZUL pages and Java controller templates, the same approach can be applied to many other libraries and technologies. By combining LLM prompts, domain knowledge, and lightweight validation, developers can build agent skills that automate repetitive scaffolding tasks. Hopefully, this article provides some ideas and inspiration for building similar agent skills for the frameworks and tools you use in your own projects. Also, if you are interested in trying out the ZUL-writer, it is available on GitHub.

By Hawk Chen

CORE

Essential Techniques for Production Vector Search Systems, Part 4: Multi-Vector Search

After implementing vector search systems at multiple companies, I wanted to document efficient techniques that can be very helpful for successful production deployments of vector search systems. I want to present these techniques by showcasing when to apply each one, how they complement each other, and the trade-offs they introduce. This will be a multi-part series that introduces all of the techniques one by one in each article. I have also included code snippets to quickly test each technique. Before we get into the real details, let us look at the prerequisites and setup. For ease of understanding and use, I am using the free cloud tier from Qdrant for all of the demonstrations below. Steps to Set Up Qdrant Cloud Step 1: Get a Free Qdrant Cloud Cluster Sign up at https://cloud.qdrant.io.Create a free cluster Click "Create Cluster."Select Free Tier.Choose a region closest to you.Wait for the cluster to be provisioned.Capture your credentials. Cluster URL: https://xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.us-east.aws.cloud.qdrant.io:6333.API Key: Click "API Keys" → "Generate" → Copy the key. Step 2: Install Python Dependencies PowerShell pip install qdrant-client fastembed numpy Recommended versions: qdrant-client >= 1.7.0fastembed >= 0.2.0numpy >= 1.24.0python-dotenv >= 1.0.0 Step 3: Set Environment Variables or Create a .env File PowerShell # Add to your ~/.bashrc or ~/.zshrc export QDRANT_URL="https://your-cluster-url.cloud.qdrant.io:6333" export QDRANT_API_KEY="your-api-key-here" Create a .env file in the project directory with the following content. Remember to add .env to your .gitignore to avoid committing credentials. PowerShell # .env file QDRANT_URL=https://your-cluster-url.cloud.qdrant.io:6333 QDRANT_API_KEY=your-api-key-here Step 4: Verify Connection We can verify the connection to the Qdrant collection with the following script. From this point onward, I am assuming the .env setup is complete. Python from qdrant_client import QdrantClient from dotenv import load_dotenv import os # Load environment variables from .env file load_dotenv() # Initialize client client = QdrantClient( url=os.getenv("QDRANT_URL"), api_key=os.getenv("QDRANT_API_KEY"), ) # Test connection try: collections = client.get_collections() print(f" Connected successfully!") print(f" Current collections: {len(collections.collections)}") except Exception as e: print(f" Connection failed: {e}") print(" Check your .env file has QDRANT_URL and QDRANT_API_KEY") Expected output: Plain Text python verify-connection.py Connected successfully! Current collections: 2 Now that we have the setup out of the way, we can get into the meat of the article. Before the deep dive into multi-vector search, let us look at a high-level overview of the techniques we have covered so far/ about to cover in this multi-part series. Techniqueproblems solvedperformance impactcomplexityHybrid SearchWe will miss exact matches if we employ semantic search purely.Huge increase in the accuracy, closer to 16%MediumBinary QuantizationMemory costs scale linearly with data.40X memory reduction, 15% fasterLowFilterable HNSWNot a good practice to apply post-filtering as it wastes computation.5X faster filtered queriesMediumMulti Vector SearchA single embedding will not be able to capture the importance of various fields.Handles queries from multiple fields, such as title vs description, and requires two times more storage.MediumRerankingOptimized vector search for speed over precision.Deeper semantic understanding, 15-20% ranking improvementHigh Keep in mind that production systems typically combine two to four of these techniques. For example, a typical e-commerce website might use hybrid search, binary quantization, and filterable HNSW. We covered Hybrid Search in the first part of the series, Binary Quantization in the second part, and filterable HNSW in the third part. In this part, we will cover multi-vector search. Multi Vector Search Before we get into multi-vector search, we should understand that single vector search treats all the text fields equally. The problem with this approach is that there is a high chance of missing the structural importance of the various fields. For example, a product titled "Engine Oxygen Sensor" is more important for keyword matching than a detailed description mentioning "sensor" buried in specifications. High-Level Conceptual Flow Diagram for Multi-Vector Search Let us look at how the query vector is used with multiple fields and related vectors to arrive at a fusion score as an output. Let us now take a look at it in more detail with the code below. Python """ Example usage of the multi_vector module. This demonstrates Named Vectors (Multi-Field Vector Search) with Qdrant, and shows concrete value: when title-only or description-only search misses relevant results, and how multi-vector fixes it. """ from multi_vector import ( multi_vector_search, single_vector_search, display_multi_vector_results, get_qdrant_client, get_collection_vector_names, create_demo_collection, cleanup_demo_collection, ) from dotenv import load_dotenv load_dotenv() client = get_qdrant_client() EXISTING_COLLECTION_NAME = "automotive_parts" DEMO_COLLECTION_NAME = "multi_vector_demo" # --- Collection setup --- available_vectors = get_collection_vector_names(EXISTING_COLLECTION_NAME, client) print("=" * 80) print("COLLECTION CONFIGURATION CHECK") print("=" * 80) use_demo_collection = False if available_vectors: print(f"✓ Found named vectors in '{EXISTING_COLLECTION_NAME}': {', '.join(available_vectors)}") COLLECTION_NAME = EXISTING_COLLECTION_NAME vector_names = available_vectors[:2] if len(vector_names) == 1: vector_names = [vector_names[0], vector_names[0]] weights = {name: 1.0 / len(vector_names) for name in vector_names} else: print(f"⚠️ No named vectors in '{EXISTING_COLLECTION_NAME}'. Using demo collection.") print() if create_demo_collection(DEMO_COLLECTION_NAME, client, force_recreate=False): COLLECTION_NAME = DEMO_COLLECTION_NAME vector_names = ["title", "description"] weights = {"title": 0.6, "description": 0.4} use_demo_collection = True print("Demo collection ready. Running value demonstrations.\n") else: print("Failed to create demo collection. Exiting.") exit(1) LIMIT = 5 def _name(payload): return payload.get("part_name", payload.get("name", "Unknown")) def run_value_demo(query: str, title_hint: str, description_hint: str): """Run title-only, description-only, and multi-vector search; show misses and value.""" title_results = single_vector_search( COLLECTION_NAME, query, vector_name="title", client=client, limit=LIMIT ) desc_results = single_vector_search( COLLECTION_NAME, query, vector_name="description", client=client, limit=LIMIT ) multi_results = multi_vector_search( COLLECTION_NAME, query, vector_names=vector_names, weights=weights, client=client, limit=LIMIT ) title_ids = {r["id"] for r in title_results} desc_ids = {r["id"] for r in desc_results} # Items in description top but not in title top → "missed by title-only" missed_by_title = [r for r in desc_results if r["id"] not in title_ids] # Items in title top but not in description top → "missed by description-only" missed_by_desc = [r for r in title_results if r["id"] not in desc_ids] # Items in multi top 3 that weren't in both single-field top 3 title_top3_ids = {r["id"] for r in title_results[:3]} desc_top3_ids = {r["id"] for r in desc_results[:3]} both_top3 = title_top3_ids & desc_top3_ids multi_only_top = [r for r in multi_results[:3] if r["id"] not in both_top3] print(f"Query: \"{query}\"") print("-" * 80) print("Title-only (top {}):".format(LIMIT)) for i, r in enumerate(title_results, 1): print(f" {i}. {_name(r['payload'])} (score: {r['score']:.4f})") print() print("Description-only (top {}):".format(LIMIT)) for i, r in enumerate(desc_results, 1): print(f" {i}. {_name(r['payload'])} (score: {r['score']:.4f})") print() print("Multi-vector / fused (top {}):".format(LIMIT)) for i, r in enumerate(multi_results, 1): print(f" {i}. {_name(r['payload'])} (fused: {r['score']:.4f})") print() if missed_by_title: print("Value of multi-vector:") print(f" • Found by DESCRIPTION but not in title-only top {LIMIT}:") for r in missed_by_title[:3]: print(f" - {_name(r['payload'])} (description score: {r['score']:.4f})") print(f" → {title_hint}") if missed_by_desc: print(f" • Found by TITLE but not in description-only top {LIMIT}:") for r in missed_by_desc[:3]: print(f" - {_name(r['payload'])} (title score: {r['score']:.4f})") print(f" → {description_hint}") if not missed_by_title and not missed_by_desc and multi_only_top: print("Value of multi-vector:") print(" • Multi-vector ranking surfaces the best overall match even when single-field rankings differ.") print() # --- Example 1: Query where DESCRIPTION field shines --- print("=" * 80) print("EXAMPLE 1: Query Where DESCRIPTION Field Adds Value") print("=" * 80) print("Query: \"device that monitors exhaust gases\"") print("Many relevant items describe exhaust monitoring in the description but not in the short title.") print("Title-only search can miss them; description-only and multi-vector find them.\n") run_value_demo( "device that monitors exhaust gases", title_hint="Title-only misses these because 'exhaust' is in the description, not the title.", description_hint="Description-only finds these; multi-vector combines both signals.", ) # --- Example 2: Query where TITLE field shines --- print("=" * 80) print("EXAMPLE 2: Query Where TITLE Field Adds Value") print("=" * 80) print("Query: \"brake pad\"") print("Products with 'brake pad' in the title are easy for title search; description may be generic.") print("Title-only finds them; multi-vector keeps them at the top.\n") run_value_demo( "brake pad", title_hint="Description-only may rank these lower; title has the exact phrase.", description_hint="Multi-vector keeps title matches at the top while still using description signal.", ) # --- Example 3: Query that needs DESCRIPTION (short title) --- print("=" * 80) print("EXAMPLE 3: Query That Matches DESCRIPTION, Not Title") print("=" * 80) print("Query: \"device that measures air flow\"") print("MAF Sensor has a short title ('MAF Sensor'); the phrase 'air flow' is in the description.") print("Description-only finds it; title-only may miss or rank it lower.\n") run_value_demo( "device that measures air flow", title_hint="Title 'MAF Sensor' doesn't contain 'air flow'; description does.", description_hint="Multi-vector finds this item by combining title and description.", ) # --- Example 4: Multi-vector search only (summary) --- print("=" * 80) print("EXAMPLE 4: Multi-Vector Search (Single View)") print("=" * 80) print("One query, multi-vector result: combines title and description for best relevance.\n") multi_only = multi_vector_search( collection_name=COLLECTION_NAME, query="engine sensor", vector_names=vector_names, weights=weights, client=client, limit=5, ) display_multi_vector_results( multi_only, "engine sensor", show_fields=["part_name", "part_id", "category", "description"], ) # --- Summary --- print("\n" + "=" * 80) print("SUMMARY: When Multi-Vector Search Adds Value") print("=" * 80) print(""" • Example 1: \"exhaust monitoring\" → Description field finds O2/exhaust items that title-only misses. Multi-vector includes them and ranks well. • Example 2: \"brake pad\" → Title field finds brake pads; multi-vector keeps them at top while still using description. • Example 3: \"measures air flow\" → Description finds MAF/air flow sensor (title is just \"MAF Sensor\"). Multi-vector combines both. • Takeaway: Users search in different ways. Single-vector (title OR description) can miss relevant results. Multi-vector (title + description, fused) covers both short/keyword and detailed/contextual queries. """) if use_demo_collection: print("=" * 80) print("DEMO COLLECTION") print("=" * 80) print(f"Collection '{DEMO_COLLECTION_NAME}' is available for further runs.") print(f"To delete: cleanup_demo_collection('{DEMO_COLLECTION_NAME}', client)") print("To force refresh data: create_demo_collection('{DEMO_COLLECTION_NAME}', client, force_recreate=True)") Now let us look at it with the help of the output for multi vector search Plain Text ================================================================================ EXAMPLE 1: Query Where DESCRIPTION Field Adds Value ================================================================================ Query: "device that monitors exhaust gases" Many relevant items describe exhaust monitoring in the description but not in the short title. Title-only search can miss them; description-only and multi-vector find them. Query: "device that monitors exhaust gases" -------------------------------------------------------------------------------- Title-only (top 5): 1. Air Flow Meter (score: 0.4396) 2. Engine Oxygen Sensor (score: 0.4032) 3. Knock Sensor (score: 0.4024) 4. Coolant Temperature Sensor (score: 0.3895) 5. O2 Sensor (score: 0.3571) Description-only (top 5): 1. O2 Sensor (score: 0.8199) 2. Exhaust Oxygen Sensor (score: 0.6804) 3. Catalytic Converter (score: 0.6628) 4. Engine Oxygen Sensor (score: 0.6308) 5. Air Flow Meter (score: 0.6086) Multi-vector / fused (top 5): 1. O2 Sensor (fused: 0.5422) 2. Air Flow Meter (fused: 0.5072) 3. Engine Oxygen Sensor (fused: 0.4943) 4. Exhaust Oxygen Sensor (fused: 0.4761) 5. Coolant Temperature Sensor (fused: 0.4602) Value of multi-vector: • Found by DESCRIPTION but not in title-only top 5: - Exhaust Oxygen Sensor (description score: 0.6804) - Catalytic Converter (description score: 0.6628) → Title-only misses these because 'exhaust' is in the description, not the title. • Found by TITLE but not in description-only top 5: - Knock Sensor (title score: 0.4024) - Coolant Temperature Sensor (title score: 0.3895) → Description-only finds these; multi-vector combines both signals. ================================================================================ EXAMPLE 2: Query Where TITLE Field Adds Value ================================================================================ Query: "brake pad" Products with 'brake pad' in the title are easy for title search; description may be generic. Title-only finds them; multi-vector keeps them at the top. Query: "brake pad" -------------------------------------------------------------------------------- Title-only (top 5): 1. Performance Brake Pads (score: 0.6525) 2. Brake Pad Set (score: 0.6431) 3. Brake Rotor (score: 0.5653) 4. Shock Absorber (score: 0.2641) 5. Radiator (score: 0.1838) Description-only (top 5): 1. Brake Rotor (score: 0.3998) 2. Performance Brake Pads (score: 0.3483) 3. Shock Absorber (score: 0.3471) 4. Brake Pad Set (score: 0.3052) 5. Catalytic Converter (score: 0.1464) Multi-vector / fused (top 5): 1. Performance Brake Pads (fused: 0.5308) 2. Brake Pad Set (fused: 0.5080) 3. Brake Rotor (fused: 0.4991) 4. Shock Absorber (fused: 0.2973) 5. Radiator (fused: 0.1560) Value of multi-vector: • Found by DESCRIPTION but not in title-only top 5: - Catalytic Converter (description score: 0.1464) → Description-only may rank these lower; title has the exact phrase. • Found by TITLE but not in description-only top 5: - Radiator (title score: 0.1838) → Multi-vector keeps title matches at the top while still using description signal. ================================================================================ EXAMPLE 3: Query That Matches DESCRIPTION, Not Title ================================================================================ Query: "device that measures air flow" MAF Sensor has a short title ('MAF Sensor'); the phrase 'air flow' is in the description. Description-only finds it; title-only may miss or rank it lower. Query: "device that measures air flow" -------------------------------------------------------------------------------- Title-only (top 5): 1. Air Flow Meter (score: 0.5513) 2. O2 Sensor (score: 0.3919) 3. Mass Air Flow Sensor (score: 0.3841) 4. Engine Oxygen Sensor (score: 0.3820) 5. Exhaust Oxygen Sensor (score: 0.3720) Description-only (top 5): 1. Air Flow Meter (score: 0.6539) 2. O2 Sensor (score: 0.6490) 3. Exhaust Oxygen Sensor (score: 0.5451) 4. Mass Air Flow Sensor (score: 0.5413) 5. Engine Oxygen Sensor (score: 0.4760) Multi-vector / fused (top 5): 1. Air Flow Meter (fused: 0.5923) 2. O2 Sensor (fused: 0.4947) 3. Mass Air Flow Sensor (fused: 0.4470) 4. Exhaust Oxygen Sensor (fused: 0.4412) 5. Engine Oxygen Sensor (fused: 0.4196) Value of multi-vector: • Multi-vector ranking surfaces the best overall match even when single-field rankings differ. ================================================================================ EXAMPLE 4: Multi-Vector Search (Single View) ================================================================================ One query, multi-vector result: combines title and description for best relevance. Multi-Vector Search Results for: 'engine sensor' ================================================================================ Found 5 results using multi-vector search (weighted fusion) 1. Oxygen Sensor for Engine Part_name: Engine Oxygen Sensor Part_id: DEMO-007 Category: Engine Components Description: High-precision oxygen sensor for engine exhaust monitoring. Detects oxygen levels in exhaust gases t... Fused Score: 0.7479 -------------------------------------------------------------------------------- 2. Engine Knock Sensor Part_name: Knock Sensor Part_id: DEMO-009 Category: Engine Components Description: Piezoelectric sensor that detects engine knock or detonation. Protects engine by adjusting ignition ... Fused Score: 0.6744 -------------------------------------------------------------------------------- 3. Engine Coolant Temperature Sensor Part_name: Coolant Temperature Sensor Part_id: DEMO-008 Category: Engine Components Description: Thermistor sensor that monitors coolant temperature. Sends signal to ECU for fuel and ignition tunin... Fused Score: 0.6651 -------------------------------------------------------------------------------- 4. O2 Sensor Part_name: O2 Sensor Part_id: DEMO-003 Category: Engine Components Description: Device that monitors oxygen levels in exhaust gases. Critical for fuel mixture control and emission ... Fused Score: 0.5716 -------------------------------------------------------------------------------- 5. Wideband Oxygen Sensor Part_name: Exhaust Oxygen Sensor Part_id: DEMO-004 Category: Engine Components Description: Precision sensor for monitoring exhaust gas composition. Used for engine tuning and emission diagnos... Fused Score: 0.5532 -------------------------------------------------------------------------------- ================================================================================ SUMMARY: When Multi-Vector Search Adds Value ================================================================================ • Example 1: "exhaust monitoring" → Description field finds O2/exhaust items that title-only misses. Multi-vector includes them and ranks well. • Example 2: "brake pad" → Title field finds brake pads; multi-vector keeps them at top while still using description. • Example 3: "measures air flow" → Description finds MAF/air flow sensor (title is just "MAF Sensor"). Multi-vector combines both. • Takeaway: Users search in different ways. Single-vector (title OR description) can miss relevant results. Multi-vector (title + description, fused) covers both short/keyword and detailed/contextual queries. Benefits As you can clearly see from the results, multi-vector search handles different query styles seamlessly. There are some queries that work with title-only vectors, while description-only might struggle. There are certain other queries where the title might not help as much, and the description vector might be the best. The fused scores by the multi-vector search ensure that the multi-vector search adapts to query style automatically and help prevent missed results regardless of how the searches are performed. Costs The biggest cost driver for a multi-vector search is the fact that it requires 2X storage overhead. If we tie it down to the result shown above, we have a title vector and a description vector, and they should be stored separately. If we take a million parts into account for example and the actual storage for the million parts related data is 1.5GB we are now looking at 3GB for storage. Other costs would be the additional time added to index more fields. Also, the latency for the query increases considerably because of the dual vector searches and the logic for the fusion scoring. Not to forget the added complexity that the fusion scoring adds to the logic, and it is dependent on the specific search use case. When to Use When you have structured product data with a lot of distinct fields.When the search behavior is different for different users.When the product information is provided by more than one field, the detail is needed.When latency and complexity are manageable, and storage is not a concern. When Not to Use When there is no structural distinction between fields.When most of the fields are semantically similar.When the latency or storage constraints are tight.When a single vector does the work for you. Efficiency Comparison (From the Results) Let us quickly compare the efficiency based on the results. Query Typetitle onlydescription onlymulti vectorcoverageBrake Pad0.65250.34830.5308Preserves title qualitymonitor exhaust gases0.43960.81990.5422Uses descriptionCoverage60%70%90%Found items others missed Performance Characteristics Based on the results, the performance characteristics are as follows Metrictitle onlydescription onlymulti vectorevidence from the dataShort query accuracyExcellentPoorExcellentSearch for Brake PadLong Query accuracyPoorExcellentExcellentSearch for monitors exhaustQuery LatencyLowLowTwiceAdded latency for dual search + fusionAdaptabilityFixedFixedAutomaticAdjusts as per query style Conclusion Multi-vector search is primarily driven by the necessity of the use case. It helps unfold the value of the field importance, as it is evident from the results, the title-only search completely missed a few searches that were captured by the multi-vector search. If the application is ready for the trade-off of 2X the storage and the added latency in exchange for comprehensive coverage across all query styles, then multi-vector search is the way to go. In the next and final part of the series, we will look into reranking and also look at all of the techniques and their applications as a recap.

By Pavan Vemuri

CORE

Standards as the Invisible Infrastructure of Software

Standards are often treated as bureaucracy — something slow, heavy, and occasionally disconnected from “real engineering.” Yet if we look at history with a bit more rigor, that narrative collapses quickly. Software development is not exempt from the forces that shaped civilization. It is, in fact, one of the latest chapters in a very old story: the story of standardization as a multiplier of human capability. The word “standard” itself comes from the Old French estandart — a rallying flag, a visible sign under which people gather. That origin is revealing. A standard is not merely documentation; it is a coordination mechanism. It allows independent actors to align around shared expectations. Without that alignment, scaling knowledge becomes nearly impossible. Human progress has always depended on standards. Writing systems transformed ephemeral speech into persistent knowledge. Once symbols were formalized, ideas could survive generations. When Isaac Newton famously wrote that he stood “on the shoulders of giants,” he was acknowledging accumulated knowledge preserved through standardized language, notation, and scholarly norms. Mathematics itself works only because its symbols and rules are shared and stable. The Industrial Revolution amplified this principle. Mass production did not emerge merely from better machines; it required interchangeable parts. That meant tolerances, measurements, and repeatable specifications. A bolt had to fit a nut regardless of where it was manufactured. Standards turned craftsmanship into scalable industry. Why should software be different? In the browser you are using right now, a silent agreement is in effect: the World Wide Web Consortium defines the specifications for HTML, CSS, and related technologies. Browser vendors interpret and implement these documents independently, yet the web works because they converge on shared behavior. Imperfectly? Sometimes. But without a common reference, the web would fragment into incompatible silos. A typical software standard has multiple structural elements. First, there is the specification — the formal description of APIs, semantics, and expected behavior. A specification is not an implementation. It defines what must happen, not how it must be achieved. Second, there is the committee or expert group responsible for evolving that specification. This governance layer is often misunderstood. Its purpose is not control; it is consensus-building. Diverse stakeholders — vendors, independent experts, and community representatives — debate trade-offs so that no single company dictates the ecosystem. Third, there are vendors or implementers. These are the runtime engines, frameworks, or tools that bring the specification to life. Multiple implementations introduce competition, innovation, and resilience. Finally, there is verification. In some ecosystems, partial conformance is tolerated. In others, compliance is binary. In the Java ecosystem, the Technology Compatibility Kit (TCK) model enforces strict alignment: you either pass all required tests, or you cannot claim compatibility. This binary approach dramatically reduces ambiguity and vendor fragmentation. To understand the power of standards in modern software, consider the Java platform. Java is not defined solely by a single company’s runtime. It is defined by specifications maintained through the Java Community Process. Multiple vendors implement the language and platform — yet all must adhere to the same specification and compatibility requirements. This is what makes Java portable in practice rather than merely in marketing. Without a standard, “Java” would become a brand attached to incompatible dialects. With a standard, it becomes a contract. The same architectural logic applies to Jakarta EE, stewarded by the Eclipse Foundation. Jakarta EE is not just a collection of APIs. It is a coordinated effort: specifications, open governance, multiple compatible implementations, and TCK validation. This structure enables innovation at the implementation layer while protecting portability at the application layer. Standards are not limited to languages and platforms. They shape architectural thinking itself. Design patterns, for example, became influential not because they were mandated, but because they were documented, named, and conceptually standardized. A shared vocabulary — “Factory,” “Strategy,” “Observer” — allows teams across continents to collaborate efficiently. In architecture, REST became dominant because Roy Fielding formalized its constraints. Naming and formalization are acts of standardization. At this point, it is worth distinguishing between related but distinct concepts: open source and open standards. Open source refers to software whose source code is publicly available, typically under licenses that allow inspection, modification, and redistribution. It is an implementation model. An open standard, on the other hand, refers to a publicly accessible specification developed through a transparent and inclusive process. It is a governance model. You can have open source without an open standard — a single project, fully transparent, but controlled by one vendor without formal specification processes. You can also have an open standard implemented by proprietary software. The most resilient ecosystems combine both: open governance for the specification and open implementations that compete on quality and performance. When these two forces align, something powerful happens. Developers gain portability. Organizations reduce vendor lock-in. Innovation accelerates because differentiation shifts from reinventing interfaces to optimizing implementations. Standards also provide long-term stability. In a world obsessed with rapid iteration, it is tempting to view standards as a form of friction. But from an engineering perspective, stability is not the enemy of innovation; it is its enabler. When APIs remain predictable, teams can invest in higher-level abstractions, tooling, and architecture without fear of constant foundational shifts. Critics often argue that standards slow progress. The more precise question is: slow progress compared to what? A proprietary ecosystem may move faster initially, but it risks fragmentation, lock-in, and incompatibility. Standards introduce negotiation and consensus, which can feel slower, yet they produce ecosystems that endure for decades. Consider the analogy with manufacturing. Interchangeable parts may have required additional upfront coordination, but they enabled exponential scaling. Software standards function similarly. They are coordination overhead that unlocks systemic scale. For software engineers, standards are not merely abstract governance constructs. They are daily tools. Every HTTP request relies on standardized semantics. Every SQL query depends on decades of evolving agreement. Every JVM-based application trusts compatibility guarantees defined outside its codebase. If we step back, the skeptical question becomes unavoidable: what would modern software look like without standards? Likely a patchwork of incompatible protocols, isolated frameworks, and fragile integrations. Standards are not glamorous. They rarely trend on social media. But they are the invisible infrastructure that allows distributed collaboration at a planetary scale. They enable thousands of engineers, across companies and continents, to build systems that interoperate reliably. In the end, standards are not about restriction; they are about shared foundations. They embody the same principle that enabled writing, science, and industrialization: codify knowledge, align expectations, and allow independent actors to build upon a stable base. Software is no different from any other engineering discipline. If history teaches us anything, it is this: progress scales when agreement scales. Standards are how agreement becomes durable.

By Otavio Santana

CORE

Understanding Custom Authorization Mechanisms in Amazon API Gateway and AWS AppSync

AWS provides Lambda-based authorization capabilities for both API Gateway and AppSync, each designed to secure different API paradigms, highlighting their complementary roles and the confidence they inspire in combined security potential. Amazon API Gateway positions Lambda authorizers as a security checkpoint between incoming requests and backend integrations — whether Lambda functions or HTTP endpoints. The authorizer validates credentials, executes custom authentication workflows, and produces IAM policy documents that explicitly grant or deny access. These policies guide API Gateway’s decision to forward or reject requests to backend services. In contrast, AppSync integrates Lambda authorizers directly into the GraphQL request lifecycle, intercepting operations before resolver execution. The authorizer examines request credentials (tokens, headers, or other authentication artifacts) and returns an identity context object upon successful validation. This context propagates through the resolver chain, enabling fine-grained, context-aware authorization decisions at the data access layer. Common Characteristics The shared attributes of Lambda authorizers in AppSync and API Gateway emphasize their core security capabilities: Security Enforcement: Both validate incoming requests and enforce access-control decisions, ensuring requesters possess sufficient privileges to access protected resources.Serverless Execution Model: Authorization logic runs within AWS Lambda functions in both cases, providing a consistent serverless compute foundation for custom authentication workflows.Flexible Authorization Logic: Developers can implement bespoke authentication and authorization rules within the Lambda function to meet diverse security requirements and business needs.AWS Service Integration: Both authorizer types integrate seamlessly with the broader AWS ecosystem. AppSync can leverage DynamoDB or Lambda as data sources, while API Gateway connects to Lambda functions and HTTP endpoints.IAM-Based Access Decisions: Both services use IAM policy frameworks to govern access permissions. The authorizer generates policy documents that explicitly define allowed or denied actions for authenticated principals. Key Distinctions These distinctions clarify the operational scope, timing, and architecture of Lambda authorizers in AppSync versus API Gateway: Functional Scope: AppSync authorizers operate at the GraphQL operation layer, making access decisions before resolver invocation. API Gateway authorizers function at the API route level, protecting endpoints before backend integration occurs.Execution Timing and Context Flow: AppSync authorizers produce an identity context object that propagates through the resolver chain. API Gateway triggers authorizers before backend integration, generating IAM policies that determine request disposition.Architectural Integration: AppSync authorizers integrate directly into the GraphQL resolver pipeline within GraphQL-specific contexts. API Gateway authorizers function as infrastructure components alongside endpoints, deployment stages, and API configurations.Data Format Expectations: AppSync authorizers process GraphQL requests, accessing operation types, field selections, and input arguments. API Gateway authorizers handle requests in formats dictated by integration types—JSON, form data, or others.Authorization Output Patterns: AppSync authorizers return identity context objects, enabling resolvers to make granular decisions. API Gateway authorizers generate explicit IAM policy documents defining precise access permissions for API resources. These architectural differences influence implementation strategies and should guide service selection based on API paradigms, use cases, and authorization needs. Amazon API Gateway Authorization Flow Client initiates a request to the API Gateway endpoint.Lambda authorizer intercepts the request, executing authentication and authorization validation.Upon successful authorization, the authorizer returns the IAM policy document defining access scope.API Gateway evaluates the policy, allowing or denying the request.Approved requests forward to the backend integration (Lambda, HTTP endpoint, etc.).Backend processes the request and generates a response payload.API Gateway returns the backend response to the client. Lambda Authorizer Contract for API Gateway: Input Parameters: event: Request metadata object containing HTTP headers, method, path parameters, and request context.context: Lambda execution context providing AWS account ID, request ID, and function metadata. Output Parameters: principalId: String identifier representing the authenticated principal for tracking and auditing.policyDocument: IAM policy document defining allowed or denied actions on API resources.context: Optional key-value object for passing additional metadata downstream. AWS AppSync Authorization Flow AWS AppSync Authorization Flow Client submits GraphQL request to AppSync API endpoint.Lambda authorizer intercepts the request, performing authentication and authorization validation.Successful authorization produces an identity context object.Request proceeds to the appropriate GraphQL resolver with the identity context.Resolver executes data source operations (DynamoDB queries, Lambda invocations, etc.).Resolver returns a GraphQL response to the client. Lambda Authorizer Contract for AppSync: Input Parameters: event: GraphQL request object containing operation payload, HTTP headers, and request metadata.context: Lambda execution context providing request ID, function version, and runtime information. Output Parameters: isAuthorized: Boolean flag indicating authorization decision (true/false).context: Identity context object propagated to downstream resolvers.resolverConfig: Optional object for customizing resolver behavior, including caching strategies and field-level authorization rules. Use Case Scenarios Lambda authorizers address diverse authentication and authorization requirements across both services. The following scenarios illustrate practical applications for each implementation: API Gateway Lambda Authorizer Applications: Token-Based RESTful API Security: When building RESTful APIs that require route-specific protection, Lambda authorizers validate authentication tokens and generate IAM policies that control resource access. This pattern suits applications needing granular endpoint-level authorization with explicit access control definitions.External Authentication Provider Integration: Organizations that leverage OAuth 2.0, OpenID Connect, or other third-party identity providers can implement Lambda authorizers to validate external tokens, perform claims-based authorization, and map external identities to AWS IAM policies for seamless integration.Complex Business Rule Authorization: Applications that require authorization beyond standard IAM capabilities benefit from Lambda authorizers that can invoke external services, evaluate business rules, query databases, and apply multidimensional access control policies before granting API access. AppSync Lambda Authorizer Applications: Field-Level GraphQL Authorization: GraphQL APIs requiring fine-grained data access control leverage AppSync Lambda authorizers to enforce schema-specific authorization rules. The authorizer validates credentials and generates identity contexts that enable resolvers to make field-level and type-level access decisions based on user attributes.Multi-Factor Authentication Workflows: GraphQL applications implementing MFA can utilize Lambda authorizers to orchestrate multistep verification processes, integrating with external MFA providers, validating time-based tokens, or enforcing adaptive authentication policies based on request context and risk assessment.Federated Identity Resolution: Complex identity scenarios involving multiple authentication sources, user attribute aggregation, or cross-system identity federation benefit from AppSync Lambda authorizers that can resolve identities across disparate systems, enrich user contexts, and provide unified identity information to GraphQL resolvers. API Gateway authorizers excel in RESTful API protection, third-party authentication integration, and policy-based access control. AppSync authorizers specialize in GraphQL-specific authorization patterns, context-aware data access control, and identity-enriched resolver workflows. Quick Comparison FeatureAPI GatewayAppSyncAPI TypeRESTful APIsGraphQL APIsAuthorization LevelRoute/endpoint levelOperation/field levelOutput FormatIAM policy documentIdentity context objectCachingBuilt-in (TTL: 0-3600s)Not built-inTimeout30 seconds max10 seconds maxUse CaseToken validation, OAuthFine-grained data accessIntegration PointBefore backendBefore resolverPolicy GenerationExplicit IAM policiesContext-based decisions Implementation Examples API Gateway Lambda Authorizer (Python) Python import json import jwt from typing import Dict, Any def lambda_handler(event: Dict[str, Any], context: Any) -> Dict[str, Any]: """ API Gateway Lambda Authorizer for JWT token validation """ try: # Extract token from Authorization header token = event['authorizationToken'].replace('Bearer ', '') # Validate JWT token (simplified - use proper validation in production) decoded = jwt.decode(token, 'your-secret-key', algorithms=['HS256']) # Extract user information principal_id = decoded['sub'] # Generate IAM policy policy = generate_policy(principal_id, 'Allow', event['methodArn']) # Add context data (available in backend) policy['context'] = { 'userId': decoded['sub'], 'email': decoded.get('email', ''), 'role': decoded.get('role', 'user') } return policy except jwt.ExpiredSignatureError: raise Exception('Unauthorized: Token expired') except jwt.InvalidTokenError: raise Exception('Unauthorized: Invalid token') except Exception as e: raise Exception(f'Unauthorized: {str(e)}') def generate_policy(principal_id: str, effect: str, resource: str) -> Dict[str, Any]: """ Generate IAM policy document """ return { 'principalId': principal_id, 'policyDocument': { 'Version': '2012-10-17', 'Statement': [ { 'Action': 'execute-api:Invoke', 'Effect': effect, 'Resource': resource } ] } } AppSync Lambda Authorizer (Node.js) TypeScript const jwt = require('jsonwebtoken'); exports.handler = async (event) => { try { // Extract token from request headers const token = event.authorizationToken.replace('Bearer ', ''); // Validate JWT token const decoded = jwt.verify(token, process.env.JWT_SECRET); // Return authorization response with identity context return { isAuthorized: true, resolverContext: { userId: decoded.sub, email: decoded.email, roles: decoded.roles || [], permissions: decoded.permissions || [] }, deniedFields: [], // Optional: fields to deny access ttlOverride: 300 // Optional: cache TTL in seconds }; } catch (error) { console.error('Authorization failed:', error.message); // Return unauthorized response return { isAuthorized: false, resolverContext: {}, deniedFields: ['*'] // Deny all fields }; } }; Performance and Cost Considerations Caching Strategies API Gateway: Built-in authorizer result caching (TTL: 0-3600 seconds) significantly reduces Lambda invocations and latencyAppSync: No built-in caching; implement custom caching in authorizer using ElastiCache or DynamoDBRecommendation: Enable caching for high-traffic APIs to reduce costs and improve response times Cold Start Impact Lambda cold starts add 100-500ms latency to authorizationMitigation: Use provisioned concurrency for critical APIs or implement connection poolingConsider keeping authorizer functions warm with scheduled invocations Cost Optimization API Gateway caching reduces Lambda invocations by 90%+ for repeated requestsTypical cost: $0.20 per million authorizer invocationsA cache hit rate of 80% can minimize authorization costs by $0.16 per million requests Security Best Practices Token Validation Always validate token signatures using proper cryptographic librariesVerify token expiration (exp claim) and not-before (nbf claim)Check token issuer (iss) and audience (aud) claimsImplement token revocation checking for critical operations Secret Management Store JWT secrets and API keys in AWS Secrets Manager or Parameter StoreRotate secrets regularly using automated rotation policiesNever hardcode secrets in Lambda function codeUse IAM roles for Lambda to access secrets securely Error Handling Return generic error messages to clients (avoid leaking security details)Log detailed error information to CloudWatch for debuggingImplement rate limiting to prevent brute force attacksUse AWS WAF for additional protection against common attacks Example: Secure Secret Retrieval Python import boto3 import json from functools import lru_cache @lru_cache(maxsize=1) def get_jwt_secret(): """Retrieve and cache JWT secret from Secrets Manager""" client = boto3.client('secretsmanager') response = client.get_secret_value(SecretId='jwt-secret') return json.loads(response['SecretString'])['secret'] Error Handling and Responses API Gateway Error Responses When authorization fails, API Gateway returns: 401 Unauthorized: Invalid or missing token403 Forbidden: Valid token but insufficient permissions500 Internal Server Error: Authorizer function error AppSync Error Responses AppSync returns GraphQL errors: JSON { "errors": [ { "message": "Unauthorized", "errorType": "Unauthorized", "locations": [ { "line": 1, "column": 1 } ] } ] } Custom Error Handling Python def handle_authorization_error(error_type: str) -> Dict[str, Any]: """Return appropriate error response based on error type""" error_messages = { 'expired': 'Token has expired', 'invalid': 'Invalid token format', 'missing': 'Authorization token required', 'insufficient': 'Insufficient permissions' } # Log error for monitoring print(f"Authorization error: {error_type}") # Return generic error to client raise Exception('Unauthorized') Limitations and Constraints API Gateway Authorizer Limits Timeout: Maximum 30-second execution timePayload: Request/response limited to 10KBCache TTL: 0-3600 seconds (1 hour maximum)Concurrent Executions: Subject to Lambda account limits AppSync Authorizer Limits Timeout: Maximum 10 seconds execution timePayload: Request limited to 1MBNo Built-in Caching: Must implement custom cachingResolver Context: Limited to 5KB Best Practices for Limits Keep authorizer logic lightweight and fastImplement connection pooling for external service callsUse caching aggressively to avoid timeout issuesMonitor CloudWatch metrics for timeout and error rates Monitoring and Debugging Key Metrics to Monitor Invocation Count: Track authorization request volumeError Rate: Monitor failed authorizationsDuration: Track authorization latency (target: <100ms)Cache Hit Rate: For API Gateway authorizersThrottles: Identify capacity issues CloudWatch Logs Python import logging logger = logging.getLogger() logger.setLevel(logging.INFO) def lambda_handler(event, context): logger.info(f"Authorization request: {event['methodArn']}") # Authorization logic logger.info(f"Authorization granted for user: {principal_id}") Debugging Tips Enable detailed CloudWatch logging for authorizer functionsUse AWS X-Ray for distributed tracingTest authorizers locally using SAM CLIImplement structured logging for easier analysis across disparate systems, enrich user contexts, and provide unified identity information to GraphQL resolvers. API Gateway authorizers excel in RESTful API protection, third-party authentication integration, and policy-based access control. AppSync authorizers specialize in GraphQL-specific authorization patterns, context-aware data access control, and identity-enriched resolver workflows. Summary Lambda authorizers serve distinct yet complementary roles across AWS's API service portfolio. The selection between AppSync and API Gateway authorizers hinges on your API architecture, authorization complexity, and integration requirements. API Gateway's Lambda authorizer implementation excels in RESTful API scenarios, providing robust route-level protection through IAM policy generation. The authorizer validates credentials before backend integration occurs, producing explicit policy documents that govern resource access. This approach proves remarkably effective for third-party authentication integration (e.g., OAuth, OpenID Connect) and for custom authorization workflows that extend beyond standard IAM capabilities. Built-in caching capabilities make it cost-effective for high-traffic applications. AppSync's Lambda authorizer operates within the GraphQL paradigm, intercepting requests at the operation level before resolver execution. Rather than generating explicit IAM policies, it produces identity context objects that flow through the resolver pipeline, enabling granular, field-level authorization decisions. This model suits GraphQL APIs requiring fine-grained data access control and complex identity resolution workflows. The architectural distinctions between these implementations are substantial. AppSync authorizers integrate directly into the GraphQL resolver chain, processing GraphQL-formatted requests and returning context objects for downstream authorization logic. API Gateway authorizers function as infrastructure-level components, handling various request formats (JSON, form data) and producing IAM policies that determine the disposition of requests. Your implementation choice should align with your API paradigm: GraphQL applications benefit from AppSync's context-aware authorization model, while RESTful APIs leverage API Gateway's policy-based approach. Consider factors including authentication provider integration requirements, authorization granularity needs, caching requirements, timeout constraints, and the level of customization your security model demands. Both services provide serverless, scalable authorization mechanisms — the optimal choice depends on your specific architectural context and security requirements. When implementing Lambda authorizers, prioritize security best practices, including proper token validation, secret management, error handling, and monitoring. Leverage caching strategies to optimize performance and costs and ensure your authorizer logic remains lightweight to avoid timeout issues. Regular monitoring of authorization metrics helps identify potential security issues and performance bottlenecks before they impact production systems. References AWS Official Documentation API Gateway Lambda Authorizers Use API Gateway Lambda authorizersLambda authorizer input and output formatAWS AppSync Authorization AWS AppSync Lambda authorizersAppSync authorization use casesAWS Lambda Lambda function handler in PythonLambda function handler in Node.jsBest practices for working with AWS Lambda functionsSecurity and Best Practices AWS Secrets ManagerIAM JSON policy referenceSecurity best practices in AWS LambdaMonitoring and Debugging Monitoring REST APIs with CloudWatchMonitoring and logging for AWS AppSyncUsing AWS X-Ray with API Gateway Additional Resources AWS API Gateway PricingAWS AppSync PricingAWS Lambda PricingJWT.io - JSON Web Token IntroductionOAuth 2.0 Authorization Framework

By Leslie Daniel Raj

DPoP: What It Is, How It Works, and Why Bearer Tokens Aren't Enough

DPoP is one of the most exciting developments in the identity and access management (IAM) space in recent years. Yet many backend developers either have not heard of it or are unsure what it actually changes. In this article, I will break down what DPoP is, what problem it solves, and walk through a working implementation with Keycloak and Quarkus. What Is DPoP? Demonstration of Proof-of-Possession (DPoP) is an OAuth 2.0 security mechanism defined in RFC 9449. Its core purpose is simple: cryptographically bind an access token to the client that requested it. This way, even if a token is intercepted, it cannot be used by another client. In the traditional bearer token model, anyone who possesses the token is considered authorized. DPoP changes this model; to use a token, the client must also prove possession of the corresponding private key. The Problem: Bearer Tokens and the "Finders Keepers" Risk Bearer tokens are tokens carried in the HTTP Authorization header and accepted by the server without any additional verification of the presenter. RFC 6750 explicitly states that possession of the token is the sole authorization criterion. This means any party that obtains the token can act as if it were the legitimate client. This is not a theoretical risk. Real-world breaches have shown, time and again, that stolen Bearer tokens translate directly into unauthorized access: Codecov Supply Chain Attack (2021): Attackers who infiltrated Codecov's CI/CD process harvested tokens stored in customers' environment variables. These tokens potentially granted access to private repositories of hundreds of organizations, including HashiCorp, which confirmed it was affected.GitHub OAuth Token Leak (2022): OAuth tokens belonging to Heroku and Travis CI were stolen, allowing attackers to list private repositories and access repository metadata across dozens of GitHub organizations, including npm.Microsoft SAS Token Incident (2023): Microsoft's AI research team accidentally shared an overly permissive SAS token in a GitHub repository. This token made it possible to access 38 TB of internal data. The common thread across these incidents is that a token was obtained and seamlessly used in a different context by a different actor. What makes this possible is the Bearer token model's core assumption: whoever presents the token = the authorized actor. The model checks who holds the token, not who the token belongs to. How Does DPoP Work? DPoP requires the client to send a DPoP Proof JWT with every request. This proof is signed with the client's private key and contains the following claims: htm and htu (HTTP method and URL): Restricts the proof to a specific endpoint, preventing a proof generated for one resource from being used against another.jti (JWT ID): Each proof carries a unique ID. The server records used JTI values and rejects any proof that attempts to reuse one.iat (Issued At): Indicates when the proof was generated, allowing the server to enforce a validity window and reject stale proofs.ath (Access Token Hash): Specifies which access token the proof is associated with. The flow works as follows: Plain Text 1. Client generates an asymmetric key pair. 2. During the token request, the client sends a DPoP proof JWT whose header contains the public key (JWK). 3. The authorization server issues a DPoP-bound access token containing the JWK thumbprint (cnf.jkt). 4. When calling a protected resource, the client sends: - Authorization: DPoP <access_token> - DPoP: <signed proof JWT> 5. The resource server: - Verifies the proof signature - Checks that the proof's public key matches the token's cnf.jkt - Validates htm, htu, iat, jti - Verifies the ath claim binding the proof to the access token With this model, stealing the token alone is not enough. The attacker cannot generate valid proofs without the private key, limiting any potential misuse to an already captured, unused proof within its narrow validity window. Compare this to the Bearer model, where a stolen token grants unrestricted access until it expires. DPoP does not eliminate token theft, but it makes stolen tokens fundamentally harder to exploit. Configuring DPoP in Keycloak For this article, I use Keycloak (v26.5.5) as the identity provider. It is open-source, widely adopted, and provides built-in DPoP support with a straightforward configuration. DPoP was introduced as a preview feature in Keycloak 23.0.0 and became officially supported in version 26.4, working out of the box without any additional client configuration. If a client sends a DPoP proof during the token request, Keycloak validates it and includes the key thumbprint in the issued token. No extra setup is needed for this default behavior. However, if you want to enforce DPoP for a specific client, meaning Bearer tokens will no longer be accepted for that client's resources, follow these steps: Step 1: In the Keycloak Admin Console, navigate to the relevant realm and select the client from the Clients menu. Step 2: In the Settings tab, locate the Capability config section. Step 3: Enable the Require DPoP bound tokens switch. With this option enabled, the client must include a DPoP proof with every token request. Requests without valid proof will be rejected, and Bearer tokens will not be accepted to access this client's resources. DPoP in Action With Quarkus To see DPoP in practice, I built a Quarkus application with protected REST endpoints and tested them using a k6 script. The full source code is available on GitHub. Project Setup The application uses Quarkus 3.32.2 with the following key extension: OpenId Connect. Quarkus provides extensions for OpenID Connect and OAuth 2.0 access token management, focusing on acquiring, refreshing, and propagating tokens. XML <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-oidc</artifactId> </dependency> The quarkus.oidc.auth-server-url property specifies the base URL of the OpenID Connect (OIDC) server, which points to the Keycloak instance in this case: Properties files quarkus.http.port=8180 quarkus.oidc.auth-server-url=http://localhost:8080/realms/master quarkus.oidc.client-id=dpop-demo quarkus.oidc.token.authorization-scheme=dpop The key line here is quarkus.oidc.token.authorization-scheme=dpop. This property tells Quarkus OIDC extension to expect the Authorization: DPoP <token> scheme and to perform the full DPoP proof verification process as defined by RFC 9449. This includes validating the proof's signature, htm, htu, ath, and the cnf thumbprint binding between the token and the proof's public key. Protected Endpoints The application exposes three endpoints under the /api path, all requiring authentication. Each endpoint returns the caller's name and the token type (Bearer or DPoP) by checking the presence of the cnf claim in the JWT: Java @Path("/api") @Authenticated public class ProtectedResource { private final JsonWebToken jwt; public ProtectedResource(JsonWebToken jwt) { this.jwt = jwt; } @GET @Path("/user-info") @Produces(MediaType.TEXT_PLAIN) public String getUserInfo() { return buildResponse(); } @POST @Path("/user-info") @Produces(MediaType.TEXT_PLAIN) public String postUserInfo() { return buildResponse(); } @POST @Path("/list-users") @Produces(MediaType.TEXT_PLAIN) public String listUsers() { return buildResponse(); } private String buildResponse() { return "Hello, %s! Token type: %s".formatted( jwt.getName(), jwt.containsClaim("cnf") ? "DPoP" : "Bearer" ); } } Having both GET and POST on /user-info plus a separate /list-users endpoint is intentional. These allow us to demonstrate how DPoP proof claims (htm and htu) restrict token usage to a specific HTTP method and URL. Replay Protection With a jti Filter As mentioned above, Quarkus OIDC extension handles the core DPoP verification. However, jti replay protection is not part of that process, since tracking used values requires server-side state, which falls outside the scope of a stateless token validation layer. I added a minimal @ServerRequestFilter that records each proof's jti and rejects any reuse: Java @Singleton public class DpopJtiFilter { private final Set<String> usedJtis = ConcurrentHashMap.newKeySet(); @ServerRequestFilter public Optional<Response> checkJti(ContainerRequestContext ctx) { String dpopHeader = ctx.getHeaderString("DPoP"); if (dpopHeader == null || dpopHeader.isBlank()) { return Optional.empty(); } String[] parts = dpopHeader.split("\\."); if (parts.length != 3) { return Optional.empty(); } try { String payloadJson = new String( Base64.getUrlDecoder().decode(parts[1])); String jti = extractJti(payloadJson); if (jti != null && !usedJtis.add(jti)) { return Optional.of(Response.status(Response.Status.UNAUTHORIZED) .type(MediaType.TEXT_PLAIN) .entity("DPoP proof replay detected: jti '%s' has already been used" .formatted(jti)) .build()); } } catch (Exception e) { // Let Quarkus OIDC handle malformed proofs } return Optional.empty(); } // ... } In this example, I use an in-memory ConcurrentHashMap to keep the demo simple. In a production environment, you would use a distributed store such as Redis or Infinispan to track used jti values across multiple application instances and to apply TTL-based eviction aligned with the proof's validity window. It is worth noting that Keycloak already performs jti replay protection at the authorization server level. Internally, its DPoPReplayCheck uses the SingleUseObjectProvider, which is backed by Infinispan's replicated cache. When a DPoP proof arrives at the token endpoint, Keycloak hashes the jti combined with the request URI using SHA-1 and stores it with a TTL derived from the proof's iat claim. If the same proof is submitted again, the putIfAbsent call fails, and the request is rejected. However, this protection only covers requests made to Keycloak itself. Once a DPoP-bound token is issued, the resource server is responsible for its own jti tracking. A stolen proof could be replayed against the Quarkus application, and Keycloak would have no visibility into that. This is why I added the jti filter at the resource server level, creating a two-layer defense: Keycloak guards the token endpoint, and the filter guards the application endpoints. Testing With k6 The repository includes a k6 test script (k6/dpop-test.js) that exercises the full DPoP flow. Run it with: Shell k6 run k6/dpop-test.js The script performs seven HTTP calls in sequence. The first request obtains a DPoP-bound token from Keycloak, the next three are happy-path requests (one per endpoint), and the final three test failure scenarios. Let's take a closer look at what happens behind the scenes at both the Keycloak and Quarkus layers: 1. Token Request (Keycloak) Before any resource access, the script requests a DPoP-bound access token: The script generates an EC key pair (P-256) using the WebCrypto API.It creates a DPoP proof JWT targeting Keycloak's token endpoint (htm: POST, htu: .../protocol/openid-connect/token), signed with the private key. The public key is embedded in the proof's jwk header.It sends a POST to the token endpoint with the DPoP header and user credentials (grant_type=password).Keycloak validates the DPoP proof (signature, structure, claims), then issues an access token containing a cnf (confirmation) claim with the SHA-256 thumbprint of the client's public key. This binds the token to that specific key pair. Notice the typ: DPoP and the cnf.jkt field in the issued token: JSON { "typ": "DPoP", "azp": "dpop-demo", "sub": "830783f9-ab1b-4c41-9c23-fa6a335de1bc", "cnf": { "jkt": "8iU6dz7Uclsxek7kgyreJc8sc2LjZIbFqtUUFpWKZIc" }, "scope": "email profile", "preferred_username": "hakdogan" } 2. GET /user-info (Happy Path) The script creates a fresh DPoP proof for GET /api/user-info with a new jti, current iat, and an ath computed from the access token's SHA-256 hash. The proof payload looks like this: JSON { "jti": "6f0bf628-309d-489b-9243-38ed169e1d8c", "htm": "GET", "htu": "http://localhost:8180/api/user-info", "iat": 1772897361, "ath": "3yFPVhSab16gaSgMAFtZCgm7GXpBMx5t3ZYCeuWqT0w" } It sends GET /api/user-info with Authorization: DPoP <token> and DPoP: <proof>.Quarkus jti filter checks the proof's jti against the used-jti store. This is a new jti, so the request passes through.Quarkus OIDCextension validates the DPoP proof as required by RFC 9449 (Section 7.1), which assigns this responsibility to the resource server. It verifies the proof's signature, confirms htm matches GET, htu matches the request URL, ath matches the token hash, and the cnf thumbprint in the token matches the proof's public key. All checks pass.The endpoint reads the cnf claim from the token, identifies it as a DPoP token, and responds: Plain Text HTTP 200: Hello, hakdogan! Token type: DPoP The script repeats this same flow for POST /user-info and POST /list-users, each with a fresh proof matching the target method and URL. Both return 200 with the same response. 3. GET /user-info (Replay Attack) The script sends the exact same proof that was used in the happy path request.Quarkus jti filter checks the jti and finds it already in the used-jti store. The request is rejected before reaching OIDC validation: Plain Text HTTP 401: DPoP proof replay detected: jti '...' has already been used Note: The error message above includes the jti value for demonstration purposes, making it easy to observe what the filter caught. In a production environment, avoid exposing internal claim values in error responses. A generic 401 Unauthorized with no body, or a minimal message like "invalid DPoP proof", is sufficient and prevents information leakage. 4. POST /user-info (Method Mismatch - htm) The script creates a new proof with htm: GET targeting /api/user-info, but sends it as a POST request.Quarkus jti filter passes the request (new jti).Quarkus OIDCextension compares the proof's htm (GET) with the actual request method (POST). They do not match. The request is rejected: Plain Text HTTP 401 5. POST /list-users (URL Mismatch - htu) The script creates a new proof targeting POST /api/user-info.It sends the request to POST /api/list-users instead.Quarkus jti filter passes the request (new jti).Quarkus OIDCextension compares the proof's htu with the actual request URL. They do not match. The request is rejected: Plain Text HTTP 401 All seven checks pass: Plain Text ✓ Token request succeeds ✓ GET /user-info returns 200 ✓ POST /user-info returns 200 ✓ POST /list-users returns 200 ✓ Replay attack returns 401 ✓ htm mismatch returns 401 ✓ htu mismatch returns 401 In contrast, if the same requests were sent as plain Bearer tokens without DPoP proofs, all of them would succeed with 200. The replay, method mismatch, and URL mismatch scenarios would go undetected because there is no proof to validate. This is exactly the gap that DPoP closes. Conclusion Bearer tokens follow a simple rule: whoever holds the token is authorized. DPoP changes this by binding each token to a cryptographic key pair and requiring a fresh, signed proof on every request. A stolen token alone is no longer sufficient. The IAM ecosystem is moving in this direction. Identity providers like Keycloak and frameworks like Quarkus already offer built-in DPoP support, making adoption straightforward. Bearer tokens are not going away, but for access to sensitive resources, adopting DPoP is becoming less of a choice and more of a necessity.

By Hüseyin Akdoğan

CORE

What's New in Java 25: Key Changes From Java 21

On the 16th of September 2025, Java 25 was released. Time to take a closer look at the changes since the last LTS release, which is Java 21. In this blog, some of the changes between Java 21 and Java 25 are highlighted, mainly by means of examples. Enjoy! Introduction What has changed between Java 21 and Java 25? A complete list of the JEPs (Java Enhancement Proposals) can be found at the OpenJDK website. Here you can read the nitty-gritty details of each JEP. For a complete list of what has changed per release since Java 21, the Oracle release notes give a good overview. In the next sections, some of the changes are explained by example, but it is mainly up to you to experiment with these new features in order to get acquainted with them. Do note that no preview or incubator JEPs are considered here. The sources used in this post are available at GitHub. Check out an earlier blog if you want to know what has changed between Java 17 and Java 21. Check out an earlier blog if you want to know what has changed between Java 11 and Java 17. Prerequisites Prerequisites for reading this blog are: You must have a JDK25 installed. I advise using SDKMAN for this purpose, so that you can switch easily between JDKs.You need some basic Java knowledge. JEP512: Compact Source Files and Instance Main Methods When you are new to Java, you are confronted with quite a few concepts before you can even get started. Take a look at the classic HelloWorld.java. Java public class ClassicHelloWorld { public static void main(String[] args) { System.out.println("Hello World!"); } } You are confronted with the following concepts: You must know the concept of a class;You must know the concept of access modifiers (public in this case);You must know what a static modifier is and the difference between a static and an instance;You must know what a void return type is;You must know what a String Array is;You must know about the strange thing System.out. That is quite a lot! And the only thing you have achieved is a simple text output to the console. The purpose of JEP512 is to remove all of the boilerplate code in order to make the onboarding to Java much more accessible. This means in practice: The static method is removed and replaced with simply void main;No need to create a class;A new IO class within the java.lang package is introduced, which contains basic IO-oriented methods. The new HelloWorld in Java 25 looks as follows: Java void main() { IO.println("Hello Java 25 World!"); } Much simpler, right? Do note that even package statements are not allowed here. The purpose is just to provide as easy a starting point as possible in order to use Java. JEP513: Flexible Constructor Bodies In a constructor, it is not possible to add statements before invoking this() (invoking another constructor within the same class) or before invoking super() (invoking a parent constructor). This causes some limitations when you want to validate input parameters, for example. Assume the following Vehicle class. Java public class Vehicle { int numberOfWheels; Vehicle(int numberOfWheels) { if (numberOfWheels < 1) { throw new IllegalArgumentException("a vehicle must have at least one wheel"); } this.numberOfWheels = numberOfWheels; print(); } void print() { System.out.println("Number of wheels: " + numberOfWheels); } } Class Java21Car extends the parent Vehicle class. Java public class Java21Car extends Vehicle { Color color; Java21Car(int numberOfWheels, Color color) { super(numberOfWheels); if (numberOfWheels < 4 || numberOfWheels > 6) { throw new IllegalArgumentException("A car must have 4, 5 or 6 wheels."); } this.color = color; } @Override void print() { System.out.println("Number of wheels: " + numberOfWheels); System.out.println("Color: " + color); } } Two issues exist: If the condition in the subclass at line 7 results to true, the constructor of the parent Vehicle class is invoked unnecessarily.If you instantiate a Java21Car with numberOfWheels equal to 1, an IllegalArgumentException is thrown, but the output of the overridden print method, will print null for the color because the value has not been assigned yet. Run the FlexibleConstructor class in order to see this result. Java public class FlexibleConstructor { static void main() { Java21Car java21Car = new Java21Car(1, Color.BLACK); } } The output is: Java Exception in thread "main" java.lang.IllegalArgumentException: A car must have 4, 5 or 6 wheels. at com.mydeveloperlanet.myjava25planet.constructor.Java21Car.<init>(Java21Car.java:12) at com.mydeveloperlanet.myjava25planet.constructor.FlexibleConstructor.main(FlexibleConstructor.java:7) Number of wheels: 1 Color: null With the introduction of JEP513, these issues can be solved. Move the validation code in the constructor of the subclass above the invocation of super(). Java public class Java25Car extends Vehicle { Color color; Java25Car(int numberOfWheels, Color color) { if (numberOfWheels < 4 || numberOfWheels > 6) { throw new IllegalArgumentException("A car must have 4, 5 or 6 wheels."); } this.color = color; super(numberOfWheels); } @Override void print() { System.out.println("Number of wheels: " + numberOfWheels); System.out.println("Color: " + color); } } Create an instance just like you did before. Java Java25Car java25Car = new Java25Car(1, Color.BLACK); Running this code results in the IllegalArgumentException, but this time the super()is not invoked, and as a consequence, the print method is not invoked. Java Exception in thread "main" java.lang.IllegalArgumentException: A car must have 4, 5 or 6 wheels. at com.mydeveloperlanet.myjava25planet.constructor.Java25Car.<init>(Java25Car.java:11) at com.mydeveloperlanet.myjava25planet.constructor.FlexibleConstructor.main(FlexibleConstructor.java:8) Create an instance with valid input arguments. Java Java25Car java25Car = new Java25Car(4, Color.BLACK); The super() is invoked and the print method outputs the data as expected. Java Number of wheels: 4 Color: java.awt.Color[r=0,g=0,b=0] JEP456: Unnamed Variables and Patterns Sometimes it occurs that you do not use a variable. For example, in a catch block, you do not want to do anything with the Exception being thrown. However, before Java 25, it was still mandatory to give the variable a name. Assume the following example, where the NumberFormatException must be given a name, ex in this case. Java String s = "data"; try { Integer.parseInt(s); } catch (NumberFormatException ex) { System.out.println("Bad integer: " + s); } With the introduction of JEP456, you can be more explicit about this by making this variable an unnamed variable. You do so by using an underscore. Java String s = "data"; try { Integer.parseInt(s); } catch (NumberFormatException _) { System.out.println("Bad integer: " + s); } The same applies to patterns. Assume the following classes. Java abstract class AbstractFruit {} public class Apple extends AbstractFruit {} public class Pear extends AbstractFruit {} public class Orange extends AbstractFruit {} You create a switch where you test which kind of Fruit the instance equals to. You had to give the case elements a name, even if you did not use them. Java AbstractFruit fruit = new Apple(); switch (fruit) { case Apple apple -> System.out.println("This is an apple"); case Pear pear -> System.out.println("This is a pear"); case Orange orange -> System.out.println("This is an orange"); default -> throw new IllegalStateException("Unexpected value: " + fruit); } Also, in this case, you can be more explicit about it and use the underscore. Java AbstractFruit fruit = new Apple(); switch (fruit) { case Apple _ -> System.out.println("This is an apple"); case Pear _ -> System.out.println("This is a pear"); case Orange _ -> System.out.println("This is an orange"); default -> throw new IllegalStateException("Unexpected value: " + fruit); } JEP506: Scoped Values Scoped values are introduced by JEP506 and will mainly be of use between framework code and application code. A typical example is the processing of http requests where a callback is executed in framework code. The handle method is invoked from within the framework, and from the application code a callback is executed in method readUserInfo. Java public class Application { Framework framework = new Framework(this); //@Override public void handle(Request request, Response response) { // user code, called by framework var userInfo = readUserInfo(); } private UserInfo readUserInfo() { // call framework return (UserInfo) framework.readKey("userInfo"); } } In the framework, data is stored in a framework context within a Thread. By means of ThreadLocal, the CONTEXT is created (1). Request-specific data is stored in this context (2) before the application code is invoked. When the application executes a callback to the framework, the CONTEXT can be retrieved again (3). Java public class Framework { private final Application application; public Framework(Application app) { this.application = app; } private static final ThreadLocal<FrameworkContext> CONTEXT = new ThreadLocal<>(); // (1) void serve(Request request, Response response) { var context = createContext(request); CONTEXT.set(context); // (2) application.handle(request, response); } public UserInfo readKey(String key) { var context = CONTEXT.get(); // (3) return context.getUserInfo(); } FrameworkContext createContext(Request request) { FrameworkContext frameworkContext = new FrameworkContext(); UserInfo userInfo = new UserInfo(); // set data from request frameworkContext.setUserInfo(userInfo); return frameworkContext; } } class FrameworkContext { private UserInfo userInfo; public UserInfo getUserInfo() { return userInfo; } public void setUserInfo(UserInfo userInfo) { this.userInfo = userInfo; } } The FrameworkContext object is a hidden method variable. It is present with every method call, but you do not pass it as an argument. Because everything is running within the same thread, readKey has access to the own local copy of the CONTEXT. There are three problems using ThreadLocal: Unconstrained mutability: Every ThreadLocal variable is mutable, when code is able to invoke the get method, it is also able to invoke the set method.Unbounded lifetime: The value of ThreadLocal exists during the entire lifetime of the thread, or until the remove method is called. The latter is often forgotten, whereas per thread, data exists longer than it should.Expensive inheritance: When a child thread is created, the value of the ThreadLocal variable is copied to the child thread. The child thread needs to allocate extra storage for this. No shared storage is possible, and when using a lot of threads, this can have a severe impact. With the introduction of virtual threads, these design flaws have seen increased impact. The solution is scoped values.A scoped value is a container object that allows a data value to be safely and efficiently shared by a method with its direct and indirect callees within the same thread, and with child threads, without resorting to method parameters. It is a variable of type ScopedValue. It is typically declared as a static final field, and its accessibility is set to private so that it cannot be directly accessed by code in other classes. The previous framework code can be rewritten as follows. Instead of creating a ThreadLocal variable, a variable of type ScopedValue is created (1). When invoking application code the static method ScopedValue.where is used to assign the value (2). The readKey method remains unchanged (3). The main advantage is that the context value is only to be used during the lifetime of the run method. Java public class Framework { private final Application application; public Framework(Application app) { this.application = app; } private static final ScopedValue<FrameworkContext> CONTEXT = ScopedValue.newInstance(); // (1) void serve(Request request, Response response) { var context = createContext(request); where(CONTEXT, context) // (2) .run(() -> application.handle(request, response)); } public UserInfo readKey(String key) { var context = CONTEXT.get(); // (3) return context.getUserInfo(); } FrameworkContext createContext(Request request) { FrameworkContext frameworkContext = new FrameworkContext(); UserInfo userInfo = new UserInfo(); // set data from request frameworkContext.setUserInfo(userInfo); return frameworkContext; } } Copying data to child threads is able by means of structured concurrency (using StructuredTaskScope), Scoped values of the parent are automatically available within child threads. Structured concurrency is in its 5th preview, so it will probably be available soon. JEP485: Stream Gatherers A stream consists of three parts: Create the stream;Intermediate operations;A terminal operation. Terminal operations are extensible, but intermediate operations are not. JEP485 introduces a new intermediate stream operation Stream:gather(Gatherer) which can process elements by means of a user-defined entity. Creating a gatherer is complex and is out of scope for this blog. But there are some built-in gatherers which are discussed below. The stream used is a stream of integers. Java List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9); 1. Gatherer: fold Gatherer fold is a many-to-one gatherer that constructs an aggregate incrementally and emits that aggregate when no more input elements exist. Java int sum = numbers.stream() .gather(Gatherers.fold(() -> 0, (acc, x) -> acc + x)) .findFirst() // fold produces a single result .orElse(0); System.out.println("fold sum = " + sum); The output is: Java fold sum = 45 2. Gatherer: mapConcurrent Gatherer mapConcurrent is a stateful one-to-one gatherer that invokes a supplied function for each input element concurrently, up to a supplied limit. Java List<Integer> squares = numbers.stream() .gather(Gatherers.mapConcurrent(4, x -> x * x)) // 4 = parallelism hint .toList(); System.out.println("mapConcurrent squares = " + squares); The output is: Java mapConcurrent squares = [1, 4, 9, 16, 25, 36, 49, 64, 81] 3. Gatherer: scan Gatherer scan is a stateful one-to-one gatherer which applies a supplied function to the current state and the current element to produce the next element, which it passes downstream. Java List<Integer> runningSums = numbers.stream() .gather(Gatherers.scan(() -> 0, (acc, x) -> acc + x)) .toList(); System.out.println("scan running sums = " + runningSums); The output is: Java scan running sums = [1, 3, 6, 10, 15, 21, 28, 36, 45] 4. Gatherer: windowFixed Gatherer windowFixed is a stateful many-to-many gatherer that groups input elements into lists of a supplied size, emitting the windows downstream when they are full. Java int size = 2; List<List<Integer>> windows = numbers.stream() .gather(Gatherers.windowFixed(size)) .toList(); System.out.println("windowFixed(2) = " + windows); The output is: Java windowFixed(2) = [[1, 2], [3, 4], [5, 6], [7, 8], [9]] 5. Gatherer: windowSliding Gatherer windowSliding is a stateful many-to-many gatherer that groups input elements into lists of a supplied size. After the first window, each subsequent window is created from a copy of its predecessor by dropping the first element and appending the next element from the input stream. Java int size = 3; List<List<Integer>> windows = numbers.stream() .gather(Gatherers.windowSliding(size)) .toList(); System.out.println("windowSliding(3) = " + windows); The output is: Java windowSliding(3) = [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9]] 6. Gatherers Final Words In order to conclude this paragraph about Gatherers, some final words of advise given by Venkat Subramaniam. Venkat's four steps to use gatherers: Use familiar functions like map, filter, etc.Use a built-in gatherer.Call a friend for advise.Create a gatherer, but ... this is complex and a lot of work. JEP458: Launch Multi-File Source-Code Programs When writing scripts with Java, you probably want to split the code between different files when the script becomes too long. Also, when writing scripts, you most likely are not using a build tool in order to create a jar-file. JEP458 allows you to resolve other classes needed in your main class without extra effort. Assume the following class. Java public class Application { public static void main() { Helper helper = new Helper(); helper.run(); } } Run this class using Java 21. When you use SDKMAN, switch to a Java 21 JDK. Execute the Application.java file. Java $ java Application.java Application.java:6: error: cannot find symbol Helper helper = new Helper(); ^ symbol: class Helper location: class Application Application.java:6: error: cannot find symbol Helper helper = new Helper(); ^ symbol: class Helper location: class Application 2 errors error: compilation failed As you can see, the compilation fails because the Helper class, which is located next to the Application class cannot be found. Switch to a Java 25 JDK. Execute the Application.java file and now the Helper class can be found and the program executes as expected. Java $ java Application.java Do something JEP467: Markdown Documentation Comments With Java 21, you can format comments by means of HTML tags, as can be seen in this example. Java /** * Returns a hash code value for the object. This method is * supported for the benefit of hash tables such as those provided by * {@link java.util.HashMap}. * <p> * The general contract of {@code hashCode} is: * <ul> * <li>Whenever it is invoked on the same object more than once during * an execution of a Java application, the {@code hashCode} method * must consistently return the same integer, provided no information * used in {@code equals} comparisons on the object is modified. * This integer need not remain consistent from one execution of an * application to another execution of the same application. * <li>If two objects are equal according to the {@link * #equals(Object) equals} method, then calling the {@code * hashCode} method on each of the two objects must produce the * same integer result. * <li>It is <em>not</em> required that if two objects are unequal * according to the {@link #equals(Object) equals} method, then * calling the {@code hashCode} method on each of the two objects * must produce distinct integer results. However, the programmer * should be aware that producing distinct integer results for * unequal objects may improve the performance of hash tables. * </ul> * * @implSpec * As far as is reasonably practical, the {@code hashCode} method defined * by class {@code Object} returns distinct integers for distinct objects. * * @return a hash code value for this object. * @see java.lang.Object#equals(java.lang.Object) * @see java.lang.System#identityHashCode */ public int htmlHashCode() { return 0; } However, Markdown is also quite a lot used by developers. With JEP467, you are able to use Markdown for documentation comments. Some notes about it: Markdown comments are indicated by means of ///.<p> is not necessary anymore and can be replaced by means of a blank line.Markdown bullets can be used.Font changes use the Markdown syntax, for example, an underscore for italic font.Backticks can be used for the code font.Markdown links are also supported.The Markdown syntax to be used is the CommonMark syntax. The previous example rewritten with Markdown. Java /// Returns a hash code value for the object. This method is /// supported for the benefit of hash tables such as those provided by /// [java.util.HashMap]. /// /// The general contract of `hashCode` is: /// /// - Whenever it is invoked on the same object more than once during /// an execution of a Java application, the `hashCode` method /// must consistently return the same integer, provided no information /// used in `equals` comparisons on the object is modified. /// This integer need not remain consistent from one execution of an /// application to another execution of the same application. /// - If two objects are equal according to the /// [equals][#equals(Object)] method, then calling the /// `hashCode` method on each of the two objects must produce the /// same integer result. /// - It is _not_ required that if two objects are unequal /// according to the [equals][#equals(Object)] method, then /// calling the `hashCode` method on each of the two objects /// must produce distinct integer results. However, the programmer /// should be aware that producing distinct integer results for /// unequal objects may improve the performance of hash tables. /// /// @implSpec /// As far as is reasonably practical, the `hashCode` method defined /// by class `Object` returns distinct integers for distinct objects. /// /// @return a hash code value for this object. /// @see java.lang.Object#equals(java.lang.Object) /// @see java.lang.System#identityHashCode public int markdownHashCode() { return 0; } Conclusion In this blog, you took a quick look at some features added since the last LTS release, Java 21. It is now up to you to start thinking about your migration plan to Java 25 and a way to learn more about these new features and how you can apply them to your daily coding habits. Tip: IntelliJ will help you with that!

By Gunter Rotsaert

CORE

AI Is Rewriting How Product Managers and Engineers Build Together

For years, product and engineering teams have relied on a familiar operating model. Product defines the problem, engineering builds the solution, and correctness can be reasoned about before launch. That model worked well in deterministic systems, and AI is quietly breaking this contract. Once models are embedded into core product flows such as transaction routing, risk evaluation, or decision automation, behavior stops being fully predictable. Outcomes depend not just on code, but on data distributions, external dependencies, retry paths, latency budgets, and second-order effects that only appear at scale. As a result, product managers and engineers can no longer operate in parallel lanes. They must rethink how they work together. From Deterministic Logic to Living Systems I remember the first time we experimented with a transaction routing model in my role as a product lead focused on increasing authorization rates. At the time, routing decisions were driven by static rules. Processor preferences, issuer heuristics, and historical success rates formed the backbone of the logic. It was explainable, auditable, and increasingly limited. We ran the model in shadow mode for several weeks. It evaluated transactions in real time and proposed routing decisions, while humans retained final control. When we analyzed the results, we could clearly see that our hypothesis was true and authorization performance improved. More importantly, the model surfaced edge cases that our rules never caught. Subtle interactions between issuer behavior, merchant category, retry sequencing, and time-of-day effects emerged almost immediately. That experiment changed how we viewed the product. We were no longer shipping a routing feature. We were operating a system whose behavior would evolve continuously, shaped by data, traffic patterns, and downstream constraints. That realization forced us to evaluate how product and engineering collaborate. Why AI Changes Collaboration In traditional product development, PMs aim to define behavior clearly enough that engineering can implement it deterministically. With AI, that clarity disappears. Objectives and constraints can be defined, but outcomes cannot be fully specified. In transaction routing, a decision can be correct according to model metrics and still produce a poor product outcome. A retry path that increases authorization rates may also increase transaction costs, extend latency, or strain partner relationships. Correctness becomes contextual rather than absolute. This is where the handoff model breaks down. PMs cannot define success purely in business terms without understanding how systems behave in production. Engineers cannot design systems without grappling with business tradeoffs that change over time. Product behavior emerges from the interaction between model predictions, infrastructure limits, retry logic, and external network responses. AI forces collaboration upstream. Alignment cannot just be established once during planning, instead it becomes continuous work as the system learns and adapts. How Product Managers Must Evolve PMs working on AI-enabled systems need enough model literacy to reason about tradeoffs. This does not mean tuning models, but it does require understanding confidence thresholds, drift, false positives, and latency impacts. Without that context, it becomes difficult to define realistic success metrics or assess whether the system is behaving acceptably. Data also becomes a first-class product dependency. Data accuracy, completeness, and schema stability directly affect outcomes and must be treated as product constraints, not implementation details. Now, PMs must define the boundaries of uncertainty. When should the system retry? When should it fall back to deterministic logic? When is human review required? These decisions shape engineering architecture and determine how much risk the product can safely absorb. How Engineering Teams Must Evolve Engineering teams must move from building features to operating decision systems. For example, in AI-driven transaction routing, responsibility extends far beyond deploying a model endpoint. Teams must design for observability into how decisions are made to ensure they get the desired outcomes from their products. That would include tracking retry behavior, understanding cost accumulation, monitoring confidence distributions, and detecting drift before it becomes a business issue. Models are probabilistic by nature, which means systems must degrade gracefully. Engineers should align with their PMs to determine fallback logic and latency budgets for worst-case execution paths to ensure they get the desired customer experiences. Architecture decisions such as model complexity, retry depth, and deployment strategy shape product behavior as much as any requirements coming from the PM. Engineering input can no longer arrive late in the cycle. It must actively influence product design from the start. A New PM and Engineering Operating Model Instead of PMs defining requirements and engineers validating feasibility, both sides should co-own outcomes. Product articulates business priorities and acceptable tradeoffs. Engineering should translate those into system constraints and operational guardrails. PMs and engineers should make decisions together, with a shared understanding of risk. One way to optimize desired outcomes and limit exposure would be for teams to establish an operating model where all launches go through shadow deployments and closely monitored rollouts. PMs and engineers review the same dashboards, examining not just success metrics but how those outcomes are achieved. Case Study: Optimizing Routing and Discovering the Real Objective We shadow-deployed an AI/ML model to optimize transaction routing across multiple acquirer-processor combinations, with the goal of increasing authorization rates through intelligent retries and eventually establishing the most optimal paths. The model identified alternative paths that static rules would not attempt, and authorization rates improved as expected. After running the model for a few weeks, the results showed that transaction costs would rise. Given that each retry carried a charge, and while individual decisions made sense in isolation, aggregate behavior revealed a mismatch between model optimization and business reality. The system was maximizing approvals without sufficient sensitivity to cost and latency. Product and engineering reframed success together, shifting from a single-metric goal to a balanced objective that accounted for authorization rate, cost, and execution time. As a result, we created a better feature where authorization performance remained strong, costs stabilized, and the team established a repeatable framework for evaluating future optimizations. Conclusion AI is advancing what teams build, but more importantly, it is changing how they think, decide, and collaborate. When product behavior emerges from systems rather than code paths, shared ownership becomes essential. The most successful AI-enabled teams are those with the strongest product and engineering partnerships. They treat uncertainty as a design input, align early on tradeoffs, and evolve their systems together. In an AI-native world, product and engineering cannot afford to work in parallel lanes. Successful teams of the future will rethink how they build together.

By Raman Aulakh

What Enterprise Architects Get Wrong About “Simple” SaaS Integrations

Enterprise IT landscapes today are overflowing with cloud-based applications. The typical company now runs around 130 different SaaS tools, spanning everything from CRM and HR to analytics. Integrating these disparate apps is essential to break down data silos and enable automated workflows. However, many enterprise architects mistakenly believe that connecting SaaS applications is simple. This article examines common misconceptions about “simple” SaaS integrations and explains the real challenges lurking beneath the surface. We also outline best practices to help architects approach SaaS integration with the rigor it requires. The Illusion of “Simple” SaaS Integration SaaS vendors often tout easy integration: plug in an API key here, use a connector there, and voilà — your systems talk to each other. In theory, modern SaaS comes with well-documented REST APIs or pre-built connectors that make integration straightforward. In reality, enterprise integration is crucial but complicated. Organizations must often tie together applications built in different eras and tech stacks, deal with multiple cloud vendors’ security quirks, and manage a web of point-to-point data flows across finance, HR, sales, and other domains. The result is a complex integration environment that is anything but trivial. Enterprise architects sometimes underestimate these complexities. Below, we debunk a few prevalent myths and highlight what often goes wrong when architects assume SaaS integrations will be simple. Assuming APIs Make Integration Trivial Misconception: “Our SaaS apps have APIs, so hooking them up will be easy.” Many architects assume that because a SaaS platform provides a REST or SOAP API, integration is just a matter of writing a few scripts or using an ETL tool. Reality: APIs are necessary but not sufficient for seamless integration. Each SaaS application’s API comes with its own data model, quirks, and limitations. Writing custom scripts to Extract, Transform, and Load (ETL) data between systems can be difficult and costly, often requiring specialized expertise and significant development time. Moreover, not all vendor APIs are well-documented or reliable. Developers often struggle with inconsistent or outdated documentation, slowing integration work. Even once the technical connection is made, data from one system might not align with the schema or business rules of another system without substantial transformation and mapping logic. In short, having an API is just the starting point. Building robust integrations involves rigorous design, error handling, and testing to ensure data moves correctly and triggers the right processes. Underestimating this effort is a recipe for brittle integrations that fail under real-world conditions. Treating Pre-Built Connectors as Plug-and-Play Misconception: “The vendor provides pre-built connectors, so we don’t need to worry — it’s plug-and-play.” Many SaaS products offer connector libraries and templates for common integrations. Enterprise architects might assume these out-of-the-box connectors eliminate the need for custom work. Reality: Connectors can accelerate integration but don’t eliminate complexity. For example, Workday, a popular cloud HCM/ERP platform, provides APIs and pre-built integration templates. Yet companies implementing Workday quickly learn that significant effort is still required to fit those integrations into their unique environment. Workday often needs to exchange data with other systems, and experts warn that “it can be difficult to integrate Workday with your present alternatives,” advising organizations to develop a strategy before adoption. In practice, even “simple” connector-based integrations demand careful configuration, field mapping, and alignment of business processes. Pre-built connectors also need validation. Thorough testing is essential to ensure data flows correctly end-to-end and to prevent conflicts or duplicates. Workday’s documentation encourages teams to collaborate with IT or integration specialists to guarantee smooth data movement. In other words, plug-and-play is rarely plug-and-forget — it still requires plug-and-plan. Architects who rely blindly on connector promises may find themselves firefighting issues later. Neglecting Security and Compliance Misconception: “We use secure SaaS platforms, so integration doesn’t add new security concerns.” It’s easy to think that if each cloud application is secure on its own, connecting them won’t introduce problems. Some architects treat security and governance as afterthoughts in integration projects. Reality: Integration can open up new vulnerabilities and compliance risks. Every data pipeline between systems is a potential window for exposure. Poorly controlled integrations might inadvertently create “shadow IT” data flows or provide unauthorized access points for insider threats. Without proper oversight, confidential data could slip through cracks, leading to GDPR or SOC 2 compliance violations. For instance, integrating an HR SaaS with a third-party analytics tool could unintentionally expose personal employee information if access controls aren’t strictly managed. Enterprise architects must design integrations with multi-layered security in mind. This means enforcing centralized identity and access management (IAM), using strong authentication, and encrypting data both in transit and at rest. It also involves monitoring integration logs and data flows for anomalies. SaaS integrations are not automatically secure — they require the same diligence as any enterprise system, if not more, given they span multiple platforms and jurisdictions. Ignoring security in the integration design phase is a serious mistake that can lead to breaches or compliance failures. Building Point-to-Point Links Without Architecture Misconception: “We just need to connect Application A to B. A few direct integrations will do the job.” Architects under time pressure might greenlight a series of quick point-to-point interfaces between SaaS apps, believing this ad-hoc approach is the fastest solution. Reality: Point-to-point integrations don’t scale and become a maintenance nightmare. While a single hardwired interface might be manageable, enterprises rarely stop at one or two integrations. Soon, dozens of applications need interconnection — HR with finance, finance with CRM, CRM with support systems, and so on. Managing multiple custom point-to-point integrations leads to inevitable complexity. Each new connection increases dependencies, making the overall system fragile and hard to change. This “spaghetti integration” can collapse when any component is updated or if volumes spike. Enterprise architects have learned that using an integration platform or hub is a more sustainable approach. Most successful SaaS integration strategies leverage an iPaaS to simplify connectivity and avoid one-off interfaces. An iPaaS provides a standardized way to manage data flows, apply transformations, and handle errors centrally. It reduces operational costs and makes it easier to adjust when SaaS vendors roll out updates. Treating integration as a quick point-to-point plumbing job is short-sighted; a smart architecture from the start prevents brittle connections. Underestimating Scalability and Maintenance Challenges Misconception: “If the integration works in a dev test, we’re done — it will handle production loads.” This view overlooks how integrations behave as data volume and usage grow. Architects might assume that once built, integrations run smoothly indefinitely. Reality: Many “simple” integrations fail under scale or break over time. Workflows that handle 100 transactions may collapse at 10,000. SaaS APIs often have rate limits and throttling. Without queueing and retry logic, integrations can hit API limits and fail under load. Complex workflows may also become unmanageable, leading to data lags or consistency issues if not architected for scale. Robust integrations require features like message queues, retry logic, modular design for version changes, and proactive monitoring. Ignoring these needs can turn a “simple” project into a firefighting ordeal. Relying on Limited Resources and Expertise Misconception: “Any good developer or our existing team can handle integration as a side task.” Some architects allocate minimal resources to integration, thinking it’s a one-time development effort. Reality: SaaS integration requires specialized knowledge. Implementing integrations involves understanding each system’s data nuances, creating mappings, handling errors and retries, enforcing security, and designing orchestration flows. Many teams struggle due to lack of expertise, spinning their wheels on problems a specialist could foresee. Enterprise architects should recognize that successful SaaS integration may call for dedicated teams or specialized tools. Low-code platforms or integration specialists can fill skill gaps. The key is to treat integration as a first-class part of the architecture, not an afterthought. Best Practices to Make SaaS Integration Smoother Start with a clear integration strategy: Plan integration as a fundamental part of any SaaS adoption. Define which data and processes need to sync across systems and map out the workflows before building anything. As one guide emphasizes determine your business goals, the critical workflows to integrate, and each application’s API capabilities upfront to avoid nasty surprises later. A well-defined strategy ensures you focus on integrations that deliver real value and avoid unnecessary complexity.Prioritize and phase the integrations: Not every connection is mission-critical. Identify which integrations will have the highest business impact and tackle those first. Rolling out integrations in phases allows you to manage complexity and learn as you go, rather than attempting a “big bang” approach.Leverage the right tools and architecture: Use integration platforms or middleware to handle the heavy lifting wherever possible. An iPaaS provides scalability, monitoring, and built-in connectors that can significantly reduce custom coding. These platforms are designed to expand and adapt with your needs. When custom code is unavoidable keep it modular and use well-known integration patterns to decouple systems. The goal is to avoid brittle point-to-point hacks by implementing a resilient integration architecture from day one.Embed security and governance: Treat security as a foundational design aspect not an afterthought. Use secure authentication encrypt data in transit and at rest, and enforce least privilege access for all integration users and services. Implement centralized monitoring and logging for all integration activities so that any irregular access or data leak can be quickly detected. Also ensure compliance requirements are met by controlling where data flows and how it’s stored. When evaluating integration solutions or iPaaS vendors, pay attention to how they handle security and compliance features.Document and test thoroughly: Maintain up-to-date documentation of your integration workflows, data mappings, API endpoints and credentials. Future maintainers (or even your future self) will need this map to navigate the integration landscape. Relying on tribal knowledge is risky if key team members leave. In parallel, test integrations rigorously before and after go-live. Create unit tests for transformation logic, run integration tests to see systems working together, and simulate high loads to catch performance bottlenecks. Don’t forget to include failure scenario testing. Conclusion With the right mindset and tools, SaaS integration pitfalls can be avoided. Recognizing that integration is complex but critical is the first step. Taking a strategic approach using integration platforms, planning for scale, embedding security, and allocating expertise will set organizations up for success. Enterprise architects who approach SaaS integrations with rigor and realism enable their companies to reap the benefits of a connected cloud ecosystem without the surprises that come from false simplicity. In short, don’t let the “simple” label fool you — treat SaaS integration as the sophisticated engineering effort it truly is.

By Suresh Kurapati

AWS EventBridge as Your System's Nervous System: The Architecture Nobody Talks About

I was sitting in a Tuesday standup when our VP dropped the bomb. "Stripe is deprecating their v2 webhooks. We have 90 days." My stomach dropped. We had webhook handlers scattered across seven different services: order processing, inventory updates, email notifications, analytics pipelines. Each one was directly coupled to Stripe's webhook format — the kind of technical debt you promise yourself you'll fix "next quarter" for two years running. Platform engineering did the math on the whiteboard. Updating seven services, each with its own deployment pipeline, database migrations, and testing requirements. Conservative estimate: $180K in engineering costs. Best case: six weeks. Worst case: we miss the deadline and payments break. Except we didn’t spend $180K. We spent $1,200 and finished in four days. Because nine months earlier, I’d convinced the team to rebuild our event architecture around EventBridge. And everyone thought I was crazy. The Architecture That Seemed Like Overkill Let me take you back to that architecture review meeting in March. I’d just spent two weeks building a proof of concept that routed all external webhooks through EventBridge instead of having them hit our services directly. The pushback was immediate and predictable. "Why add another layer? Webhooks work fine." "This seems like premature optimization." "Do we really need AWS EventBridge for this?" Here’s what I was seeing that nobody else was: our system didn’t have a nervous system. It had seven independent organisms that happened to live in the same body. When Stripe sent a payment_intent.succeeded event, it would directly wake up the order service. That service would manually call the inventory service, which would call the notification service, which would call analytics. It worked. But it was brittle as hell. The EventBridge architecture I proposed flipped this completely. Instead of services talking to each other, they all talked to EventBridge. Every external webhook hit a single Lambda function that normalized the event and published it to our event bus. Services subscribed to the events they cared about. The beauty? No service knew where events came from. The order service didn’t know whether a payment.completed event came from Stripe, PayPal, or a test harness. It just knew: payment completed, process order. Why It’s Actually a Nervous System The nervous system metaphor isn’t just cute marketing. It’s architecturally precise. Your nervous system doesn’t have your hand directly talk to your foot. When you touch something hot, sensory neurons send signals to the spinal cord. The spinal cord routes those signals to the brain. The brain processes them and sends signals back through motor neurons. Each part is specialized, decoupled, and completely unaware of the others’ implementation details. That’s exactly what EventBridge does. External webhooks are sensory input. The webhook normalizer is the spinal cord — immediate, reflexive processing. EventBridge is the neural pathway. Individual services are motor neurons — specialized responders. The killer feature isn’t the technical architecture. It’s what happens when you need to change something. Just like you can train your nervous system to respond differently to stimuli, you can rewire your EventBridge rules without touching application code. When Stripe changed its API, here’s what we did: We updated the webhook normalizer Lambda (68 lines of code). Changed the event shape from Stripe’s v2 format to v3. Published to the same EventBridge event pattern. Done. Every downstream service kept working because they consumed normalized events, not Stripe’s raw payloads. We did it on a Wednesday afternoon. Zero downtime. Zero service updates. Zero database migrations. The Patterns Nobody Documents Here’s what I learned building this that you won’t find in AWS documentation. Pattern 1: Event Normalization at the Edge Don’t let raw external events onto your bus. Ever. Your webhook handler should transform vendor-specific payloads into domain events. When we integrated PayPal, our services didn’t care. They still received payment.completed events with the same schema. Pattern 2: Event Versioning from Day One We screwed this up initially. Six months in, we needed to change the event schema. Half our services were still consuming v1 events. Now every event includes a version field, and EventBridge rules route based on version. Services can migrate on their own schedule. Pattern 3: Dead Letter Queues for Everything This saved us during Black Friday. A bug in the inventory service caused it to reject 15% of order.created events. Because we had DLQs configured, those events sat safely in a queue while we fixed the bug, then we replayed them. Zero lost orders. Pattern 4: Archive Anything That Touches Money EventBridge archiving is criminally underused. We archive every payment-related event for 90 days. When customers dispute charges, we have perfect audit trails. When the finance team needs transaction reports, we replay archived events. Cost? $47/month for 2.1M archived events. The Real Cost Let’s talk money, because everyone assumes EventBridge is expensive. We process 4 million events per day. That’s 120M events per month. At $1.00 per million events published and $0.20 per million events delivered to targets, our monthly EventBridge bill is around $280. For that $280, we eliminated: $180K in migration costs (already paid for itself)14 inter-service API calls that were costing $890/month in NAT gateway charges3 RDS instances we were using for “event storage” ($440/month)Approximately 40 hours per month spent debugging service-to-service communication issues The ROI is absurd. But it gets better. The Second-Order Benefits Six months after deployment, weird things started happening. The analytics team built a new real-time dashboard without asking engineering for anything. They just subscribed to relevant events and built their pipeline. Two days. Zero meetings. When we needed to add fraud detection, we didn’t modify existing services. We deployed a new fraud service that subscribed to payment.initiated events and published fraud.detected events. The payment service added a rule to listen for fraud events. Done. We started treating events like data. Product managers would ask, "What events do we have around checkout?" — and we could actually answer. We built an internal event catalog. New developers could browse available events before writing a single line of code. The system became genuinely antifragile. It got stronger from shocks instead of weaker. What I’d Do Differently If I were starting over, I’d invest in event schema validation earlier. We use JSON Schema now, but we should have started with it. Too many bugs came from services expecting fields that didn’t exist. I’d also set up better monitoring from day one. We can see EventBridge metrics, but tracking event flow through the entire system required custom CloudWatch dashboards and Lambda instrumentation. There’s no out-of-the-box “nervous system health” view. And I’d push back harder against the “this seems like overkill” argument. The pattern only seems like over-engineering until you need to change something. Then it suddenly becomes the only sane architecture. The best architectures are the ones that make future changes easy, not the ones that make the initial build fast. When NOT to Use This Pattern I’m not saying EventBridge is always the answer. If you have three microservices that rarely change, direct HTTP calls are fine. If your entire system fits in a single Lambda function, you don’t need this. But if you’re building something that will integrate with external systems, if you expect service boundaries to shift, if you’re growing a team that needs to work on different parts of the system independently — this pattern will save you. It did for us. That Tuesday, when our VP announced the Stripe migration, the room went quiet for about five seconds. Then I said, "We’re fine. The normalizer handles this." Everyone looked confused. Then relieved. Then slightly annoyed that they’d braced for a crisis that wasn’t coming. That’s the nervous system working exactly as designed. The hand touches something hot, signals fire, the system reacts — and you barely notice the complexity underneath. That’s the architecture nobody talks about.

By Dinesh Elumalai

Shifting Bottleneck: How AI Is Reshaping the Software Development Lifecycle

The AI Promise and the Reality The software development industry has witnessed an unprecedented transformation with the integration of artificial intelligence tools into the development lifecycle. GitHub's 2024 Developer Survey reveals that 87% of developers using AI coding assistants report significantly faster development cycles, with productivity gains of up to 41% on routine coding tasks [11]. Yet paradoxically, many organizations are discovering that accelerating one phase of development merely exposes — or creates — bottlenecks elsewhere in the pipeline. This phenomenon, which I term “the shifting bottleneck paradox,” represents one of the most critical challenges facing software engineering teams today. As Bain & Company's 2025 Technology Report notes, while two-thirds of software firms have rolled out generative AI tools, the reality is stark: teams using AI assistants see only 10% to 15% productivity boosts, and often the time saved is not redirected toward higher-value work [4]. The Coding Acceleration: A Double-Edged Sword Individual Developer Productivity Gains Research from GitHub has consistently demonstrated that AI-powered coding assistants can significantly improve developer efficiency at the individual level. A 2024 study published in Communications of the ACM by Ziegler et al. analyzed 2,631 survey responses from developers using GitHub Copilot and found that 73% reported staying in a flow state more effectively, while 87% preserved mental effort during repetitive tasks [1]. A case study from Zoominfo involving over 400 developers showed an average acceptance rate of 33% for AI suggestions and 20% for lines of code, with developer satisfaction scores reaching 72% [7]. Research from the automotive industry, published in the Americas Conference on Information Systems 2024 Proceedings, demonstrated improvements across throughput, cycle time, code quality, defects, and developer satisfaction when using GitHub Copilot within the SPACE framework [6]. The study found that AI copilots showed the greatest efficacy in software-building tasks, with developers reporting 45% time savings in coding activities. The Surprising Counter-Evidence However, not all research paints such an optimistic picture. A study by METR (Model Evaluation and Threat Research) in July 2025 produced surprising findings that challenge the narrative of universal AI productivity gains [2]. The randomized controlled trial of early-2025 AI tools with experienced open-source developers found that when developers used AI tools, they took 19% longer to complete tasks compared to working without AI assistance. This counterintuitive result highlights a critical insight: AI’s impact on productivity is highly context-dependent. As the METR researchers note, AI tools may be most beneficial for less experienced developers or those working in unfamiliar codebases, rather than for experienced developers working in their own repositories [2]. This nuance is often lost in broad claims about AI productivity. The New Bottleneck: Code Review and Merge Approval The 77% Human Control Problem While AI has accelerated code generation, it has inadvertently created what colleagues at LinearB describe as “AI dams” — points where human processes block the flow of AI-accelerated work [5]. Their 2024 analysis of more than 400 development teams revealed a striking disparity: 67% of developers use AI for coding, yet merge approvals remain 77% human-controlled, with only 23% adoption of AI assistance. As Suzie Prince from Atlassian observed during a recent industry webinar, “80% of coding time for a developer — or the time that a developer spends — is not coding. It’s planning, documentation, reviews, and maintenance.” The irony is clear: the industry has optimized the 20%, while the bottlenecks persist in the remaining 80%. According to the State of Code Review 2024, the median engineer at a large company takes approximately 13 hours to merge a pull request, spending the majority of this time waiting on code review. A 2023 Stack Overflow Developer Survey found that developers spend roughly 15–20% of their time waiting for code reviews, with this number climbing even higher in teams with tight release schedules or large codebases. The AI Code Review Response The industry has responded with a proliferation of AI-powered code review tools. Research published in Medium by API4AI in May 2025 indicates that teams using AI-assisted code review tools report 30% faster merge request approvals — especially for small and medium-sized changes — with fewer back-and-forth review cycles and reduced load on senior engineers [13]. GitHub's July 2025 research reveals that developers who run AI-powered reviews before opening pull requests often eliminate entire classes of trivial issues, such as missing imports or inadequate tests, reducing iterative review cycles [12]. However, as GitHub’s blog post “Code Review in the Age of AI” emphasizes, AI changes none of the fundamental accountability requirements — it merely shifts the bottlenecks. The merge button still requires a developer’s approval because AI cannot make nuanced decisions about privacy implications, technical debt prioritization, or architectural trade-offs. The Testing and Quality Assurance Shift The Shift-Left Imperative The acceleration of code generation through AI has reinforced the importance of shift-left testing — moving quality assurance earlier in the development lifecycle. According to Barry Boehm’s seminal research, later validated in multiple contemporary studies, fixing a bug in production can cost up to 100 times more than addressing it during the requirements phase [16]. The Ponemon Institute’s 2017 research found that vulnerabilities detected early in development cost approximately $80 on average, whereas the same vulnerabilities cost around $7,600 to fix if detected after deployment [17]. The 2024 World Quality Report highlights that 72% of QA teams now integrate automation into their workflows alongside manual testing to enhance their capacity to deliver faster and more reliable results [15]. This integrated approach ensures that quality is embedded throughout the development lifecycle, aligning with the shift-left principle, which aims to identify and address issues early in the process. The Software Lifecycle Pipeline Challenge As AI accelerates individual coding productivity, the downstream pipeline must keep pace. Bain & Company’s 2024 Technology Report emphasizes that broad AI adoption requires process changes: “If AI speeds up coding, then code review, integration, and release must speed up as well to avoid bottlenecks” [3]. Leading companies like Netflix and Intuit have recognized this and shifted testing and quality checks earlier using the shift-left approach to ensure that rapidly generated code does not remain stuck waiting on slow tests. Forrester’s 2024 State of DevOps Report reveals that organizations leveraging AI in DevOps pipelines have reduced their release cycles by an average of 67% [10]. However, as reported in [5], low-impact areas in AI adoption tell a revealing story: security reviews hover at just 15% AI adoption due to risk concerns, while production debugging remains at 12% due to its complexity. The Organizational Bottleneck: Process and Culture The Human Adaptation Challenge Perhaps the most significant bottleneck is not technical at all — it is human. At a Fortune dinner held in collaboration with AMD in May 2024, technology leaders agreed that AI is “developing faster than an individual’s or company’s ability to adapt to it” [8]. One executive warned of “process growth pains and cycles of job displacement and creation that would require the most human of features: grace and dignity.” The group identified multiple bottleneck categories: regulatory bottlenecks as policymakers react, organizational bottlenecks as corporations cope, strategic bottlenecks as leaders plan, and technical bottlenecks. But the consensus was clear: the biggest bottleneck of all is human — the people operating the systems [8]. XB Software’s July 2025 analysis notes that only 12% of business leaders surveyed by MIT report that generative AI has fundamentally transformed how their solutions are developed [9]. However, 38% believe that generative solutions will bring major changes to the software development lifecycle within the next one to three years, with an additional 31% anticipating transformative shifts over four to ten years. The Skills and Enablement Gap Microsoft research reveals that it can take up to 11 weeks for users to fully realize the satisfaction and productivity gains of using AI tools. This finding underscores the importance of proper enablement and training programs. Organizations must invest in teaching developers not just how to use AI tools, but how to integrate them effectively into existing workflows. Forrester’s 2024 survey reveals a critical insight: developers spend only about 24% of their time writing code [10]. The rest is devoted to essential tasks such as creating software designs, writing and running tests, debugging issues, and collaborating with stakeholders. These activities require critical thinking, creativity, and communication — skills that current AI tools cannot replicate or replace. The Three Phases of AI Evolution in Software Development Research [5] identifies three distinct phases in the evolution of AI in software development: Phase 1 (2021-2023): Individual Adoption Developers adopted AI copilots and assistants with productivity gains remaining localized and no real process changes. Teams experienced isolated improvements but systemic bottlenecks remained unchanged. Phase 2 (2024-2025): Workflow Integration Teams are connecting AI tools across workflows, introducing automated PR reviews, intelligent test generation, and smart deployments. This phase reveals the bottleneck-shifting phenomenon most acutely, as organizations discover that accelerating one stage exposes constraints elsewhere. Phase 3 (2026 and beyond): AI-Native Development Bain & Company’s 2025 report describes this emerging phase as requiring companies to “frame their roadmap as an AI-native reinvention of the software development lifecycle” [4]. This involves designing processes from scratch with AI capabilities in mind, rather than retrofitting existing workflows. The Maturity Model: Four AI Adoption Patterns The analysis of enterprise adoption patterns reveals four distinct quadrants [5]: AI Newbie (50% of enterprises, 60% of startups) These teams show minimal AI adoption with traditional workflows dominating. They might have GitHub Copilot licenses but manual processes control everything. They're stuck due to risk aversion, lack of DevEx support, and unclear ROI. Vibe Coder (16% of startups, rare in enterprises) Heavy AI code generation meets traditional processes. Developers use Copilot extensively, but PRs still wait days for review. AI creates; humans slowly evaluate. These teams plateau because process bottlenecks negate generation speed, with warning signs including rising technical debt and reviewer burnout. AI Orchestrator (fewer than 1% of startups) This rare configuration maintains human-crafted code with AI-driven workflows. The emphasis is on process automation rather than code generation. AI-Native (8% of enterprises with 300+ developers) These organizations have achieved full integration, using AI throughout the development lifecycle with processes redesigned to accommodate AI capabilities. Strategic Responses: Breaking Through the Bottlenecks Visibility through Engineering Intelligence A key imperative for managing successful AI adoption while avoiding bottlenecks is the need for data and visibility across the software lifecycle. Logilica has shown [19] that engineering intelligence platforms can aid organizations in obtaining and maintaining visibility across software lifecycle silos. This is paramount to for engineering leader to make responsive, up-to-date, data-driven decisions. Some key outcomes of Logilica's research are: Baseline your engineering processes across the software lifecycle Measure AI adoption across teams and track adoption rates over time Correlate the impact of AI adoption with software lifecycle bottlenecksDrive process and tooling change according to the insights gained Establish a continuous improvement and feedback loop Comprehensive Pipeline Optimization Organizations achieving the highest gains from AI, up to 30% efficiency improvements according to Bain, take a comprehensive approach that goes beyond code generation [3]. Intuit's initiative to move from “scrappy testing” to scale development exemplifies this approach, focusing on increasing development velocity while leveraging generative AI benefits across the entire development platform. The key dimensions include: Focusing on the right work that creates the most valueEnsuring speedy, high-quality execution with full AI potentialOptimizing resourcing costs across the development lifecycle Tool Integration and Orchestration Research reveals that the average team uses 4.7 AI tools, but only 1.8 integrate with each other [5]. This creates “AI silos” where productivity gains in one area create bottlenecks in another. Smart teams solve this through API-first selection, workflow platforms to orchestrate tools, standard formats ensuring interoperability, and gradual rollout integrating one stage at a time. Security and Compliance Integration As AI reshapes the software development lifecycle, the focus on security and risk prediction is surpassing productivity gains as a top priority. Generative AI’s growing influence on planning, coding, testing, and deployment introduces new challenges, especially as AI-driven reverse engineering and attack tools become increasingly sophisticated. The 2024 DORA report emphasizes the importance of tying engineering practices to user and business value, noting that AI appears in the toolchain, but value shows up only when practices, platforms, and metrics align. Looking Ahead: The Autonomous Development Era The next wave of AI in software development — agentic or autonomous AI — raises the stakes even higher. Bain’s 2025 report notes that autonomous agents that can manage multiple steps of development with little to no human intervention are emerging [4]. Start-up Cognition introduced “Devin,” an AI “software engineer,” in 2024 that can build and troubleshoot applications from natural language prompts. Vitor Monteiro from Poolside describes a roadmap evolving from today’s code assistants to junior developers, eventually progressing to senior developers, and ultimately to autonomous development systems [20]. However, Poolside has identified two critical bottlenecks: compute power and data. With approximately 3 trillion tokens of source code available worldwide for training, all AI companies are working with the same limited dataset, driving innovation in synthetic data generation. Conclusion: Embracing the Complexity The integration of AI into the software development lifecycle represents a profound transformation — but not the simple productivity multiplier that early enthusiasts envisioned. Instead, we are witnessing a complex reordering of constraints and capabilities across the development pipeline. The evidence is clear: AI excels at structured, repeatable tasks but struggles with nuanced decision-making. Success in this new landscape requires organizations to: Measure the software lifecycle: Gain visibility with engineering intelligence (SEI).Think holistically: Optimize the entire development pipeline, not just coding.Invest in people: Provide comprehensive training and enablement programs.Redesign processes: Build AI-native workflows rather than retrofitting existing ones.Integrate tools: Ensure AI tools work together rather than creating new silos. As we move toward Phase 3 of AI evolution in software development, the winners will be organizations that recognize AI not as a magic bullet, but as a powerful tool that requires thoughtful integration into a reimagined development lifecycle. The bottlenecks will continue to shift, but with strategic planning and comprehensive process transformation, organizations can turn these challenges into competitive advantages. The future of software development is neither purely human nor purely AI — it is a carefully orchestrated collaboration that amplifies human creativity and judgment while leveraging AI’s speed and pattern-recognition capabilities. Understanding and managing shifting bottlenecks will be the defining challenge of software engineering leadership for the next decade. To manage this transition successfully, data and visibility will be crucial [19]. Even here, AI can provide automated guidance and deep insights to ensure the guardrails for success are in place [22]. References Ziegler, A., Kalliamvakou, E., Li, X.A., Rice, A., Rifkin, D., Simister, S., Sittampalam, G., & Aftandilian, E. (2024). "Measuring GitHub Copilot's Impact on Productivity." Communications of the ACM, 67(3). https://cacm.acm.org/research/measuring-github-copilots-impact-on-productivity/METR. (2025). "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity." https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/Bain & Company. (2024). "Beyond Code Generation: More Efficient Software Development." Technology Report 2024. https://www.bain.com/insights/thriving-as-the-software-cycle-slows-tech-report-2024/Bain & Company. (2025). "From Pilots to Payoff: Generative AI in Software Development." Technology Report 2025. https://www.bain.com/insights/from-pilots-to-payoff-generative-ai-in-software-development-technology-report-2025/LinearB. (2024). "AI in Software Development: The Complete Guide to Tools, Productivity & Real ROI." https://linearb.io/blog/ai-in-software-developmentSmit, D., Smuts, H., Louw, P., Pielmeier, J., & Eidelloth, C. (2024). "The Impact of GitHub Copilot on Developer Productivity from a Software Engineering Body of Knowledge Perspective." AMCIS 2024 Proceedings. https://aisel.aisnet.org/amcis2024/ai_aa/ai_aa/10/Zoominfo. (2025). "Experience with GitHub Copilot for Developer Productivity at Zoominfo." arXiv. https://arxiv.org/html/2501.13282v1Fortune. (2024). "AI's Biggest Bottlenecks, According to CIOs and CTOs." https://fortune.com/2024/05/01/ai-bottlenecks-regulatory-technical-organizational-strategic-humans/XB Software. (2025). "Generative AI in Software Development: 2024 Trends & 2025 Predictions." https://xbsoftware.com/blog/ai-in-software-development/Forrester. (2024). "State of DevOps Report."GitHub. (2024). "Research: Quantifying GitHub Copilot's Impact on Developer Productivity and Happiness." https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/GitHub. (2025). "Code Review in the Age of AI: Why Developers Will Always Own the Merge Button." https://github.blog/ai-and-ml/generative-ai/code-review-in-the-age-of-ai-why-developers-will-always-own-the-merge-button/API4AI. (2025). "AI Code Review in DevOps Workflows." Medium. https://medium.com/@API4AI/ai-in-devops-enhancing-code-review-automation-55beb25111a8CodeAnt AI. (2025). "AI Code Review Metrics That Reduce Backlog (Not Just Comments)." https://www.codeant.ai/blogs/ai-code-review-metrics-reduce-backlogCapgemini, Sogeti, & Micro Focus. (2024). "World Quality Report."Boehm, B. (1981). Software Engineering Economics. Prentice Hall.Ponemon Institute. (2017). "The Cost of Software Vulnerabilities."McKinsey & Company. (2024). "AI Adoption Survey."Logilica. (2025). "The Impact of AI on Software Engineering Productivity" https://www.logilica.com/blog/the-impact-of-ai-on-software-engineering-productivityMonteiro, V. (2024). "How Poolside Is Pursuing AGI for Software Development." ai-Pulse Conference. https://www.frenchtechjournal.com/ai-pulse-2024-how-poolside-is-pursuing-agi-for-software-development/InformationWeek. (2025). "Breaking Through the AI Bottlenecks." https://www.informationweek.com/machine-learning-ai/breaking-through-the-ai-bottlenecksLogilica. (2025). "Boost Insights with Logilica's AI Advisor" https://www.logilica.com/blog/boost-insights-with-logilica-ai-advisor

By Ralf Huuck