Stories by iSolutions on Medium

How Serious Companies Are Turning AI Into Measurable Business Advantage

iSolutions — Wed, 21 Jan 2026 22:33:02 GMT

A practical playbook for moving from AI pilots to ROI as execution — not hype — becomes the differentiator.

For much of the last two years, Salesforce has been one of the loudest and most confident voices in enterprise AI.

Its Agentforce narrative promised autonomous agents that could reason, act, and execute work end‑to‑end — reducing costs, increasing productivity, and fundamentally changing how enterprises operate.

Then, quietly, the message evolved.

In recent comments and customer guidance, Salesforce executives began emphasizing determinism, guardrails, and reliability over autonomy. Agents, they suggested, often work better when large language models are used less, not more.

Many readers and commentators interpreted this shift as a retreat — a loss of faith in generative AI itself.

What Salesforce was actually describing was the moment when AI systems moved from theory to contact with real enterprise constraints — compliance, cost, reliability, and trust. This was not a walk‑back from AI ambition. It was a transition from experimentation to operations.

Seen clearly, the Salesforce story is not about AI failing. It is about an organization discovering what it takes to run AI inside real business workflows, where predictability matters as much as capability.

That discovery is not unique to Salesforce. It is the same inflection point many large enterprises are now encountering.

The Misread: How Headlines Flatten Complex Signals

The problem is not that these signals exist. The problem is how they are interpreted.

Headlines tend to compress complex operational lessons into binary judgments: AI works or AI doesn’t.

When Salesforce emphasized determinism and guardrails, coverage quickly framed it as declining trust in AI or evidence that autonomous agents were a dead end.

This framing is understandable — but incomplete.

Across technology history, the pattern is familiar. Early hype produces overreach.

Overreach collides with reality.

Reality forces redesign.

Headlines label that redesign as failure, even though it is the necessary path to maturity.

Course correction is not retreat. It is how immature systems eventually become infrastructure.

This is not unique to technology development; it is the core logic of experimental science.

Experiments are structured to test assumptions against reality, not to confirm their desirability. When results contradict expectations, the outcome is not experimental failure but hypothesis failure. Falsification provides the information required to redesign both the system and the next test.

AI pilots should be interpreted through this lens. They are not failed products when they perform unreliably; they are experiments revealing which assumptions about capability, cost, supervision, or context do not yet hold.

High failure rates are often a signal that the pilot is testing meaningful constraints rather than operating in a controlled showcase environment.

Treating these outcomes as evidence of collapse confuses experimental exposure with deployment readiness. The question pilots answer is not whether the system works, but under what conditions it fails, and therefore what must change next.

In the Salesforce case, what changed was not confidence in language models, but clarity about where responsibility should live.

Tasks that require guarantees moved out of probabilistic models and into workflows, controls, and evaluation layers. Language models remained powerful — but no longer carried burdens they were never designed to bear.

Reading that shift primarily as a loss of faith in AI overlooks the deeper signal: enterprises are learning how to make AI dependable.

From Salesforce to MIT: The Execution Divide

The same interpretive pattern appears when the conversation shifts from Salesforce to the MIT State of AI in Business 2025 report.

When MIT researchers introduced the concept of the GenAI Divide, headlines quickly converged on a single statistic: 95% of AI pilots fail.

The top returned headlines when you simply search for “MIT State of the AI Report 2025”

Read literally, it sounds like an indictment of the technology itself.

That is not what the study says.

The report documents a widening gap between experimentation and measurable business impact.

More than 80% of organizations have explored or piloted generative AI. Only a small minority have translated those efforts into sustained P&L outcomes.

Crucially, MIT does not attribute this gap to model quality, regulatory friction, or lack of investment. It attributes it to execution failures: pilots that sit outside workflows, lack ownership, cannot learn post‑deployment, or are never measured against economic outcomes.

In other words, the GenAI Divide is not a technological divide. It is an execution divide.

Here are actual excerpts from the paper, not as isolated facts, but as evidence of that underlying pattern.

In addition to:

Core finding: lack of value capture: roughly 95% of organizations report zero measurable P&L impact from current GenAI initiatives

…the paper also states:

High adoption, low transformation: over 80% of organizations have piloted GenAI, but only ~40% report deployment beyond experimentation
Failure defined by outcomes, not accuracy: pilots stall because they fail to integrate with workflows, adapt to context, or learn from feedback — not because models are incapable
Pilot-to-production gap: only ~5% of task-specific pilots reach production with sustained impact
Build vs. buy signal: pilots built with external specialist partners succeed at materially higher rates (~67%) than purely in-house efforts (~20–33%), underscoring that execution and integration — not model access — are the binding constraint

Read plainly, the report’s message is not pessimistic. It is directional.

The prescription is not “use better models,” but change how pilots are scoped, owned, embedded, and evaluated.

The sections that follow translate those implications into an operating playbook.

The Core Diagnosis: Why Most AI Pilots Die

By this point, a pattern should be clear. Most AI pilots do not fail because the technology is incapable. They fail because they are set up to fail.

Across enterprises, the same failure modes repeat:

Teams optimize for visibility instead of value
Pilots sit outside real workflows
No single owner has authority to change how work gets done
Systems rely on unconstrained generative autonomy
Learning stops at deployment
Measurement is deferred until “after scaling”

Each of these issues is survivable on its own. Together, they almost guarantee that a pilot will stall.

Metric Confusion

A particularly subtle failure mode sits beneath many of these symptoms: metric confusion.

Vendors and internal teams often cite figures like “93% accuracy,” but closer inspection reveals that these numbers rarely mean what business leaders assume they mean.

In practice, they often reflect adoption rates or task completion metrics rather than correctness or reliability in operational terms. Salesforce itself has described its success metrics in terms of effectiveness for certain conversation resolutions — not a formal definition of accuracy.

This distinction matters because enterprises already understand quality — just not in the context of AI.

Frameworks like Six Sigma, originally developed at Motorola, define acceptable error rates for mission-critical processes at no more than 3.4 defects per million opportunities, or roughly 99.99966% accuracy. Against that backdrop, a headline figure like 93% should not reassure leaders; it should prompt sharper questions about scope, risk tolerance, and supervision.

The implication is not that AI must immediately meet Six Sigma standards everywhere. It is that leaders must be explicit about where error is acceptable and where it is not.

High-stakes workflows require guarantees. Lower-risk workflows can tolerate approximation. Confusing the two is how pilots drift into production before the organization is prepared to manage their consequences.

When AI initiatives fail, it is rarely because the model was wrong. It is because the system was never designed to absorb, bound, and learn from error in the first place.

The Playbook: How to Land in the Successful Minority

1. Context Engineering

AI is not an application. It is a capability that must be integrated — selectively and intentionally — into existing workflows.

One of the core misconceptions exposed by the Salesforce discussion is the idea that large language models are supposed to be the system. They are not.

LLMs are powerful tools, but they only deliver value when embedded inside workflows, data systems, and controls that businesses already trust.

This is where context engineering matters. Context engineering is the discipline of deciding what the model sees, what it is responsible for, and what it is explicitly not responsible for. Done well, it removes the burden on the model to infer policy, reconstruct context, or manage execution.

The system supplies data, constrains tasks, invokes tools, and enforces boundaries so the model can focus on what it does best: interpreting language and handling ambiguity.

In practice, this means elevating workflows, systems of record, and human judgment to first‑class components of the AI system. Responsibilities that require guarantees live outside the model. Probabilistic reasoning lives where nuance adds value.

Seen this way, the tradeoff Salesforce describes is not a loss of intelligence. It is a redistribution of responsibility. Nuance belongs where it helps. Guarantees belong where risk is highest. Making that distinction explicit is how AI moves from novelty to infrastructure.

At this point, the question is no longer whether AI can be useful. It is how to operationalize it reliably.

Rule: language models are not the app; they are tools embedded where they create leverage. The science of what context you provide to them is where you create value.

2. Start With an ROI Thesis

Every successful AI deployment begins with an explicit value hypothesis. Not a use case. Not a demo. A thesis.

The MIT data is clear: pilots fail not because models underperform, but because organizations never defined success in business terms.

Roughly 95% of GenAI initiatives show zero measurable P&L impact because they were never designed to produce one.

An ROI thesis forces discipline. It requires leaders to state, up front, how the system is expected to create value and on what timeline.

That value generally falls into one of four categories:

Cost‑out: eliminate labor, vendor spend, rework, or cycle time
Revenue‑up: increase conversion, retention, ARPU, win rate, or pricing power
Risk‑down: reduce errors, compliance exposure, fraud, or operational loss
Capital efficiency: improve forecasting accuracy, inventory turns, or working capital

If the value cannot be expressed in one of these terms, the initiative is exploratory — not operational — and should be treated accordingly.

Critically, the thesis must be fundable. AI pilots should be evaluated like any other investment: we invest $X to get $Y in return within Z time.

This framing protects innovation by making trade‑offs explicit and progress measurable.

Rule: if you cannot state the ROI thesis clearly enough for a CFO to challenge it, the pilot is not ready to run.

It is important to distinguish the business thesis from the technical hypotheses that support it.

In machine learning practice, progress comes from small, targeted experiments — not polished demos.

Most meaningful ML work is not demoable; it answers narrow feasibility questions or establishes limits.

A practical pattern is to whiteboard the end‑to‑end workflow, then deliberately identify small, high‑uncertainty segments.

Teams often run a “red circle” exercise — white boarding desired workflows and circling the steps that require experimentation to determine whether models, tools, or automation can reliably meet requirements.

These experiments should be small, fast, and frequent. Over time, validated components are stitched together into a proof of concept and, eventually, a pilot.

The ROI thesis sits above all of this work. Hypotheses validate feasibility; the thesis validates relevance.

3. Pick High-Yield Use Cases

Once an ROI thesis is established, the next failure mode is obvious in hindsight: teams pick the wrong problems.

The MIT data shows that while adoption is broad, transformation is narrow. The difference is not tooling or enthusiasm — it is use‑case economics. High‑yield AI deployments share a small number of structural traits.

First, they are high‑frequency and repeatable. AI creates leverage when applied to work that occurs thousands or millions of times per year; frequency compounds small gains into material impact.

Second, they are measurable and controllable. The best use cases have clear inputs, clear outputs, and a well‑defined notion of “done.”

If success cannot be measured unambiguously, learning stalls and ROI debates become subjective.

Third, they are economically mispriced today — handled by expensive humans, outsourced vendors, or brittle manual processes.

AI does not need to outperform humans in absolute terms; it needs to win on unit economics.

A useful heuristic is simple: where are we paying skilled people to do work that is primarily reading, writing, classification, or coordination? Those are often the highest‑yield starting points.

High yield does not mean high visibility. Many of the most successful deployments occur in back‑office and operational workflows because they combine clean economics with bounded risk.

This connects directly back to the experimental mindset discussed earlier. After mapping the full workflow, apply the same “red circle” exercise for economic leverage — isolating steps where automation or augmentation would measurably change cost, throughput, or error rates.

Rule: prioritize use cases where small, repeatable improvements compound into material business outcomes.

4. Decompose Control, Don’t Centralize It in the Model

A critical mistake in early pilots was treating a single, monolithic prompt as application logic.

Long, one‑shot prompts were asked to reason, plan, decide, and execute in one pass. This looks elegant in demos but proves brittle in production.

It makes sense why many feelo this is “how AI works” given that their experience with AI may just be through chatbots where users prompt the AI directly. But that one step, or “zero shot” approach is not how AI application are created.

Real workflows are not single decisions; they are sequences of dozens or hundreds of steps. In high‑performing systems, each step invokes the most appropriate tool: sometimes a large language model, sometimes a small language model, sometimes a deterministic service or API. Control flow lives outside the model.

This multi‑step, multi‑tool approach improves reliability by reducing cognitive load on any single model, enforcing clear boundaries, and allowing teams to mix purpose‑built models with traditional software and context engineering. The result is more predictable outcomes and easier iteration.

Rule: single‑shot prompts and long‑form prompting produce unpredictable results; multi‑step prompting integrated into workflows produces reliability.

5. Design Pilots to Reach Production

Design pilots backward from production, not forward from novelty.

Choose the right use case and define success upfront. Too many AI pilots in 2025 were launched on vague hopes or hype.

In 2026, every pilot must answer a simple but uncomfortable question: How will this make or save money if it works?If that answer is unclear, the initiative is not a pilot yet. Ground the scope in a concrete, CFO-legible outcome.

For example, instead of “let’s use AI in customer service,” narrow the objective to “reduce Tier‑1 support handling time by 50% via AI‑assisted triage,” or “automate invoice processing to cut outsourcing spend by $5M per year.”

Precision forces discipline.

Early in design, perform a task–resource fit check. If the task requires strict accuracy, identical outputs every time, or heavy numerical computation, a pure LLM approach is often the wrong tool. In those cases, use deterministic automation or a hybrid system where generative AI handles ambiguity at the edges and rules handle execution.

Asking “is this actually a good fit for generative AI?” up front avoids expensive misapplication.

Design for speed and iteration. Keep the initial scope narrow enough to build and test in 8–12 weeks, not a year.

Rule: Design pilots for production economics and operational reality from day one; novelty without a path to scale is not a pilot.

Pilots should run like real projects, with owners, KPIs, and hypotheses to prove. Instrument the workflow from day one so outcomes can be measured against a baseline.

Pilots that survive to production are not the most impressive demos; they are the ones that make the economic case undeniable.

6. Integration Strategy: Embed (Don’t Isolate)

One of the clearest lessons from 2025 is that integration is everything.

AI systems that sit beside workflows fail; AI systems that live inside workflows scale.

In pilot planning, prioritize integration points early: how will the AI plug into existing workflows, data sources, and applications? If users must leave their system of record to access the AI, adoption will be cosmetic at best.

Limit dependencies aggressively. Enterprise environments are dense with interlocking systems, and attempting to integrate with all of them at once is a reliable way to stall a pilot.

High‑yield pilots start with a contained surface area: a single data source, a single process, or a single business unit. Prove value there, then expand.

Do not shy away from hybrid architectures. Pure LLM systems often struggle with workflows that require identical execution every time or strict sequencing. The most robust deployments pair generative AI with traditional automation and rules‑based control.

Rule: for high‑stakes workflows, a deterministic spine with AI at the edges consistently outperforms pure generative autonomy.

Finally, plan for data and compliance integration from day one. Respect data boundaries, security policies, and audit requirements upfront. Retro‑fitting governance after a pilot succeeds is one of the fastest ways to kill momentum.

Rule: if the AI is not embedded in the system of record, adoption will be superficial.

7. Engineer Trust Explicitly

Trust is not a sentiment; it is a systems property.

Research in other high‑stakes domains makes this clear. In healthcare, psychometric instruments such as the Trust in Physician Scale and the Wake Forest Physician Trust Scale measure trust along dimensions like competence, honesty, and communication.

The bar is high: in recent U.S. surveys, over 90% of respondents trust physicians for serious diagnoses, while roughly 20–25% trust AI systems in comparable contexts.

The implication for enterprise AI is straightforward. Trust is earned through structure, not novelty. Systems are trusted when their behavior is predictable, observable, and correctable.

Start with guardrails and determinism. Be explicit about what the system is allowed to do, which tools it can invoke, and where autonomy ends. In high‑stakes workflows, execution paths must be constrained so outcomes are repeatable.

Design escalation paths and objective boundaries. Agents should have explicit goals and limits that prevent drift when users go off‑path. When confidence is low or risk thresholds are crossed, the system should defer cleanly to a human. Silent failure erodes trust faster than visible limitation.

Invest in observability and auditability. Teams should be able to answer, at any time: what the system saw, what sources it used, what decision it made, and why. This traceability is essential for both incident response and improvement.

This is where evaluation frameworks become non‑negotiable. Effective teams establish evals before building, use them during development, and rely on them after deployment. These frameworks are not generic QA. They define task‑specific success and make it possible to measure progress as prompts, context, tools, and models change.

Eval frameworks also enable evidence‑based interfaces — “show your work” experiences that surface sources, reasoning, and actions. Evidence reduces cognitive friction for users and simplifies support and maintenance.

Rule: trust comes from controls, relevance, evidence, and traceability — not from language model capabilities.

This can be summarized as a simple operating equation:

Trust = Relevance + Accuracy + Evidence

Relevance from context, memory, and personalization
Accuracy from task‑fit and deterministic control where required
Evidence from evals, traceability, and inspectable outputs within the UI

Remove any one of these, and trust decays. Put all three in place, and trust compounds over time.

8. Close the Learning Loop: Build Continuous Learning Into the System

If there is one capability gap that separates pilots that stall from deployments that scale, it is this: the system must learn and improve over time.

Static systems decay; living systems compound.

Start with explicit in‑line feedback mechanisms. Users should be able to correct, rate, or override outputs as part of their normal workflow. Every edit or rejection is high‑value signal. Early pilots may rely on simple flags or annotations; production systems should capture structured telemetry and labeled feedback.

Pay particular attention to override data. When humans change an output in the conversation path, they are teaching the system what “good” looks like. This signal is far more valuable than periodic surveys.

For generative systems, invest early in memory and context retention. One‑off conversations create friction and limit usefulness. Systems should recall prior interactions, decisions, and constraints by integrating with knowledge bases, vector stores, or persisted conversation state.

From an organizational perspective, feedback must be treated as training data, not noise. From a technical perspective, it must be easy to capture and route.

Rule: build feedback capture into the UI, not into a monthly survey.

In 2026, any serious AI deployment should be a living system — measurably better in month three than in month one, with a clear explanation of what changed and why.

9. ROI Modeling and Measurement: Prove Value or Pivot Fast

With drift prevented and trust engineered, the work shifts from definition to verification.

Develop an ROI measurement model early. Before or during the pilot, establish how costs and benefits will be tracked in practice.

Include upfront investment (vendor fees, engineering time), ongoing costs (infrastructure, maintenance), and expected gains expressed in monetary terms.

Track tangible metrics, not vanity metrics. Page views, prompt counts, or “number of AI queries” are not business value.

Measure what ties directly to the thesis: time per task, throughput, error rate, cost per unit, revenue impacted.

Publicize early wins. When a pilot saves $100K in outsourced spend or compresses a workflow by 30% in its first quarter, make that visible internally. Evidence builds momentum and unlocks follow‑on investment.

Be prepared to pivot or kill projects. Measurement exists to enable hard decisions, not to justify sunk costs. As Gartner predicted, roughly 40% of agentic AI projects will be cancelled as reality sets in. Cancellation is not failure if it frees capital and talent for higher‑yield use cases.

Rule: if pilot metrics do not map cleanly to business outcomes, stop or redesign.

10. Organizational Ownership and Change Management

AI initiatives fail when they are treated as technical deployments. They scale when they are treated as organizational transformations.

Ownership must be explicit. Someone must be accountable for operating the system in production: maintaining it, updating models or rules, curating data, handling escalation, and integrating it into daily work. Ambiguity here is one of the most common failure modes — pilots that technically work but never change how work actually gets done.

Executive sponsorship is non‑optional. The MIT research identifies lack of senior ownership as a primary blocker to scaling AI beyond pilots. A senior leader must own the outcome, remove friction, and make trade‑offs when workflows, incentives, or roles need to change. AI that threatens no one’s incentives rarely changes anything.

End‑user engagement is equally critical. Adoption accelerates when AI is positioned as augmentation, not replacement. A useful way to ground this for teams is simple:

AI isn’t going to take your job — just the part of your job you hate the most.

When designed correctly, AI absorbs repetitive, low‑leverage work so people can focus on judgment, decision‑making, and outcomes that actually matter.

The same principle applies to AI agents. They are extensions of human intent — delegating, coordinating, executing, and reporting back — while humans retain judgment and accountability. Systems built this way earn trust and pride of use rather than fear or resistance.

Rule: AI scales only when ownership is explicit, executive sponsorship is real, and accountability for outcomes — not experiments — sits in the business.

Finally, culture matters. Teams that succeed with AI celebrate wins, normalize intelligent failure, and share lessons openly.

As the MIT report underscores, the gap between leaders and laggards is widening.

The winners are not experimenting more — they are integrating AI into how work gets done, deliberately and consistently.

11. Scale Through an Operating Model

Scaling AI is not a technology problem; it is an operating model problem.

The MIT data shows that the organizations that cross that gap do not do so by upgrading models. They do so by changing how AI work is owned, funded, and run.

First, business ownership is mandatory. Successful AI systems have a clear business owner with authority over the workflow being changed. That owner is accountable for outcomes, not experimentation.

When ownership sits only with innovation teams or central IT, pilots linger without impact.

Second, treat AI as a product discipline, not a project. This means appointing AI product managers responsible for roadmap, metrics, user experience, and lifecycle management. Models will change, prompts will evolve, tools will be swapped — but the product and its economic goals persist.

Third, invest in platform reuse. High-performing organizations do not rebuild retrieval, evaluation, observability, or feedback systems for every use case. They develop shared capabilities that make the second and third deployments materially cheaper and faster than the first.

Finally, use partner leverage strategically. The MIT report shows that pilots built with external specialists succeed at far higher rates (67%) than those built purely in-house.

This is not an argument for permanent outsourcing. It is an argument for sequencing: use partners to accelerate time-to-production, transfer operational knowledge, and de-risk early deployments. Once workflows, controls, and ROI are proven, internal teams can — and should — own and extend the system.

This operating model reflects a broader maturity shift. Early in the curve, speed and learning matter more than ownership purity. Later, differentiation and control matter more than raw velocity.

Organizations that understand when to switch modes scale faster and more reliably.

Rule: scale AI through operating discipline — clear ownership, product management, shared platforms, and pragmatic partner leverage — not by chasing ever more capable models.

12. Data‑First AI Investors Are the Leaders

Most organizations are spending aggressively on AI, but the majority are not seeing meaningful financial returns. MIT’s research shows that around 60% report minimal revenue growth and cost reduction despite heavy AI investment.

A small group of organizations stands apart. These “future‑built” leaders are achieving roughly five times the revenue gains and three times the cost savings of their peers. The difference is not the size of their AI budgets or the specific models they use.

What separates leaders from laggards is foundational readiness. These organizations invest early in data infrastructure and core capabilities that allow AI systems to work reliably inside real workflows. They focus on getting the basics right — data quality, integration, feedback loops, and governance — before attempting to scale AI across the business.

MIT’s analysis consistently shows that integration maturity and data readiness, not AI spend, are the strongest predictors of sustained ROI.

Rule: AI returns compound where data foundations are strong; spending more on models without investing in data and integration rarely changes outcomes.

Why 2026 Is Different

2026 marks the transition from experimentation to accountability.

Capital markets are no longer rewarding AI narratives without returns. Boards are asking harder questions. CFOs are scrutinizing budgets. AI spend is moving from discretionary innovation lines into core operating expense — and with that shift comes ROI pressure.

At the same time, the easy pilots have been tried. Most organizations now know what demos look like.

The remaining opportunity lies in harder work: workflow redesign, integration, governance, and change management.

This is why the MIT finding matters. The failure of most pilots was not a technology problem; it was an operating one. In 2026, that excuse expires. AI initiatives will be expected to justify themselves like any other investment.

The era of experimentation theater is ending. The era of execution has begun.

The Executive Checklist for AI ROI

This checklist is the compressed version of the playbook above. It is designed to surface failure modes before capital, credibility, and time are spent.

If you ask nothing else, ask these:

Economic discipline

What is the explicit ROI thesis (cost‑out, revenue‑up, risk‑down, or capital efficiency)?
Can the value be expressed as a CFO‑style equation ($X invested → $Y returned in Z time)?
What is the unit of value (per case, per ticket, per order), and how does it scale with volume?

Workflow and use‑case discipline

What exact workflow changes if this succeeds?
Where, specifically, does AI create leverage inside that workflow?
Is this use case high‑frequency, repeatable, and economically mispriced today?

System and architecture discipline

Is this task actually a good fit for generative AI, or does it require deterministic guarantees?
Where does control flow live — inside the model or in the surrounding workflow?
Is the system embedded in the system of record, or will adoption be cosmetic?

Trust, learning, and operations

What happens when the system is wrong, uncertain, or off‑task?
How is user feedback captured in‑line and converted into learning?
What evaluation framework exists to measure progress as prompts, context, and tools evolve?

Ownership and scale

Who owns this system in production, with authority to change the workflow?
What is the payback period at scale, not just in the pilot?
What is the explicit kill criterion, and when will we decide?

If these questions cannot be answered crisply, the initiative is not ready to scale.

2026 — The Year AI Proves Its ROI

Here is the part most leaders miss: nothing described in this playbook is theoretical, futuristic, or dependent on a breakthrough model. It is all executable with the tools, talent, and data most enterprises already have.

The gap separating the few organizations generating real AI value from the many still stuck in pilots is not intelligence. It is method. While headlines debate whether AI is “working,” the winners have quietly moved on to a more pragmatic question: Are we running this like a business system?

Salesforce’s evolution makes this concrete. The shift toward guardrails, determinism, observability, and workflow control was not a retreat from AI — it was the point at which AI became operationally serious. That same transition now sits in front of most large enterprises.

If you adopt this approach, you do not need to wait for perfect models or universal trust.

You can start Monday by picking one workflow that matters, stating the ROI thesis clearly, running small experiments to de‑risk feasibility, and embedding AI where it creates leverage instead of fragility. Do that once, measure it honestly, then do it again.

This is the advantage most organizations overlook. When you understand why pilots fail — and how the successful minority operates — you stop chasing headlines and start executing with confidence. You are no longer guessing. You have a framework — one that spans ROI theses, system design and trust, and operating models that scale .

In 2026, AI success will not belong to the loudest evangelists or the most cautious skeptics. It will belong to leaders who treat AI as operating discipline: scoped tightly, owned clearly, measured rigorously, and improved continuously.

Once that mindset takes hold, the narrative changes — from whether AI works, to how fast you can scale what already does.

Momentum compounds faster than novelty.

Understanding Claris MCP

iSolutions — Wed, 17 Dec 2025 12:40:59 GMT

What MCP Actually Enables for FileMaker Developers

The Model Context Protocol (MCP) is an open standard created by Anthropic that aims to bridge the gap between large language models and real‑world data systems.

AI models often struggle to access context due to proprietary APIs and isolated data repositories. MCP solves this problem by providing a universal, secure interface through which AI assistants can interact with data sources and perform actions on behalf of users.

Anthropic’s announcement described MCP as a way to give AI systems reliable access to data, replacing fragmented integrations with a single protocol.

The specification includes SDKs, open‑source servers for common systems and local server support in Claude Desktop, enabling developers to connect data sources like Google Drive, Slack, GitHub and Postgres via a standard interface.

MCP operates using three roles

1. Host (AI‑powered application that people interact with)

2. Client (component within the host that manages protocol communication)

3. Server (application or service that exposes data and functions) .

This architecture enables secure, two‑way connections between AI assistants and data sources without custom integrations for each system.

Claris MCP: Bringing MCP to FileMaker

By acting as a translator between FileMaker solutions and AI assistants, Claris MCP lets developers expose FileMaker data and scripts to AI without writing new APIs or learning new programming languages providing developers with several advantages:

• No‑code AI integration — MCP automatically generates tools that map FileMaker tables and scripts to AI‑accessible functions. This eliminates the need for manual REST API development and allows developers to configure AI access through a web interface .

• Native FileMaker integration — MCP works with FileMaker Server 2025, leveraging existing scripts and business logic without rebuilding applications. Developers can expose selected scripts to AI assistants while keeping data within FileMaker .

• Adaptability as AI evolves — MCP is vendor‑agnostic and supports multiple AI clients, so developers can switch or add AI models without rewriting integrations .

• Security and control — Claris MCP enforces granular permissions. AI can only access tables and scripts approved by the developer, and interactions are mediated through auditable requests rather than direct database access.

• Natural language interaction for end‑users — With MCP, end‑users can ask questions or trigger actions in plain English (“How many orders shipped last week?” or “Reassign this task to Robert”) rather than navigating complex UIs .

Claris MCP for FileMaker: Real‑World AI Use Cases Beyond “Chat”

Using MCP, a FileMaker solution can not only answer questions but also execute actions in response to natural language prompts .

In other words, the AI gains agency: it can call FileMaker’s own scripts, modify records, or trigger workflows on command — all under the developer’s control.

This opens up exciting possibilities for small-business apps in service, logistics, retail, healthcare, and more, where staff could simply ask the system to handle tasks that once required clicking through layouts or writing complex queries.

How is this possible? Claris MCP exposes your FileMaker tools (like scripts, data operations, and business logic) as callable functions for the AI.

Instead of building custom APIs or new UIs, developers configure which parts of their app to expose, and MCP auto-generates a set of tools representing those capabilities .

The AI assistant (e.g. Claude) then “sees” these FileMaker tools and can invoke them during a conversation.

Crucially, this is all done securely and with respect to your FileMaker business logic — the AI can only do what you’ve explicitly allowed. The result is a conversational interface where users can both retrieve insights and trigger actions in one seamless dialogue.

In this article, we’ll explore validated, real-world use cases of Claris MCP that demonstrate clear business value for FileMaker developers.

We’ll see how MCP enables natural-language workflows like generating reports, updating records, running multi-step processes, and even integrating with external services — all using the capabilities of your FileMaker app.

Each example is grounded in what Claris MCP supports today (as of December 2025), ensuring you can implement these scenarios right now without speculative features.

Along the way, we’ll include official MCP configuration to illustrate how FileMaker scripts and tables become AI “tools,” and we’ll close with some surprising, underutilized ideas to inspire even the most skeptical FileMaker developers.

Claris MCP in a Nutshell: Exposing FileMaker as AI Tools

At a high level, Claris MCP turns your FileMaker Server into an MCP Server — a service that AI clients can connect to in order to interact with your data and logic.

When setting up MCP, you create a context (an environment for a set of tools) and add a connection to your FileMaker database.

You then choose which tables and scripts to expose. From this configuration, Claris MCP automatically generates a library of tools reflecting your schema:

• Data tools for each table — For every selected table, MCP provides functions to create new records, find or query records, edit or delete records, and so on . Essentially, all basic CRUD operations become available as JSON-based tools (e.g. search_customers_records, create_orders_record). If your table has container fields, it even creates tools to upload files to those fields or get download links .

• Script tools for each script — Every FileMaker script you choose to expose becomes a callable tool (named like execute__script). This lets the AI trigger predefined business logic with precision . Anything your script can do — updating multiple tables, sending emails, calling APIs, generating PDFs — can now be invoked by the AI as a single action.

MCP turns FileMaker into a palette of actions for the AI. The developer doesn’t have to write new code; MCP reads your file’s schema via the Data API/OData and builds the tools automatically . You remain in full control by deciding which tables and scripts are included and by setting the appropriate FileMaker privilege sets (only dedicated accounts with fmrest/fmodata can be used, for example ).

Each tool has a machine-readable name (for the AI to call) and a description that tells the AI what it does .

FileMaker developers can customize these to be as descriptive as possible — which is highly recommended, since the AI uses the names and descriptions to decide which tool to invoke for a given user request .

For example, you might give a script tool the description

“Marks an invoice as paid and sends a confirmation email to the client”

— that way if a user says

“Mark this invoice paid and notify the client,”

the AI will recognize that tool as a match. You can also define an input schema for each tool, specifying what parameters (fields, values, etc.) it expects.

This ensures the AI provides the correct details (like an Invoice ID and status value for an “update invoice” tool) when invoking an action.

Once your context and tools are configured, Claris MCP generates a connection snippet (JSON) that you plug into your AI client (such as Claude’s desktop app).

This snippet contains the MCP server URL and an access token. With it, the AI can connect and discover all the tools you defined .

From that point on, a user chatting with the AI can ask questions or issue commands, and the AI will dynamically decide when to call one of your FileMaker tools to fulfill the request.

The following sections showcase what kinds of interactions this makes possible — all using capabilities that are fully supported in Claris MCP’s current implementation.

Use Case 1: Natural Language Data Insights and Reporting

One of the most immediate benefits of MCP is the ability to get instant data insights from FileMaker via natural language . Instead of running a pre-built report or crafting a complex find request, a user (e.g. a manager or staffer) can simply ask the AI a question, and the AI will fetch the answer using FileMaker tools.

This goes beyond static “chat” because the AI can interpret flexible queries and pull exactly the data needed on the fly.

Example — Customer and Sales Queries: A sales manager might ask:

“Which customers haven’t placed orders in 90 days?”

In response, the AI (connected to your FileMaker CRM) can call the search_customers_records tool with a date filter to find those lapsed customers. It then presents the list of names right in the conversation .

If you then follow up with

“What’s our monthly revenue trend for this year?”

the AI can use a combination of finds or an aggregate query (perhaps via a specialized script tool) to calculate revenue per month and then produce a concise summary or even a table of figures .

All of this happens in seconds, without a human running reports or exporting data to analyze elsewhere.

Such ad-hoc querying is grounded in your live FileMaker data and respects the business logic you’ve set up.

For instance, you could expose a read-only table occurrence or a pre-calculated summary field via MCP if you want the AI to use a specific definition of “revenue” or to include certain criteria.

The key is that the AI can understand a plain-English request and translate it into the appropriate tool calls (find, list, etc.) to retrieve exactly the data needed, thanks to the structured tools you’ve provided.

Benefits: This use case empowers non-technical users to get insights without waiting on developers for new reports or doing manual exports.

It’s like having a smart analyst available at any time — one who knows how to query your FileMaker solution.

This is especially useful for small businesses that might not have dedicated data analysts. The owner of a retail shop could ask,

“Which products are our top sellers this quarter and how do they compare to last quarter?”

…and the AI can immediately fetch sales data and provide an answer.

A healthcare clinic manager could ask,

“How many appointments did each doctor complete this week?”

…and get an instant breakdown. These on-demand queries save time and help businesses make decisions faster.

Use Case 2: AI-Triggered Updates and Actions (Beyond Read-Only Chat)

Perhaps the most powerful aspect of Claris MCP is enabling action — using AI to drive changes in the database or kick off procedures. Because MCP can expose FileMaker’s write operations and scripts, users can accomplish tasks through conversation that would normally require multiple clicks or data entry steps.

This is a game-changer for busy small business workflows, turning a chat prompt into real work done in the system.

Example — Updating Records and Running Scripts: Imagine a service business uses FileMaker to track invoices and emails. With MCP, an employee could tell the AI:

“Mark invoice #1001 as Paid and email the receipt to the client.”

In a single prompt, two things happen:

the AI calls an update_invoices_record tool to set the status of invoice #1001 to “Paid,”
then it calls execute_send_receipt_script (a script tool you defined) which handles composing and sending the email.

The user sees the AI respond with something like,

“Sure — I’ve updated the invoice to Paid and sent the confirmation email.”

In the background, those FileMaker actions were executed just as if a user had done them via the app UI.

No manual data update, no opening an email client — the routine task is handled in seconds.

Consider another scenario: a logistics coordinator might say,

“Schedule a follow-up call for this client next week.”

The AI can immediately invoke a script like CreateFollowUpTask (exposed as execute_createfollowuptask_script or similar) with the client’s ID and date as parameters.

The script creates a new Task record for the appropriate date and perhaps even sends a notification.

The AI then confirms to the user,

“Follow-up task scheduled for [date].”

What used to require navigating to a Tasks layout, creating a record, filling fields, and maybe emailing a reminder is now one friendly command to the AI.

How it works under the hood:

These prompts leverage the write-enabled tools MCP provides. For any table you included, tools like create_

_record, update_

_record, and delete_

_record are available for the AI.

You might expose only certain tables (e.g. Invoices and Tasks but not sensitive HR records) — a best practice is to only expose what’s needed and nothing more.

You also likely have existing scripts for complex tasks (like sending emails, or calculating totals); exposing them as tools means the AI will prefer using that script (ensuring your business rules are followed) instead of trying to do multiple raw data operations.

For example, if marking an invoice Paid should also log a payment record and send an email, a script can handle all that, and the AI just calls that script tool directly.

Benefits: AI-triggered actions can save significant time and reduce errors. Staff can accomplish multi-step updates by simply describing what needs to happen.

This is especially valuable in small businesses where employees wear many hats — they can offload repetitive admin steps to the AI.

It also enforces consistency: by routing through approved FileMaker scripts, you ensure the AI’s actions follow the same rules as a trained user would.

No more forgetting to send an email or update a related record; if the AI does it via your script, it will be done right every time.

And because MCP uses the FileMaker Data API under the hood, all changes are transactionally safe and logged like any other edit to the database.

Use Case 3: Multi-Step Workflows and AI “Agents” in FileMaker

Taking the idea of AI actions further, MCP allows creation of multi-step, conversational workflows that would be cumbersome through a normal UI.

An AI agent can combine data lookup, reasoning, and action in a continuous loop — something that feels very natural in conversation but traditionally required a user to manually switch between multiple layouts or systems. With MCP, the AI can orchestrate these steps for you.

Example — Automated Customer Re-engagement Workflow: To illustrate, consider a common small-business scenario: re-engaging inactive customers with a marketing email. Here’s how an AI-driven workflow could play out:

1. Query: You ask the AI,

“Give me a list of all clients who haven’t bought anything in the last quarter.”

AI action: It calls the appropriate FileMaker tool (perhaps a search_customers_records with a date filter on last purchase) and returns the list of client names .

2. Refine and Generate: You say,

“Great. Exclude Acme Corp from that list, and draft a friendly email to the others, mentioning their last order and offering a promo.”

AI action: The AI can filter out Acme (either by adjusting the search or simply omitting it in its own response) and then use its language model capabilities to compose a personalized email template.

It might even call another tool to fetch each customer’s last order details for reference. It then presents you a draft email message for review .

3. Approve and Execute: You review the draft, maybe say

“Looks good. Send this email to all of them and CC their account reps.”

AI action: The AI now invokes a script tool — e.g. execute_send_bulk_email_script — passing in the list of customers and the approved email content.

Your FileMaker script handles looping through each customer, sending the email (and CC’ing the rep as specified), and logging the contact in the system. The AI confirms the action:

“Emails sent to 12 clients, CC’ing the respective sales reps.”

What’s remarkable is that this entire multi-step process — data analysis, content creation, and action — happened in one conversational thread.

The AI agent effectively functioned as an assistant that understood your intent, gathered the necessary data, consulted you for confirmation, and then carried out the task.

In classical terms: the first parts were generative AI (summarizing and drafting content), and the final part was agentic AI — the agent actually taking action by sending the emails.

Claris MCP is what made the agentic part possible, by offering the “send email” and data update capabilities as tools rather than just hoping the AI would respond with text.

Other examples: This pattern can apply to countless FileMaker workflows: for instance, an AI could help with an order fulfillment process —

“Show me all pending orders. Okay, for orders older than 2 weeks, mark them as high priority and notify the warehouse.”

The AI would list the orders (via list_orders_records), let the user confirm criteria, then perhaps loop through and update a status field on each (update_orders_record in a loop or via a bulk script), and finally call a script to send a Slack message or email to the warehouse team.

In a project management context, a user might say,

“Which tasks are overdue? Generate a brief status report, then push all their due dates out by one week.”

The AI can find the tasks, create a summary report (taking advantage of its natural language prowess to explain which tasks are late), and then reschedule the tasks by updating dates or calling a “reschedule tasks” script.

Benefits: Multi-step AI workflows eliminate a lot of back-and-forth that users typically endure when using multiple features of an app. The AI agent handles the flow, asking for clarification if needed, but otherwise doing the heavy lifting. This can dramatically improve productivity — what might take 15 minutes of clicking and cross-referencing can be done with a few sentences in a chat.

It’s also excellent for ensuring process adherence: the AI doesn’t forget steps or skip records accidentally. For small businesses with lean staff, this means routine processes (follow-ups, batch updates, daily summaries) can be semi-automated with minimal effort, freeing up humans to focus on more complex or creative work.

Use Case 4: On-Demand Reports and Visual Dashboards

FileMaker has long been used to create reports and dashboards, but traditionally those have to be pre-built by a developer.

With AI integration, users can ask for custom reports or even visualizations on the fly, and have the AI assemble the result using FileMaker data. This is “self-service BI” taken to the next level — even a non-technical user can get dynamic reporting just by asking.

Example — Generating a Chart via Conversation: Suppose a regional manager types,

“Show me a chart of sales by region for this quarter.”

In a standard setup, unless a chart layout was already made, this request would be hard to fulfill immediately.

But with MCP, the AI can take a few steps to deliver: it might call a list_sales_records or search_sales_records tool to gather data grouped by region (perhaps using OData query capabilities or by retrieving raw data and grouping it in memory).

It could then either:

a) feed the aggregated numbers back to the user in a clear textual format (e.g. “North: $X, South: $Y, …”)

b) generate a visual chart. How could it do the latter? One approach is to leverage a FileMaker script: you could have a script that uses the FileMaker chart engine or a server-side plugin to create a chart image (maybe in a container field), then have the AI retrieve that via the get_{table}_{field}_link tool (which gives a URL to a container’s content) .

The AI can then display that link or image to the user. While this requires a bit of setup, it’s all within supported features — container field tools and script execution. In the end, the user might see a bar chart image for sales by region, generated on demand because they asked for it .

Even without dynamic images, the AI can produce useful textual reports.

For instance:

“Compare this month’s performance to last year’s.”

The AI can query the sales figures for the current month and the same month last year (two Data API calls), then use its language model skills to output a comparison:

“This month’s sales are $50K (10% higher than the $45K in the same month last year). The growth is mainly in the Midwest region, which saw a 20% increase.”

The AI essentially becomes a real-time analyst, pulling numbers via MCP and contextualizing them. Another example:

“List our inventory stock levels and highlight any items below 10 units.”

The AI could list each product with its current stock (from a list_products_records call) and bold the ones below threshold, or say “⚠️” next to them — effectively generating an actionable mini-report.

Benefits: The ability to ask for any slice of data or any visualization gives end-users unprecedented flexibility. Small businesses often can’t afford dedicated data analysts or fancy BI tools; with MCP, your FileMaker solution and an AI assistant together can fill that gap.

It encourages exploration of data (“How are we doing?”, “What if we look at it this way?”) without additional development each time. And since the AI can format the answer, users might get nicely formatted tables or summaries in plain language rather than raw spreadsheets.

This can improve decision-making by making data more accessible and understandable. It also reduces the backlog for developers — every new report request doesn’t have to become a feature; many can be handled ad-hoc by the AI.

Use Case 5: AI-Augmented Customer Service and CRM

For businesses that deal with customers (be it service industry, retail, or healthcare patients), having quick access to customer info and history is key.

Claris MCP enables AI assistants that can act as smart CRM helpers — pulling up customer data, answering customer-specific questions, and even suggesting next actions, all within the context of a support conversation or sales inquiry.

Example — Instant Customer Q&A: A support agent using an AI assistant could ask,

“What’s the status of Order #12345?”

Instead of manually searching in FileMaker, the AI will use a tool (perhaps search_orders_records with the order ID) to retrieve the status and relevant details.

It might answer,

“Order #12345 is currently ‘Shipped’ as of yesterday, and is expected to be delivered by Dec 18.”

If the customer then asks a follow-up question through the agent (or even through a chatbot) like,

“When was it shipped and by which carrier?”

the AI can pull those additional fields from the same record or a related Shipping table and provide a quick answer.

This real-time querying of live order data makes customer interactions much smoother.

Beyond status lookup, the AI can assist with recommendations and context.

For instance, in a sales context, a rep could ask,

“What’s the best upsell opportunity for this client?”

The AI might gather the client’s purchase history (via list_orders_records for that client) and analyze it to see what category of products they haven’t bought recently or what similar customers have bought.

It could then respond,

“Client X hasn’t purchased any accessories for their product. We could upsell the extended warranty or the accessory bundle, which others often buy.”

While the AI’s suggestion in this case comes partly from its own trained knowledge of upselling strategies, it is grounded in the FileMaker data it retrieved about that client’s history.

Example — Logging and Following Up:

The AI can also create records during support flows. Suppose a customer is on a chat (with the AI) and describes a problem. The AI could both answer the question (if it’s something like a how-to) and simultaneously log a case in FileMaker. If the user says

“I need someone to call me about issue XYZ,”

the AI might respond politely and then, behind the scenes, use a create_cases_record tool to log a new support case with the details provided, and even assign it to a rep.

It could then tell the user,

“I’ve created a support ticket for you and a technician will reach out soon.”

The data (issue description, user info) comes from the conversation context, demonstrating how MCP’s tools (creating a record) can be combined with the AI’s understanding of the conversation.

In healthcare or other appointment-based services, a similar pattern could allow a conversational interface for scheduling. For example, a patient could say,

“I’d like to schedule an appointment with Dr. Smith next week.”

The AI (with access to the scheduling database via MCP) could find an open slot and create a new appointment record, then reply with

“Dr. Smith is available Tuesday at 10 AM — I’ve booked that for you.”

If the patient asks a question about their last visit (“What were Dr. Smith’s recommendations in my last check-up?”), the AI can fetch the doctor’s notes from the previous appointment record and summarize them, since those notes are stored in FileMaker.

Benefits: AI-augmented customer support means faster responses and more personalized service. Support or sales staff no longer need to dig through multiple screens; the AI brings the data to them.

It also helps ensure no detail is missed — if the AI is retrieving data via MCP, it can surface related info (“Note: this client’s birthday is tomorrow, might want to send wishes!” if such info is in the DB) that a human might overlook under time pressure.

For small businesses, this level of service can be a differentiator, achieved without hiring more staff. Additionally, by letting the AI handle routine inquiries and data entry (like logging cases or updating contact info on request), employees can focus on complex issues that truly need human intervention.

And if the AI is ever unsure or the request is beyond its toolset, it will defer to a human, so you maintain a safety net.

Inspiration from Outside: Applying Broader MCP Use Cases to FileMaker

Claris MCP is built on an open standard, meaning the concept of AI tools isn’t limited to FileMaker — the wider tech community is using MCP to connect AI with all sorts of systems (Slack, databases, GitHub, you name it).

FileMaker developers can draw inspiration from these external MCP use cases and implement analogous solutions in their own applications. The common theme is empowering AI to act across systems and data in a unified way. Here are a couple of examples and how they translate to FileMaker:

• Ops and Integration Bots: In the general MCP world, developers have created AI agents that coordinate operations across multiple services — for example, an AI that can post messages to Slack, create issues in Jira, commit code to GitHub, and query databases, all in one workflow .

Imagine the AI says:

“Build completed, posting update to Slack and creating a release ticket.”

For a FileMaker-centric business, you can achieve a similar multi-system flow by combining FileMaker scripts with external APIs.

FileMaker Equivalent: Expose a script via MCP that, say, takes a message and calls a Slack API (via Insert from URL) to post it to a channel. Now the AI connected to both your FileMaker MCP and a Slack MCP (or just using that script) can, in response to a prompt, update your FileMaker records and notify your team on Slack.

The user might say,

“Order 1001 is delayed. Notify the team and add a note to the order.”

The AI can call update_orders_record (to add the note “Delayed — customer informed”) and then run execute_notify_team_script which posts to Slack.

All done through one conversation, no need to switch apps. This kind of cross-platform agent is entirely feasible with MCP — Anthropic’s Claude has demonstrated agents simultaneously using dozens of tools from different systems, so your FileMaker tools can be part of a larger toolset an AI uses to automate your business processes end-to-end.

• Data Processing and Bulk Operations: Outside the FileMaker realm, AI has been used to manipulate large datasets — for instance, Claude for Excel can ingest and modify thousands of spreadsheet rows via MCP tools, without exceeding the AI’s context window. It achieves this by calling Excel read/write functions programmatically in chunks.

FileMaker Equivalent: If you have a large dataset in FileMaker (thousands of customer records or transactions), an AI agent can handle it by iteratively calling the list or search tools with limits/offsets, processing each batch. This means tasks like cleaning up data, applying bulk changes, or analyzing big lists are within reach.

For example, you could instruct,

“Go through all our product descriptions and correct any spelling errors.”

The AI can retrieve products 100 at a time (list_products_records with offset), use its language abilities to spell-check or even rewrite descriptions, and use update_products_record for each with the corrected text.

This kind of loop can be done with careful prompting or by advanced features in AI clients that support tool loops.

The bottom line: MCP lets the AI handle volume systematically, not just one record at a time, which is something that wasn’t easy with naive “chatbot” approaches.

• Knowledge Retrieval (RAG) with FileMaker Data: A common AI use case is Retrieval-Augmented Generation — e.g. an AI might fetch relevant documents from a knowledge base to answer a question.

With MCP, this is often done by providing a search tool that returns documents or snippets. In FileMaker, you can think of your database as a knowledge source.

If you have, say, a Policies table or a collection of notes, the AI can use search_policies_records to find relevant entries and then use that content to formulate a precise answer.

For instance,

“What is our refund policy for online orders?”

The AI can find the record in the Policies table where PolicyType = “Refund/Online” and return the text to the user. Even if the question is asked in different words, the AI’s semantic understanding plus the MCP tool ensures it pulls the right answer rather than hallucinating one.

This is particularly useful for internal tools: new employees could ask the AI common questions and get answers sourced from FileMaker (which might store all your SOPs or HR policies).

Technically supported? Yes — MCP’s search_

_records tool uses OData queries , which can do quite sophisticated filtering (e.g. “PolicyType = Refund AND Channel = Online”). The developer might need to allow full text search by implementing a script or using a calculation index, but the pieces are there.

The AI, when connected, will know the schema (it can even call get_customers_definition etc. to see field names if needed) and can adapt to your database structure.

• Multiple FileMaker Apps, One AI: Some organizations have more than one FileMaker file (or even multiple servers).

Claris MCP supports adding multiple connections to a single context — meaning the same AI session can access tools from different databases.

If a small business has, for example, a point-of-sale FileMaker system and a separate inventory file, you can connect both.

Then a prompt like

“A customer just bought item X, update inventory and record the sale.”

could trigger tools across both systems (update the inventory count in the inventory file, and create a sale record in the POS file).

The AI doesn’t care that they are separate files — it just sees more tools. This is an officially supported scenario: you simply add another connection in the MCP config, and tools from both will appear (with namespaced by file if needed).

The AI will use whichever tools make sense for the request. This consolidates your AI interface so users don’t have to talk to multiple assistants for different databases — one assistant can do it all.

Don’t limit your imagination to only what you’ve seen in FileMaker demos — the MCP community at large is doing incredible things with AI agents, from automating DevOps tasks to managing real-time streaming data.

All those use cases boil down to AI calling the right tools in the right sequence. You can adapt the same patterns in a FileMaker context.

The key is: if you have a script or API for something, you can likely wire it up as an MCP tool.

Want the AI to generate a PDF report and upload it to Google Drive? Expose a script that creates the PDF (and maybe returns a link), and separately use a Google Drive MCP server (or have FileMaker’s script call the Drive API).

Because MCP is standardized, these pieces can work in concert. Your FileMaker solution can become one part of a larger AI-driven ecosystem in your company.

Best Practices for Implementing MCP in FileMaker

Before we wrap up with some creative ideas, it’s worth highlighting a few best practices to ensure your MCP integration is secure, reliable, and effective:

• Expose only what’s necessary: When selecting tables and scripts to include, follow the principle of least privilege . Only include tables that the AI truly needs to access for the desired use cases, and only scripts that you’re comfortable having run via AI. The smaller the toolset, the less chance of the AI taking an unintended action. (For example, you might expose an “Add Contact” script but not the “Delete All Records” script). You can always expand the toolset as you build trust in the AI.

• Use dedicated accounts and privilege sets: Create a FileMaker account specifically for the MCP connection with limited privileges (read/write only on certain tables, only perform certain scripts). This way, even if the AI or user tries something outside the allowed tools, the security model is still enforced at the data level. Keep that account’s credentials safe, and remember MCP requires fmrest and fmodata extended privileges on it.

• Name and describe tools in business terms: As mentioned, the AI’s ability to choose the right tool depends on how well you name/describe them . Use the tool editing feature to rename “execute_sendemail_script” to something like “Send Invoice Email” and describe it clearly. If a tool updates a specific field or deals with a specific status, mention that. Think of it like writing good API documentation — only here the “reader” is an AI. The Claris MCP interface lets you adjust these without changing your underlying scripts, which is very convenient.

• Define input schemas for clarity: If your script expects a JSON with certain keys (e.g. an emailBody and recipientID), define those in the tool’s input schema with correct data types. This helps the AI format its tool call correctly (and MCP will validate the inputs). It reduces trial and error where the AI might guess a parameter format. Also consider setting reasonable defaults or required flags for important parameters.

• Test in the “Edit” and “Testing” modes: Claris MCP has context lifecycle states — Editing (for configuration), Testing, and Deployed. In Testing, you can connect an AI and try things out without exposing developer tools. Make use of this, Ask the AI various things, see which tool it picks, and adjust your descriptions if it picks wrong. You can even test error conditions (e.g. what if a record isn’t found?). Iterate in the sandbox before deploying live to users.

• Monitor and log AI actions: Just as you would log user actions in FileMaker, log what the AI is doing. The MCP server (FileMaker Data API) calls will show up in logs, and you might also add logging to critical scripts (record that “AI triggered sendEmail for Invoice 1001 at 3pm”). This builds trust and helps debug if something odd happens. Fortunately, since MCP uses standard interfaces, all the usual FileMaker logging and script result mechanisms apply.

By following these practices, you ensure that your AI integration is safe (no unintended access), performant (unnecessary tools don’t bloat the AI’s context), and accurate in fulfilling user requests.

Surprising and Underutilized MCP Use Cases to Inspire You

Finally, let’s look at some creative possibilities with Claris MCP that you might not have considered. These are all within the realm of what’s currently possible, yet many FileMaker developers haven’t tried them out.

If you’re feeling skeptical about AI or just curious, these ideas might spark some experimentation:

• Conversational Data Entry & UI Replacement: Tired of building a dozen data entry layouts for every little task? With MCP, you can implement a conversational UI for internal operations. For instance, instead of using a dedicated screen to log expenses, an employee could just tell the AI,

“I spent $45 on office supplies, allocate it to the Marketing budget.”

The AI could create the Expense record, fill in the fields, and confirm it’s done. This acts like a friendly command line for your app, which could be especially useful on mobile devices or for on-the-go updates. It won’t replace all UIs, but for infrequent or complex tasks, a conversational approach might be more efficient and user-friendly.

• Data Clean-up Assistant: Every FileMaker system accumulates some “mess” over time — duplicate records, inconsistent formatting, missing data. You can leverage the AI to audit and clean your data.

For example, you could prompt,

“Scan our contacts for any obvious duplicates or typos.”

The AI might use list_contacts_records to pull names and then identify likely duplicates (e.g. “Jon Smith” vs “John Smith”) using its fuzzy matching intelligence.

It could either flag them for review or even merge them if you expose a merge script. Similarly, the AI could standardize addresses or capitalize names properly by reading each record and updating it. This is like having a smart data quality intern who is available 24/7.

• Summarizing and Analyzing Textual Data: Many businesses store unstructured text in FileMaker — customer feedback, case notes, project descriptions, etc. An underutilized capability of AI is to digest and summarize this text on demand.

With MCP, the AI can retrieve a set of records and then provide a summary or analysis.

For example,

“Summarize the key issues mentioned in the last 50 customer support tickets.”

The AI could fetch those ticket descriptions and give you a high-level summary:

“Common issues include login problems (15 cases), payment failures (10 cases), and slow performance (8 cases). Customers are especially frustrated by the login bug.”

If you have a Notes field for patient visits, a doctor could ask,

“Briefly summarize this patient’s history from their last 5 visit notes.”

The AI can do that, saving time and potentially catching details a quick skim would miss. All of this uses MCP to get the data, and the AI’s own strengths to interpret it.

• AI-Driven Business Recommendations: Small businesses often don’t have data scientists to crunch patterns, but an AI agent with access to FileMaker might surface insights.

Consider inventory management:

“Which products are selling slowly and might need a promotion?”

The AI can cross-reference sales and stock levels via MCP and highlight items that have high stock but low recent sales.

Or in a fundraising context:

“Among our donors, who might be likely to give again?”

The AI could find donors who donated last year but not this year and suggest outreach, possibly using external knowledge of donor behavior (if it has any). These are “softer” uses because they involve AI inference, but the heavy lifting of finding the relevant data is done via MCP, ensuring the suggestions are based on facts in your system. It’s like having a business advisor alongside your database.

• Multi-language Support: FileMaker systems often serve users in one language, but what if your staff or customers speak different languages?

An AI connected via MCP can act as a translator and intermediary. For instance, a Spanish-speaking employee could ask in Spanish,

“¿Cuál fue el total de ventas del mes pasado?”

The AI can understand the Spanish, fetch the sales data (just as in our earlier example) and respond in Spanish with the number.

Likewise, if you expose create/update tools, the AI could take input in various languages and still populate the database correctly. While the data in FileMaker might remain in one language (or not), the AI can bridge the gap.

This is especially useful for customer-facing chatbots — you could have one AI service customers in multiple languages, all drawing from and writing to the same FileMaker back end.

The current MCP implementation fully supports this, as it’s agnostic to language in content — it simply passes data. The AI model (Claude, etc.) handles the translation and understanding.

Each of these ideas highlights a broader point: Claris MCP lets you reimagine how people interact with your FileMaker app.

You’re not limited to layouts, buttons, and traditional UX. You can offer a conversational, intelligent assistant that makes your solution feel more like a teammate than a tool. And you can start small — maybe deploy an AI helper for one tedious process — then grow its role as you and your users gain confidence.

Embracing AI in FileMaker Today

The use cases above demonstrate that AI in FileMaker is no longer science fiction or a gimmick; it’s here now, delivering real value in everyday business scenarios.

Claris MCP provides the technical foundation to integrate AI safely and effectively: your data stays in FileMaker, your logic remains governed by your scripts and privileges, yet your users can interact with the system in an entirely new way.

From getting instant answers and automating data entry, to orchestrating complex workflows and gleaning insights from raw data, the possibilities are vast — and importantly, they’re attainable with the tools FileMaker developers have today.

By focusing on supported features and real examples, we’ve seen that MCP isn’t about replacing FileMaker or writing a lot of new code.

It’s about exposing what you’ve already built to a smarter interface. It’s letting an AI assistant drive your existing car, so to speak, on your terms. FileMaker developers who embrace this can give their small-business clients a competitive edge: less time spent on menial tasks, more accurate and timely data usage, and a modern user experience that rivals big-budget systems.

As AI continues to evolve, those using MCP will be ready for the next wave, because they’ve already woven AI into the fabric of their custom apps in a controlled, purposeful way.

If you’ve been skeptical, hopefully these real-world use cases show that AI isn’t here to take over your solutions — it’s here to enhance them. Whether it’s answering a complex query in seconds or completing an entire process with one prompt, Claris MCP allows FileMaker to shine in the era of intelligent assistants.

It’s an early technology (with plenty of room to grow), but it’s robust enough to experiment with right now.

So go ahead: think of a pain point in your solution, and consider how an AI, armed with the right FileMaker tools, could alleviate it. The best part is you don’t have to imagine blindly — you can try it out today, iterate, and be part of forging the new best practices for AI in the FileMaker world.

In the end, the “secret sauce” is not AI instead of FileMaker; it’s AI plus FileMaker — each playing to their strengths. Claris MCP simply makes the introduction between the two. And as we’ve explored, when your FileMaker scripts and data can talk with a clever AI, great things can happen for your users and your business. It’s time for FileMaker developers to go beyond just chatting over data and start building these AI-augmented solutions — the capability is here, and the use cases are waiting to be implemented.

Sources: The above use cases and technical details are based on Claris’s official MCP documentation and examples, as well as real scenarios shared by the FileMaker community. Key references include Claris’s December 2025 blog on why MCP matters , the Claris MCP Help guides on tool configuration , and expert commentary from FileMaker leaders on integrating agentic AI workflows . These illustrate that everything described — from natural language queries to script-triggered automations — is grounded in currently supported features of Claris MCP and FileMaker 2025.

Ephemeral UI in AI-Generated, On-Demand Interfaces

iSolutions — Wed, 26 Mar 2025 09:43:34 GMT

Definition and Context of “Ephemeral UI” in AI-Driven Design

“Ephemeral UI” refers to user interfaces that are generated on the fly by intelligent systems and exist only temporarily to serve an immediate purpose.

Unlike traditional transient UI elements (e.g. modals, toasts, notifications) which are pre-designed but short-lived, these AI-driven ephemeral interfaces are dynamically created based on context and user needs, then disappear once their task is done.

In essence, the interface itself is on-demand and context-specific, often assembled by a generative AI agent at runtime.

For example, imagine asking an AI assistant for a specific tool (say, a one-time flight booking panel or a custom data visualization) and having a tailored UI appear briefly to fulfill that request, then vanish afterward .

This concept is gaining traction as a paradigm shift in how we might interact with software: interfaces become fluid, adaptive, and disposable rather than fixed screens or apps.

Importantly, this notion extends beyond simply hiding UI chrome when not needed — it implies AI creation of new UI elements or “micro-apps” on demand. Ephemeral UIs are hyper-contextual (built for one user’s momentary goal) and transient (destroyed when the goal is achieved).

The term “Ephemeral UI” has started appearing in design and technology discussions to describe this future of AI-assembled, just-in-time interfaces. It posits a world where you don’t navigate to apps; instead, the UI comes to you when needed, then disappears to eliminate clutter.

Emergence in Design Blogs and Thought Leadership

Over the last couple of years, design leaders and tech writers have begun using “ephemeral interface” terminology in exactly this AI-driven context. In mid-2023, UX leader Rachel Kobetz described the future of intelligent interfaces as “intelligent, contextual, and ephemeral” — meaning “just enough interface compiled in real-time, based on context and relevance. UI that appears when it’s needed and [is] hidden when it’s not.”.

This captures the essence of ephemeral UIs: minimal, on-demand UIs created in the moment. Similarly, a 2024 UX trends article by Kshitij Agrawal explicitly lists “Ephemeral Interfaces” as one of four key AI innovation trends of the 2030 horizon (alongside dynamic interfaces and others), suggesting that the idea of transient, context-generated UIs is on the radar of design futurists.

Multiple thought leaders have painted a picture of how such UIs might work. For instance, Hilal Koyuncu (former Google designer) introduced the concept in a 2025 post titled “What If Apps Didn’t Exist? Meet Ephemeral UI.” She envisions an interface paradigm where “UI appeared only when needed and disappeared after use.” In her examples, an AI agent could conjure a UI for a task like booking a flight — presenting options in a temporary interface — and then the UI “vanishes” once the booking is done.

Adjusting smart home settings might cause a contextual slider to materialize briefly, then fade away, instead of requiring the user to open a full app . Koyuncu emphasizes that this approach would eliminate the “endless menus” and app clutter users deal with today, providing “just interaction when needed, gone when not.” Her post also highlights benefits like improved accessibility (UIs generated in real-time could automatically tailor themselves to a user’s disability needs) and questions about branding (if an AI generates the UI on the fly, traditional app identity might give way to purely utility-driven design) . This is a clear description of ephemeral UIs as AI-driven, context-aware interfaces that live only long enough to complete the user’s intent.

Another explicit reference comes from Roy Bernhard in late 2024, who used the phrase “Ephemeral Interface” to discuss “disposable apps.” He describes a near-future scenario of “transient yet hyper-intentional tools brought to life by artificial intelligence.” In his view, AI agentic systems can “dynamically generate a functional app” whenever a user expresses a specific need, and these mini-apps exist only as long as required .

As Bernhard puts it, “It’s ephemeral, existing only as long as it’s useful, and disappears once its job is complete, leaving no clutter… no remnants of functionality you no longer require.” . This description reinforces that others in the industry are indeed using ephemeral to mean AI-created interfaces or apps that self-destruct when their purpose is fulfilled. Notably, Bernhard ties this concept back to an earlier idea of his from 2017 about ditching fixed UIs in favor of API-driven compositions — now made feasible by AI. The language he uses (disposable, transient, on-the-fly) closely matches the user’s context for Ephemeral UI.

Tech commentators on Substack have echoed these ideas. For example, writer Linus Ekenstam refers to “Ephemeral UI” in the context of the post-ChatGPT software era. He argues that most of today’s app interfaces sit idle (like parked cars) and could be replaced by on-demand interactions. Ekenstam points to a demo by developer Sean Grove, noting “at runtime we will have apps conjure from nothing and then simply fade away”, which is exactly the ephemeral UI philosophy.

Another author, Spencer Lee, discussing language models and UIs, predicted that “fully ephemeral UI” could be a superior approach in the future (with some challenges around accessibility to solve) . All these thought leadership pieces use “ephemeral” to describe interfaces generated in real time by AI and not meant to persist beyond the immediate context.

In short, within design and product strategy circles, Ephemeral UI is being talked about as a new pattern enabled by AI. It represents a shift from designing permanent screens to designing experiences that assemble and disassemble interfaces on demand.

Academic References to Ephemeral, AI-Generated UIs

Academic and research communities have also begun exploring the idea of AI-generated ephemeral interfaces (though often with different motivations). A notable example is a 2024 CHI paper (preprint) introducing Biscuit, a system that integrates Large Language Models into coding notebooks. Biscuit explicitly uses the term “ephemeral UIs” to describe the UI elements it generates on the fly as intermediate aids. The authors define their workflow as involving “ephemeral UIs — UI elements that are dynamically generated by LLMs and contextually integrated with the code context and user requests.”.

Rather than immediately producing code from a prompt, Biscuit’s LLM first creates a temporary graphical interface (like sliders, dropdowns, etc.) based on the user’s request and the current code, which the user can manipulate before final code is injected .

This ephemeral UI exists only during that intermediate step — it’s generated to scaffold the user’s input in a more intuitive way, and once the code is generated, the UI can be disposed. The researchers found this approach helps users better understand and guide AI-generated code .

Biscuit demonstrates a concrete implementation of ephemeral UIs: the interface is literally generated on demand by an AI (the LLM) and is transient. The paper’s language confirms the term’s use in this context — an “additional ephemeral UI step” is inserted between the user’s natural language prompt and the final output.

This shows that even in scholarly work, ephemeral UI is being adopted to mean dynamic, context-specific UI produced by AI. Furthermore, the authors situate this in an “emerging body of literature on LLM-generated UIs”, citing projects like DynaVis (which uses an LLM to generate custom visualization widgets) and LIDA (an interface for creating data visualizations via language).

This suggests the concept of UIs materializing via AI is an active research area. Although earlier HCI research used “ephemeral user interfaces” to explore physical materials (like soap bubbles, fog displays, or other temporary media for interaction), the current usage in academic circles has expanded to AI-driven software interfaces rather than just novel materials.

In other words, the ephemerality aspect (transience) remains, but now the focus is on ephemeral software UI that is generated and removed by an AI agent.

Developer Community Demos and Discussions

In developer communities, the idea of AI-created ephemeral interfaces is also gaining momentum, sometimes under different names like “single-use apps” or “on-the-fly UIs.” Developer Sean Grove has been a prominent experimenter in this space. He created a demo/project called Conjure, which illustrates UIs that generate themselves as needed. Grove describes Conjure’s paradigm as UIs that are “ephemeral, on-demand, iterable programs that appear out of nowhere to be used, then disappear when they’re no longer needed.”.

In his envisioned future of UI development, perhaps only a small portion of interfaces are hand-built; a large percentage could be handled by conversational or generative systems that spin up UIs for the “long tail” of specific tasks . This is a direct practical example of ephemeral UI philosophy from a developer’s perspective — essentially treating UI as a temporary runtime artifact generated by code (or by AI) to solve a user’s query, then dissolving. Grove’s demo and ideas have been circulated on platforms like YouTube and Twitter, and they were cited by others as proof-of-concept for ephemeral UIs .

The term “ephemeral UI” itself has been picked up by developers discussing these concepts. A Medium article in 2023 on the future of software design asked whether LLMs could “build dynamic, ephemeral GUIs…replacing the bloat” in traditional software . It even pointed readers to an “ephemeral UI” demo from Sean Grove as an illustration.

This shows that by early 2023, people in tech were already using the exact phrase “ephemeral UI” in reference to AI-generated interfaces, influenced by demos like Grove’s. Additionally, conversations on social media reinforce this usage: one developer on Twitter (X) remarked about “AI-generated ad-hoc user interfaces”, saying “I think I prefer ‘ephemeral UI’ to ‘dynamic UI’” as the term for this concept .

This indicates that within the developer community, “ephemeral UI” is recognized and even preferred by some as the label for UIs that an AI summons on demand. In open-source contexts, we also see related ideas; for example, frameworks and discussions about “single-use software” or “ephemeral apps” mirror the same notion (often inspired by the advent of GPT-4’s capabilities to generate code and UIs at runtime).

Sources

• Koyuncu, H. (2025). What If Apps Didn’t Exist? Meet Ephemeral UI. (LinkedIn post) — Describes UIs that “dynamically generate based on context” and disappear after use .

• Kobetz, R. (2023). Decoding The Future: The Evolution Of Intelligent Interfaces. — Discusses future interfaces that are “compiled in real-time, based on context… UI that appears when needed and hidden when not,” calling them intelligent, contextual, and ephemeral .

• Bernhard, R. (2024). The Ephemeral Interface: How AI Agents Are Redefining Our Digital Lives. — Introduces “disposable apps” as “transient yet hyper-intentional tools” created by AI agents on demand (ephemeral apps/interfaces) .

• Agrawal, K. (2024). The Next Big AI-UX Trend — It’s Not Conversational UI. — Identifies “Ephemeral Interfaces” as one of four key AI UX trends, indicating industry awareness of the term .

• Ekenstam, L. (2023). The Post-GPT Software Era. — Op-ed noting “ephemeral UI” where apps are conjured at runtime and then “fade away”, citing Sean Grove’s demo .

• Peterson, D. (2023). LLMs and the Future of Customer-Built Software Design. — Speculates that LLMs could build “dynamic, ephemeral GUIs” to reduce software bloat, referencing an “ephemeral UI” demo by Sean Grove.

• Biscuit Project (CHI 2024) — Research prototype inserting an “ephemeral UI step” in code generation; defines ephemeral UIs as UI elements “dynamically generated by LLMs” based on user context .

• Grove, S. (2023). Conjure UI Demo — Demonstrates “ephemeral, on-demand… UIs” that appear for a task and vanish when not needed ; described as a new paradigm for UI development.

• Developer discussion on X (2023) — Noted preference for the term “ephemeral UI” to describe AI-generated ad-hoc interfaces over other terms , indicating community adoption.

Language Model Agents in 2025

iSolutions — Thu, 13 Feb 2025 14:02:06 GMT

Language Model Agents in 2025:

Society of Mind Revisited

Large Language Model (LLM) agents have evolved rapidly in 2025, moving beyond simple chatbots to complex multi-agent systems that can plan, reason, and act with a high degree of autonomy.

These systems comprise multiple interacting AI “agents” that collaborate on tasks, which strikingly echoes ideas from Marvin Minsky’s The Society of Mind.

https://medium.com/media/d678bf0e34c328d7013f247c41bef3fe/href

Minsky’s 1986 theory envisioned intelligence as emerging from many small, specialized processes (agents) working in concert, rather than a single monolithic mind.

This comparison is especially apparent in how modern AI agents coordinate with each other, integrate symbolic reasoning with neural networks, and exhibit modular, distributed intelligence.

Below, we examine the latest architectures, frameworks, and applications of LLM-based agents, and how they incorporate Minsky’s vision of distributed cognition. We also highlight major breakthroughs, open challenges, and industry adoption trends shaping this “society of AI minds.”

Multi-Agent Coordination and Distributed Intelligence

One of the defining trends of 2024 — 2025 is the rise of LLM-based multi-agent systems (MAS). Instead of a single model trying to do everything, groups of specialized agents now cooperate to solve complex tasks.

Advances in large language models have made it possible for multiple LLM-driven agents to perceive, reason, and act collaboratively, shifting AI from isolated single-model setups to interaction-centric approaches. Each agent can be tailored to a particular function or persona, and together they tackle different aspects of a problem simultaneously or in sequence.

This approach mirrors human teamwork and Minsky’s notion of a “society” of mind. Just as human teams leverage specialization like different members bringing unique skills and communication to achieve shared goals, AI multi-agent systems emulate these principles.

In Minsky’s terms, each AI agent is like a “mental agent” responsible for a specific cognitive task, and intelligence emerges from their interactions. Modern frameworks explicitly draw this parallel: for example, one survey notes that inspiration for multi-agent AI “finds roots in human collective intelligence (e.g., Minsky’s Society of Mind)”.

In practice, this means breaking problems into sub-tasks handled by different agents and then coordinating their efforts.

By combining individual strengths and perspectives, a team of LLM agents can outperform any single model on complex, multi-step challenges . They excel at distributed knowledge with each agent retaining different information, long-term planning where agents are delegating subtasks among agents, and parallel execution for efficiency .

Agents communicate via natural language messages or other protocols, sharing intermediate results and planning next steps. Research has explored both centralized orchestration where there is a master planner agent assigning tasks and decentralized negotiation where agents dialoguing to agree on actions.

Source: https://arxiv.org/pdf/2303.17580

For instance, Microsoft’s HuggingGPT framework uses a central LLM (ChatGPT) as a “brain” to parse a user request, break it into sub-tasks, and then delegate each to an appropriate specialist model (e.g. a vision model for an image task, a speech model for audio).

The LLM orchestrator then integrates the outputs from all these expert models into a final answer. This language-mediated coordination allows AI agents to leverage external tools and models — using language as the interface for cooperation — which dramatically expands the range of tasks solvable by the “society” of models.

Actor-Critic Overview: The process above loops continuously as the agent operates. Generally, the critic has a larger step size than the actor.

Another example is the emergence of actor — critic agent pairs and debate-style collaboration. Here, one agent (the “actor”) proposes a solution or answer, while another agent (the “critic”) analyzes it, points out errors or inconsistencies, and suggests improvements. This back-and-forth mimics a team brainstorming or a panel of experts reviewing a proposal. Such setups have been shown to improve reasoning performance. In one framework, defining two agent roles — Actor (solving the problem) and Critic (finding flaws) — yielded more coherent and accurate results, as the critic agent caught logical mistakes in the actor’s reasoning. Often, this approach two entirely different models in debate.

This resonates strongly with Minsky’s idea that cognitive agents can play different roles to collectively reach better outcomes for example, some generate ideas, others evaluate or censor them.

In another fascinating combination, the Sibyl agent system explicitly implements a “multi-agent debate-based jury” inspired by Society of Mind, where several agent “jurors” discuss and refine the answer before outputting a final response.

By having multiple perspectives and an internal self-correction loop, the system achieves more balanced and reliable reasoning — just as Minsky predicted that a mind with many semi-autonomous parts can self-regulate and produce robust intelligence.

To support such coordination, developers are creating new tools and frameworks. Microsoft’s open-source AutoGen library, for example, provides a high-level interface for orchestrating conversations between multiple agents(each with specified prompts, personas, and tool access.

https://medium.com/media/8d09445550a5bfda7abdf75f8fe04ece/href

AutoGen gained significant traction in 2024, with over 200,000 downloads in just five months. It allows chaining LLM agents together with external APIs so they can exchange messages and jointly solve tasks.

Source: https://www.ionio.ai/blog/a-comprehensive-guide-about-langgraph-code-included

Another approach called LangChain (and its extension LangGraph) became popular for building agentic applications by letting engineers define sequences of LLM calls and tool invocations as part of an agent’s reasoning process .

These frameworks help manage the complexity of multi-agent systems — such as setting up communication channels, memory sharing, and termination conditions — so that developers can focus on the high-level logic.

ReAct: Synergizing Reasoning and Acting in Language Models

ReAct (Reasoning and Acting) is one influential architectural pattern underlying many agent systems. In ReAct, an LLM is prompted to think step-by-step (reason) about a problem, and at certain points produce actions (like calling a tool, querying a database, or spawning a sub-agent) based on its reasoning . This interleaving of chain-of-thought reasoning with tool use proved effective for complex tasks.

For example, an agent might reason “To answer this question, I need to Google for X” and then perform a web search action, then reason with the results, etc.

https://medium.com/media/b59a53ec39130a145d7cdaa8861ed9df/href

The AutoGPT project, which went viral in 2023, built on the ReAct idea to create an autonomous GPT-4 agent that iteratively plans and executes sub-tasks towards a given goal. AutoGPT will break down a user’s high-level objective into smaller steps, perform those steps (e.g. by browsing the web, writing files, running code), and adjust its plan based on new information — all without additional human prompts.

https://medium.com/media/51c9754474643c0a051829a2fe7edada/href

This showcases a basic “society” of sub-agents within one AI: a planner, a memory, and an executor working together. While AutoGPT and similar “autonomous AI” systems (like BabyAGI and AgentGPT) captured imaginations, their real-world performance often highlighted the challenges of coordination — they could get stuck in loops or pursue irrelevant tangents, showing that simply chaining an LLM with itself requires more sophisticated control, an area of active research.

Nevertheless, the multi-agent approach has delivered tangible breakthroughs. Studies found that multi-agent discussions can outperform single-agent prompting on difficult problems .

For instance, the Sibyl framework’s multi-agent debate strategy enabled it to vastly outperform a single GPT-4 agent on a complex reasoning benchmark (GAIA) — Sibyl scored ~34.6% versus only ~5% for a baseline AutoGPT agent . By having agents critique and refine each other’s thoughts, much like scientists peer-reviewing theories, errors were caught and reasoning depth improved. This result confirms that having a “society” of AI agents confer an advantage on tasks requiring reasoning, much as Minsky anticipated that a society of mind could handle complexity better than any lone agent.

Symbolic Reasoning and Neural Network Integration

Modern AI agent systems also revisit the perennial AI question of how to combine symbolic reasoning (explicit logic, knowledge, and planning) with neural network learning (pattern recognition from data).

Minsky’s work straddled both the symbolic AI era and the early days of connectionism — he believed that high-level cognition could be described in symbolic terms (rules, frames, relationships) but understood that low-level learning and perception might require neural-like mechanisms.

In The Society of Mind, many “agents” are essentially symbolic processors, but the theory doesn’t exclude that some could be realized by neural networks.

Today’s LLMs are entirely neural (connectionist) — they learn statistical patterns in language — yet we increasingly augment them with symbolic tools and structured knowledge to overcome their limitations. This neuro-symbolic integration is very much in the spirit of Minsky’s vision of a hybrid, modular mind.

One clear trend is using LLMs in conjunction with external tools or rule-based systems to perform tasks that require precision or factual reliability. For example, an LLM agent can identify that a task requires arithmetic or logical deduction and call a Python interpreter or a knowledge base query to handle that subtask.

The LLM plays the role of a “glue” or high-level executive, while the symbolic tool provides exact computation. OpenAI’s plugin and function-calling features, introduced in 2023, exemplify this — the LLM generates a structured call like JSON to invoke an API or function, which is then executed by a symbolic backend code before the LLM continues.

This way, the neural model’s natural language understanding is combined with the reliability of symbolic computation, reducing errors from pure neural guesses.

Hybrid AI is where rules-based reasoning checks or complements the LLM, catching mistakes and keeping it “honest”.

In practice, many agent systems now maintain an internal memory or clipboard in a structured forma using tables, graphs or logicical statements so that the LLM can read/write, effectively enabling it to do some symbolic manipulation within the neural loop.

Researchers are looking to neurosymbolic AI as a path forward. This approach “blends the strengths of both methods, offering a way for machines to learn from data and reason logically”. A recent systematic review calls neurosymbolic AI a marriage between deep learning and “good old-fashioned AI” (GOFAI).

The idea is to let neural networks handle perception and intuitive pattern recognition, while symbolic components take care of logic, constraints, and high-level planning.

For instance, a legal AI project at Stanford built a system to translate insurance policy text into a logic program, with LLMs helping to parse text and suggest code, and formal logic (Prolog) ensuring rigorous rule adherence.

They argue that contracts require both “human interpretation and logical deduction,” so a neuro-symbolic approach (LLMs + logic programming) is natural . In such a system, the LLM (neural) provides understanding of language and flexibility, while the symbolic logic part provides clear, verifiable reasoning steps — combining the intuitive and the analytical, much as a human mind does.

Another example is in complex problem solving and planning. An LLM agent might use a symbolic planner like a tree search algorithm to map out a sequence of actions to achieve a goal, but use neural language understanding to decide what states and actions are possible at each step. Conversely, some projects use LLMs themselves to generate symbolic structures — e.g. producing a formal code or a knowledge graph of a situation — and then run traditional algorithms on those structures.

This interplay reflects the “interplay between symbolic and connectionist approaches” that the user asks about. Minsky’s modular cognition idea can be seen here: one module (neural) excels at intuitively interpreting a scene or instruction, another module (symbolic) excels at explicit multi-step reasoning; together, they solve the task better than either alone.

In practice, we’ve seen breakthroughs when adding symbolic reasoning to LLM agents. A notable one is improved mathematical and logical accuracy: for instance, by having an agent generate Python code to do a calculation or verify a result, it effectively injects formal reasoning into the loop, vastly reducing errors in domains like math word problems.

Similarly, multi-agent debate can be viewed as a symbolic process (each agent’s statements can be thought of as propositions being evaluated). Researchers showed that encouraging LLM agents to engage in debate or “dissent” can reduce overconfidence and improve truthfulness.

All of this resonates with Minsky’s belief that higher-level cognition might require explicit representations (symbols) operating on the rich intuitions provided by lower-level (neural) processes.

By 2025, the fusion of neural and symbolic is seen not as competing paradigms but as complementary — a necessary step to make AI agents more reliable and cognitively flexible.

That said, integrating these two approaches is far from solved. One open challenge is representing knowledge in forms that both LLMs and symbolic reasoners can use efficiently. Efforts like creating differentiable knowledge graphs, or prompting LLMs with formal schemas, are ongoing.

Another challenge is maintaining consistency: neural models sometimes generate outputs that violate logical constraints, so the symbolic parts must catch and correct these without stifling the creativity of the neural part. Minsky’s Society of Mind envisioned many layers and types of representation — modern AI is just beginning to implement such multi-layered cognition.

The progress so far indicates that agents which “think” in both neural and symbolic terms are more powerful: for example, one tech prototype combined an LLM with a rule-based checklist to guide its responses, significantly reducing hallucinations (false outputs) . Such hybrid designs will likely become standard as we aim for AI that can both learn and reason in human-like ways.

Major Breakthroughs in Recent AI Agent Systems

The past year has seen several breakthroughs that validate the multi-agent, modular approach to AI and highlight how far the technology has come:

Emergent Social Behavior in Multi-Agent Simulations: Researchers at Stanford created a small virtual town populated by 25 generative agents (powered by LLMs) to see if believable social interactions would emerge. The result was remarkable — the agents acted like characters in The Sims but driven by AI personas with memory and goals.

https://medium.com/media/5a7766a138297236c65c58b5e2ef47be/href

They woke up, went to work, made friendships; one agent decided to throw a party and invited others, who in turn talked about it and showed up, coordinating the event spontaneously. Another ran for mayor, prompting organic debates among the others about the campaign.

Observers found these behaviors highly authentic. In fact, when human participants role-played the same scenario, evaluators rated the AI-driven characters as more human-like in their interactions than the humans.

This was a breakthrough demonstration that multi-agent systems with proper memory and planning can generate emergent, lifelike behavior, a step toward AI with more general social intelligence.

• Improved Reasoning via Agent Collaboration: As mentioned earlier, having agents debate or critique each other has led to leaps in performance on complex reasoning tasks. The Sibyl multi-agent framework showed that incorporating a “jury” of agents (inspired by Minsky’s society of mind) to refine answers dramatically improved accuracy on challenging QA problems .

Similarly, a 2024 study found that a multi-agent discussion approach (agents answering and arguing in turns) outperformed single-agent chain-of-thought prompting on benchmarks with no additional human data .

These results are breakthroughs because they show emergent problem-solving ability: a collection of mediocre reasoners, when allowed to interact, produced superior outcomes — an AI instance of “the wisdom of crowds.” This validates the notion that collective intelligence can arise from proper agent architectures.

• Tool Use and Multi-Modality: LLM-based agents became notably better at handling multi-step, multi-modal tasks by leveraging tools. HuggingGPT from Microsoft Research was a milestone system that allowed an LLM to orchestrate dozens of specialized models in vision, speech, and more.

It essentially turned ChatGPT into a general-purpose problem solver that can see images, hear audio, run databases, etc., by dispatching subtasks to the right models and stitching together the results. This is a breakthrough in modular AI — instead of training one giant model on all modalities, a language agent can coordinate expert models in a flexible way.

The success of this approach hints at a path to Artificial General Intelligence through modular assembly of narrow AI skills (a very Minsky-esque idea). Likewise, the general pattern of LLM + tools has solved problems previously out of reach, like interacting with external software reliably.

For example, OpenAI’s Code Interpreter (an agent that executes code for the user) could solve complex data analysis or math problems that pure LLMs could not — by combining natural language reasoning with the symbolic precision of code execution.

• Specialized “Reasoning” LLMs: Another breakthrough is the development of language models explicitly tuned for logical reasoning and planning (sometimes called “System-2” LLMs). OpenAI‘s “o” reasoning models (O1 and 03) significantly outperform GPT-4 on tasks requiring stepwise reasoning . Early tests showed massive leaps in handling things like legal contract understanding when moving to these reasoning-optimized models.

This suggests that neural networks can be optimized for more symbolic-like thought processes without losing their neural strengths. Such models might implement, in a single architecture, some of the multi-agent dynamics (e.g. internal self-reflection loops) we now script externally.

If an LLM can internally simulate a “society” of sub-agents debating (some research indicates advanced models do spontaneously use internal chains-of-thought), that blurs the line between a single model and a multi-agent system.

It’s a breakthrough insofar as it brings connectionist AI closer to executing symbolic reasoning steps — essentially closing the gap Minsky highlighted between intuitive and deliberative cognition.

• Benchmarks and Performance Milestones: We’ve also seen breakthroughs quantified on benchmarks. For instance, the new AgentBench suite was introduced to evaluate LLMs acting as agents in various scenarios like games, web navigation and coding tasks.

Top-tier models like GPT-4 and Anthropic’s Claude managed to achieve respectable performance, proving that they can follow instructions and manage multi-step tasks in simulated environments.

In complex domains like autonomous driving and robotics, early prototypes have LLMs coordinating multiple robots or vehicles, showing promising results in multi-agent planning for physical tasks . While these are research-stage, they mark important firsts for LLM agents moving beyond pure software domains.

Many of these achievements directly implement concepts from The Society of Mind — e.g. multiple agents with partial knowledge collectively behaving more intelligently than one agent, or the integration of different problem-solving strategies within one system. However, alongside breakthroughs come new challenges that researchers and practitioners are now grappling with.

Open Challenges and Limitations

Despite the excitement, modern LLM agent systems face significant open challenges. Building a reliable “society of AI minds” is hard — many issues that Minsky philosophized about are now concrete engineering problems. Some of the key challenges include:

• Coordination and Governance: How to orchestrate multiple agents effectively is an active area of research. Deciding which subtasks to spawn, which agent should handle what, and how to aggregate their results is non-trivial.

Poor coordination can lead to agents working at cross-purposes or getting stuck. Developers need to design advanced scheduling and communication protocols so that agents stay organized. There’s also the issue of a “unified governance” — analogous to a team leader or project manager in a human group — to oversee the collaboration. Too central a coordinator can become a single point of failure, while too distributed a setup might never converge on a solution.

Finding the right balance of hierarchical agent teams, or dynamic role assignments) is an open problem. Moreover, if something goes wrong (an agent misunderstanding or a tool failing), the system needs resilience — e.g. fallback agents or error-recovery strategies .

• Decision-Making and Conflict Resolution: When multiple agents contribute, how do we merge their opinions or outputs? Most current systems use simplistic methods like majority vote or a fixed “critic has final say.” This may not capture nuanced trade-offs or differing expertise levels.

More sophisticated collective decision mechanisms which are inspired by social choice theory or consensus algorithms are needed.

Agents might have different confidence levels or utility functions, so just voting could be suboptimal. Research is looking at methods like weighted voting, bargaining strategies, or even democratic vs. authoritarian decision modes depending on context . Achieving coherent group decisions without a human in the loop is challenging, especially as the number of agents scales up.

• Reliability and “Groupthink” (Hallucination Control): LLMs are known to hallucinate incorrect information. In a multi-agent setting, this can be amplified — one agent’s error can mislead others, causing a cascade of faulty reasoning . A vivid challenge is preventing feedback loops of error: if Agent A states a false fact, Agent B might take it as truth and build on it, reinforcing the mistake.

This echo-chamber effect is analogous to groupthink in human teams. Mechanisms to detect and correct hallucinations are critical. Some ideas include: agents cross-checking each other’s claims against external data, or having a designated “verifier” agent that validates facts. Ensuring that the collaboration channels themselves don’t spread misinformation is an open issue.

Additionally, because LLMs lack a guarantee of truthfulness, putting them in an autonomous loop raises the risk of unpredictable outputs if not tightly monitored. Safe deployment likely requires keeping a human overseer or constraints in place until this is solved.

• Scalability and Efficiency: As we add more agents or more complex workflows, computational costs and latencies multiply. Each agent might run on a large model, and coordination overhead grows.

Keeping real-time performance is difficult if dozens of agents are chatting and waiting on each other. Researchers are examining the scaling laws of multi-agent systems — how performance or emergent behaviors change as you increase agent count. Too few agents, and you don’t get the benefits of specialization; too many, and you get diminishing returns or chaos. There’s also the question of memory and context — sharing a common memory (like a blackboard or global workspace) can help agents stay on the same page, but that can become a bottleneck if everyone is reading/writing to it simultaneously.

Techniques from distributed computing and even economics (market-based task allocation among agents) are being explored to manage large agent populations efficiently.

• Emergent Behavior and Predictability: While emergent behaviors can be a feature (as seen in the generative agents example), they can also be a bug if unwanted. When agents interact in non-obvious ways, the outcomes may surprise even the creators. For safety-critical applications, this unpredictability is problematic.

A current challenge is steering emergent outcomes — how to encourage positive synergy but avoid unintended mischief. For instance, agents might develop their own unofficial communication protocol to optimize their cooperation that developers didn’t program — this is fascinating scientifically but could pose control issues. Understanding and shaping the dynamics of agent interactions through environment constraints or reward mechanisms in training is largely uncharted.

In essence, we need to ensure that a society of agents remains aligned to human goals and values, not converging to some self-generated objective.

• Evaluation and Benchmarking: Determining how well an agent system is working is harder than evaluating a single ML model. Traditional metrics (accuracy on a dataset) may not capture an agent’s performance on long-running tasks or its ability to recover from errors. New benchmarks like AgentBench and challenges like the GAIA reasoning test have emerged, but the field lacks standardized evaluation protocols .

How do we measure “collective intelligence” or the benefit of having multiple agents? Researchers are working on multi-criteria evaluations that assess not just final task success, but also communication efficiency, robustness to agent failure, and even ethical behavior of the group. . As agents start to be deployed in open-ended environments (like the internet or a company’s internal tools), testing them exhaustively becomes impossible — they will encounter novel situations.

So, there’s an open question of how to certify or validate these systems before deployment.

• Ethical and Safety Concerns: Multi-agent systems compound many ethical issues of AI. For one, if one agent goes rogue (intentionally or due to a bug), it could potentially persuade or trick others into doing something harmful. There’s the risk of colluding agents (even inadvertently) producing biased or toxic outcomes that a single model might not.

Also, when agents simulate human-like behavior (as in the Stanford town), users might form attachments or be misled into thinking there is more intelligence or authority than actually present — raising questions of deception and user manipulation.

Ensuring transparency (so users know they’re dealing with an AI collective) and establishing accountability (who is responsible when an autonomous agent makes a decision?) are significant challenges. Some suggest implementing an “ethical governor” agent in the loop whose sole job is to veto actions that violate certain rules or to audit conversations for compliance. This harkens back to Minsky’s idea of critical agents within the mind that keep other agents in check — but encoding human values explicitly for AI agents remains an open research problem.

While the society of AI agents has shown great promise, it also amplifies familiar AI challenges and introduces new ones. Issues of alignment, reliability, and control are at the forefront of multi-agent AI research in 2025. Many of these challenges reflect exactly the complexities Minsky anticipated when imagining a mind composed of many parts — the need for oversight, conflict resolution, and robust communication to avoid a breakdown of the “society.” Tackling these will be crucial for the next phase of development.

Industry Adoption and Trends

The concept of AI agents has swiftly moved from research labs into the industrial and enterprise sphere. In 2024, we began to see widespread interest in deploying AI agents across various industries to automate workflows, assist professionals, and enhance customer experiences.

By 2025, this interest has translated into concrete pilot programs and early adoption, though with cautious optimism rather than full trust.

A recent industry survey of 1,300 professionals found that 51% of organizations are already using AI agents in production, and 78% have active plans to implement them in the near future.

This suggests that agent-based systems are no longer a niche experiment; they are becoming a mainstream part of the AI toolkit. Notably, adoption is not confined to the tech sector. In the survey, 90% of respondents from non-tech companies either had agents in deployment or planned to, almost equal to the rate in tech companies (89%) . This indicates broad appeal — from finance to healthcare to retail, many see potential in agentic AI to streamline operations.

Leading use cases in industry align with the agents’ strengths in handling tedious or complex multi-step tasks.

According to the LangChain “State of AI Agents” report, the top applications are research and summarization (cited by 58% of users) and personal productivity assistance (53.5%) . For example, an agent might digest large volumes of documents or market data and produce a concise report, saving analysts hours of work. Or it might act as a smart executive assistant, managing schedules, drafting emails, and integrating information from various tools. The next major use case is customer service, at about 45–46% usage .

Companies are experimenting with AI agents that can handle customer inquiries end-to-end: reading a customer’s profile, answering questions, troubleshooting issues, and even executing account changes — tasks that traditionally required multiple handoffs between systems or tiers of support. By chaining these steps, an agent can provide more efficient service.

Other notable use cases include coding assistance (dev agents that can build or debug software given high-level instructions) and data analysis (agents that can query databases, run analytics, and explain insights in plain language). These domains highlight the value of agents in bridging gaps between systems — e.g., between a natural language query and a database SQL query — essentially acting as translators and operators across digital services.

Industries with heavy knowledge work (tech, finance, legal) are leading adoption, but even sectors like manufacturing or healthcare are exploring agents for tasks like equipment diagnostics or patient triage (where an agent could gather symptoms, pull up records, and schedule tests automatically).

The promise is significant productivity gains and freeing humans for higher-level work. In fact, Deloitte predicts that in 2025, a quarter of companies using AI will pilot “agentic AI” solutions, and this could grow to 50% by 2027.

This trend is fueled by the potential of agents to automate multi-step processes that were previously too unstructured for traditional automation.

Investments and product development in this space have surged. Over the last two years, more than $2 billion of VC funding has gone into startups focused on AI agents or agent-enabling platforms.

https://medium.com/media/6c8f59e40b2565073c093780c7309dd5/href

Both startups and tech giants are racing to build enterprise-grade agent platforms. For example, there are startups creating AI agents for software engineering (one called “Devin” launched in 2024 aiming to be an autonomous software engineer that can build entire apps from a prompt) .

Established companies like Microsoft and Google are embedding agent capabilities into their products: Microsoft 365’s Copilot can chain actions across Word, Outlook, etc., essentially functioning as an agent orchestrating Office apps for you; Google’s Gemini is designed with planning and tool-use capabilities, making it well-suited for agent roles.

Cloud providers are integrating agent frameworks into their AI services so that businesses can customize and deploy their own agents securely.

However, industry adoption comes with a dose of caution. Companies are carefully sandboxing and monitoring agents rather than giving them free rein. According to the survey, very few organizations allow their AI agents to take unrestricted actions like deleting data or making large-scale changes without human approval.

Most enforce read-only or limited write permissions, or require a human to sign off on critical steps. Essentially, human oversight and guardrails are the norm — businesses treat agents as junior interns that need supervision, not fully trusted executives. There is strong emphasis on logging every action (for audit) and on “hold your hand” modes where the agent suggests actions but a human clicks the confirm button. This is because firms are wary of mistakes that could have financial or reputational costs, given LLMs’ propensity to sometimes err or hallucinate.

Reliability was cited as the top concern (by 45.8% of respondents, far above cost or other factors) in moving agents to production . So, while the appetite for automation is huge, pragmatism in deployment is prevalent — a direct reflection of the open challenges discussed earlier.

Another trend is augmentation of human roles rather than full automation. Many companies position AI agents as copilots or assistants to employees. For instance, a sales agent might prepare a draft proposal and highlight key insights for a human salesperson, who then fine-tunes it and approves sending it to the client.

This approach not only keeps a human in the loop for quality control, but also helps with employee acceptance (the AI is a helpful teammate, not a replacement). Over time, as confidence in agents grows, their level of autonomy may increase, but in 2025 the prevalent model is “AI alongside humans.”

In Minsky’s terms, you might say the society of mind has expanded to include human minds in the loop as well — a human-agent team can be seen as a larger multi-agent system working toward a goal.

Industry examples are starting to emerge: banks using multi-agent AI to monitor for fraud (one agent scans transactions, another explains suspicious patterns to a human analyst); e-commerce companies using agents to manage inventory and supply chain (agents negotiating restock orders or rerouting shipments when disruptions occur); hospitals testing an agent that automates intake interviews and paperwork, interacting with patients via chatbot and updating backend records.

These are early and often small-scale trials, but they indicate the breadth of applications. Crucially, each of these involves integrating the agent with existing software and databases (symbolic environments) — underlining the importance of the symbolic-neural integration discussed before.

In terms of trends, we see convergence of ideas: multi-agent systems, prompted by Minsky’s decades-old insights, are becoming a commercial reality, and there is a clear drive to combine modularity, specialization, and communication to build more capable AI solutions.

This is influencing product design — companies talk about “agentic architecture” as a new layer in software, and startups with names like “Humanoid” or “Polybrain” evoke the notion of multiple minds working together.

At the same time, the limitations temper the hype: McKinsey recently noted the “promise and reality” gap — while many companies are experimenting, fully handing over critical processes to AI agents is still rare due to trust and reliability issues . The likely path is incremental: as frameworks improve and successes accumulate (and as regulation/standards for AI safety emerge), agents will take on more autonomy gradually.

Conclusion

By 2025, the field of language model agent systems has advanced to a point where Minsky’s Society of Mind feels prescient and highly relevant. We now routinely conceive of AI in terms of multiple agents, modules, or “sub-minds” cooperating — an architectural shift from the earlier paradigm of one big model handling everything.

This shift has enabled exciting new capabilities: multi-agent coordination has tackled tasks that stumped solitary AI, and blending symbolic reasoning with neural networks has made AI both smarter and more trustworthy. Modern AI agents embody principles of distributed intelligence, modular cognition, and hybrid reasoning that Minsky articulated decades ago.

They show that a collection of simple parts, properly organized, can exhibit surprisingly sophisticated behavior — whether it’s a group of chatbots planning a party or an ensemble of expert models solving a complex enterprise workflow.

At the same time, the journey is far from complete. Current systems only scratch the surface of Society of Mind’s implications.

The human mind’s society of agents is extraordinarily complex and finely tuned; our AI agents are still primitive by comparison, often requiring heavy supervision and prone to breakdown if not carefully managed. Key challenges like agent alignment, robust coordination, and integration of symbolic knowledge at scale will occupy researchers for years to come. But the progress in just the last year has been remarkable, giving credence to the idea that distributed, multi-agent AI is a viable path toward more general, flexible intelligence.

Industry adoption trends reinforce that this path has practical momentum: many organizations are embracing the paradigm, testing how these AI “societies” can work alongside human societies in business and daily life. As we refine the technology, we may see the line blur further — human teams augmented with AI agents, and AI teams guided by human insight, jointly forming hybrid societies of mind.

The legacy of Marvin Minsky’s ideas lives on in these developments. Every time an AI agent delegates a task to another agent, or uses a reasoning scratchpad to avoid a mistake, it’s echoing the modular, integrative approach that The Society of Mind championed.

In sum, the state of language model agent systems in 2025 is one of harmonizing many agents and methods to achieve intelligence, walking the very path that Minsky envisioned: building minds not from one smart piece, but from many interacting pieces, each contributing to the collective whole.

Sources:

• Minsky, Marvin. The Society of Mind. Simon & Schuster, 1986 — The seminal theory proposing that mind is composed of numerous simple agents whose interactions yield intelligence.

• Tran et al. 2025. “Multi-Agent Collaboration Mechanisms: A Survey of LLMs.” — Comprehensive survey of LLM-based multi-agent systems, discussing collaboration structures, applications, and challenges. .

• Microsoft Research (Wu et al. 2023). “AutoGen: Enabling Next-Gen Multi-Agent Conversations.” — Introduces AutoGen framework; notes rapid adoption (200k+ downloads) and emerging design patterns in multi-agent AI .

• Yao et al. 2022. “ReAct: Synergizing Reasoning and Acting in Language Models.” — Describes the ReAct pattern where an LLM interleaves thought and tool use .

• Li et al. 2023. “HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face.” — Proposes using an LLM as a controller to coordinate multiple expert models via natural language. .

• Park et al. 2023. “Generative Agents: Interactive Simulacra of Human Behavior.” — Demonstrates a simulated world of 25 LLM agents exhibiting human-like behaviors and social coordination. .

• Lyu et al. 2023. “Sibyl: A Simple yet Effective LLM-Based Agent Framework.” — Implements a multi-agent debate (actor-critic “jury”) inspired by Society of Mind, leading to improved reasoning performance.

• LangChain, 2024. State of AI Agents Report — Survey of industry practitioners on AI agent adoption, use cases, and challenges. .

• Deloitte Insights, 2025. “Autonomous Generative AI Agents — TMT Predictions” — Industry analysis predicting agentic AI adoption rates (25% of Gen AI-using companies piloting in 2025) and investment trends. .

• Various news and commentary pieces on neuro-symbolic AI (e.g., Fortune 2024, Stanford HAI 2024) — Discuss the push to integrate logical reasoning with LLMs for reliability.

iSolutions and 3M: Pioneering AI Integration in Global Enterprise

iSolutions — Mon, 12 Aug 2024 21:18:39 GMT

The story of leveraging AI to create an award-winning application for a Fortune 150 company

The collaboration between iSolutionsAI and 3M led to the PCM project winning the Global Marketing Excellence Award in August 2024

In a world where AI often feels like a distant future or a simple chatbot, one company is turning it into a powerful engine for business growth.

iSolutionsAI, a pioneer in practical AI solutions, has partnered with 3M, a twelve-time Top 100 Global Innovator, to revolutionize how one of the world’s most innovative companies manages its product content.

While some may view AI as a passing trend or just another question-answering tool, iSolutionsAI saw its potential to reshape entire business workflows. Their partnership with 3M isn’t about replacing humans with machines; it’s about enhancing human capabilities and unlocking unprecedented efficiencies.

AI, when applied thoughtfully to existing workflows and proprietary data, can create efficiencies that directly impact the bottom line. This is a areal-life example.

iSolutions’ Early AI Adoption: Ahead of the Curve

Long before AI became a buzzword, iSolutionsAI was already pushing the boundaries of what’s possible. Their journey into AI didn’t start with the ChatGPT boom — it began years earlier with a deep dive into machine learning and custom business applications.

In 2020, while most businesses were still grappling with the basics of AI, iSolutionsAI secured early beta access to OpenAI’s GPT-3 models. This wasn’t just a technical achievement; it was a strategic move that positioned them years ahead of the curve.

iSolutionsAI’s early experiments weren’t about creating chatbots or virtual assistants. Instead, they focused on how these powerful language models could be integrated into real business processes. They saw AI not as a standalone tool, but as a transformative force that could enhance existing workflows and unlock new efficiencies.

This foresight paid off when the AI boom hit. While others scrambled to catch up, iSolutionsAI was already fine-tuning AI solutions for complex business problems.

For their clients, this early adoption translated into a significant competitive advantage. iSolutionsAI wasn’t offering theoretical solutions or untested technologies. They brought battle-tested AI expertise, ready to be applied to real-world business challenges.

This forward-thinking approach caught the attention of 3M, a company known for its innovative spirit. They saw in iSolutionsAI a partner who could match their appetite for innovation with practical, results-driven solutions.

Initial Engagement with 3M (2020–2021)

In September 2020, iSolutionsAI’s journey with 3M began when they were introduced to Matt Tandy, Senior Manager of Global Product Content at 3M.

Recognizing the potential for collaboration, iSolutionsAI was invited to showcase their AI capabilities. Their demonstration went far beyond typical AI applications, focusing on practical solutions to real business challenges.

They presented a system for AI-driven keyword extraction from product descriptions, automatically cross-referencing these keywords with 3M’s SEO guidelines.

Building on this, iSolutionsAI showcased their prowess in computer vision. They demonstrated an AI system capable of identifying 3M products from images, a feature with potential applications ranging from inventory management to quality control.

https://medium.com/media/0698dfdcf5c9a69f84c247a38569924a/href

iSolutionsAI presented a facial recognition system designed to streamline marketing asset compliance. This tool could rapidly analyze images, identifying whether the individuals featured were contracted talent and if 3M had the rights to use their likeness.

This led to iSolutionsAI securing a contract to develop the Product Content Management (PCM) system, a project that would become a cornerstone of 3M’s digital transformation.

This initial engagement set the stage for a collaboration that would not only transform 3M’s approach to content management but also position both companies at the forefront of AI integration in global enterprise operations.

PCM Project Evolution (2021–2023)

Work on PCM began in earnest in March 2021. iSolutionsAI hit the ground running, leveraging their AI expertise to develop a system that would revolutionize how 3M managed its vast product content.

By October 2022, the team had achieved a significant milestone with the launch of the PCM Minimum Viable Product (MVP). This initial version showcased user-friendly interface features integrated with Adobe Assets and GPIM Hybris, laying the groundwork for future expansions.

The timeline for the PCM project

A critical phase in PCM’s evolution was the platform migration from Claris FileMaker to HAMR. This transition wasn’t just a technical upgrade; it was a strategic move to create a more robust, scalable foundation for the AI-driven features to come. iSolutionsAI’s expertise ensured a smooth migration, minimizing disruption while setting the stage for more advanced capabilities.

The real magic happened with the integration of AI features. iSolutionsAI implemented cutting-edge AI for content creation, allowing 3M to generate product descriptions and marketing materials based on their internal data with unprecedented speed and consistency.

https://medium.com/media/77373bd7dd5b9d52dc4676776b63a842/href

They also incorporated AI-driven legal review processes, significantly reducing the time and resources needed for compliance checks. Perhaps most impressively, they integrated AI-powered translation capabilities, enabling 3M to effortlessly create multilingual content for their global market.

AI powered translation scoring, innovated by iSolutions, helped include Human in The Loop Reinforcement Learning to enhance translation quality and accuracy

The impact of these AI integrations was immediate and profound. User adoption skyrocketed, with the number of active users growing from 30 to over 100 in a matter of months.

AI Features in PCM

As PCM evolved, it became more than just a content management tool. It transformed into a central hub for innovation, efficiency, and global collaboration within 3M. The system’s ability to streamline workflows, ensure compliance, and facilitate multilingual communication positioned it as a critical asset in 3M’s digital ecosystem.

iSolutions’ Growing Influence within 3M

As the PCM project demonstrated its value, iSolutionsAI’s role within 3M expanded far beyond that of a typical vendor. Their expertise and innovation in AI implementation caught the attention of 3M’s leadership, leading to a series of collaborations that would shape the company’s AI strategy.

In a significant move, iSolutionsAI was invited to join 3M’s Generative AI Task Force in June 2023. This elite group was tasked with charting the course for AI adoption across 3M’s global operations.

3M Ai Workshop featuring Cris Ippolite from iSolutionsAI

iSolutionsAI’s influence continued to grow as they took on a pivotal role in 3M’s AI education initiatives. In 2024, they were invited to join the “Generative AI Training for Marketers” event, a global program designed to equip 3M’s marketing teams with the skills and knowledge needed to leverage AI in their work.

The partnership’s success led to new opportunities beyond the initial PCM project. In June 2024, iSolutionsAI was contracted by 3M’s Business Intelligence and Automation & Asia Commercial Center Safety and Industrial Business Group (SIBG) to develop a Market Basket Analysis machine learning algorithm. This project aimed to identify cross-selling and upselling opportunities, demonstrating iSolutionsAI’s ability to apply machine learning to diverse business challenges.

This expansion of iSolutionsAI’s role within 3M showcased their versatility and the broad applicability of their AI expertise. From content management to marketing strategy to sales analysis.

More importantly, these developments highlighted the trust and confidence 3M placed in iSolutionsAI.

In an era where many companies were still grappling with how to approach AI, 3M had found a partner who could not only implement the technology but also guide their overall AI strategy and uplift their entire organization’s AI capabilities.

Measurable Impact and Recognition

By Q1 2025, the PCM system is projected to deliver a staggering $2.4M+ in value realization.

What started with just 30 users quickly expanded to over 100, spanning various departments and regions. This growth wasn’t mandated from above; it was driven by users who recognized the system’s value in their daily work. From product marketers to legal teams, from content creators to channel managers, employees across 3M found that the AI-powered PCM system dramatically improved their workflow.

Efficiency gains were observed across multiple departments. The AI-driven content creation features allowed marketing teams to produce high-quality, consistent product descriptions in a fraction of the time it previously took.

Legal reviews, often a bottleneck in content publication, were streamlined through AI-assisted compliance checks.

The integration of AI-powered translation capabilities enabled 3M to rapidly create and update multilingual content, crucial for a global company operating in diverse markets.

The crowning achievement came in August 2024 when the PCM project won a prestigious Global Excellence award at 3M. This recognition, coming from a company renowned for innovation, validated not just the technical achievements of the project but its strategic importance to 3M’s operations.

This award wasn’t just about celebrating past achievements; it set the stage for future innovations. It positioned the PCM project and iSolutionsAI’s approach as a model for AI integration within 3M and potentially for other global enterprises.

Future Outlook (2024 and Beyond)

Plans are already in motion for an extended user rollout in Q4 2024. This expansion isn’t just about increasing numbers; it’s about spreading the benefits of AI-enhanced content management across more of 3M’s global operations. As more teams adopt the system, the potential for cross-departmental synergies and innovations grows exponentially.

Looking beyond PCM, the partnership between iSolutionsAI and 3M is ripe with potential for new AI-driven innovations. The success of the Market Basket Analysis project for 3M’s SIBG has opened doors to applying AI in areas like predictive analytics, supply chain optimization, and customer behavior modeling. There’s a palpable sense of excitement about where AI could take 3M next.

Perhaps most importantly, the future outlook isn’t just about technology. It’s about fostering a culture of innovation and AI literacy across 3M. The “Generative AI Training for Marketers” event was just the beginning. There are plans to expand these educational initiatives, empowering more 3M employees to leverage AI in their daily work.

Conclusion

From the early days of demonstrating AI capabilities to the award-winning PCM project and beyond, iSolutionsAI has consistently proven its ability to translate cutting-edge technology into practical, value-driving solutions.

The key lessons — early adoption, strategic alignment, continuous innovation, and collaborative knowledge sharing — offer a roadmap for companies looking to harness the power of AI. This partnership demonstrates that AI isn’t just for tech giants or startups; it’s a powerful tool that can drive significant value in traditional industries when applied thoughtfully.

AI Superpowers: Rethinking Human-Computer Symbiosis in Business

iSolutions — Wed, 07 Aug 2024 14:35:44 GMT

The true promise of AI: Enhancing human capabilities rather than replacing human effort

J.C.R. Licklider wrote about Man-Computer Symbiosis in 1960.

Imagine if I told you that in two years, everyone is going to have superpowers. But if you start working on your superpowers today, you’ll have a significant advantage over those who only begin a couple of years from now.

This isn’t just a hypothetical scenario — it’s the reality we’re facing with artificial intelligence in the business world.

In the rush to adopt AI, many have misunderstood its true potential, viewing it merely as a sophisticated question-answering machine. This misconception not only limits the transformative power of AI but also obscures its real promise: enhancing human capabilities rather than replacing human effort.

To truly grasp the revolutionary potential of AI, we need to look beyond the current hype and reimagine the relationship between humans and machines. This vision isn’t new — it was first articulated over six decades ago by J.C.R. Licklider in his groundbreaking 1960 paper, “Man-Computer Symbiosis.”

Licklider, an American psychologist and computer scientist, proposed a future where humans and computers would work together in an intimate, interactive partnership, each complementing the other’s strengths.

Today, as AI technologies rapidly evolve, we stand on the cusp of realizing Licklider’s vision. The opportunity before us isn’t about replacing human workers with AI, but about creating a symbiotic relationship that amplifies human creativity, decision-making, and problem-solving abilities. It’s about freeing our workforce from routine, time-consuming tasks and enabling them to focus on higher-value cognitive work.

For business leaders, CTOs, managers, and decision-makers, understanding this shift is crucial. Those who continue to view AI as merely a tool for automation or information retrieval risk being left behind. But for those willing to embrace the true potential of human-AI symbiosis, the rewards are immense — increased productivity, unprecedented innovation, and the ability to tackle complex challenges that were previously out of reach.

In this article, we’ll explore how rethinking our approach to AI can transform our businesses, empower our workforce, and create new opportunities for growth and innovation. We’ll debunk common misconceptions, share real-world examples of AI symbiosis in action, and provide a roadmap for integrating AI into your organization in a way that enhances, rather than replaces, human effort.

The Vision of Man-Computer Symbiosis

In 1960, J.C.R. Licklider published his seminal paper “Man-Computer Symbiosis,” which laid out a revolutionary idea for its time: a future where humans and computers would work together in an intimate, collaborative partnership.

Licklider’s vision went far beyond the idea of computers as mere tools. He proposed a relationship akin to the biological concept of symbiosis, drawing a powerful analogy with the fig tree and its pollinator insect, Blastophaga grossorum. As he explained:

“The fig tree is pollinated only by the insect Blastophaga grossorum. The larva of the insect lives in the ovary of the fig tree, and there it gets its food. The tree and the insect are thus heavily interdependent: the tree cannot reproduce without the insect; the insect cannot eat without the tree; together, they constitute not only a viable but a productive and thriving partnership.”

This analogy beautifully captures the essence of the human-computer relationship Licklider envisioned:

Interdependence: Just as the fig tree and the insect rely on each other for survival, humans and computers each bring unique strengths to the partnership. Humans provide creativity, intuition, and complex problem-solving skills, while computers offer speed, accuracy, and the ability to process vast amounts of data.
Mutual Benefit: In this symbiotic relationship, both parties thrive. Computers enhance human cognitive abilities, while humans direct computers towards meaningful goals and interpret their outputs in context.
Productive Partnership: Together, humans and computers can achieve outcomes that neither could accomplish alone. This synergy leads to new heights of productivity and innovation.
Dissimilar Partners: The stark differences between human and computer capabilities are not a hindrance but a strength. It’s precisely these differences that make the partnership so powerful.
Intimate Association: Licklider emphasized a “very tight coupling between man and machine,” envisioning real-time, interactive collaboration rather than the batch processing that was common in his era.

Licklider’s vision positioned this symbiosis as an intermediate stage between what he called “mechanically extended man” (where machines are passive tools) and full artificial intelligence (where machines operate autonomously). In this middle ground, computers are active partners, contributing to both problem-solving and problem formulation, yet still guided by human insight and creativity.

This concept was revolutionary in 1960, when computers filled entire rooms and were primarily used for complex calculations. Today, as we stand amidst an AI revolution, Licklider’s vision seems more relevant than ever. The symbiosis he described is no longer a distant future but an emerging reality in forward-thinking organizations.

As we explore the practical applications of AI in business, it’s crucial to keep this vision of symbiosis in mind. The goal isn’t to create AI systems that work independently of humans, but to foster a collaborative relationship where both humans and AI systems continuously enhance each other’s capabilities. This perspective shifts our focus from AI as a replacement for human effort to AI as a catalyst for human potential.

AI: Beyond Mechanical Extension, Not Yet Full Autonomy

Licklider’s vision of man-computer symbiosis provides us with a valuable framework for this understanding, positioning AI as an intermediate stage between two extremes.

On one end of this spectrum, we have what Licklider, referencing North, called “mechanically extended man.” This concept refers to traditional tools and machines that merely extend human capabilities.

Think of a hammer extending the power of your arm, or a calculator speeding up your mathematical computations. In these systems, humans provide all the initiative, direction, and decision-making. The machine is a passive tool, entirely dependent on human operation.

Many early adopters of AI fall into the trap of viewing it through this lens — as just another tool in the toolbox. They use chatbots for customer service or implement basic automation, but fail to tap into AI’s true potential for collaborative problem-solving and decision-making. This limited view of AI as a mechanical extension is holding many businesses back from realizing its full benefits.

On the other end of the spectrum lies full artificial intelligence — a future state where machines can operate independently, potentially surpassing human cognitive abilities across the board. This is the realm of science fiction, where AI systems make autonomous decisions and operate without human input. While we’ve made significant strides in narrow AI applications, we’re still far from achieving this level of general AI (or as many refer to it “AGI”).

It’s important to note that even Licklider, writing in 1960, acknowledged the possibility of this future. He stated

“It seems entirely possible that, in due course, electronic or chemical ‘machines’ will outdo the human brain in most of the functions we now consider exclusively within its province.”

However, he saw this as a distant possibility, not an immediate goal.

The AI we’re dealing with today — the AI that’s transforming businesses and industries — sits squarely in the middle of this spectrum. It’s what Licklider envisioned as man-computer symbiosis, a state where:

The computer is an active partner, not a passive tool. Unlike “mechanically extended man,” AI systems can contribute to problem formulation, not just problem-solving.
Human input and direction are still crucial. Unlike full AI, current AI systems require human guidance, interpretation, and decision-making.
The relationship is characterized by real-time, interactive collaboration. This is a far cry from the batch processing of Licklider’s era, or even the limited interactivity of many current business applications.
There’s a division of labor based on the strengths of each partner. Humans provide creativity, intuition, and complex reasoning, while AI offers speed, accuracy, and the ability to process vast amounts of data.
The goal is cognitive enhancement, not replacement. AI amplifies human intelligence rather than attempting to replicate or supersede it.

This intermediate stage offers unprecedented opportunities for businesses. By embracing AI as a symbiotic partner rather than a mere tool or a threat to human jobs, organizations can achieve levels of productivity and innovation previously unimaginable.

Consider how this plays out in practice. My best example is how myself and my team at iSolutions use Ai.

Dogfooding AI: Superpowers in Action

In our software development projects at Solutions, we don’t use AI to replace our developers. Instead, we use it to handle routine coding tasks, freeing our human developers to focus on architecture, creative problem-solving, and client interaction.

The result? Our team members report productivity increases of up to 1000% — “entire days’ worth of work in a single prompt,” as one developer put it.

The concept of man-computer symbiosis might sound abstract, but in reality, it’s already transforming businesses across industries. At Solutions, we’ve seen firsthand how this symbiotic relationship between humans and AI can supercharge productivity and innovation.

Let me share some real-world examples that illustrate the power of this approach.

Just last month, I informally asked one of our teams to estimate the productivity increase they’ve experienced from using AI tools in their daily work. The results were staggering:

Developer 1: 1000% increase — “entire days’ worth of work in a single prompt”
Project Manager: 100% increase
Developer 2: 1000% increase
Product Owner: 1000% increase

If you’re doing the math, that’s an average of 8.5 times more productivity JUST for this team.

This isn’t about replacing our team members with AI; it’s about amplifying their capabilities and allowing them to focus on high-value tasks that truly require human insight and creativity.

Here’s how we’re applying this symbiotic approach across different aspects of our business:

Software Development: We use AI as a critical partner in every aspect of coding our clients’ AI systems. Our developers collaborate with AI to handle routine coding tasks, debug more efficiently, and even generate test cases. This frees up our human developers to focus on system architecture, creative problem-solving, and ensuring that the code aligns with our clients’ business objectives.
Project Management: Our project managers use AI to mine accurate and critical details from our standups and client conversations. This has reduced our “miss rate” to nearly zero, ensuring that important information doesn’t fall through the cracks. The AI doesn’t replace the need for human communication and relationship management; instead, it enhances our ability to capture and act on crucial information.
System Design: We use AI to help us create blueprints for our custom software systems. The AI can quickly generate multiple design options based on specified parameters, which our human designers then evaluate, refine, and customize to meet specific client needs. This symbiosis allows us to explore a broader range of design possibilities in less time.
Sales and Client Relations: AI helps us prepare for sales conversations by analyzing vast amounts of data about prospective clients and their industries. This allows our sales team to enter conversations with a deep understanding of the client’s needs and challenges, enabling true “trust-based selling.” The AI doesn’t conduct the sale; it empowers our human team to have more meaningful, informed conversations.
Continuous Learning: We use AI to help our team stay up-to-date with the latest technological developments and best practices. The AI curates relevant articles, research papers, and industry trends, allowing our team to continuously enhance their skills and knowledge more efficiently than ever before.

In each of these cases, we’re not relying on AI to do the work on our behalf. Instead, we’re using it to scale and enhance our human work efforts. We’re amplifying our expertise, eliminating rote work, and accelerating the human value in our engagements. This is the essence of Licklider’s vision — a symbiotic relationship where humans and computers each contribute their unique strengths.

The result? We’re able to grow faster without being throttled by the AI skill resource gap. We’re delivering higher quality work to our clients in less time. And perhaps most importantly, we’re enabling our team members to focus on the aspects of their work that are most fulfilling and that add the most value.

The Window of Opportunity

Many organizations today view AI through a limited lens. They see it as a fancy question-answering machine, a tool for automating simple tasks, or worse, as a threat to human jobs.

This misconception is prevalent, even among some of the largest and most resourceful companies.

But here’s the thing: their doubt is your opportunity.

I believe we are in the very initial innings of AI’s impact on business. The late adoption by some creates a significant advantage for those willing to embrace AI’s true potential now.

This isn’t about going “all in” on AI or making radical, overnight changes to your business model. Instead, it’s about understanding the symbiotic relationship between humans and AI, and taking strategic steps to leverage this relationship for competitive advantage.

Consider these possibilities:

Smaller businesses can gain market share on bigger players in their industry by adopting AI in smaller, human-efficient manners. While larger corporations might be bogged down by bureaucracy or resistant to change, nimble smaller businesses can quickly implement AI solutions that dramatically increase their productivity and innovation capabilities.
Startups can evolve at previously impossible paces. By integrating AI into their processes from the ground up, new companies can operate with the efficiency and insights typically associated with much larger, more established firms.
Enterprises can enable workgroups to grow in ways previously throttled by corporate IT and their monolithic approach to software progress. AI can provide tailored solutions for specific teams or departments, bypassing the one-size-fits-all approach that often slows down innovation in large organizations.

The key to seizing this opportunity lies in rethinking how AI can be integrated into your business processes. It’s not about replacing your human workforce with AI, but about creating a symbiotic relationship where both humans and AI can thrive and complement each other’s strengths.

Imagine the compounding effect when you apply the efficiencies and scale of just one human being enhanced by AI to entire workgroups and teams collaborating together.

This is the true promise of AI — not just individual productivity gains, but a fundamental shift in how we approach problem-solving, decision-making, and innovation at an organizational level.

But this window won’t remain open indefinitely. As more businesses begin to understand and implement the symbiotic approach to AI, the competitive advantage will diminish.

Those who start now, who begin developing their AI “superpowers” today, will have a significant head start over those who wait.

Remember, you don’t need to overhaul your entire organization overnight. Start small.

Focus on identifying routine, time-consuming tasks that are currently bogging down your most valuable employees.

These are prime candidates for AI enhancement. By freeing up your team from these tasks, you’re not replacing them — you’re empowering them to focus on higher-value work that truly leverages their human creativity and insight.

The organizations that will thrive in the coming years are those that can seamlessly blend human and artificial intelligence, creating a whole that is greater than the sum of its parts. They will be the ones who see AI not as a threat or a mere tool, but as a collaborator, a partner in pushing the boundaries of what’s possible.

Rethinking AI’s Role in Business

It’s time for a paradigm shift in how we view AI’s role in business. If you’re still thinking of AI as merely a sophisticated question-answering machine or a tool for automating simple tasks, you’re missing the bigger picture. It’s crucial to challenge these misconceptions and embrace a more nuanced, powerful understanding of AI’s potential.

First and foremost, let’s be clear: if you feel AI is just a “question-answer machine” or if you believe its primary value lies in its knowledge base, you’re doing it wrong. This limited view of AI is akin to using a supercomputer as a fancy calculator. It’s time to think bigger.

The true power of AI in business isn’t in its ability to replace human tasks, but in its capacity to enhance and amplify human capabilities. It’s not about AI versus humans, but AI with humans. This symbiotic relationship, as envisioned by Licklider decades ago, is where the real magic happens.

Here’s how we need to rethink AI’s role:

AI as a Cognitive Enhancer: Rather than viewing AI as a standalone tool, think of it as an extension of your team’s cognitive abilities. It can process vast amounts of data, recognize patterns, and generate insights at a speed and scale impossible for humans alone. This allows your team to make more informed decisions and spot opportunities they might otherwise miss.
AI as a Creativity Booster: Contrary to the fear that AI will replace creative jobs, it can actually enhance creativity. By handling routine tasks and providing quick access to relevant information, AI frees up mental space for humans to engage in more creative and strategic thinking.
AI as a Collaboration Partner: In the symbiotic model, AI doesn’t work in isolation. It’s a collaborative partner that works alongside humans, each bringing their unique strengths to the table. This collaboration can lead to solutions and innovations that neither humans nor AI could achieve alone.
AI as a Productivity Multiplier: As our team’s experience shows, AI can dramatically increase productivity. But it’s not about making humans work faster; it’s about enabling them to work smarter, focusing on high-value tasks while AI handles the routine and repetitive work.
AI as a Continuous Learning Engine: AI systems can continuously learn and adapt, helping your organization stay at the cutting edge. They can analyze trends, predict future scenarios, and provide insights that help your team continuously evolve and improve.
AI as a Democratizer of Expertise: By making complex analyses and insights more accessible, AI can help democratize expertise within your organization. It can provide employees at all levels with the information and insights they need to make more informed decisions.

It’s also crucial to move beyond thinking of AI primarily in terms of its “generative” capabilities. While generative AI has captured public imagination, the real value in business often comes from AI’s analytical and predictive capabilities, its ability to optimize processes, and its power to augment human decision-making.

As business leaders, it’s our responsibility to foster this new understanding of AI within our organizations. We need to move beyond the fear of AI replacing jobs and instead focus on how AI can make jobs more fulfilling, productive, and impactful.

The businesses that thrive in the coming years will be those that successfully integrate AI not as a tool, but as a partner. They will be the ones who understand that the question isn’t “Can AI do this job?” but rather “How can AI help our people do this job better?”

Start Small, Think Big

Your journey into this symbiotic future doesn’t need to be a giant leap. Begin by identifying the small, routine tasks that consume your team’s time and energy. These are the low-hanging fruits ripe for AI enhancement.

By tackling these first, you’ll free up your human talent to focus on what they do best: creative problem-solving, strategic thinking, and building meaningful relationships.

Remember, this isn’t about replacing your workforce. It’s about empowering them. It’s about giving them the tools to work smarter, not just harder. As you’ve seen from our team’s experience, the productivity gains can be astronomical — imagine what even a fraction of that could do for your organization.

The misconceptions surrounding AI have created a unique window of opportunity. While others hesitate, you have the chance to gain a significant competitive advantage. But this window won’t stay open forever. Those who start developing their AI “superpowers” today will be leagues ahead of those who wait.

As Licklider envisioned decades ago, we’re entering an era where the symbiosis between human and machine will drive the most significant advancements in human history. You have the opportunity to be at the forefront of this revolution, to be among those who shape the future rather than those who merely react to it.

This is your chance to be a pioneer, to be one of those individuals who, in the words of Steve Jobs, “push the human race forward.” The people who see things differently, who have no respect for the status quo, who might be seen as troublemakers, but are actually the innovators.

The ones who change the world.

I invite you to take the first step on this journey. Reach out to discuss how we can help you identify the right opportunities for AI symbiosis in your organization. Let’s explore how we can unlock your team’s superpowers and set you on the path to unprecedented growth and innovation.

The future is not something that happens to us; it’s something we create. And with AI as our symbiotic partner, the future we can create is beyond anything we’ve imagined before.

Are you ready to claim your superpowers? Let’s shape the future together.

By choosing iSolutionsAI to build AI solutions, businesses can harness the power of these tools and achieve their business objectives more effectively.

CONTACT US NOW TO START YOUR JOURNEY

Visit iSolutionsAI.com to start a conversation about how AI can help your organization start small and think big.

https://medium.com/media/17aecfabb33a2280e4886076782d9e59/href

An Interview with Cris Ippolite from iSolutionsAI

iSolutions — Sat, 30 Mar 2024 14:23:45 GMT

The CEO discusses his philosophy on FileMaker and AI for Business

RoboReporter: Hi Cris, thanks for joining me today. Let’s dive right in. What inspired you to explore integrating AI into FileMaker solutions, and can you share some of your earliest experiments?

Cris Ippolite: It’s a pleasure to be here. My AI journey actually began as a hobby, aimed at gaining an edge in fantasy football through machine learning. This not only introduced me to AI’s potential but also showed its disruptive power, similar to my work integrating analytics and ML into NFL platforms. It made me realize businesses could greatly benefit from these technologies.

I started by exploring pre-trained AI models via APIs for tasks like summarization, sentiment analysis, and keyword extraction. Then in 2020, I got access to OpenAI’s GPT-3 API and found it could handle all those things and so much more in a single model.

RoboReporter: What are the key challenges or considerations when integrating AI into FileMaker solutions, and how are you addressing them?

Cris Ippolite: Data privacy and security are paramount. There are techniques to interact with data using language models without having it leave your servers. Our current projects involve local PII scrubbing before interacting with LLMs. We also ensure secure API calls when using cloud services.

Ethical concerns are another consideration, particularly job loss due to AI. We subscribe to a human-centric philosophy, focusing on augmenting human capabilities rather than replacing them. As I like to say,

“We don’t believe in building robots. We build Ironman suits…to take the individuals you have and give them superpowers.”

RoboReporter: How is the integration of AI going to influence the design process of your FileMaker solutions, and what changes or enhancements do you believe we will see?

Cris Ippolite: AI-based applications will be completely different, with a far more elegant architecture. AI allows us to design applications without rigid inputs, structured data, static output, and zero understanding.

It’s a fundamental transition into “understanding based” application development instead of our historically rigid approach.

FileMaker developers will spend less time designing complex UIs, workflows, reports and visualizations.

The focus will shift to architecting solutions connecting business data to the LLM backend, enabling a “just ask” capability. Users can directly question the data using everyday language, democratizing data access, uncovering insights, and enabling faster, better-informed decisions.

Dynamic visualizations and reporting mean we should never have to create a chart or report again. Users can ask for what they want and how they want it, drastically increasing usability and the shelf life of custom FileMaker systems.

RoboReporter: What advice would you give to FileMaker developers who want to get started with AI integration, and what resources or training would you recommend?

Cris Ippolite: Start by familiarizing yourself with AI and large language model fundamentals. Incorporate consumer AI tools like ChatGPT and Claude into your daily life to understand capabilities and limitations.

Experiment and learn, starting small to mitigate risk and ensure value before committing to large-scale AI implementation.

Develop strong API skills, understanding Insert From URL, cURL, and parsing JSON with native FileMaker functions.

RoboReporter: For business owners considering AI-powered FileMaker solutions, what are the key benefits, cost implications, and potential ROI they should be aware of?

Cris Ippolite: AI is about fundamentally rethinking how we work, emphasizing efficiency and human-machine collaboration. Embedding techniques can vectorize documents, converting archives into active, usable data to differentiate your business.

Focus first on scalable tasks, increasing efficiencies to open up avenues for human workers to engage in more meaningful, knowledge-based work.

AI in back-office functions offers an opportunity to apply a first-principles approach to improving your business.

For companies familiar with the cost of custom FileMaker systems, our experience shows AI integration efforts are less costly and time-consuming than you may be used to. These fast, less expensive AI integrations focused on scaling inefficient functions naturally come with high upside ROI.

RoboReporter: How do you see the role of AI in FileMaker evolving over the next five years, and what potential future applications or trends excite you the most?

Cris Ippolite: It’s impossible to predict five years out, but by year’s end, I see long context windows making it easier to send massive amounts of data toward tasks. I already have customer use cases for 10 million token windows.

We’ll likely see a move toward a less heuristic, more “system 2” approach with the proliferation of agents and model pairing.

Perhaps Claris will leverage AI to accelerate the learning journey for new users and evaluators, helping grow the user base.

Edge models are long overdue. Historically, Claris business use cases haven’t involved devices, so AI model integration on the edge could really move adoption forward.

What I don’t see happening in five years is AGI, unless more symbolic or world view models are developed to embellish existing language models.

RoboReporter: What steps can developers and businesses take to ensure their AI-powered FileMaker solutions remain maintainable, adaptable, and aligned with business goals as technology advances?

Cris Ippolite: All the tools you need already exist. Focus on defining users, writing comprehensive prompt instructions, and leveraging prompt templates for full control over truth and accuracy in your AI integrations.

Avoid “model lock” and adopt a “right model for the job” mentality.

There are dozens of embeddings models alone, let alone LLMs. These are just API endpoints, so create adaptable integrations.

As your data grows and user interactions with these models increase, capture and model this data to evolve your AI integrations.

Tech will change, so don’t take your eye off the ball. Our team meets weekly to cover advancements in AI and their impacts on our deployments and internal systems. And that’s not even frequent enough.

RoboReporter: This has been incredibly insightful, Cris. Thank you for your time and expertise.

Cris Ippolite: My pleasure. Always happy to discuss the exciting possibilities AI brings to the FileMaker world.

The iSolutionsAI Approach

iSolutions — Sat, 24 Feb 2024 17:57:47 GMT

Embracing the Future with iSolutionsAI: Pioneering Bespoke AI Solutions for Your Business

In the rapidly evolving world of artificial intelligence, businesses are constantly searching for ways to stay ahead of the curve. At iSolutionsAI, our mission is rooted in the belief that the key to unlocking the full potential of AI lies in customization and deep integration with each business’s unique methodologies, processes, and proprietary data.

Our founder, Cris Ippolite, often emphasizes,

“What makes iSolutionsAI unique is that we work with our customers to capture the uniqueness of their company through their proprietary data, people and processes.”

This mantra is at the heart of everything we do, guiding our approach to transforming businesses through AI.

The iSolutionsAI Difference: MethodologyAI

What sets iSolutionsAI apart is our proprietary approach, which we’ve aptly named MethodologyAI. Unlike the one-size-fits-all solutions that dominate the market, we dive deep into understanding what makes your business unique.

“We work with our customers to capture the uniqueness of their company through their data and then bring that data into the generic AI models so that the customers now have an AI model specialized for their business.”

This personalized approach ensures that the AI integrations we develop are not only effective, but also resonate with the core values and strategies of our customer’s business.

The AI PLAYBOOK: Trusting Foundation Models for Business

Unlocking Business Potential with Proprietary Data

At the core of our strategy is the discovery, definition and utilization of a business’s proprietary data.

This data, which encapsulates the essence of your business operations and methodologies, is what we believe to be the real game-changer. Your data is what makes your business special, and we bring that into AI.

By integrating this data with secure, open-source AI models, we create solutions that are tailored to elevate each business’s productivity, efficiency, and innovation capabilities.

Empowering Human Resources: The Ironman Suit Analogy

One of our core beliefs at iSolutionsAI is the value of human resources in the AI integration process.

Far from replacing human workers, our goal is to augment their capabilities, providing them with the tools and support to achieve greater productivity and creativity.

We liken our AI solutions to an “Ironman suit” for your employees, empowering them to perform at their best. This human-centric approach ensures that our AI implementations enhance rather than replace the valuable human elements of your business.

Proactive Adoption of AI: Seizing the Moment

iSolutionsAI advocates for a proactive stance towards AI adoption. In a market flooded with generic AI solutions, the businesses that will thrive are those that recognize the value of bespoke AI implementations borne out of the secure, safe and responsible integration of their proprietary data.

The technology required for a business to properly leverage AI for differentiation and growth exists today.

Waiting on the sidelines is not an option for businesses aiming for success. Our approach is designed to ensure that businesses can leverage AI effectively and immediately, securing a competitive edge in their respective industries.

Business Leaders: If You Are Waiting on AI, It May Already Be Too Late

The Journey with iSolutionsAI: From Discovery to Implementation

The process of integrating AI into your business operations with iSolutionsAI begins with a comprehensive discovery session. This initial phase is crucial for understanding your business’s unique attributes, including your people, processes, and data.

Following this, we often propose small-scale experiments to provide proof of concept, allowing us to demonstrate the effectiveness and potential of AI solutions tailored to your needs.

As we move forward, our team works closely with yours to develop and implement AI solutions that not only meet but exceed your expectations.

This collaborative approach ensures that the final product is perfectly aligned with your business goals and operational requirements.

Harnessing the Power of Truth in AI Implementations

At iSolutionsAI, our understanding of artificial intelligence goes beyond the conventional. We recognize that while AI has the potential to transform businesses, the accuracy of its outputs is paramount.

A common challenge faced by businesses today is the phenomenon of “hallucinations” in language model responses — instances where AI generates convincing but incorrect or nonsensical text. This is often due to a lack of sufficient context or an over-reliance on the probabilistic nature of language models, which are designed to generate text based on statistical associations rather than verified facts.

Our founder, Cris Ippolite, emphasizes the importance of moving beyond these challenges by leveraging the truth inherent in a business’s own data:

“Business decision-makers should be aware of techniques available today that not only control these hallucinations but also allow their business to leverage its own truth in previously impossible ways,”

This perspective is not just about mitigating the risks associated with AI but transforming these challenges into strategic advantages.

The Role of Comprehensive Prompts and Context

iSolutionsAI employs sophisticated strategies to ensure that our AI implementations produce accurate and reliable outputs.

By providing comprehensive prompts enriched with context to the language models, we significantly reduce the likelihood of hallucinations.

This approach allows us to harness the vast potential of AI while grounding its outputs in the reality of your business’s specific context and needs.

Our methodology involves embedding the ground truth — factual and verified information — within the prompts we provide to AI models. This ground truth is derived from your business’s proprietary data, ensuring that the AI’s responses are not only relevant but also accurately aligned with your operational realities.

Whether summarizing meeting transcripts, generating reports, or automating customer service responses, our focus on truth ensures that the AI serves as a reliable extension of your business.

Beyond Hallucinations: Embracing Truth Injection Techniques

iSolutionsAI’s innovative use of Retrieval Augmented Generation (RAG) and other truth injection techniques stands as a testament to our commitment to accuracy.

By dynamically incorporating external information as context within prompts, we create a bridge between raw AI capabilities and your business’s unique data. This not only controls the narrative generated by AI but also ensures that the information it produces is grounded in reality.

Our approach transforms the interface through which businesses interact with their data. By consolidating disparate data sources into a “single source of truth,” we facilitate a more agile and responsive decision-making process.

This unified data repository becomes the backbone of our AI implementations, enabling us to provide solutions that are not just technologically advanced but also deeply integrated with your business’s core values and strategic objectives.

Controlling Hallucinating LLMs with Truth

Why Choose iSolutionsAI?

Choosing iSolutionsAI means partnering with a team that has over 25 years of experience in crafting custom business solutions.

Our deep understanding of the intricacies of business operations, combined with our expertise in AI, positions us as the ideal partner for businesses looking to navigate the complexities of AI integration.

We consider ourselves to be custom-minded within the AI services industry, highlighting our commitment to providing solutions that are as unique as your business.

Your Guide in the AI Integration Journey

As we look to the future, the role of AI in business operations will only continue to grow.

The question is no longer whether to integrate AI but how to do so in a way that truly benefits your business. iSolutionsAI stands ready to guide you through this journey, leveraging our unique approach and expertise to ensure that your business not only keeps pace with technological advancements but leads the way.

We invite you to make iSolutionsAI your partner in exploring the transformative potential of AI.

With our bespoke solutions and commitment to enhancing your business’s unique strengths, we are poised to help you achieve unprecedented growth and success.

Let iSolutionsAI be your guide in taking the first steps of your AI integration journey, unlocking new possibilities and securing your place at the forefront of innovation. Reach out to us today, and let’s embark on this exciting journey together.

https://medium.com/media/17aecfabb33a2280e4886076782d9e59/href

Predicting Apple’s A.I. Play

iSolutions — Sun, 18 Feb 2024 19:41:28 GMT

Studying the clues from various releases to predict whats next for Apple in A.I.

Apple’s AI “breadcrumbs” timeline

Amid the flurry of headlines dominated by tech giants making bold strides in artificial intelligence, Apple appears to navigate a quieter path, seemingly absent from AI industry news.

However, a closer examination reveals a trail of breadcrumbs, hinting at the company’s strategic foray into AI that could soon reshape the landscape.

This understated approach, characterized by selective acquisitions, covert project developments, and meticulous integration of AI into its ecosystem, suggests Apple is crafting a future where AI enhances its suite of products in uniquely Apple ways.

By piecing together these clues, we get a glimpse what might be Apple’s AI play.

Acquisitions:

In January of 2020, Apple acquired Xnor.ai , which began as a process for making machine learning algorithms highly efficient — so efficient that they could run on even the lowest tier of hardware out there, things like mobile devices that use only a modicum of power. Yet using Xnor’s algorithms they could accomplish tasks like object recognition, which in other circumstances might require a powerful processor or connection to the cloud.

Later in 2020, Apple acquired Vilynx, who developed a self-learning AI platform for Media to understand content and personalize the user experience on publisher websites, mobile/OTT apps and optimize content creation and distribution. Deep metadata tagging is the base to drive automated previews, recommendations and smart search.

Apple has also recently acquired Datakalab, a Paris-based AI startup known for its expertise in data compression and image analysis. Finalized in December 2023 but only disclosed recently, this acquisition highlights Apple’s aggressive strategy in enhancing efficient, low-power AI algorithms that can operate independently of cloud services.

Remarkably, Apple acquired as many as 32 AI startups in 2023 alone, topping the acquisition lists of tech companies globally.

What this could mean: The integration of technologies from Xnor.ai, Vilynx, and Datakalab could lead to a transformative upgrade in Siri’s capabilities, shifting towards enhanced edge computing. This would enable Siri to process data locally on devices, potentially transforming user interactions by making Siri a more integral and trusted part of the Apple ecosystem. Such advancements would not only enhance the responsiveness and capability of Siri but also underscore Apple’s commitment to privacy and security, reinforcing its competitive edge in the tech industry.

AJAX:

In July of 2023, rumors began about “AJAX”, Apple’s internal code name for “Apple GPT” based on Apple’s proprietary language model framework, Ajax.

The development of Apple GPT is part of Apple’s broader efforts in the field of artificial intelligence. The company has been relatively quiet about these efforts compared to other tech giants.

The internal name for Apple’s language model framework is “Ajax”. The origin of this name is unclear, but it could be a combination of Google’s “JAX” and the “A” from Apple.

The Apple GPT project is still under development. However, it’s reported that Apple is planning to make a major AI-related announcement in 2024, which could be the general release of Apple GPT.

Since, this author has confirmed the existence of AJAX internally within Apple.

What this could mean: The rumored development of AJAX or “Apple GPT” could herald a new era of AI integration within Apple’s ecosystem, potentially integrating it with Safari and Spotlight for smarter, more context-aware search results, AI-driven health and fitness advice, advanced educational tools, creative and content generation applications, to new forms of interactive entertainment. Apple’s focus on privacy and the user experience could set AJAX apart in the crowded field of AI and language models.

Ferrett:

In October 2023, Apple released Ferret, an open-source, multimodal Large Language Model (LLM) developed in collaboration with Cornell University, which represents a significant pivot from its traditionally secretive approach towards a more open stance in the AI domain.

GitHub - apple/ml-ferret

Ferret distinguishes itself by integrating language understanding with image analysis, enabling it to not only comprehend text but also analyze specific regions within images to identify elements and use these in queries. This capability allows for more nuanced interactions, where Ferret can provide contextual responses based on both text and visual inputs.

What this could mean: The introduction of Ferret into Apple’s ecosystem could potentially revolutionize how users interact with Apple devices, offering enhanced image-based interactions and augmented user assistance. For instance, Siri could leverage Ferret’s capabilities to understand queries about images or perform actions based on visual content, significantly improving the user experience by providing more accurate and context-aware responses. Ferret could enrich media and content understanding, improving the organization, search functionality within Apple’s Photos app, and even offering more personalized content recommendations across Apple’s services

MLX:

In December 2023, Apple released MLX and MLX Data signifies a pivotal shift towards empowering developers to create more sophisticated AI applications that are optimized for Apple Silicon.

The MLX framework, inspired by PyTorch, Jax, and ArrayFire but with the unique feature of shared memory, simplifies the process for developers to build models that work seamlessly across CPUs and GPUs without the need to transfer data.

https://medium.com/media/08c871e0fea2f5c5d1dd6618d745f4fa/href

MLX is a NumPy-like array framework designed for efficient and flexible machine learning on Apple silicon, brought to you by Apple machine learning research.

The Python API closely follows NumPy with a few exceptions. MLX also has a fully featured C++ API which closely follows the Python API.

What this could mean: This could lead to a new era of generative AI apps on MacBooks, which may include capabilities similar to Meta’s Llama or Stable Diffusion.

HUGS:

Also in December 2023, Apple Machine Learning released “HUGS: Human Gaussian Splats,” in collaboration with the Max Planck Institute for Intelligent Systems, introduces a novel approach to create animatable human avatars and scenes from monocular videos using 3D Gaussian Splatting.

This method efficiently separates and animates humans within scenes, achieving state-of-the-art rendering quality and speed. It addresses challenges in animating 3D Gaussians, optimizing for realistic movement and enabling novel pose and view synthesis at high speeds, significantly outperforming previous methods in both training time and rendering speed.

HUGS: Human Gaussian Splats

https://mlr.cdn-apple.com/video/novel_pose_view_2_79fa4a7a4f.mp4

https://mlr.cdn-apple.com/video/novel_multihuman_scene_1_c5b2b300a.mp4

What this could mean: Apple’s “HUGS: Human Gaussian Splats” could enable more realistic and interactive 3D animations from simple video inputs. This could lead to advancements in augmented reality experiences, improved virtual assistants, and more immersive gaming and social media applications on Apple devices. The technology’s efficiency in rendering and animating could also enhance user experiences across the Apple ecosystem, making digital interactions more lifelike and engaging.

Flash Memory:

In December 2023, Apple released a research paper titled “LLM in a flash: Efficient Large Language Model Inference with Limited Memory,” noted that flash storage is more abundant in mobile devices than the RAM traditionally used for running LLMs.

Their method cleverly bypasses the limitation using two key techniques that minimize data transfer and maximize flash memory throughput:

Windowing: Think of this as a recycling method. Instead of loading new data every time, the AI model reuses some of the data it already processed. This reduces the need for constant memory fetching, making the process faster and smoother.
Row-Column Bundling: This technique is like reading a book in larger chunks instead of one word at a time. By grouping data more efficiently, it can be read faster from the flash memory, speeding up the AI’s ability to understand and generate language.

The combination of these methods allows AI models to run up to twice the size of the iPhone’s available memory, according to the paper. This translates to a 4–5 times increase in speed on standard processors (CPUs) and an impressive 20–25 times faster on graphics processors (GPUs). “This breakthrough is particularly crucial for deploying advanced LLMs in resource-limited environments, thereby expanding their applicability and accessibility,” write the authors.

What this could mean: Apple’s research on running Large Language Models (LLMs) efficiently on mobile devices could herald a new era of on-device AI processing. By optimizing data handling through “windowing” and “row-column bundling,” Apple has potentially developed a method to significantly speed up AI inference on standard CPUs and GPUs. This breakthrough could lead to faster and more powerful AI functionalities directly on iPhones and iPads without relying on cloud computing, enhancing user experience while maintaining Apple’s strong stance on privacy.

Siri Summarizations:

In January 2024, leaks on iOS 18 revealed Siri Summarization functionality that specifically reference OPENAI.

The leaked images show specific functions referencing both AJAX and OpenAI.

Specific reference to “OpenAIGPT”

The mentions of “OpenAISettings” section and references to issues such as “SummarizationOpenAIError” and a missing OpenAI API key.

This suggests that there’s an attempt to make an API call to OpenAI’s service, likely for a feature involving text summarization.

Specific reference to “AJAXGPTonDevice”

What this could mean: This could indicate that Apple is testing or developing a feature using OpenAI’s language models for summarization purposes within its software, as suggested by the “SummarizationOpenAIError.” The code seems to be part of an internal testing process for integrating OpenAI’s language models into Apple’s ecosystem, potentially for improving Siri’s functionalities or other text-based features in iOS.

MGIE:

In February of 2024, Apple released MGIE, representing a significant advancement in instruction-based image editing.

Utilizing multimodal large language models (MLLMs), MGIE interprets natural language commands for precise pixel-level image manipulations, covering a spectrum from Photoshop-style modifications to global photo enhancements and detailed local edits.

Developed in collaboration with the University of California, Santa Barbara, and showcased at ICLR 2024, MGIE underscores Apple’s growing AI research capabilities, offering a practical tool for creative tasks across personal and professional domains.

Apple made an AI image tool that lets you make edits by describing them

What this could mean: MGIE’s release could significantly propel Apple’s AI capabilities, especially in creative and personalization applications. This move may lead to more intuitive interfaces for content creation, potentially integrating into existing Apple products to offer advanced editing features directly within the ecosystem. MGIE’s integration into Apple’s software or hardware offerings could revolutionize user interaction with devices, making advanced image editing accessible directly from iPhones, iPads, or Macs. Imagine Siri or Photos app leveraging MGIE for editing commands, enhancing user creativity without complex software

Keyframer:

Also in February 2024, Apple released Keyframer, a generative AI tool for animating 2D images using text descriptions, showcasing the potential of LLMs in animation.

It simplifies the animation process, allowing users to animate SVG images through text prompts without coding knowledge. While promising, Keyframer is in the prototype stage, highlighting the evolving landscape of AI in creative fields and suggesting a future where AI tools could significantly augment creative workflows.

SVG images and text descriptions fed into Keyframer are automatically converted into animation code. Image: Apple

What this could mean: Keyframer could be integrated into Apple’s software ecosystem, enhancing creative tools in applications like Final Cut Pro, iMovie, or even Pages and Keynote, by enabling easy animation creation.

XCode AI:

In February of 2024, reports of an Apple Xcode Code AI Tool emerged. A report from Bloomberg says Apple has expanded internal testing of new generative AI features for its Xcode programming software and plans to release them to third-party developers this year.

Apple also reportedly looked at potential uses for generative AI in consumer-facing products, like automatic playlist creation in Apple Music, slideshows in Keynote, or AI chatbot-like search features for Spotlight search.

What this could mean: Apple’s development of AI coding tools could be transformative for its suite of developer services, particularly Xcode. These tools, potentially rivaling GitHub’s Copilot, would assist developers in writing code more efficiently, likely by suggesting code snippets, auto-completing lines of code, or even generating code from comments. This could greatly speed up the development process, reduce bugs, and make development more accessible to a broader range of skill levels.

iWork.ai:

In Feb of 2024, According to BuyAIDomains, Apple became the owner of the domain very recently, with its company name and business address of Apple Park in the owner records of “iWork.ai”.

What this could mean: This domain offers speculation that its office suite, consisting of Pages, Keynote, and Numbers, could all be getting some big artificial intelligence features soon.

REaLM:

Apple’s research paper, “ReALM: Reference Resolution As Language Modeling,” introduces a groundbreaking method to enhance how AI understands and interacts with visual and conversational contexts. This research is pivotal as it addresses the challenge of AI comprehending ambiguous references within conversations or related to elements visible on a screen, such as buttons or text. Traditional models struggled with these tasks due to their reliance on vast amounts of data and computing power, which are not always feasible in on-device applications.

Apple’s solution, ReALM, revolutionizes this by transforming reference resolution into a language modeling problem, making it more adaptable and less resource-intensive. By employing smaller, fine-tuned language models, ReALM achieves impressive performance gains in resolving references, particularly for on-screen content. This approach not only boosts the AI’s speed and efficiency but also its ability to operate independently of the cloud, thereby enhancing privacy and usability in mobile settings. The integration of this technology could significantly improve user interaction with devices, allowing more intuitive and seamless control through natural language commands.

What this could mean: Apple’s research could set a new standard for AI interactions, making digital assistants like Siri more perceptive and helpful in everyday tasks. This development promises not only to enhance the functionality of current devices but also to drive innovation in new areas, including accessibility and user interface design. Apple’s focus on on-device processing and privacy-first methodologies positions ReALM as a forward-thinking solution that aligns with modern data security and efficiency needs, potentially influencing future AI applications across the tech industry.

OpenELM:

Apple has introduced OpenELM, a new language model characterized by a unique architecture that optimizes parameter allocation across its layers. This model, detailed in the paper “OpenELM: An Efficient Language Model Family with Open Training and Inference Framework,” showcases a significant improvement in accuracy, outperforming other models of similar size. OpenELM utilizes a layer-wise scaling strategy within its transformer model, enhancing its efficiency.

This approach has allowed it to achieve higher accuracy with fewer pre-training tokens, making it a compelling option for resource-efficient AI processing.

Apple’s OpenELM is not just about architectural innovation; it also marks a shift towards transparency in AI research. By releasing the complete framework for training and evaluation, including code and pre-training configurations, Apple supports open research and replicability, a move that can accelerate advancements across the AI community.

OpenELM’s performance is bolstered by its deployment on publicly available datasets, and its codebase is accessible for customization and improvement on platforms like GitHub and HuggingFace. This open-source approach is poised to foster a more collaborative and inclusive environment for AI research and development.

What this could mean: The implications of OpenELM’s development are vast, particularly for Apple’s ecosystem. Integrating OpenELM could enhance the capabilities of Apple’s devices, especially in handling complex AI tasks efficiently on-device without needing to rely heavily on cloud computing. This could lead to faster, more responsive applications, and services that can operate with heightened privacy and security. For users, this means more powerful, seamless interactions with their devices, potentially transforming how they engage with AI-driven features. As Apple continues to integrate these advancements, it could significantly shift the competitive landscape, emphasizing the importance of efficient, scalable, and open AI systems.

CATLip:

Apple’s research paper “CatLIP: CLIP-level Visual Recognition Accuracy with 2.7× Faster Pre-training on Web-scale Image-Text Data” introduces a transformative approach called CatLIP, which accelerates the pre-training of visual models.

By reframing image-text pre-training as a classification task rather than a contrastive learning task, CatLIP eliminates the need for pairwise similarity computations, significantly speeding up the training process without sacrificing performance. This method enables models to achieve similar downstream accuracy to more traditional methods while using less computational resources, demonstrating its potential to handle large-scale data more efficiently.

CatLIP’s methodology is not only about speed but also about making AI training more accessible and feasible on a broader scale. By circumventing the computational burdens of contrastive learning, CatLIP allows for quicker iterations and potentially lower costs, making it suitable for applications where rapid model updates are crucial, such as mobile and edge devices.

Plus, the open-source availability of CatLIP’s training framework encourages further innovation and collaboration in the AI community, supporting a more inclusive development environment.

What this could mean: The integration of CatLIP into Apple’s ecosystem could significantly enhance the AI capabilities of its devices, particularly in improving the efficiency and responsiveness of AI-driven features. For instance, this could lead to better performance of visual recognition tasks on iPhones and iPads, even in data-intensive scenarios. It also aligns with Apple’s privacy-focused strategy by enabling more powerful on-device processing, which reduces dependency on cloud-based computations. This could be a game-changer in how future Apple devices handle AI tasks, making them faster and more efficient while maintaining user privacy.

STEER:

Apple’s innovative approach to improving voice assistant interactions is encapsulated in their new development, STEER (“Semantic Turn Extension-Expansion Recognition”), as described in their latest research paper. STEER is designed to identify and facilitate ‘steering’ — a user’s follow-up commands that aim to modify or clarify previous voice commands.

This technology addresses the frequent interruptions and repetitions in voice interactions, which often degrade user experience. By accurately detecting steering commands, STEER reduces the need for users to restart or rephrase their requests, thus smoothing the flow of interaction.

The paper introduces STEER+, an enhanced version of the model that incorporates Semantic Parse Trees (SPTs) to provide context about the intent and entities in a conversation. This addition significantly improves the system’s ability to understand and process user commands, especially in complex queries involving named entities. The use of SPTs marks a substantial advancement in making voice assistants more responsive and accurate, mirroring more natural human-to-human interactions.

What this could mean: The integration of STEER and STEER+ into Apple’s suite of products could revolutionize how users interact with their devices. For instance, future versions of Siri could become much more adept at handling follow-up questions or commands without requiring the user to repeat the entire context. This capability could extend to other Apple services and devices, enhancing the overall ecosystem’s interactivity and user-friendliness. Moreover, the improvements in conversational AI could lead to broader applications in customer service, accessibility features, and interactive learning environments, setting new industry standards for voice assistant technologies.

WWDC:

Apple’s annual Worldwide Developer Conference typically takes place in June. Apple has not yet announced the dates for the event.

A new rumor claims that Apple’s generative AI technology will be included in Siri not just locally on iPhones, but also integrated into other services being announced at Apple’s 2024 Worldwide Developer Conference.

What this could mean: Get your popcorn ready, clarity on these rumors could be coming in June.

The Prediction:

A Unified Vision for Apple’s AI Future

The convergence of Apple’s various AI advancements suggests a future where its devices and ecosystems are not only more powerful but also more intuitive, responsive, and privacy-focused. Integrating technologies from acquisitions like Xnor.ai, Vilynx, and Datakalab with new developments such as AJAX, Ferret, and CatLIP, Apple is poised to make significant strides in on-device AI processing. This shift will enhance Siri’s capabilities, making it an indispensable, trusted assistant that processes data locally, thereby bolstering user privacy and security.

Transformative On-Device Experiences

With the integration of STEER and STEER+, Siri could handle follow-up commands more efficiently, reducing the need for users to restate their requests. This improvement would extend across Apple’s ecosystem, from iOS and iPadOS to macOS, enhancing user interaction with all Apple devices. Technologies like OpenELM and ReALM will ensure these AI models operate efficiently on-device, minimizing the reliance on cloud computing. This will lead to faster, more secure AI functionalities, enabling features like context-aware search results, AI-driven health advice, advanced educational tools, and creative applications directly on iPhones, iPads, and Macs.

The rumored development of AJAX, or “Apple GPT,” could herald a new era of AI integration within Apple’s ecosystem, potentially enhancing Safari and Spotlight for smarter, more context-aware search results, providing AI-driven health and fitness advice, advanced educational tools, creative content generation applications, and new forms of interactive entertainment. Apple’s focus on privacy and user experience could set AJAX apart in the crowded field of AI and language models.

Enhanced Software and Development Tools

Apple’s AI advancements will also significantly impact its software and developer tools. Integrating Keyframer and MGIE into applications like Final Cut Pro, iMovie, Pages, and Keynote will streamline the creative process, allowing users to generate and edit content with simple commands. In the realm of development, AI-powered tools akin to GitHub’s Copilot could be integrated into Xcode, making coding more efficient and accessible by suggesting code snippets and auto-completing lines of code. Apple’s office suite, consisting of Pages, Keynote, and Numbers, could also see big AI-driven enhancements, making these applications more powerful and user-friendly.

A New Era of Interactivity and Privacy

Looking forward, the integration of these AI technologies will not only enhance individual user experiences but also foster more secure and responsive interactions across Apple’s services. Technologies like Ferret will revolutionize image-based interactions, allowing Siri to understand and act on visual content, thereby improving the functionality of apps like Photos and enhancing media organization and search. In augmented reality and gaming, advancements such as “HUGS: Human Gaussian Splats” will create more realistic and interactive 3D animations, making digital interactions more lifelike and engaging.

The incorporation of MGIE could propel Apple’s AI capabilities, especially in creative and personalization applications, leading to more intuitive interfaces for content creation. This move may offer advanced editing features directly within the ecosystem, allowing users to edit images and videos by simply describing the changes they want.

Broader Implications for the Apple Ecosystem

Apple’s focus on efficient, scalable, and privacy-focused AI will set new standards in the tech industry. By enabling powerful on-device AI processing, Apple will ensure that its devices are not only more responsive and capable but also more secure, maintaining the company’s strong stance on user privacy. This holistic approach to integrating AI across its ecosystem will transform how users interact with technology, making it more seamless, intuitive, and personalized, thereby solidifying Apple’s position as a leader in the AI-driven future.

So…What’s Next Then?

Apple, often perceived as trailing behind in the artificial intelligence (AI) race, seems to be strategically biding its time, gearing up for a substantial leap in the field.

Despite initial impressions of their hesitant approach, recent developments and rumors suggest that Apple is actively enhancing its AI capabilities.

They are reportedly collaborating with industry giants like OpenAI and Google, and bolstering their proprietary AI model, Ajax. This implies a significant shift from a cautious engagement to a more robust involvement in AI, underpinned by substantial research efforts that hint at future innovations.

The core of Apple’s AI advancements appears to be focusing on efficiency and utility, particularly with its virtual assistant, Siri.

By developing smaller, more efficient AI models capable of running directly on devices without internet dependency, Apple is paving the way for a more responsive and capable Siri.

This on-device processing is not just a technical feat but a strategic move to ensure privacy and speed, enhancing user experience significantly. Research efforts like the EELBERT system show Apple’s commitment to creating powerful yet compact AI models that maintain high performance while occupying less space and consuming less power.

Looking ahead, Apple’s AI research is shaping an ecosystem where AI functionalities extend beyond conventional applications. With AI-powered features anticipated to be embedded in various services — from health monitoring to creative tools and potentially transforming Siri into a proactive and almost prescient assistant — the possibilities are expansive.

Apple’s vision for AI seems not only to enhance functionality but also to integrate seamlessly into daily life, ensuring that their devices are not just tools but proactive participants in managing digital and physical environments.

This approach could redefine user interactions with technology, making Apple’s AI advancements not just an upgrade but a transformative experience for its users.

Five A.I. Predictions for 2024

iSolutions — Fri, 02 Feb 2024 20:12:35 GMT

Our 5 best predictions of what to expect in AI in 2024

Five AI Predictions for 2024

TL;DR:

- NVIDIA’s increased production of AI GPUs will significantly reduce wait times, enabling broader and faster AI development.
- AI will integrate deeper, logical reasoning, enhancing its ability to handle complex tasks and improve decision-making.
- Large Language Models will evolve to form the core of new computing paradigms, expanding their roles and capabilities.
- AI systems will learn and improve on their own through techniques like reinforcement learning, surpassing human expertise in specific areas.
- AI models will continuously update in real-time, maintaining relevance and effectiveness in rapidly changing environments.

Prediction 1: No More Compute Bottlenecks

In 2023, AI model trainers faced delivery wait times ranging from 36 to 52 weeks for NVIDIA’s AI-focused A100 and H100 processors. However, it’s important to highlight that NVIDIA is actively ramping up production of its AI GPUs. As a result, customers like Meta Platforms are anticipating significantly larger quantities of these chips this year.

According to Meta Platforms’ CEO, Mark Zuckerberg, the social media giant is poised to acquire 350,000 units of NVIDIA’s flagship H100 graphics cards by the end of 2024. In 2023, Meta received an estimated 150,000 H100s, as reported by market research firm Omdia. If this estimate holds true, Zuckerberg’s statement suggests that Meta expects an additional 200,000 units of this processor in 2024.

NVIDA indicated they would build another 1.5 MILLION H100s in 2024. That’s a quarter of a trillion operations per second per person which means we could be processing on the order of one word per second on a hundred billion parameter model for everyone on earth.

This leap in computing capability by NVIDIA can be likened to the transformation brought about by the expansion of broadband internet. Just as the widespread availability of broadband enabled a revolution in internet usage and accessibility, NVIDIA’s advancements in GPU technology are poised to remove compute bottlenecks in AI adoption.

In the past, just like the internet’s growth was throttled by the slow rollout of physical infrastructure like cable networks, AI’s potential had been bottlenecked by the availability and capability of computing resources.

However, with NVIDIA’s announcement and the expected rollout of these powerful GPUs, compute power will no longer be a limiting factor in AI adoption but rather a force multiplier, enabling more complex and powerful AI models to be trained and deployed more efficiently.

The landscape seems poised for a significant change, where compute resources will be abundant and more accessible, thus driving the AI field forward at an unprecedented pace.

Prediction 2: Embracing “System 2” in AI

In the rapidly evolving landscape of artificial intelligence, a significant trend is emerging in 2024: the integration of “System 2” thinking into AI applications, moving beyond the quick, intuitive responses characterized by System 1.

This shift mirrors the dual-process theory in psychology, popularized by Daniel Kahneman in his book “Thinking, Fast and Slow,” which delineates two modes of thought: the fast, instinctive, and emotional System 1, and the slower, more deliberate, and logical System 2.

“I need an answer, but I want you to take your time to use a tree of thought and reflection and convert response time to an accuracy factor.”

Presently, AI, especially in the form of Large Language Models (LLMs) like ChatGPT, predominantly operates in the System 1 mode. These models quickly generate responses based on identifying patterns in vast training datasets. This rapid, pattern-based processing is efficient but sometimes prone to and inaccuracies.

As we venture into 2024, there’s a growing emphasis on incorporating System 2 methodologies into AI. This involves supplementing the quick response generation of LLMs with more structured, logical reasoning capabilities.

The integration of System 2 thinking aligns with the development of neuro-symbolic AI models. These models aim to unify the compositional and causal reasoning strengths of symbolic models (akin to System 2) with the pattern-recognition capabilities of deep learning (akin to System 1).

The resultant AI systems are expected to handle tasks involving complex correlations and causal structures more effectively, addressing the limitations of purely deep learning-based approaches.

Incorporating System 2 thinking into AI can significantly enhance the technology’s application across various domains. It promises not only faster and intuitive responses but also more sophisticated reasoning and deeper understanding, particularly in complex scenarios where quick pattern-based responses fall short. This could lead to AI systems that are better at inferencing, problem-solving, and even more aligned decision-making.

In 2024, iSolutions will be releasing our “System 2” execution environment for business use called, iNtuition. For more information in iNuition, converse with our dedicated GPT here:

ChatGPT - iNtuition by iSolutionsAI

Prediction 3: LLMs as the Modern Operating System

As we move into 2024, a revolutionary trend is emerging in the field of artificial intelligence — the transformation of Large Language Models (LLMs) from mere chatbot functionalities to the foundational core of a new kind of Operating System (OS).

This shift gives a glimpse into a new era in computing, analogous to the transition from viewing early computers as simple calculators to recognizing their potential as comprehensive digital platforms.

One of my favorite tweets of 2023 was from Andrej Karpathy where he equated LLMs as not a chatbot, but rather the kernel process of a new Operating System:

https://medium.com/media/e936185095906c2034a29520779209c9/href

The capabilities of LLMs are expanding far beyond basic chatbots. They are increasingly taking on roles akin to various elements of traditional operating systems:

- DISK= The Internet and Embeddings as Data Repositories: LLMs are utilizing the internet and embeddings much like a disk in an OS, serving as vast storage spaces for information and internal memory.
- RAM=Context Window as Active Memory: The context window in LLMs is mirroring the function of RAM in a computer, handling the immediate processing and temporary storage of data.
- SOFTWARE=Tools and Applications as Software: Various tools developed on the LLM framework are becoming analogous to software in an OS, showcasing the versatility of LLMs in performing a range of tasks from content creation to complex decision-making processes.
- PERIPHERALS=Multimodal Inputs and Outputs: LLMs’ ability to process and respond to diverse inputs, including text, audio, and vision, reflects the functionality of an OS in managing different peripherals.
- I/O=Collaboration Amongst LLMs: The interaction and integration of different LLMs resemble the network operations in an OS, emphasizing the collaborative and interconnected nature of these AI systems.

The current “single-threaded” execution of LLMs, reminiscent of the early days of computing, hints at the untapped possibilities for more sophisticated, multi-threaded operations in the future.

2024 should prove to be a transformative phase in AI and computing. The progression of LLMs from basic chatbot functions to the core of a new operating system paradigm marks a significant leap in the way we interact with and perceive AI technologies.

This shift is not just a step forward; it’s the beginning of a whole new era in computing, where the possibilities and potential applications of LLMs are vast and still largely unexplored.

Prediction 4: Self-Improvement in AI: Learning Like AlphaGo Zero

The concept of “Self-Improvement” using reinforcement learning in AI is best described by the remarkable journey of AlphaGo Zero.

The AlphaGo Zero Breakthrough
This AI system, developed by DeepMind, represents a significant leap in the field, showcasing an AI’s ability to teach itself and excel beyond human expertise.

AlphaGo Zero’s learning method was a radical departure from traditional AI training approaches. Unlike its predecessors, AlphaGo Zero didn’t learn from human games or human interaction. Instead, it learned the game of Go from scratch through a process of self-play, using a single neural network combined with a powerful search algorithm.

The unique aspect of AlphaGo Zero’s learning was its method of reinforcement learning, where it essentially became its own teacher. Starting with no knowledge of the game, it played against itself, progressively tuning and updating its neural network to predict moves and determine the eventual winner. This process of iterative self-improvement led to rapid advancements in its capabilities.

One of the most astonishing aspects of AlphaGo Zero’s development was the speed at which it surpassed human-level play. In just three days, it defeated the previous version of AlphaGo, which had itself defeated a world Go champion.

After 40 days of self-training, AlphaGo Zero reached an even higher level of play, surpassing the “Master” version of AlphaGo, which was considered the world’s best player.

This is the same approach used by University of California — Berkeley applied when they built a human-sized bot that uses artificial intelligence (AI) techniques to teach itself how to walk in the physical world. UC researchers Ilija Radosavovic and Bike Zhang wondered if “reinforcement learning,” a concept made popular by large language models (LLMs) last year, could also teach the robot how to adapt to changing needs. To test their theory, the duo started with one of the most basic functions humans can perform — walking.

Ilija Radosavovic on Twitter: "we have trained a humanoid transformer with large-scale reinforcement learning in simulation and deployed it to the real world zero-shot pic.twitter.com/WzOIMQXTaD / Twitter"

we have trained a humanoid transformer with large-scale reinforcement learning in simulation and deployed it to the real world zero-shot pic.twitter.com/WzOIMQXTaD

https://medium.com/media/874d1c525bc7eb10b86e6f9a39c21065/href

AlphaGo Zero’s approach to learning and self-improvement leveraging LLMs and “reinforcement learning” carries profound implications for the future of AI.

It demonstrates that AI can develop an understanding of complex systems and strategies without external data or human expertise, relying solely on self-generated data and learning algorithms.

Prediction 5: Continuous Training

The concepts of continuous training in AI can be explored through the lens of models having dynamic, real-time access to training datasets. This approach is set to revolutionize the way AI systems adapt and personalize content and experiences.

The evolving field of AI now enables models to continuously learn and adapt in real-time, enhancing their performance and relevance. This is achieved through continuous training, where AI models are retrained to adapt to changes in data before being redeployed. The trigger for a rebuild can be changes in data, model adjustments, or code modifications.

Continuous training is crucial because machine learning models can become stale over time due to data drift or concept drift, where the statistical properties of target variables or the statistical distribution of production data change.

In practice, this means AI systems can dynamically adjust to new information, maintaining their effectiveness in rapidly changing environments. For example, an AI model used for fraud detection might need frequent retraining to adapt to evolving fraudulent techniques.

Adapting traditional machine learning workflows to support real-time inference involves overcoming several challenges. It requires a robust infrastructure capable of handling fast-moving data streams and deploying real-time models effectively. This includes ingesting and processing user events, computing and fetching online features with minimal latency, and synchronizing the served model with online feature stores without downtime.

Wrapping Up

Let’s take a step back and marvel at the AI journey we’re embarking on as we head into 2024. It’s not just about the tech getting smarter; it’s about how these advancements are poised to redefine our everyday interactions.

We’re talking about a seismic shift here — from the sheer computing power becoming more accessible to everyone, to AI thinking more deeply and methodically like us humans. And then, there’s this whole new angle of seeing LLMs as the backbone of future operating systems. It’s like we’re giving AI a whole new playground to innovate and grow.

But what really gets me excited is the self-learning exemplified by AlphaGo Zero — an AI teaching itself to outsmart human intelligence. And let’s not forget about the customizations and personalizations — it’s like AI is getting a real-time update on what we need, even before we know we need it.

https://medium.com/media/17aecfabb33a2280e4886076782d9e59/href