Solve LLM illusion in a conversation, customer-facing use case

by admin · June 23, 2025

Or: Why “we can shut down a generation” may be the smartest question in generating AI

Not long ago, I found myself meeting with technology leaders in large enterprises. We are discussing Parlant as a solution to build a fluent but tightly controlled dialogue agent. The conversation went well – until someone asked a question that caught me completely off guard:

“Can we use Parlant when we shut down the first generation part?”

At first, I honestly meant that was a misunderstanding. Generated AI agents…no generation? It sounds contradictory.

But I stopped. And the more I think about it, the more meaningful the question becomes.

High bets for customer-facing AI

These teams did not play with the demo. Their AI agents are destined to produce – communicate directly with millions of users every month. In this environment, even an error rate of 0.01% is unacceptable. When the result can be compliance failure, legal risk or brand damage, one of ten thousand bad interactions is too many.

On this scale, “very good” is not good enough. Although LLM has come a long way, their free forms still introduce uncertainty – practice, unexpected tone and fact drift.

So no, this question is not ridiculous. Actually it is the key.

Changes in perspective

Later that night, I was thinking. This question makes more sense than I initially realized because these organizations are not lacking in resources or expertise.

In fact, they have full-time conversation designers. These are the responses of trained professionals who can design agency behaviors, produce interactions, and write that are completely aligned with brand voices and legal requirements and enable customers to actually interact with AI – in practice, it’s no easy task!

So, they are not asking for a generation to be shut down out of fear – they ask for it to be shut down because they want and can control their hands.

That was the time to hit me: we kept throwing away the “generated AI agent” incorrectly.

They aren’t necessarily about the generation of open tokens. They are about adaptation: responding to input in context through intelligence. Whether these responses are directly from llm by word or from the planned response bank does not actually matter. What matters is whether they are suitable: compliant, context, clarity and useful.

Hidden keys for hallucination problems

Everyone is looking for fixes for hallucinations. It’s a radical idea: we think it’s already here.

Conversation with the designer.

Like many businesses already do, the conversation designers on your team can not only alleviate the illusion of output, but you are actually ready to eliminate them altogether.

They also bring clarity into customer interactions. deliberately. A fascinating voice. Moreover, the interactions they create can be more effective than Foundation LLM because (itself) LLM (itself) doesn’t sound right in the case of customer facing (itself).

So, I realized why not try a system remodel of the creation with band-aids: Why not bake it to Palant from scratch? After all, Parlant is related to design authority and control. This is to provide the right people with tools to shape the behavior of AI in the world. It’s a perfect match, especially for these enterprise use cases, which can make a lot of money from adaptive conversations if they can trust with real customers.

From insight to product: matching words

That was the breakthrough moment that prompted us to build our puzzle.

Discourse templates allow designers to provide liquid, context-aware templates with proxy responses: a response that feels natural but fully reviewed, versioned and managed. This is a structured way to maintain adaptability like LLM while mastering what is actually said.

Under the hood, the discourse template works in the three-stage process:

Agents draft fluid information based on current situational awareness (interactions, guides, tool results, etc.).
According to the draft message, it matches the closest discourse template in your discourse store
The engine uses variables provided by the tool to replace the matching discourse template (in Jinja2 format as Jinja2 format)

We immediately know that this will work perfectly with Parlant’s hybrid model: a tool for software developers to build reliable agents while allowing business and interactive experts to define how these agents behave. The guy from that particular business also knew immediately that this would be OK.

Conclusion: Authorize the right person

The future of conversational AI is not about removing people from the loop. It’s about empowering the right people to shape and continually improve what AI says and what it says.

With Parlant, the answer can be: the people who know your brand, customers, and your responsibilities.

So the only ridiculous thing is my initial response. Turning off (or at least controlling large-scale control) in customer-facing interactions: This is not ridiculous. It’s probably what it should be. At least in our opinion!

Disclaimer: The views and opinions expressed in this guest article are the author’s views and do not necessarily reflect the official policies or positions of Marktechpost.

Yam Marcovitz is the technical leader and CEO of EMCIE at Parlant. YAM’s background is an experienced software builder with extensive experience in mission-critical software and system architecture, providing a unique approach to developing unique approaches to controllable, predictable and consistent AI systems.