One of the Largest Telecom Operators

Challenge

The second line of support at a major telecom operator handles hundreds of thousands of tickets per month: tariffs, balances, subscriptions, unexpected charges, enabling and disabling services. A single request is not especially difficult. The problem is scale and context collection.

Operators were spending 15 to 25 minutes per ticket because every answer required them to read the level-1 thread, open several internal systems, gather the context, find the right script, and only then write the message.

Why it was harder than it looked

The slide version of this problem is simple: connect AI to internal systems and get instant answers. Production reality is not. Access to internal systems is restricted, the corporate tone is strict, the knowledge base changes constantly, and a single general-purpose agent quickly turns into an unpredictable improviser.

Brand voice was a project of its own. “Your balance is 342 rubles” and “There are 342 rubles on the account” are semantically equivalent, but not equivalent for a regulated brand. The final wording rules were refined more than sixty times on live dialogues.

What we tried and discarded

Approach	What happened	Outcome
One general agent for everything	It handled simple cases but drifted in complex ones and violated the required sequence	Replaced with strict routing and scenario execution
Separate tone editor	Added latency without improving quality	Removed, tone rules were embedded into the main agent
Fake tool responses in early prototypes	Looked fast in demos but collapsed as soon as real systems were connected	Replaced with production-grade tool integration
Static knowledge base assumptions	The agent started citing stale conditions	Replaced with continuous sync and dual indexing

Architecture

The final design separates “understand what the user asked” from “decide what to do with it”.

First, one agent classifies the topic from a dynamic list. Then a second agent narrows the request down to a subtopic and sees only the options allowed under that topic. For each topic/subtopic pair, the database stores a concrete execution plan: which data to pull, from which systems, in what order, and what kind of completion is allowed.

That plan becomes the main agent’s highest-priority instruction. As a result, the business logic is editable through the scenario matrix instead of code releases, while the answer remains grounded in live data from billing, CRM, and the knowledge base.

Result

Once the system hit live traffic, the client expanded it to new categories instead of treating it as a prototype. Operators stopped acting as manual context collectors and became reviewers of a prepared response. What used to take 15 to 25 minutes can now be assembled in seconds.

Platform modules used in this project

Chat & Agents pydantic-ai

A cascade of three agents: two strict-choice classifiers determine topic and subtopic, and the main agent executes the scenario with seven tools.

Documents Weaviate

A dual knowledge index: one store for semantic retrieval and another for exact data with tables and numbers, synchronized from the corporate knowledge base.

Inference

Qwen 3 235B is used in strict-choice mode for classification and in a freer mode for answer generation inside the client’s corporate tone.

Guardrails

Explicit rule hierarchy for off-topic filtering, operator handoff, and brand-voice boundaries.

All platform modules →