Customer Story

How MakeMyTrip cut LLM costs by 63% with zero quality loss

MakeMyTrip's AI travel assistant Myra serves tens of thousands of conversations monthly across a complex multi-agent architecture. With Divyam.AI, they moved from static model assignments to intelligent, data-driven optimization.

Online Travel Platform 6 weeks to production Millions of daily active users

63%

Cost reduction

Zero

Quality loss

1 line

Code change

MakeMyTrip is India's largest online travel platform, serving millions of daily active users across flights, hotels, holidays, and more. To enhance the customer experience, MMT built Myra, an AI-powered travel assistant designed to handle holistic travel needs through natural conversation.

9:41

Can you suggest unique and offbeat travel destinations in Japan?

I see you're interested in unique and offbeat destinations in Japan! Are you looking for specific types of experiences, like nature, culture, or adventure?

maybe a mix

Japan is a treasure trove of unique destinations. Here are some recommendations:

Days 1-2 Naoshima Island Art museums on the Seto Inland Sea

Days 3-4 Takayama Edo-period streets, morning markets

Days 5-6 Kanazawa Kenrokuen Garden, samurai district

Day 7 Yoshino 30,000 cherry trees, mountain temple

Want me to find flights and hotels for these dates?

Ask me anything

Meet Myra

Myra is a GenAI Trip Planning Assistant built on an agentic AI framework and a network of specialised AI agents spanning flights, accommodation, holidays, ground transport, visas, and forex. The core idea is to convert natural-language travel intent into bookable, transaction-ready outcomes using real-time availability, pricing, customer-preference data, supply data, user-generated content, personalisation, and verification systems. Each agent is powered by its own LLM, including LoRA fine-tuned models. The system handles hundreds of thousands of conversations every month.

The Challenge

A million-combination model selection problem

MakeMyTrip's AI team faced a problem that every company building multi-agent AI systems eventually hits: how do you keep up with the LLM landscape when your system depends on multiple models working together?

Today's state of the art requires agents to be bound to a static LLM — a fixed assignment rather than a data-driven approach where the optimal model is chosen just in time. When the team wanted to experiment with different models for cost and quality improvements, the evaluation overhead quickly became the bottleneck.

The math alone was daunting. With 10 candidate models per module and six modules in the Query Planner, there were one million possible combinations to evaluate. For a multi-agent system, comprehensive manual evaluation was operationally impossible.

And even if they could evaluate every combination, static assignment was inherently wasteful. A simple customer query ("cancel my hotel booking") was being processed by the same expensive frontier model as a complex one ("plan a 10-day family trip across Southeast Asia with budget constraints"). There was no mechanism to right-size model selection based on the actual complexity of each request.

What separated MMT's approach was conviction: their leadership had the foresight to recognise that LLM optimization is a complex and perpetual problem, one that would permanently consume their best engineers and data scientists if kept in-house. Their brightest minds belonged on what only MMT can build (Myra), not on the infrastructure layer beneath it.

The Solution

Intelligent optimization that replaces brute-force evaluation

Acting on that conviction, MMT's team turned to Divyam.AI — a platform purpose-built for the very problem they had chosen not to make their own.

Three capabilities stood out:

Linear-time search through exponential space. Divyam.AI's algorithms could navigate the exponential search space, finding optimal model assignments across all modules without brute-force evaluation.
Intelligent benchmark sampling. Reduced experimentation costs dramatically, making continuous evaluation feasible rather than a one-time expensive exercise.
Fine-grained prompt routing. Segment prompts by complexity and select models dynamically along the cost-quality frontier, with constraints on latency and volume commitments.

Critically, Divyam.AI could be deployed within MMT's cloud environment, keeping all customer data within MMT's security perimeter. Decision trails were fully auditable, and quality metrics were surfaced through dashboards that gave MMT's team complete visibility.

Implementation

Divyam.AI was deployed within MakeMyTrip's cloud environment using a structured, phased approach. The Divyam.AI team provided Terraform scripts and a setup guide; MMT's DevOps team stood up the environment (Kubernetes Service, Key Vault, Blob Storage, Internal Load Balancer, and MySQL). Divyam.AI supported through teething issues and validated the setup.

The integration effort on MMT's side was minimal: a single line code change to route their Query Planner modules through Divyam.AI's optimization layer. No re-architecture required. No months-long migration. Divyam.AI slotted into the existing system as a drop-in optimization layer.

Implementation Timeline

Week 1-2

Infrastructure setup on MMT's cloud environment

AKS, Key Vault, Blob Storage, Internal Load Balancer, MySQL

Week 3-4

Integration and benchmarking of Query Planner modules

Single line code change on MMT's side, 6 modules onboarded

Week 5-6

Production rollout with live traffic optimization

Intelligent routing active, 63% cost reduction measured

The Results

63% cost reduction with zero quality loss

Within six weeks of deployment, the results were clear. Divyam.AI reduced MakeMyTrip's LLM costs by 63% with zero quality loss. The system was no longer sending every query to an expensive frontier model. Instead, Divyam.AI's routing layer dynamically matched each prompt to the right-sized model based on complexity, cost constraints, and quality requirements.

The efficiency gain was not just about cost. MMT's team could now experiment with new models as they launched without manually re-evaluating every possible combination. What had been an impossible optimization problem, a million model combinations across six modules, was now handled algorithmically and continuously.

And the integration had been virtually frictionless. A single line code change on MMT's side, deployed within MMT's cloud environment, with full auditability and dashboard visibility into every routing decision.

63%

Cost reduction

Zero

Quality loss

1 line

Code change

6 weeks

To production

Before Divyam.AI

Fixed

Model assignment per agent

1M+

Combinations to evaluate manually

With Divyam.AI

Dynamic

Prompt-level model selection

Automated

Continuous optimization

63% less

Inference cost, zero quality loss

63% savings

What's Next

Expanding optimization across the full Myra architecture

The initial deployment covered Phase 1: the Query Planner modules that form the core of Myra's query understanding pipeline. The natural expansion path extends Divyam.AI's optimization across the rest of the Myra architecture, including the Conversation Manager, flight and hotel specialist agents, and the response synthesis layer.

Each additional module integrated means more optimization surface and compounding cost savings. As new models enter the market, Divyam.AI's continuous evaluation ensures MMT is always running the optimal configuration without manual intervention.

See what Divyam.AI can do for your AI system

Join teams like MakeMyTrip that are cutting LLM costs without sacrificing quality.

Book a Demo More Customer Stories