How MakeMyTrip cut LLM costs by 63% with zero quality loss
MakeMyTrip's AI travel assistant Myra serves tens of thousands of conversations monthly across a complex multi-agent architecture. With Divyam.AI, they moved from static model assignments to intelligent, data-driven optimization.
Online Travel Platform
6 weeks to production
Millions of daily active users
63%
Cost reduction
Zero
Quality loss
1 line
Code change
MakeMyTrip is India's largest online travel platform, serving millions of daily active users across flights, hotels, holidays, and more. To enhance the customer experience, MMT built Myra, an AI-powered travel assistant designed to handle holistic travel needs through natural conversation.
9:41
Can you suggest unique and offbeat travel destinations in Japan?
I see you're interested in unique and offbeat destinations in Japan! Are you looking for specific types of experiences, like nature, culture, or adventure?
maybe a mix
Japan is a treasure trove of unique destinations. Here are some recommendations:
Days 1-2Naoshima IslandArt museums on the Seto Inland Sea
Days 3-4TakayamaEdo-period streets, morning markets
Days 5-6KanazawaKenrokuen Garden, samurai district
Day 7Yoshino30,000 cherry trees, mountain temple
Want me to find flights and hotels for these dates?
Ask me anything
Meet Myra
Myra is a GenAI Trip Planning Assistant built on an agentic AI framework and a network of specialised AI agents spanning flights, accommodation, holidays, ground transport, visas, and forex. The core idea is to convert natural-language travel intent into bookable, transaction-ready outcomes using real-time availability, pricing, customer-preference data, supply data, user-generated content, personalisation, and verification systems. Each agent is powered by its own LLM, including LoRA fine-tuned models. The system handles hundreds of thousands of conversations every month.
The Challenge
A million-combination model selection problem
MakeMyTrip's AI team faced a problem that every company building multi-agent AI systems eventually hits: how do you keep up with the LLM landscape when your system depends on multiple models working together?
Today's state of the art requires agents to be bound to a static LLM — a fixed assignment rather than a data-driven approach where the optimal model is chosen just in time. When the team wanted to experiment with different models for cost and quality improvements, the evaluation overhead quickly became the bottleneck.
The math alone was daunting. With 10 candidate models per module and six modules in the Query Planner, there were one million possible combinations to evaluate. For a multi-agent system, comprehensive manual evaluation was operationally impossible.
And even if they could evaluate every combination, static assignment was inherently wasteful. A simple customer query ("cancel my hotel booking") was being processed by the same expensive frontier model as a complex one ("plan a 10-day family trip across Southeast Asia with budget constraints"). There was no mechanism to right-size model selection based on the actual complexity of each request.
What separated MMT's approach was conviction: their leadership had the foresight to recognise that LLM optimization is a complex and perpetual problem, one that would permanently consume their best engineers and data scientists if kept in-house. Their brightest minds belonged on what only MMT can build (Myra), not on the infrastructure layer beneath it.
The Solution
Intelligent optimization that replaces brute-force evaluation
Acting on that conviction, MMT's team turned to Divyam.AI — a platform purpose-built for the very problem they had chosen not to make their own.
Three capabilities stood out:
Linear-time search through exponential space. Divyam.AI's algorithms could navigate the exponential search space, finding optimal model assignments across all modules without brute-force evaluation.
Intelligent benchmark sampling. Reduced experimentation costs dramatically, making continuous evaluation feasible rather than a one-time expensive exercise.
Fine-grained prompt routing. Segment prompts by complexity and select models dynamically along the cost-quality frontier, with constraints on latency and volume commitments.
Critically, Divyam.AI could be deployed within MMT's cloud environment, keeping all customer data within MMT's security perimeter. Decision trails were fully auditable, and quality metrics were surfaced through dashboards that gave MMT's team complete visibility.
Implementation
Divyam.AI was deployed within MakeMyTrip's cloud environment using a structured, phased approach. The Divyam.AI team provided Terraform scripts and a setup guide; MMT's DevOps team stood up the environment (Kubernetes Service, Key Vault, Blob Storage, Internal Load Balancer, and MySQL). Divyam.AI supported through teething issues and validated the setup.
The integration effort on MMT's side was minimal: a single line code change to route their Query Planner modules through Divyam.AI's optimization layer. No re-architecture required. No months-long migration. Divyam.AI slotted into the existing system as a drop-in optimization layer.
Implementation Timeline
W1
Week 1-2
Infrastructure setup on MMT's cloud environment
AKS, Key Vault, Blob Storage, Internal Load Balancer, MySQL
W3
Week 3-4
Integration and benchmarking of Query Planner modules
Single line code change on MMT's side, 6 modules onboarded
Within six weeks of deployment, the results were clear. Divyam.AI reduced MakeMyTrip's LLM costs by 63% with zero quality loss. The system was no longer sending every query to an expensive frontier model. Instead, Divyam.AI's routing layer dynamically matched each prompt to the right-sized model based on complexity, cost constraints, and quality requirements.
The efficiency gain was not just about cost. MMT's team could now experiment with new models as they launched without manually re-evaluating every possible combination. What had been an impossible optimization problem, a million model combinations across six modules, was now handled algorithmically and continuously.
And the integration had been virtually frictionless. A single line code change on MMT's side, deployed within MMT's cloud environment, with full auditability and dashboard visibility into every routing decision.
63%
Cost reduction
Zero
Quality loss
1 line
Code change
6 weeks
To production
Before Divyam.AI
Fixed
Model assignment per agent
1M+
Combinations to evaluate manually
With Divyam.AI
Dynamic
Prompt-level model selection
Automated
Continuous optimization
63% less
Inference cost, zero quality loss
63% savings
What's Next
Expanding optimization across the full Myra architecture
The initial deployment covered Phase 1: the Query Planner modules that form the core of Myra's query understanding pipeline. The natural expansion path extends Divyam.AI's optimization across the rest of the Myra architecture, including the Conversation Manager, flight and hotel specialist agents, and the response synthesis layer.
Each additional module integrated means more optimization surface and compounding cost savings. As new models enter the market, Divyam.AI's continuous evaluation ensures MMT is always running the optimal configuration without manual intervention.
See what Divyam.AI can do for your AI system
Join teams like MakeMyTrip that are cutting LLM costs without sacrificing quality.