Large language models have moved from novelty to infrastructure. The next frontier is not bigger models, but better agents—systems that can plan, use tools, and work within business constraints. Kimi, Moonshot AI's long-context model, offers a practical path to building these agents without requiring a PhD in machine learning.
This guide walks through building a custom AI agent for a specific business scenario: handling used car enquiries. The pattern applies to any domain where you need an AI to answer questions, check availability, and provide technical details within guardrails.
Generic chatbots fail in business contexts because they lack:
A custom agent combines these elements into a system that actually helps customers rather than frustrating them.
Building a useful agent requires thinking through five interconnected layers:
Start by listing what the agent must do:
Equally important: define what it must not do. Never negotiate final prices. Never guarantee loan approval. Never disparage competitors.
Kimi responds to system prompts that establish role, constraints, and available tools:
You are a helpful assistant for Prestige Motors used car dealership.
YOUR ROLE:
- Answer customer questions about vehicles in our inventory
- Check availability and pricing using the provided tools
- Schedule test drives when requested
- Escalate complex negotiations to human sales staff
AVAILABLE TOOLS:
- search_inventory(query): Search vehicles by make, model, year, or feature
- check_availability(vehicle_id): Check if a vehicle is still available
- get_pricing(vehicle_id): Get current listed price and financing estimate
- schedule_test_drive(vehicle_id, datetime, contact_info): Book a test drive
- flag_for_followup(reason, urgency): Alert human staff to intervene
CONSTRAINTS:
- Always verify availability before confirming anything
- Be honest about vehicle condition and history
- Do not negotiate prices below listed amounts
- Do not make promises about financing approval
Kimi supports function calling (tool use) via its API. When a user asks "Is the 2022 BMW X3 still available?", the agent should:
search_inventory to find the vehicle IDcheck_availability with that IDThe key is letting Kimi decide which tools to use based on context, rather than hardcoding a rigid flow.
Used car buying spans multiple conversations. A customer might ask about SUVs on Monday, return Wednesday with questions about a specific vehicle, and call Friday to schedule a test drive.
The agent needs session persistence—remembering previous preferences, viewed vehicles, and discussed price ranges. This requires storing conversation summaries and retrieving them when the customer returns.
Technology alone fails without operational design:
Start narrow. Launch with a limited scope—perhaps just answering availability questions for a single vehicle category. Expand capabilities as the system proves reliable.
Test edge cases aggressively. What happens when a customer asks "What's your cheapest car?" or "Do you have anything like a Tesla?" The agent needs graceful handling of ambiguous or comparative queries.
Monitor latency. Each tool call adds time. Batch queries where possible, and consider caching common responses (e.g., featured vehicles) to reduce API costs.
Version your prompts. Small wording changes significantly affect behaviour. Track prompt versions and A/B test improvements.
Kimi's 200,000+ token context window enables true multi-turn conversations with full history. Its reasoning capabilities handle the disambiguation required in natural customer queries. And its API supports the function calling pattern essential for agent tool use.
Most importantly, Kimi is accessible—available through standard API calls without requiring complex infrastructure setup. This makes it practical for mid-sized businesses to experiment with agentic AI without enterprise-scale budgets.
Custom AI agents represent a shift from chatbots that answer questions to systems that accomplish tasks. For used car sales—and any similar enquiry-heavy business—this means better customer experience, reduced staff workload on routine queries, and clearer escalation paths for complex situations.
The technology is ready. The pattern is proven. The remaining work is thoughtful implementation of the five layers: foundation model, reasoning, tools, memory, and operating model. Get those right, and you have an agent that genuinely serves your customers rather than annoying them.