China's AI Price War: Xiaomi Matches DeepSeek
DeepSeek made its 75% V4-Pro cut permanent; days later Xiaomi cut MiMo-V2.5 up to 99% to match it exactly—pushing AI inference to a quarter of Western prices.
The price war in China's artificial intelligence sector has entered a significant new phase. DeepSeek has solidified a 75% rate cut on its flagship V4-Pro model into a permanent pricing tier. Just three days later, at midnight Beijing time on May 27, Xiaomi slashed prices for its MiMo-V2.5 series by up to 99%, releasing a price sheet that mirrors DeepSeek's down to the decimal.
Per million tokens, DeepSeek V4-Pro charges $0.435 for input and $0.87 for output. Xiaomi's MiMo-V2.5-Pro matches those figures exactly. The alignment extends to their standard tiers: Xiaomi's base model costs $0.14 for input and $0.28 for output, identical to DeepSeek's V4-Flash.
With flagship rates from OpenAI and Anthropic sitting at roughly four times these prices, the implications of this shift are profound. By aligning their pricing structures within the same week, the two Chinese tech firms have effectively anchored the cost of AI inference at a fraction of Western alternatives.
DeepSeek V4-Pro Sets the Baseline
DeepSeek was the first to establish this new baseline. Just a month after launching its V4 generation, the startup reduced the per-token rate of V4-Pro by 75%, transitioning the temporary promotional rate into a permanent pricing model. Under this new structure, one million tokens cost $0.435 for input and $0.87 for output—a sharp drop from the previous rates of $1.74 and $3.48. These figures will become the official list prices when the promotional period concludes on May 31.
The company maintains that the reduction is not a mere marketing discount. Sanchit Vir Gogia, CEO of Greyhound Research, points out that V4-Pro was specifically architected to minimize long-context inference costs. The model operates at approximately one-quarter of the compute cost and one-tenth of the memory footprint of its predecessor. This fundamental structural efficiency allows DeepSeek to sustain the rate cut permanently.
Independent benchmarks support the model's value proposition. Evaluator Artificial Analysis reported that following the 75% reduction, V4-Pro achieved top-tier status globally in terms of 'intelligence per dollar,' placing it squarely on the efficiency frontier. Because V4 is open source, developers can deploy and customize it locally, and it is pre-optimized for agent workflows like Anthropic's Claude Code and OpenClaw.
Xiaomi's MiMo Replicates the Exact Price Structure
Xiaomi immediately replicated the exact figures established by DeepSeek. Effective globally at midnight Beijing time on May 27, the newly introduced rates for the MiMo-V2.5 series reflect reductions of up to 99%, while eliminating the legacy pricing structure that varied by input length. The premium MiMo-V2.5-Pro is priced at $0.435 for input and $0.87 for output, while the standard MiMo-V2.5 tier stands at $0.14 and $0.28—matching DeepSeek's V4-Pro and V4-Flash pricing down to the decimal.
Xiaomi openly acknowledged the strategy, attributing the price cuts to systematic optimization of its inference architecture. The company integrated Sliding Window Attention (SWA) with SGLang HiCache, reducing KV-cache data transfer between GPU memory, CPU memory, and SSD storage to approximately one-seventh of previous levels. Simultaneously, the framework expanded the number of cacheable tokens fivefold, driving down per-token unit costs through higher cache-hit rates and overall processing efficiency.
The pricing adjustment was accompanied by broader operational changes. Xiaomi increased the capacity of its Token Plans by five to eight times and reset all active subscriber credits. Furthermore, a 100-trillion-token incentive program for creators, which launched in late April, was fully exhausted ahead of schedule on May 26. This tactical transition signals a shift from subsidizing early adoption with free tokens to cementing long-term user retention through highly aggressive baseline pricing.
The Next Battleground Is Trust and Data
When market leaders converge on identical price points, unit cost ceases to be a competitive differentiator. The critical question now is how Western frontier laboratories will respond. With flagship pricing from OpenAI and Anthropic remaining roughly four times higher than the new Chinese baseline, justifying this premium is becoming increasingly difficult for enterprises running token-heavy workloads.
Neil Shah, vice president of Counterpoint Research, noted in Computerworld that the availability of high-performance open-weights alternatives shifts leverage directly to corporate buyers. Western AI startups, despite boasting $700-billion-class collective valuations and demanding massive capital expenditures, are finding themselves price-takers on the core product of inference. As raw inference commoditizes, a multi-model strategy—deploying premium models for high-reasoning tasks and cost-efficient alternatives for repetitive operations—is rapidly becoming the standard enterprise architecture.
However, these low rates carry structural trade-offs. Amit Jaju, senior managing director at Ankura Consulting, identifies data sovereignty as a primary risk. Routing external APIs through Chinese infrastructure means prompts, proprietary documents, and logs may cross jurisdictions, raising severe regulatory concerns. Furthermore, pasting source code or design files into these models exposes intellectual property to potential training leaks. Following an earlier knowledge-distillation dispute, industry analysts increasingly recommend self-hosting on private infrastructure or sovereign cloud services.
The price-tag skirmish has essentially run its course. As inference transitions into a utility akin to electricity or bandwidth, competition will center on trust, ecosystem integration, and data residency rather than the invoice total. Having established the cost floor, the strategic frontier has moved from what AI costs to where and how it is deployed.
- Xiaomi MiMo Open Platform - MiMo-V2.5 Series Price Adjustment Announcement
- DeepSeek API Docs - Models & Pricing
- Computerworld - DeepSeek's steep V4-Pro price cut escalates AI pricing war
- Investing.com - China's DeepSeek to make permanent 75% price cut on flagship V4-Pro AI model
- Artificial Analysis (X) - DeepSeek V4 Pro on the Pareto frontier after the permanent 75% cut