slug: deepseek-v4-preview
summary: DeepSeek released V4 on April 24, 2026 — a 1.6 trillion parameter frontier model, fully open-source under MIT license, running natively on Huawei Ascend chips, at prices 9-18x lower than OpenAI’s latest.
description: DeepSeek V4 is open-source, runs on Huawei chips, and costs far less than GPT-5.5. A new competitor emerges.
coverImage: cover.png
author: Sun Jie
date: 2026-04-24
tags: ["DeepSeek", "V4", "Open Source AI", "Huawei Ascend", "Frontier Models", "AI Pricing"]
DeepSeek V4: The Open-Source Model That Runs on Huawei Chips and Costs 18x Less Than GPT-5.5
On April 24, 2026, exactly one year after DeepSeek-R1 went viral and briefly dethroned ChatGPT on the App Store, DeepSeek released V4. While the world's attention was on OpenAI's GPT-5.5 (released the same day), DeepSeek published something that may matter more in the long run: a frontier-level AI model that anyone can download, modify, deploy, and commercialize, without licensing fees, without American chips, at a fraction of the cost of comparable proprietary models.
The Numbers That Matter
DeepSeek V4 comes in two variants:
V4-Pro is the flagship. It has 1.6 trillion total parameters, though only 49 billion are activated per token thanks to its Mixture-of-Experts (MoE) architecture. That means it works like a large model but inference costs like a small one. The context window stretches to 1 million tokens — standard across both variants, not a premium feature.
V4-Flash is the lightweight option. 284 billion total parameters, 13 billion activated per token. It is designed for speed and cost efficiency rather than maximum capability. For most production applications, Flash is likely the practical choice.
Here is the comparison that matters most right now:
| Model | Input (Cache Miss) | Output | Open Source? |
|---|---|---|---|
| DeepSeek V4-Flash | $0.14 / 1M tokens | $0.28 / 1M tokens | Yes (MIT) |
| DeepSeek V4-Pro | $1.74 / 1M tokens | $3.48 / 1M tokens | Yes (MIT) |
| GPT-5.5 | ~$3-15 / 1M tokens | ~$15-75 / 1M tokens | No |
DeepSeek claims V4 is approximately 18x cheaper than GPT-4o and 9-18x cheaper than GPT-5.5 depending on the tier and token type. Even compared to GPT-5.4-class models, the pricing gap is substantial. And this is for a model that benchmarks competitively: V4-Pro scores around 91.2 on MMLU-Pro, one of only two models (alongside Gemini 3.1 Pro at 90.0) to break 90 on that benchmark.
The Huawei Angle
The model that runs on Huawei Ascend chips is not an afterthought — it is the point.
Since early 2026, US export restrictions have cut off Chinese companies from Nvidia's H100 and H200 chips. Huawei's Ascend 950 series has been positioned as the domestic alternative, but running large language models on it has required significant engineering optimization. DeepSeek V4 is the first frontier-class model to run natively on Ascend hardware, meaning it was designed from the ground up for these chips rather than ported after the fact.
Here is why this matters. Chinese enterprises can now deploy frontier-level AI without relying on cloud providers or foreign hardware. Alibaba, Tencent, and ByteDance have all placed large orders for Huawei Ascend chips following V4's release, Reuters reported in March 2026, before V4 even shipped.
For the global AI ecosystem, the Nvidia dependency is no longer permanent. A full AI stack (chips, model, training infrastructure) now exists outside the US-controlled supply chain.
For DeepSeek, the Huawei angle validates the company's unusual origin story. Founded in July 2023 by Liang Wenfeng, a Zhejiang University graduate who co-founded the quantitative hedge fund High-Flyer, DeepSeek was funded by hedge fund algorithmic trading profits rather than venture capital. That financial independence shows in the product strategy: DeepSeek consistently releases things for free that competitors charge for.
The Architecture: Engram, DSA2, and the 384 Experts
V4 is not just a larger V3. The team introduced several technical innovations that deserve attention.
Engram is a new memory architecture published in January 2026 on arXiv (paper 2601.07372). The core idea is straightforward: separate factual knowledge lookup from reasoning computation. Static knowledge gets stored in a lookup table (offloaded to host DRAM), freeing GPU memory and compute for complex reasoning tasks. The result is O(1) constant-time knowledge retrieval instead of storing everything in transformer parameters. The Engram paper describes a "U-shaped scaling law" for lookup mechanisms — a finding that could influence how future models handle long-term knowledge.
DSA2 (DeepSeek Sparse Attention 2) is the attention mechanism. It combines two prior DeepSeek innovations — DSA and NSA — to enable stable training at trillion-parameter scale with only about 6.7% computational overhead. At 1 million token context, V4-Pro uses only 27% of the FLOPs and 10% of the KV cache compared to V3.2. V4-Flash drops to 10% FLOPs and 7% KV cache. This is what makes million-token context practically usable rather than a benchmark trick.
Sparse MoE continues from V3. V4-Pro has 384 experts with only 6 activated per forward pass. That is a massive expansion from V3's 64 experts + 2 shared experts. Despite 1.6T total parameters, only ~3% are active per token — which is how a 1.6T parameter model can be deployed at inference costs comparable to a 50B model.
The Open-Source Strategy
DeepSeek released V4 under the MIT License — one of the most permissive open-source licenses available. Weights are on Hugging Face, ModelScope, and GitHub. The API is live simultaneously with the model release, offering both v4-pro and v4-flash endpoints.

This is a deliberate competitive move. DeepSeek is not trying to win on quality alone. GPT-5.5 is still ahead on polished creative tasks and frontend design. Instead, DeepSeek is betting on the economics: if your application does not need the absolute best quality, or if you are cost-sensitive, or if you have regulatory concerns about vendor lock-in, V4 is the obvious choice.
Under the MIT license, commercial use is unrestricted (no revenue sharing, no licensing fees), modification is allowed (fine-tune, quantize, merge with other models), and deployment is unrestricted (run it on your own hardware, in your own data center, in any cloud).
This contrasts sharply with OpenAI, which charges premium prices and retains significant control over how its models are used. The DeepSeek approach mirrors what Llama did for smaller models — it commoditizes capability and forces competitors to justify their pricing premium.
The Competitive Landscape: Three Releases in Nine Days
V4 did not arrive in a vacuum. The week of April 16-24, 2026 will be remembered as one of the most concentrated periods in AI history:
- • April 16: Anthropic shipped Claude Opus 4.7, its most capable model yet.
- • April 23: OpenAI released GPT-5.5, positioning it as an "AI superapp" with workflow agents.
- • April 24: DeepSeek released V4, the same day.
Three frontier model releases in nine days. The industry has entered what commentators are calling a "release or die" phase — falling behind by even a few weeks now costs you mindshare, headlines, and customers.
Anthropic's counter to GPT-5.5 was to announce app connectors for Claude: integrations with Spotify, Uber Eats, TurboTax, and dozens of other services. Rather than building a superapp itself, Anthropic is connecting to the apps people already use.
OpenAI's response to DeepSeek is less clear. GPT-5.5 is more expensive, but it has the distribution advantage — ChatGPT has millions of daily active users who will use GPT-5.5 without having to switch products. DeepSeek has to earn that switching decision.
What V4 Means for Developers
If you are building AI-powered applications today, V4 changes the calculation in three ways:
First, your inference costs just dropped significantly. At $0.28/M output tokens for V4-Flash, you can process a lot of text for very little money. For high-volume applications like content moderation, document classification, or automated customer service, the economics of V4 make AI-assisted workflows viable at scale.
Second, self-hosting is now a real option. The MIT license means you do not have to trust a third-party API with your data. Deploy V4 on your own infrastructure, control your own data, pay your own hardware costs instead of per-token fees. For enterprises with data sovereignty requirements or high volume needs, this is a significant change.
Third, fine-tuning is now accessible. With a 284B parameter model like V4-Flash, fine-tuning on a specific domain (legal documents, medical records, financial reports) is computationally feasible for organizations that could not afford to fine-tune a 1.6T parameter model. Expect to see specialized V4 variants emerging over the coming months.
The Benchmark Question
One caveat before treating V4 as definitively superior: many of the benchmark numbers circulating are from pre-release leaks and internal claims. V4 was officially released on April 24, 2026. Independent third-party evaluation is still forthcoming.
The numbers that look strongest are:
- • MMLU-Pro: ~91.2 — genuinely competitive, only Gemini 3.1 Pro is in the same range
- • AIME 2025: ~96.4 — top-tier reasoning performance
- • SWE-bench: 80-85% (pre-release) — would rival Claude Opus 4.5 at 80.9%
- • HumanEval: ~90% (leaked) — strong coding capability
The coding benchmarks (SWE-bench, HumanEval) are particularly notable because they align with DeepSeek's historical strength — DeepSeek Coder was one of the first serious open-source coding models.
But benchmark performance on pre-release claims should be treated with skepticism. The AI industry has a well-documented history of inflated benchmark claims that do not fully translate to real-world performance.
The Geopolitical Dimension
DeepSeek V4's release is not purely a technology story. It is also a geopolitical one.
US export controls on advanced semiconductors were designed to slow China's AI development. DeepSeek V4 suggests those controls are not working as intended — or at least not in the timeframe policymakers expected. A frontier-class model running natively on Huawei chips demonstrates that China's domestic AI ecosystem has reached a level of maturity that does not require American hardware.
This has implications for the ongoing US-China technology competition:
- • If China can develop frontier AI on domestic chips, the leverage of export controls diminishes.
- • If Chinese enterprises adopt Huawei chips + DeepSeek models as their standard stack, US companies lose access to a massive market.
- • The "AI race" narrative that assumes US dominance may need updating.
None of this means DeepSeek is definitively winning — GPT-5.5 and Claude Opus 4.7 are genuinely strong models with polished ecosystems. But the competitive gap has narrowed significantly, and the assumption that Western AI is permanently ahead is no longer safe.
What to Watch
The next three months will determine whether V4's release is a significant industry event or a defining one:
- • Will independent benchmarks confirm the claims? Third-party evaluation of V4-Pro and V4-Flash will either validate or complicate the benchmark claims. Watch for results from LMSYS, Artificial Analysis, and similar independent evaluators.
- • Will major open-source fine-tuning projects emerge? Llama spawned hundreds of variants. Will V4 do the same? The MIT license makes it maximally attractive for fine-tuning. Watch for domain-specific V4 models (legal, medical, coding, finance).
- • Will Huawei chip availability expand globally? The Ascend 950 is currently focused on the Chinese market. If Huawei begins marketing internationally, the V4 + Ascend stack becomes a viable global alternative.
- • How will OpenAI and Anthropic respond on pricing? V4's economics put real pressure on proprietary model pricing. Both companies have pricing power with their existing customers, but new entrants and cost-sensitive developers now have a credible alternative.
Practical Advice for Developers
- 1. Evaluate V4-Flash for cost-sensitive production applications. At $0.28/M output tokens, it is worth testing for any application where you are currently paying more. The quality/cost ratio is compelling.
- 2. Consider self-hosting if you have data sovereignty requirements. The MIT license + Huawei chip compatibility means you can build a fully independent AI stack. For enterprises in regulated industries or countries with data residency requirements, this is now viable.
- 3. Do not switch away from GPT-5.5 or Claude for polished creative work. The benchmark numbers are competitive, but for high-stakes creative tasks, the polished ecosystem and proven reliability of proprietary models still matters. V4 is a strong alternative, not yet a definitive replacement.
- 4. Watch the fine-tuning ecosystem. In 3-6 months, expect to see specialized V4 variants for specific industries and use cases. The first fine-tuned variants will reveal where V4's capabilities are truly strongest.
DeepSeek V4 is the most significant open-source AI release since Llama 3.1. Whether it changes the industry depends on whether the benchmark performance holds up in practice. The pricing, the MIT license, and the Huawei chip compatibility make it impossible to ignore.
The AI pricing floor just dropped. Anyone charging significantly more now has to explain why.



