XAI launches Grok-4-fast: a unified reasoning and non-controversial model with 2M to toction and trained end-to-end training using reinforced learning (RL) through tools
XAI has introduced grok-4-fastThis is the successor to Grok-4’s cost optimization, fusing “reasoning” and “non-conditioning” behaviors into a set of weights that can be controlled by system prompts. This model targets high-throughput search, coding and question-and-answer 2M token context window and the native RL that decides when to browse the network, execute code or call tools.
Architectural Notes
Previous Grok released long chain “inference” and brief “unreasonable” responses across separate models. grok-4-fast Uniform weight space Reduce end-to-end latency and tokens with system prompts, which are related to real-time applications (search, assistive proxy, and interactive encoding), where switching models can all hurt latency and cost.
Search and proxy usage
Grok-4-fast end-to-end training Enhanced learning of tools And display the benefits of search-centric proxy benchmarks: Viewed 44.9%,,,,, SimpleQA 95.0%,,,,, REKA study 66.0%plus the Chinese variant scores higher (e.g. BrowseComp-ZH 51.2%). Xai also quotes private combat tests on Lmarena grok-4-fast-search
(codenamed “Menlo”) ranked first on the search stage, 1163 ELO rankingsa text variant (codenamed “tahoe”) is located #8 in the Text Arenaroughly grok-4-0709
.
Performance and efficiency delta
Grok-4 Quick Posts under Internal and Public Benchmarks Frontier score In case of cutting token use. xai report via @1 result 92.0% (Aime 2025, no tools),,,,, 93.3% (HMMT 2025, no tools),,,,, 85.7% (GPQA Diamond)and 80.0% (livecodebench Jan – May)close to or match grok-4, but use ~4% less “think” tokens average. The company regards it as “smart density”, claiming Same benchmark performance as Grok-4 reduces price by ~98% When the token count and new pricing are merged.
Deployment and Price
The model is Usually available for all users In Groke Quickly and car Mode across network and mobile devices; Auto will select Grok-4-fast for difficult queries to improve latency without losing quality, and for the first time –Free users Access Xai’s latest model layer. For developers, XAI reveals Two skus–grok-4-fast-reasoning
and grok-4-fast-non-reasoning
– and 2M context. Pricing (xai api) is $0.20/1M input token (, $0.40/1 million input token (≥128K),,,,, $0.50/1m output token (,, $1.00/1m output token (≥128K)and $0.05/1M cache input token.

5 technical points:
- Unified Model + 2M Context. Grok-4-fast uses a single weight space for “reasoning” and “non-disputed” and enters quickly, and has a window of 2,000,000 tokens on both SKUSs.
- Pricing for pricing. API pricing begins with $0.20/m input,,,,, $0.50/m outputinput with cache $ 0.05/m The higher interest rates are only over 128K context.
- Efficiency requirements. XAI Report ~40% of “thinking” tokens are missing With comparable accuracy with Grok-4, ~ 98% lower price matching Grok-4 performance On border benchmarks.
- Benchmark configuration file. Report via @1: Aime-2025 92.0%,,,,, HMMT-2025 93.3%,,,,, GPQA-Diamond 85.7%,,,,, Livecodebench (Jan – May) 80.0%.
- Proxy/Search for use. Use tools to train with RL; targeting for browsing/search workflows that contain recorded search proxy metrics and real-time search bills in documents.


Summary
GROK-4-FAST package GROK-4 level features are a timely model with 2m token windows, tools use RL, and adjust prices for high-throughput search and proxy workloads. Early public signals (LMARENA #1 in search, competitive text placement) were used with Xai’s claim of similar accuracy using about 40% of the “thinking” tokens, which translated into reduced latency and production costs.
Check Technical details. Check out ours anytime Tutorials, codes and notebooks for github pages. Also, please stay tuned for us twitter And don’t forget to join us 100K+ ml reddit And subscribe Our newsletter.

Asif Razzaq is CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, ASIF is committed to harnessing the potential of artificial intelligence to achieve social benefits. His recent effort is to launch Marktechpost, an artificial intelligence media platform that has an in-depth coverage of machine learning and deep learning news that can sound both technically, both through technical voices and be understood by a wide audience. The platform has over 2 million views per month, demonstrating its popularity among its audience.
🔥[Recommended Read] NVIDIA AI Open Source VIPE (Video Pose Engine): A powerful and universal 3D video annotation tool for spatial AI