According to 1M AI News monitoring, the AI inference infrastructure company Fireworks AI has released a Fireworks Training preview version, expanding from a pure inference platform into an end-to-end platform for training and deployment. Fireworks AI was founded by Qiao Lin (Lin Qiao), a former Meta engineer who worked on building PyTorch. It currently has a $4.0 billion valuation, and its daily processed token volume reaches 150 trillion.
The platform offers three tiers:
The scale of full-parameter training ranges from a single-node Qwen3 8B to Kimi K2.5 (trillion-parameter) on 64 blocks of Nvidia B200.
Fireworks AI’s production inference customers—AI programming tools Cursor, Vercel, and Genspark—have completed frontier reinforcement learning training on this platform. Vercel trained an automatic error-correction model for its code generation product v0; the no-error code generation rate reaches 93%. Its CTO Malte Ubl said that compared to Sonnet 3.5, it is only 62%, and end-to-end latency has improved 40x versus the previously used closed-source models. Genspark conducted reinforcement learning fine-tuning on the open-source trillion-parameter model Kimi K2 to build a deep research agent; tool-call volume increased by 33% and costs decreased by 50%. Cursor distributedly completed reinforcement learning training of Composer 2 across 3 to 4 clusters worldwide (currently ranked #1 on CursorBench), and training and production inference share the same GPU pool.
Fireworks AI emphasizes the core technical differentiator of numerical consistency between training and inference. MoE (mixture-of-experts) models are more fragile numerically than dense models; even small changes in hidden states may flip expert routing and cascade amplification. Fireworks publishes the KL divergence values between training and inference for all supported models, all of which are below 0.01.