Alibaba releases the new generation of the base model Qianwen 3.5, topping the list of the world's most powerful open-source large models

On Lunar New Year’s Eve, February 16th, Alibaba open-sourced the all-new generation large model Qwen3.5-Plus, with performance comparable to Gemini 3 Pro, topping the list of the world’s most powerful open-source models.

It is reported that Qwen3.5 has achieved a comprehensive overhaul of the underlying model architecture. The released Qwen3.5-Plus version has a total of 397 billion parameters, with only 17 billion active. It outperforms larger models with over a trillion parameters like Qwen3-Max, reduces deployment VRAM usage by 60%, and significantly improves inference efficiency, with maximum inference throughput increasing up to 19 times. The API price for Qwen3.5-Plus is as low as 0.8 yuan per million tokens, only 1/18 of Gemini 3 Pro.

Unlike previous generations of Qwen large language models, Qwen3.5 has achieved an intergenerational leap from a pure text model to a native multimodal model. Qwen 3 was pretrained on pure text tokens, while Qwen3.5 is pretrained on a mixture of visual and text tokens, with substantial additions of data in Chinese, English, multilingual, STEM, and reasoning tasks. This enables the large model with “eyes” to learn more intensive world knowledge and reasoning logic, achieving top performance comparable to the trillion-parameter Qwen3-Max with less than 40% of the parameters. It performs excellently across comprehensive benchmarks including reasoning, programming, and agent intelligence. For example, Qwen3.5 scores 87.8 on the MMLU-Pro knowledge reasoning test, surpassing GPT-5.2; scores 88.4 on the challenging GPQA doctoral-level test, higher than Claude 4.5; and sets a new record of 76.5 on the instruction-following benchmark IFBench. In general agent evaluations such as BFCL-V4 and search agent benchmarks like Browsecomp, Qwen3.5 outperforms Gemini 3 Pro and GPT-5.2.

Native multimodal training also brings a leap in Qwen3.5’s visual capabilities: in numerous authoritative evaluations including multimodal reasoning (MathVision), general visual question answering (RealWorldQA), text recognition and document understanding (CC_OCR), spatial intelligence (RefCOCO-avg), and video understanding (MLVU), Qwen3.5 achieves the best performance. In tasks such as subject problem-solving, task planning, and physical space reasoning, Qwen3.5 outperforms the specialized Qwen3-VL model, with greatly enhanced spatial localization and image reasoning abilities, resulting in more detailed and precise reasoning analysis. In video understanding, Qwen3.5 supports direct input of videos up to 2 hours long (1 million tokens of context), suitable for long video content analysis and summarization. Additionally, Qwen3.5 has achieved native integration of visual understanding and coding capabilities. Combined with image search and generative tools, it can convert hand-drawn interface sketches directly into usable front-end code, allowing a screenshot to locate and fix UI issues, making visual programming a true productivity tool.

Qwen3.5’s native multimodal training is efficiently conducted on Alibaba Cloud’s AI infrastructure. Through a series of technological innovations, Qwen3.5’s training throughput on mixed data of text, images, and videos is nearly 100% comparable to pure text base models, greatly lowering the barrier for native multimodal training. Meanwhile, by employing carefully designed FP8 and FP32 precision strategies, the memory usage during training is reduced by about 50% when scaling to hundreds of trillions of tokens, with a 10% speedup, further reducing training costs and improving efficiency.

Qwen3.5 also marks a new breakthrough from agent frameworks to agent applications. It can autonomously operate smartphones and computers, efficiently complete daily tasks, support more mainstream apps and commands on mobile devices, and handle more complex multi-step operations on PCs, such as cross-application data management and automation workflows, significantly improving operational efficiency. The Qwen team has built an extensible asynchronous reinforcement learning framework for agents, which can accelerate end-to-end processes by 3 to 5 times, and support plugin-based intelligent agents at scale reaching millions.

It is reported that Qwen3.5-Plus models are already integrated into the Qwen app and PC versions. Developers can download the new models from the Modao community and HuggingFace, or directly access API services via Alibaba Cloud Balian. Alibaba will soon continue to open-source different sizes and functionalities of the Qwen3.5 series models. The more powerful flagship model, Qwen3.5-Max, will also be released soon.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)