Day 0 adaptation for GLM-5.1 | Biren Technology completes integration of Zhipu's new flagship model first

On April 8, Zhipu officially launched and open-sourced its new flagship model, GLM-5.1. As the world's strongest open-source model currently available, GLM-5.1 has made significant breakthroughs in handling long-horizon tasks. Biren Technology (06082.HK) completed the adaptation of its BIREN™ 166 series products on the day of the model’s release, becoming one of the first domestic GPU manufacturers to adapt GLM-5.1.

GLM-5.1 performs inference tasks based on the BR166 chip

Targeting core features of GLM-5.1 such as the 744B MoE architecture, 200K long context, and DSA sparse attention, Biren Technology conducted full-stack deep optimization. Leveraging the high computing power of its self-developed chips and the operator-level deep synergy capabilities of the BIRENSUPA™ software stack, and based on two mainstream open-source frameworks, vLLM and SGLang, precise adaptation was achieved for 40B active parameters and Interleave Thinking cross-inference mode, enabling lossless inference for 200K contexts. Additionally, through multiple optimization techniques such as MoE scheduling, sparse computation, Tensor Parallel, Context Parallel, and MTP, low latency and high throughput efficient inference were realized.

The Pillar™ 166 series is an all-in-one chip for large-scale computing in data centers, capable of meeting the demands of trillion-parameter models. It is widely used in large language models, multimodal AIGC, image and voice processing, and other fields.With its exceptional technical maturity and outstanding out-of-the-box features, Pillar™️The 166 series significantly lowers the threshold for developers to deploy and apply models. With full-stack capabilities, it facilitates the large-scale implementation of domestic SOTA (state-of-the-art) models, promoting the democratization and practical application of AI.

Official introduction of GLM-5.1

Compared with GLM-5, GLM-5.1's overall capabilities have been comprehensively enhanced, with significant breakthroughs in long-horizon task handling. Unlike most existing models that focus on minute-level interactions, GLM-5.1 can work continuously and autonomously for up to eight hours in a single task, delivering complete engineering-grade results through self-planning, execution, and iterative evolution.

I. Comprehensive SOTA Performance

GLM-5.1 is Zhipu's most intelligent flagship model to date and the strongest open-source model globally. The following chart shows the average results from three of the industry’s most representative code evaluation benchmarks: SWE-Bench Pro, which measures the professional software development capabilities of models; Terminal-Bench 2.0, which evaluates problem-solving via command-line operations like an engineer; and NL2Repo, which builds complete code repositories from scratch. Across these three evaluations, GLM-5.1 ranks third globally, first among domestic models, and first among open-source models.

In the SWE-bench Pro benchmark, which most closely simulates real-world software development, GLM-5.1 set a new global record, surpassing GPT-5.4 and Claude Opus 4.6.

II. While you sleep for eight hours, the model works for eight hours

Over the past two years, the industry has used benchmarks to measure how intelligent a model is. The GLM team believes the next phase of evaluation should focus on 'how long a model can work,' specifically its performance in long-horizon tasks. Under the same assessment criteria on the METR leaderboard, GLM-5.1 is the only open-source model capable of sustaining eight hours of continuous work and one of the few models globally—aside from Claude Opus 4.6—with this capability.

The rapid adaptability of domestic computing power is the core support for the implementation of large models and the key engine driving the rise of the domestic AI industry.Currently, Biren Technology has developed the capability to co-evolve with cutting-edge global algorithms and has become one of the very few domestic computing power providers fully compatible with state-of-the-art large models.Biren Technology will continue to strengthen its cooperation with domestic large model manufacturers, enabling developers and customers to embrace the most advanced model capabilities globally at the earliest opportunity. This effort aims to propel domestic large models from 'technological leadership' to 'application leadership,' fostering an open, prosperous, and independently controllable artificial intelligence industry ecosystem.

116K Views