MetaX GPU servers, shown here with a digital brain motif, were used to train SpikingBrain 1.0, China’s brain-inspired AI model that runs without NVIDIA hardware. Image Source: ChatGPT-5

China’s New SpikingBrain AI Models Deliver Speed and Efficiency on Domestic Chips

Key Takeaways: Brain-Inspired AI Breakthrough

SpikingBrain 1.0 uses spiking neurons that fire selectively, mimicking how the human brain processes information.
Researchers trained 7B and 76B parameter models with less than 2% of the data used by traditional AI systems.
The 7B model processed a 4 million-token prompt 100× faster than standard models.
Both models were trained and tested entirely on MetaX GPUs, showing China’s ability to build advanced AI without NVIDIA.
A public demo called “Shunxi” allows users to try the SpikingBrain model online.

SpikingBrain: A Brain-Inspired Alternative to Transformers

A team of researchers in Beijing has unveiled SpikingBrain 1.0, an AI system designed to more closely imitate how the human brain works. Instead of keeping all neurons “switched on” like today’s large language models (LLMs), SpikingBrain neurons only activate when needed.

This approach, combined with hybrid attention mechanisms and Mixture-of-Experts (MoE) scaling, makes the system both faster and more efficient. The researchers argue it represents a new direction for large language models at a time when costs, data, and hardware availability are major concerns.

Training With Less Data, Running at Higher Speeds

SpikingBrain was released in two versions: the smaller SpikingBrain-7B, designed for efficiency, and the larger SpikingBrain-76B, built for greater accuracy.

Both models were trained on about 150 billion tokens — less than 2% of the data typically used for state-of-the-art language models. Despite this lighter training load, they matched or outperformed many traditional systems in evaluations.

The SpikingBrain-7B emphasizes efficiency. By activating fewer parameters per token, it is especially well-suited to tasks with very long inputs. In tests, it processed a 4-million-token prompt more than 100× faster than standard transformer models while maintaining stability for weeks. On benchmarks such as MMLU, CMMLU, ARC-C, Hellaswag, and C-Eval, the 7B model recovered about 90% of the performance of baseline transformer systems while using only a fraction of the training data.

SpikingBrain-7B achieved competitive results on MMLU, CMMLU, ARC-C, Hellaswag, and C-Eval benchmarks, rivaling 7B and 8B transformer models trained with far more data. Image Source: SpikingBrain Research Paper (arXiv, 2025)

The larger SpikingBrain-76B uses a hybrid Mixture-of-Experts architecture, which adds complexity but consistently improves accuracy. It closed much of the gap with top transformer models and, in some cases, matched or surpassed widely used systems such as Llama2-70B, Mixtral-8×7B, and Gemma2-27B.

SpikingBrain-76B closed much of the gap with leading transformer models, performing on par with Llama2-70B and Mixtral-8×7B across multiple benchmarks. Image Source: SpikingBrain Research Paper (arXiv, 2025)

Extended evaluations show SpikingBrain-76B consistently outperforming the 7B version and competing strongly with larger transformer-based models. Image Source: SpikingBrain Research Paper (arXiv, 2025)

By the Numbers: SpikingBrain’s Performance and Efficiency

150B tokens — training data used for both models, under 2% of typical state-of-the-art LLM datasets.
100× faster — time-to-first-token speedup on a 4M-token prompt with the SpikingBrain-7B.
90% benchmark recovery — the 7B model reached ~90% of baseline transformer performance on MMLU, CMMLU, ARC-C, Hellaswag, and C-Eval.
Hundreds of MetaX C550 GPUs — hardware used for training, demonstrating large-scale stability outside of the NVIDIA ecosystem.

Hardware Independence: Running on MetaX Chips

One of the most notable achievements is that researchers trained SpikingBrain entirely on hundreds of China’s MetaX C550 GPUs, not on NVIDIA hardware. To achieve this, they customized operator libraries, parallelism strategies, and memory optimizations specifically for MetaX. The team reported stable large-scale training over several weeks, underscoring that advanced large language models can now be built and deployed outside of the NVIDIA ecosystem.

This independence is especially significant given ongoing concerns about chip supply chains and national technology independence.

Public Access: Shunxi Demo

To showcase their progress, the researchers have released the smaller SpikingBrain-7B model as open source through GitHub. This gives developers and researchers the opportunity to study the system and build on its efficiency-focused design.

To highlight the capabilities of the larger system, the team also introduced Shunxi, a public demo of the flagship SpikingBrain-76B. Unlike the 7B release, Shunxi is not open-sourced but provides an online trial port where users can experience the model’s performance first-hand. The research paper notes the demo’s availability but does not provide a permanent link.

These models highlight not only SpikingBrain’s performance, but also its ability to operate fully on domestic infrastructure.

Q&A: Understanding SpikingBrain

Q: What makes SpikingBrain different from other AI models?
A: It uses spiking neurons that activate only when needed, making it faster and more efficient, similar to how the human brain works.

Q: How much training data did it use?
A: About 150 billion tokens, less than 2% of what most large models require.

Q: How fast is it compared to traditional AI systems?
A: The smaller 7B model processed a 4M-token prompt 100× faster than standard systems.

Q: What hardware does it run on?
A: It was trained and tested entirely on China’s MetaX GPUs, without using NVIDIA chips.

Q: Can the public try SpikingBrain?
A: Yes. The smaller SpikingBrain-7B is available open-source on GitHub. For the larger SpikingBrain-76B, the team launched a public demo called Shunxi. The research paper confirms its availability but does not provide a permanent link.

What This Means: Efficiency, Independence, and Competition

The release of SpikingBrain signals three important shifts:

Efficiency over scale: Brain-inspired designs show that performance gains don’t always require bigger datasets or higher costs.
Hardware diversification: By proving MetaX GPUs can power advanced models, China is reducing reliance on Western chipmakers.
Global competition: As different countries build their own AI systems and hardware, the landscape is becoming more diverse and competitive.

These innovations promise faster, cheaper, and more energy-efficient systems — reshaping how and where large language models are built.

Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.

China’s New SpikingBrain AI Models Deliver Speed and Efficiency on Domestic Chips

China’s New SpikingBrain AI Models Deliver Speed and Efficiency on Domestic Chips

Key Takeaways: Brain-Inspired AI Breakthrough

SpikingBrain: A Brain-Inspired Alternative to Transformers

Training With Less Data, Running at Higher Speeds

By the Numbers: SpikingBrain’s Performance and Efficiency

Hardware Independence: Running on MetaX Chips

Public Access: Shunxi Demo

Q&A: Understanding SpikingBrain

What This Means: Efficiency, Independence, and Competition

Keep Reading

AiNews.com