Tether Releases Cross-Platform BitNet LoRA Framework, Billion-Parameter Model Can Complete Fine-Tuning on Consumer-Grade Devices

robot
Abstract generation in progress

Techub News announces that Tether has launched a cross-platform BitNet LoRA fine-tuning framework within QVAC Fabric, optimized for training and inference of Microsoft BitNet (1-bit LLM). The framework significantly reduces computational power and memory requirements, enabling billion-parameter models to be trained and fine-tuned on laptops, consumer GPUs, and smartphones. This is the first time that BitNet models have been fine-tuned on mobile GPUs (including Adreno, Mali, and Apple Bionic). Tests show that a 125M parameter model can be fine-tuned in about 10 minutes, a 1B model in roughly an hour, and even scaled up to 13B parameters on mobile devices. Additionally, the framework supports heterogeneous hardware such as Intel, AMD, and Apple Silicon, and for the first time enables 1-bit LLM LoRA fine-tuning on non-NVIDIA devices. In terms of performance, BitNet models achieve 2 to 11 times faster inference on mobile GPUs compared to CPUs, while reducing VRAM usage by up to approximately 77.8% compared to traditional 16-bit models. Tether states that this technology could break dependence on high-end computing power and cloud infrastructure, promote decentralized and localized AI training, and lay the foundation for new applications like federated learning.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments