JittorLLMs: 2G memory, can run without a graphics card, easy deployment of large models

In the wave of artificial intelligence, large models have become the cornerstone of many advanced applications due to their powerful learning and reasoning abilities. However, these models often require expensive hardware support, especially high-performance GPUs, which discourages many researchers and developers. But now, JittorLLMs, launched by the domestic company Fitten, is changing this situation and making local deployment of large models within reach.

Project website: https://github.com/Jittor/JittorLLMs

Low cost, high performance:

JittorLLMs is an inference library designed specifically for large models, and its core advantage lies in its ability to significantly reduce hardware requirements. Compared to traditional frameworks, JittorLLMs reduces hardware demands by 80%, allowing large models to run with just 2GB of memory even without a dedicated graphics card. This means that anyone can achieve local deployment of large models on ordinary machines without expensive hardware investments.

Extensive support and high portability:

JittorLLMs supports a variety of large models, including ChatGLM, Pengcheng Panggu, BlinkDL's ChatRWKV, Meta's LLaMA/LLaMA2, MOSS, and more. In the future, it will also support more excellent domestic large models. Through Jittor version of PyTorch (JTorch), users can achieve model migration and adaptation to various heterogeneous computing devices and environments without modifying any code.

Dynamic swapping technology, reducing development difficulty:

The dynamic swapping technology developed by the Jittor team is the world's first framework that supports automatic swapping of dynamic graph variables. Users do not need to modify any code, and tensor data can be automatically swapped between GPU memory, system memory, and hard disk, greatly reducing the difficulty of developing large models.

Fast loading and computational performance:

Jittor framework reduces the loading overhead of large models by 40% and improves computational performance by more than 20% through zero-copy technology and automatic compilation optimization of meta-operators. In situations where there is sufficient GPU memory, JittorLLMs outperforms similar frameworks. Even in cases of insufficient GPU memory or no GPU at all, it can still run at a certain speed.

JittorLLMs not only provides new possibilities for the deployment of large models but also opens up new paths for the popularization and application of artificial intelligence. With the continuous advancement of technology, we have reason to believe that artificial intelligence will become more inclusive and deeply integrated into our lives in the future.

Project website: https://github.com/Jittor/JittorLLMs