Consumer grade GPU can run, Shanghai Artificial Intelligence Laboratory open-source scholar · Puyu 20B model
On September 20th, Shanghai Artificial Intelligence Laboratory and other institutions released the 20 billion parameter version of the Shusheng Puyu Big Model InternLM-20B, which was open source and first released for free commercial use in the Alibaba Cloud Magic Building community.
The Shusheng Puyu Big Language Model was jointly launched by Shanghai Artificial Intelligence Laboratory and multiple institutions. In June this year, the InternLM billion parameter language model was first released and has undergone multiple rounds of upgrades. In July, Shanghai Artificial Intelligence Laboratory opened up the lightweight version of InternLM-7B with 7 billion parameters from Shusheng Puyu, and was the first in the industry to open up a full chain tool system that runs through data, pre training, fine-tuning, deployment, and evaluation. The InternLM-20B released this time is a medium-sized large-scale model. Compared to InternLM-7B, its understanding ability, reasoning ability, mathematical ability, programming ability, etc. have all been significantly improved.
Compared to the previously open-source 7B and 13B specification models in China, the 20B level model has stronger comprehensive capabilities, particularly outstanding complex reasoning and reflection abilities, and can provide more powerful performance support for practical application scenarios; Meanwhile, the 20B level model can perform inference on a single card, and after low bit quantization, it can run on a single consumer grade GPU, making it more convenient in practical applications.
Compared to previous open-source models, InternLM-20B can achieve Llama2-70B evaluation scores with less than one-third of the parameter count. It also supports dozens of plugins, tens of thousands of API interface functions, and has the ability to interpret and reflect on code. In addition, during the development and training process of InternLM-20B, the research team conducted a two-stage value alignment based on SFT and RLHF, and significantly improved its safety through adversarial training with expert red teams.