Llama 2 from Meta | | | | Turtles AI
Llama 2 from Meta
DukeRem
Meta releases Llama 2 Family of Large Language Models with 7B-70B Parameters. You can read the original paper by clicking here.
Meta has released a new family of large language models named Llama2 ranging in scale from 7 billion to 70 billion parameters. These Llama2 models show significant improvements over the previous Llama1 models by being trained on larger data, longer context length of up to 4k tokens and faster inference for the 70B model thanks to grouped query attention.
The Llama2 models include several fine-tuned variants optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF). According to human evaluations, the Llama2-Chat models perform better than most existing open models and achieve performance comparable to ChatGPT.
The Llama2 models are supported natively in the HuggingFace ecosystem with model integration, inference scripts, fine-tuning examples, quantization tools and model hosting through Inference Endpoints. The provided demo lets Users easily try the Llama2 70B model in action.
For production use, Meta recommends deploying the 7B models on a 1x Nvidia A10G GPU instance, the 13B models on a 1x Nvidia A100 GPU instance and the 70B models on a multiple GPU setup instance with 8x A100 GPUs. HuggingFace has also showcased how to fine-tune the Llama2 7B model on a single T4 GPU using PEFT.
Overall, the Llama2 release, with its permissive license and performance improvements, presents an exciting open alternative for building dialogue applications.