The latest version of Meta's open-source multilingual large language model (LLM), Llama 3.3, has been launched. Llama 3.3 offers enhanced performance at a reduced cost compared to its predecessor, Llama 3.1.
Llama 3.3 has 70 billion parameters, minimizing computational overhead and GPU requirements. It incorporates advanced features such as a longer context window and utilizes Grouped Query Attention (GQA) for improved scalability and performance.
Meta has also prioritized user safety and helpfulness by employing reinforcement learning with human feedback (RLHF) and supervised fine-tuning (SFT).
Llama 3.3 is cost-effective, with token generation costs as low as $0.01 per million tokens. It also offers substantial savings in GPU memory. Meta is committed to environmental sustainability and has offset greenhouse gas emissions during the model's training phase.
Llama 3.3 is released under the Llama 3.3 Community License Agreement, with attribution requirements and an Acceptable Use Policy. Meta provides resources like Llama Guard 3 and Prompt Guard to assist users in deploying the model responsibly.
Llama 3.3 has been pretrained on 15 trillion tokens and fine-tuned on synthetically generated examples, resulting in high performance in multilingual reasoning tasks. It supports multiple languages and outperforms similar models in various areas.
Llama 3.3 is a significant advancement in open-source AI, offering cost efficiency, environmental responsibility, and robust performance.