Meta Unveils Llama 3.1: The Largest Open-Source AI Model Yet

Meta has announced the release of Llama 3.1, its most powerful open-source AI model to date. With 405 billion parameters, this model is the largest from Meta and also sets a new benchmark in the open-source. The company claims that Llama 3.1 outperforms other leading models like OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet in several key areas.

Exclusive: Meta just released Llama 3.1 405B — the first-ever open-sourced frontier AI model, beating top closed models like GPT-4o across several benchmarks.

I sat down with Mark Zuckerberg, diving into why this marks a major moment in AI history.

Timestamps:

00:00 Intro… pic.twitter.com/wI0X86P0dM
— Rowan Cheung (@rowancheung) July 23, 2024

Meta Unveils Llama 3.1: The Largest Open-Source AI Model Yet

Also Read: Elon Musk’s xAI Launches the World’s Most Powerful AI Training Cluster in Memphis

Llama 3.1 has 405 billion parameters and it is the first frontier-level open-source AI model that provides unmatched flexibility, control and performance.

With a context length of 128,000 tokens, Llama 3.1 can handle more complex tasks, making it suitable for long-form text summarization, advanced conversational agents and coding assistants. The new models support eight languages broadening their utility across global applications.

The model was trained using over 16,000 of Nvidia’s high-end H100 GPUs. Meta is collaborating with over two dozen companies including Microsoft, Amazon, Google, Nvidia and Databricks to help developers deploy and customize Llama 3.1.

Llama 3.1 offers capabilities in general knowledge, steerability, math, tool use and multilingual translation that rival the best AI models available.

It enables the creation of synthetic data enhancing the model’s training process and allowing for new workflows and innovations.

A unique capability at this scale, facilitating the improvement and training of smaller models using the 405B parameter model.

Llama 3.1 utilizes a decoder-only transformer model with minor adaptations to maximize training stability and efficiency.

It employs an iterative process with supervised fine-tuning and direct preference optimization leading to high-quality synthetic data and improved performance.

The model is quantized from 16-bit (BF16) to 8-bit (FP8) reducing compute requirements and enabling deployment on a single server node.

Multiple rounds of alignment including Supervised Fine-Tuning (SFT), Rejection Sampling (RS) and Direct Preference Optimization (DPO) ensure high levels of safety and helpfulness.

Techniques to filter and balance synthetic data, maintaining high quality across capabilities and ensuring that the model remains helpful and accurate.

Llama 3.1 has been evaluated on over 150 benchmark datasets showcasing its capabilities in general knowledge, steerability, mathematical reasoning, tool usage and multilingual translation.

The model supports a longer context length of 128K, enabling it to handle more complex tasks and process information in a single interaction.

Through an iterative post-training process involving supervised fine-tuning, rejection sampling and preference optimization, Llama 3.1 gives improved instruction-following and safety measures.

The model is well-suited for applications like long-form text summarization, multilingual conversational agents and coding assistance.

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet.

Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context… pic.twitter.com/1iKpBJuReD
— AI at Meta (@AIatMeta) July 23, 2024

Also Read: Google Abandons Plans to Eliminate Third-Party Cookies

Llama 3.1 exhibits emerging agentic behaviors such as integrating with search engine APIs to retrieve information from the internet and executing tasks using multiple tools.

The model can perform complex queries and even generate Python code to analyze data trends.

Meta’s red teaming efforts for Llama 3.1 include assessing cybersecurity and biochemical use cases. This testing ensures the model is safe and reliable for use.

Llama 3.1 powers Meta’s AI assistant, which is now available across multiple platforms including WhatsApp, Instagram and Facebook.

The assistant supports multiple languages such as French, German, Hindi, Italian and Spanish.

A new feature called Imagine Me allows users to generate images based on their likeness. This feature avoids creating deepfakes by capturing the user’s likeness through the phone’s camera instead of using profile photos.

Llama 3.1 will be available on Meta’s Quest headset and Ray-Ban smart glasses, providing users with real-time information and interactive experiences through augmented reality.

Collaborations with over 25 partners including AWS, NVIDIA, Databricks, Groq, Dell, Azure, Google Cloud and Snowflake, provide services from day one ensuring support for developers.

Integration with projects like vLLM, TensorRT, and PyTorch ensures the community is ready for production deployment.

Meta is releasing the Llama Stack on GitHub, a set of standardized interfaces for building toolchain components and agentic applications.

Developers can use Llama 3.1 for real-time and batch inference, fine-tuning and synthetic data generation.

Meta has partnered with AWS, NVIDIA and Databricks to provide cloud solutions and optimized inference capabilities.

Meta AI is expanding its reach to 22 countries including Argentina, Chile, Colombia, Ecuador, Mexico, Peru and Cameroon. The assistant supports interactions in several languages with plans to add more in the future.

What can you do with Llama quality and Groq speed? You can do Instant. That's what. Try Llama 3.1 8B for instant intelligence on https://t.co/1tnDYnUkUi. pic.twitter.com/ZI4kdGXOGC
— Jonathan Ross (@JonathanRoss321) July 23, 2024

Also Read: Xiaomi Launches Xiaomi MIX Flip, Goes on Sale in China