|||

Article originally appeared on Replicate.

Editor’s note

The big event this week was the release of Llama 3.1, Meta’s new generation of language models, including the 405 billion parameter model. This model is a peer to GPT-4, Claude 3, and Gemini 1.5, the big proprietary models from other labs.

But unlike those labs, Meta doesn’t claim to be building superintelligence, or even AGI. They think of AI as a system, and language models as one component. Mark Zuckerberg, in his letter accompanying the release, repeatedly uses the phrase AI systems”. More than most people, he understands that software doesn’t exist in a vacuum. An app” like Facebook or Instagram is actually a giant, interconnected set of social and technical systems. An AI will be like this too: not one giant end-to-end omnimodal intelligence, but a bunch of components working together.

Human intelligence is already a component in that system. Each one of us is a squishy cog in the vast machine of society. The systems look to us for guidance: Do you like this video? Would you buy this product? Does this picture contain a bus? They also inform us: Meeting in 10 minutes. Turn right at the next intersection. New message from Mom. We co-evolve with the systems we use.

Deep learning models are a new type of component. They have some of the aspects of human employees: they can perceive the world, they can make judgment calls, they can plan. But that doesn’t mean we need to package them up into a humanoid robot with a sense of self. They can be, instead, a form of distributed intelligence. We can put a little judgment here, some pattern recognition there. We can keep humans in the loop, automating away tasks instead of jobs. We can augment our own human intelligence, bit by bit.

In fact, this what humans have always done. We compose intelligences into systems to augment ourselves. Agriculture, domestication, engineering: we are already a modular intelligence. This is a different vision than general intelligence”, and it require a different type of thinking.

Instead of building an employee, we must build an ecosystem.

deepfates


A giant open-source-ish language model

The Llama 3.1 generation includes a massive 405 billion parameter model as well as updated versions of the 8B and 70B models released earlier this year.

  • 128,000 token context
  • Multilingual support
  • Can use tools and functions

This release narrows the gap between open and closed-source models. The 405B model rivals state-of-the-art closed models in many benchmarks. It particularly excels in coding and mathematical reasoning tasks.

The updated version of the Llama license allows synthetic data creation for training other AI models, with some restrictions.

try on replicate

Smaller model, big performance

Mistral AI unveils Mistral Large 2, a 123 billion parameter model under a Research License:

  • Matches Llama 3.1 405B in some tasks
  • Excels at coding and math
  • 128,000 token context

This release demonstrates that smaller, more efficient models can compete with larger ones. Its strong performance in coding and math tasks makes it particularly interesting for developers working on technical applications.

However, the restrictive research license may limit its adoption and impact in the open-source community.

post


Cool tools

Meta’s framework for building AI agents

Meta open-sources a toolkit for creating AI agents with Llama 3.1.

  • Breaks down complex tasks
  • Uses built-in and custom tools
  • Configurable safety with Llama Guard

This framework allows developers to create AI agents that can tackle multi-step problems and interact with external tools. The inclusion of Llama Guard for safety provides a starting point for responsible AI development.

github


Research radar

Scaling secrets of Llama 3.1

Meta’s research on Llama 3.1 reveals:

  • Extensive use of synthetic data
  • Novel fine-tuning approaches for specialized tasks
  • Techniques for handling long contexts
  • Built-in tool use abilities

These advancements provide valuable insights for developers working on large language models, especially for domain-specific applications and complex task handling.

post | paper

Lightweight defense against LLM exploits

Along with Llama 3.1, Meta released PromptGuard, a small classification model to detect malicious prompts:

  • Based on mDeBERTa-v3-base with multilingual capabilities
  • Classifies inputs as BENIGN, INJECTION, or JAILBREAK
  • Helps prevent prompt injection and jailbreak exploits

Ben at Taylor AI demonstrates how to integrate PromptGuard into existing workflows. Notably, the INJECTION tag can flag on benign prompts, as it’s designed to handle both user inputs and retrieved contexts.

blog post


Changelog

Search all the public models on Replicate

We’ve added a new API endpoint for searching public models on Replicate:

  • Use a simple QUERY HTTP request
  • Search by plaintext query
  • Get paginated JSON responses with model details

This new endpoint makes it easier to discover and integrate models into your projects. You can now programmatically search for models based on specific criteria, streamlining your development workflow.

changelog


Bye for now

In other news, we have a subscribe form now! You can find it, and all the back issues of this letter, at replicate.com/newsletter.

Thanks for reading. Make sure to forward this letter to seven more people, or you’ll have seven weeks of cold boots.

— deepfates

Up next Replicate Intelligence #7 Data curation, data generation, data data data Replicate Intelligence #9 Open source frontier image model, cut objects from videos, new Python web framework from Jeremy Howard
Latest posts Replicate Intelligence #12 Replicate Intelligence #11 Replicate Intelligence #10 Replicate Intelligence #9 Replicate Intelligence #8 Replicate Intelligence #7 Replicate Intelligence #6 Replicate Intelligence #5 Replicate Intelligence #4 Replicate Intelligence #3 Replicate Intelligence #2 Replicate Intelligence #1 The 3½ Tenets of Biocosmism Hypervector Redactions Rufus, your AI-powered shopping assistant Oh Turing Two scientists The Ascension of Cerebro The Hyperstition Array Crawling Chat Instructions 2 as a user The OOM Source Text Paradoxes Message from SF Instructions Another carved fragment Data Cognitive Security 101