Verta | Blog

January roundup: our favorite models:

Written by Baasit Sharief | February 07, 2024

 

 

The rapidly evolving GenAI models market can be overwhelming to keep track of, but fear not! In our monthly roundup, we present the top 5 open source GenAI models that deserve a spot on your radar. The trend is clear – the emergence of SLMs ranging from approximately 1 to 3 billion parameters is on the rise, and we anticipate this trend will persist. Let's delve into the latest releases from established names in the OSS community:

CodeLlama-70B

  • Built by: Meta
  • Size: 70B
  • Why it's interesting: This model, based on Llama 2, represents the latest and more performant version of the CodeLlama model for code generation.
  • Where to use it: Three variants are available – Code Completion, Instruct, and Python.
  • More here 

Mixtral-8x7B

  • Built by: Mistral AI
  • Size: 46.7B
  • Why it's interesting: A Sparse Mixture-of-Experts model (SMoE) with open weights, Mixtral-8x7B outperforms Llama 2 70B on most benchmarks with 6x faster inference.
  • Where to use it: Boasting a large context size of 32k tokens, it excels in English, French, Italian, German, and Spanish, particularly showcasing robust performance in code generation. 
  • Learn more about the model here.

Eagle-7B by RWKV Open Source Group

  • Built by: RWKV Open Source Group
  • Size: 7B
  • Why it's interesting: Featuring the novel RWKV-v5 architecture (Receptance Weighted Key Value), Eagle-7B surpasses Falcon, LLaMA2, and Mistral in English evaluations. Notably, it is the most energy-efficient 7B model in terms of joules per token.
  • Where to use it: Ideal for various multilingual applications with support for 23 languages, outperforming other 7B counterparts. 
  • Dive into the details of Eagle 7B.

Phi-2 by Microsoft

  • Built by: Microsoft
  • Size: 2.7B
  • Why it's interesting: As part of the Phi series, Phi-2 builds on the success of its predecessors, including Orca 2, in teaching SLMs to reason. Achieving state-of-the-art performance among base language models, it competes with or outperforms models up to 25 times larger.
  • Where to use it: A cost-effective SLM suitable for specialized fine-tuning and distillation applications. Notably, the license has been updated to MIT, simplifying its use and commercialization. 
  • Uncover the surprising power of small language models.

StableLM 2 by StabilityAI

  • Built by: StabilityAI
  • Size: 1.6B
  • Why it's interesting: Drawing inspiration from Phi-2, StableLM 2 pushes the limits of SLMs further, outperforming Falcon-40B-Instruct on MTBench and its counterparts in the same range. It also excels in multilingual benchmarks.
  • Where to use it: An alternative to Phi-2 for fine-tuning and distillation, requiring a Stability AI membership for commercialization. 
  • Explore the capabilities of Stable LM 2.

Learn more

For an in-depth look at our capabilities, you can read our full launch blog post and check out the platform here.