We are back with our April LLM roundup! As the technology and research continues to change, models with parameters ranging from 1 to 3 billion are becoming increasingly prevalent, indicating a sustained trend towards scalability and performance.

Let's dive into the cutting-edge releases from prominent names in the GenAI community:

Bitnet

Developed by: Microsoft Research
Parameters: 7B (Llama-7B was chosen in the paper)
Why it's noteworthy: Bitnet stands out for its groundbreaking approach to energy-efficient AI. This model, characterized by ternary weights {-1, 0, 1}, achieves a remarkable 71.4 times reduction in arithmetic operations energy consumption compared to traditional models. Its architecture, which replaces multiplication operations with addition, showcases Microsoft's commitment to sustainability in AI
Where to explore: While Bitnet is currently a work in progress by community members and hasn't been officially open-sourced by Microsoft, its potential to revolutionize energy-efficient AI is undoubtedly worth monitoring closely
Learn more here

Gemma

Developed by: Google
Parameters: 2B, 7B
Why it stands out: Gemma emerges as Google's response to the growing demand for responsible AI. This model not only matches but exceeds the performance of larger models like Llama 13B, emphasizing quality over sheer size. Google's emphasis on responsible AI is evident in Gemma's design, coupled with the release of the GenAI responsible AI toolkit, ensuring ethical and transparent AI development
Where to dive deeper: Gemma's prowess in math/science and coding domains positions it as a formidable contender in the GenAI landscape, while making it a compelling option for those prioritizing both performance and ethics in AI applications
Learn more here

Qwen1.5

Developed by: Alibaba Cloud
Parameters: Family of Models (0.5B-72B)
Why it's notable: Qwen1.5 introduces an extended context size of 32768 tokens, catering to complex language processing tasks with enhanced performance Qwen1.5 base models under 7 billion parameters are highly competitive with the leading small-scale models in the community while the bigger counterparts reach or exceed GPT-4 on certain benchmarks. Its improvements in performance hint at Alibaba Cloud's commitment to pushing the boundaries of AI capabilities
Where to explore further: With its enlarged context window, Qwen1.5 presents exciting possibilities for tackling nuanced language tasks, showcasing Alibaba Cloud's dedication to advancing AI research and development
Learn more here

Aya

Developed by: Cohere
Parameters: 12.9B
Why it's captivating: Aya, trained on a diverse corpus spanning 101 languages, represents a leap towards multilingual GenAI models. Cohere's open approach to language diversity fosters inclusivity and accessibility in AI applications, catering to a global audience with varied linguistic backgrounds
Where to find more information: Aya's multilingual prowess positions it as a versatile tool for cross-cultural communication and understanding, highlighting Cohere's commitment to building AI solutions that transcend linguistic barriers
Learn more here

StarCoder

- Developed by: Collaborative effort between Nvidia, ServiceNow, and HuggingFace
- Parameters: Family of Models (3B, 7B, 15B)
- Why it's captivating: StarCoder shines as a collaborative effort aimed at revolutionizing AI-assisted coding. With variants boasting up to 15 billion parameters, StarCoder empowers developers with advanced coding capabilities across 600+ programming languages. Its superior coding capabilities, coupled with an enhanced context window of 16,384 tokens with sliding window attention of 4,096 tokens, signify a leap forward in AI-driven coding solutions
- Where to find more information: StarCoder's emergence signals a transformative shift in the coding landscape, empowering developers with advanced AI tools for streamlined code generation and development processes
- Learn more here

From energy-efficient architectures to responsible AI frameworks, these models exemplify the multifaceted nature of GenAI research and development. As we continue to witness rapid advancements in AI technology, staying abreast of these cutting-edge models is essential for harnessing their transformative potential in various domains. Stay tuned for our next roundup for the latest updates from the changing world of LLM's!

Learn more

For an in-depth look at our capabilities, you can read our full launch blog post and check out the platform here.

Subscribe To Our Blog

Get the latest from Verta delivered directly to you email.

April Model Roundup

Bitnet

Gemma

Qwen1.5

Aya

Subscribe To Our Blog

Try Verta today