LLM News and Articles Weekly Digest — April 24, 2024
Latest News
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone (Source)
Phi 3 comprises a range of models, varying in size from 3 billion to 14 billion parameters, all demonstrating outstanding performance on contemporary benchmarks. The 3B model, in particular, asserts superiority over the original ChatGPT model and its weights have been made publicly available. Additionally, a variant with an extended 128k context length is accessible.
Meta releases LLama 3 -8B, 70B and later 400B (Announcement, Models, Try it, Run Locally)
The model has been trained on a massive 15 trillion tokens! Two modes have been released, one with 70 billion parameters and another with 8 billion parameters, along with instruction fine-tuning. It operates with an 8K context length and is not multi-modal. The 70B model achieves impressive results, scoring 82% on MMLU and 81.7% on HumanEval. It employs a tokenizer with a vocabulary size of 128,000. Unlike the MoE (Mixture of Experts) approach, this model utilizes a dense architecture. Both modes have been fine-tuned on human-annotated datasets and are openly accessible. Additionally, the model already incorporates RoPe (Robust Preprocessing).
Bigxtral instruct 0.1 (Blog, Try it)
The Instruct model, part of the Apache 2 series, stands as the premier choice, supported by a comparison chart that sparked community interest. Mistral AI’s Mixtral 8x22B model leads with exceptional performance and efficiency. Fluent in five languages and equipped with strong math and coding capabilities, it utilizes only 39 billion parameters out of 141 billion, maintaining cost efficiency. With a 64K token context window, it excels in recalling information from large documents. Released under an open-source license, it outperforms competitors on reasoning, knowledge, and language benchmarks, particularly in four languages. Its adaptability is further enhanced by Mistral’s new Tokenizer, tailored for tool use with tokens.
Meta’s battle with ChatGPT begins now
Meta’s AI assistant is being put everywhere across Instagram, WhatsApp, and Facebook. Meanwhile, the company’s next major AI model, Llama 3, has arrived.
Mistral seeking funding at $5B valuation
There have been reports of the open source pioneer Mistral seeking several hundred million dollars of funding to train more models.
Google’s New Technique Gives LLMs Infinite Context
Google researchers have introduced Infini-attention, a technique that enables LLMs to work with text of infinite length while keeping memory and compute requirements constant.
Articles
Stanford HAI Releases 2024 AI Index Report (Website)
OpenAI and Meta Reportedly Preparing New AI Models Capable of Reasoning
A Handy Compendium of Common Terms Used In The Context Of LLMs
From 7B to 8B Parameters: Understanding Weight Matrix Changes in LLama Transformer Models
Unlocking the Power of Transformers: A Journey through the Evolution of Artificial Intelligence
Groq API: Unleashing the Power of Ultra-Low Latency AI Inference
Papers and Repositories
Optimizing In-Context Learning in LLMs
This paper introduces a new approach to enhancing In-Context Learning (ICL) in large language models like Llama-2 and GPT-J. Its authors present a new optimization method that refines what they call ‘state vectors’ — compressed representations of the model’s knowledge.
AI Gateway is an interface between apps and hosted large language models. It streamlines API requests to LLM providers using a unified API. AI Gateway is fast, with a tiny footprint, and it can load balance across multiple models, providers, and keys. It has fallbacks to ensure app resiliency and supports plug-in middleware as needed.