As the sector of generative AI continues to evolve, a new trend is emerging: a shift towards models tailored specifically for enterprise needs. This transition is exemplified by Snowflake, the renowned cloud computing company, which has just unveiled its latest offering: Arctic LLM, touted as an “enterprise-grade” generative AI model.
Arctic LLM, released under the Apache 2.0 license, is optimized for enterprise workloads, including tasks such as generating database code. Snowflake CEO Sridhar Ramaswamy sees this innovation as a crucial step towards unlocking the full potential of AI within the enterprise. With Arctic LLM, Snowflake aims to empower its customers to build cutting-edge AI-driven products tailored to their unique requirements.
- Efficiently Intelligent: Arctic excels at enterprise tasks such as SQL generation, coding and instruction following benchmarks even when compared to open source models trained with significantly higher compute budgets. In fact, it sets a new baseline for cost-effective training to enable Snowflake customers to create high-quality custom models for their enterprise needs at a low cost.
- Truly Open: Apache 2.0 license provides ungated access to weights and code. In addition, we are also open sourcing all of our data recipes and research insights.
Snowflake Arctic is available from Hugging Face today or via your model garden or catalog of choice, including Snowflake Cortex, Amazon Web Services (AWS), Microsoft Azure, NVIDIA API catalog, Lamini, Perplexity, Replicate and Together over the coming days.
This move by Snowflake aligns with a broader trend in the tech industry, where cloud vendors are increasingly focusing on catering to the needs of enterprise clients. The release of Arctic LLM follows in the footsteps of similar offerings from competitors like Databricks’ DBRX, signaling a growing demand for AI solutions tailored specifically for enterprise applications.
Training efficiency
To achieve this level of training efficiency, Arctic uses a unique Dense-MoE Hybrid transformer architecture. It combines a 10B dense transformer model with a residual 128×3.66B MoE MLP resulting in 480B total and 17B active parameters chosen using a top-2 gating.
1) Many-but-condensed experts with more expert choices: In late 2021, the DeepSpeed team demonstrated that MoE can be applied to auto-regressive LLMs to significantly improve model quality without increasing compute cost.
Arctic is designed to have 480B parameters spread across 128 fine-grained experts and uses top-2 gating to choose 17B active parameters.
2) Architecture and System Co-design: Training vanilla MoE architecture with a large number of experts is very inefficient even on the most powerful AI training hardware due to high all-to-all communication overhead among experts. However, it is possible to hide this overhead if the communication can be overlapped with computation.
3) Enterprise-Focused Data Curriculum: Excelling at enterprise metrics like Code Generation and SQL requires a vastly different data curriculum than training models for generic metrics.
Snowflake’s Arctic LLM boasts impressive performance metrics, outperforming competitors on tasks such as coding and SQL generation. Built on a mixture of experts (MoE) architecture, Arctic LLM combines efficiency with scalability, making it a compelling choice for enterprise users.
However, amidst the excitement surrounding Arctic LLM, some questions linger about its practicality and effectiveness. With a relatively small context window compared to other models, Arctic LLM may face challenges in handling complex tasks requiring extensive contextual understanding.
Getting started with Arctic
Snowflake AI Research also recently announced and open sourced the Arctic Embed family of models that achieves SoTA in MTEB retrieval. We are eager to work with the community as we develop the next generation in the Arctic family of models. Join us at our Data Cloud Summit on June 3-6 to learn more.
Here’s how we can collaborate on Arctic starting today:
- Go to Hugging Face to directly download Arctic and use our Github repo for inference and fine-tuning recipes.
- For a serverless experience in Snowflake Cortex, Snowflake customers with a payment method on file will be able to access Snowflake Arctic for free until June 3. Daily limits apply.
- Access Arctic via your model garden or catalog of choice including Amazon Web Services (AWS), Lamini, Microsoft Azure, NVIDIA API catalog, Perplexity, Replicate and Together AI over the coming days.
- Chat with Arctic! Try a live demo now on Streamlit Community Cloud or on Hugging Face Streamlit Spaces, with an API powered by our friends at Replicate.
- Get mentorship and credits to help you build your own Arctic-powered applications during our Arctic-themed Community Hackathon.
Additionally, like all generative AI models, Arctic LLM is not immune to shortcomings such as hallucinations, where it may provide incorrect responses based on statistical probabilities.
Despite these considerations, Snowflake remains optimistic about the potential of Arctic LLM to drive innovation and value for its customers. With a commitment to providing resources and support for users looking to leverage Arctic LLM, Snowflake aims to position itself as a leader in the enterprise AI space.