Ai2’s open foundation models are now available on the Cirrascale Inference Platform. Start now and put Ai2's models to work for you.
Ai2’s fully open OLMo and Molmo models, and open-weights Tülu models, empower enterprises to build AI-powered applications out of the box. The Cirrascale Inference Platform provides scalable, secure API endpoints that simplify deployment. Together, we deliver plug-and-play API access to OLMo and Molmo, enabling enterprises to prototype, launch, and scale with ease. Additionally, these models can be fine-tuned or RAG-enabled and then deployed with the Cirrascale Inference Platform, removing operational hurdles and accelerating enterprise adoption.
OLMo 2 is Ai2’s latest family of truly open-source language models, trained on up to six trillion tokens and released with full data, code, checkpoints, and evaluation suites. Performance holds its own: the 7B and 13B checkpoints meet or beat Llama-3 peers, while the flagship 32B model is the first fully-open LLM to surpass GPT-3.5-Turbo and GPT-4o mini on a broad suite of academic benchmarks.
Molmo is a suite of multimodal models that interpret images, diagrams, and text in a single prompt. Molmo narrows the performance gap with proprietary giants across VQA and captioning leaderboards, while lighter variants outclass models ten times larger—ideal for cost-sensitive inference. Molmo empowers developers to deploy state-of-the-art multimodal reasoning through the Cirrascale Inference Platform.
Tülu is Ai2’s family of open instruction-following models with fully released data, code, and post-training recipes designed to serve as a comprehensive guide for modern post-training techniques. Using Ai2’s Reinforcement Learning with Verifiable Rewards and other modern fine-tuning techniques, the models deliver strong reasoning, coding, and multi-turn dialogue skills while staying light enough for cost-effective inference.
OLMo and Molmo are true open models, offering open weights and—uniquely for AI models—fully open data, all under permissive licensing that benefits enterprises.
A web front end lets users create an account with SSO, explore models in a sandbox chat UI, grab API keys/endpoints, add a payment method, and start building.
Leverage the Cirrascale Inference Platform’s load-balancing technology to automatically scale pipelines up or down to meet throughput demands.
Designed for 24/7 availability with NOC monitoring, ticketing and escalation paths that meet the uptime expectations of production workloads.
Cirrascale auto-selects and configures the best AI accelerators and hardware for your model, enabling faster innovation.
Token-based, daily true-up billing that removes guesswork and surprise invoices.
It's easy to get started. Our web front end lets users create an account with SSO, explore models in a sandbox chat UI, grab API keys/endpoints, add a payment method, and start building.