Ai2 Endpoint Platform

Scalable API endpoints for OLMo, Molmo, and Tülu

Ai2’s open foundation models are now available on the Cirrascale Inference Platform. Start now and put Ai2's models to work for you.

Why Deploy Ai2 Models with Cirrascale?

Ai2’s fully open OLMo and Molmo models, and open-weights Tülu models, empower enterprises to build AI-powered applications out of the box. The Cirrascale Inference Platform provides scalable, secure API endpoints that simplify deployment. Together, we deliver plug-and-play API access to OLMo and Molmo, enabling enterprises to prototype, launch, and scale with ease. Additionally, these models can be fine-tuned or RAG-enabled and then deployed with the Cirrascale Inference Platform, removing operational hurdles and accelerating enterprise adoption.

OLMo2

OLMo 2 is Ai2’s latest family of truly open-source language models, trained on up to six trillion tokens and released with full data, code, checkpoints, and evaluation suites. Performance holds its own: the 7B and 13B checkpoints meet or beat Llama-3 peers, while the flagship 32B model is the first fully-open LLM to surpass GPT-3.5-Turbo and GPT-4o mini on a broad suite of academic benchmarks.

Molmo

Molmo is a suite of multimodal models that interpret images, diagrams, and text in a single prompt. Molmo narrows the performance gap with proprietary giants across VQA and captioning leaderboards, while lighter variants outclass models ten times larger—ideal for cost-sensitive inference. Molmo empowers developers to deploy state-of-the-art multimodal reasoning through the Cirrascale Inference Platform.

Tülu

Tülu is Ai2’s family of open instruction-following models with fully released data, code, and post-training recipes designed to serve as a comprehensive guide for modern post-training techniques. Using Ai2’s Reinforcement Learning with Verifiable Rewards and other modern fine-tuning techniques, the models deliver strong reasoning, coding, and multi-turn dialogue skills while staying light enough for cost-effective inference.

Ai2 Endpoint Platform Core Benefits

Truly Open Models

OLMo and Molmo are true open models, offering open weights and—uniquely for AI models—fully open data, all under permissive licensing that benefits enterprises.

Ready to Use with Self-Service Onboarding

A web front end lets users create an account with SSO, explore models in a sandbox chat UI, grab API keys/endpoints, add a payment method, and start building.

Scale Performance with No Infrastructure Headaches

Leverage the Cirrascale Inference Platform’s load-balancing technology to automatically scale pipelines up or down to meet throughput demands.

Enterprise-Grade Reliability and Support

Designed for 24/7 availability with NOC monitoring, ticketing and escalation paths that meet the uptime expectations of production workloads.

Accelerator Optimization

Cirrascale auto-selects and configures the best AI accelerators and hardware for your model, enabling faster innovation.

Pay Only for What You Use

Token-based, daily true-up billing that removes guesswork and surprise invoices.

Ready to Put Ai2's Models to Work for You?

It's easy to get started. Our web front end lets users create an account with SSO, explore models in a sandbox chat UI, grab API keys/endpoints, add a payment method, and start building.