Stay free if you only need apache 2.0 license and full colang 2.0 specification language. Upgrade if you need enterprise sla and support and gpu-accelerated low-latency rails. Most solo builders can start free.
Why it matters: Colang has a learning curve — it's a new domain-specific language that developers must learn on top of their existing stack
Available from: NVIDIA Enterprise
Why it matters: Adding multiple rail layers introduces measurable latency (50-200ms per rail check depending on complexity), which compounds in real-time applications
Available from: NVIDIA Enterprise
Why it matters: Primarily focused on text-based conversations — limited support for multimodal content filtering (images, audio, video)
Available from: NVIDIA Enterprise
Why it matters: Complex guardrail configurations can be difficult to test exhaustively, making it hard to guarantee coverage against all edge cases
Available from: NVIDIA Enterprise
Why it matters: Get help when stuck. Can save hours of troubleshooting on critical projects.
Available from: NVIDIA Enterprise
Colang is a domain-specific language created by NVIDIA specifically for defining conversational guardrails. It uses an event-driven model where you define flows describing how the AI should behave. The syntax is relatively simple and purpose-built — most developers can write basic guardrails within a few hours of reading the docs.
Each rail layer adds 50-200ms depending on complexity. Input rails run before the LLM call, so they add to perceived latency. Output rails run after. Simple topic checks are fast; complex fact-checking rails that require additional LLM calls are slower. GPU acceleration reduces this significantly.
No guardrail system can prevent 100% of jailbreak attempts. NeMo Guardrails significantly reduces the attack surface through multi-layered detection, but determined adversaries with novel techniques may find bypasses. It's best used as part of a defense-in-depth strategy alongside prompt engineering and monitoring.
NeMo Guardrails works with any LLM including OpenAI, Anthropic, Google, open-source models, and NVIDIA's own models. The guardrails wrap the LLM interaction, so the underlying model is interchangeable. Some rails use a secondary LLM for evaluation, which can be any supported provider.
Start with the free plan — upgrade when you need more.
Get Started Free →Still not sure? Read our full verdict →
Last verified March 2026