Start free. Scale when you need to.

LayerScale is free to use with full inference capabilities. Upgrade to Pro when you need unlimited sessions and production monitoring.

Free

LayerScale

$0
No license key required. Use forever.
  • OpenAI and Anthropic API compatibility
  • Tool calling (both formats)
  • SSE streaming responses
  • 1 persistent session
  • Streaming data ingestion
  • Flash Queries
  • WebSocket support
  • Full GPU acceleration
  • Context length up to 32K tokens
  • KV cache quantization (FP16, Q8_0, Q4_0)
  • 2 context pool slots
Get License Key
License key required

LayerScale Pro

Contact us
30-day free trial available.
  • Everything in LayerScale
  • Unlimited concurrent sessions
  • Unlimited context length (up to 1M tokens)
  • Unlimited context pool
  • Prometheus metrics endpoint
  • Priority decode scheduler
  • Production support
Contact Sales

Feature comparison

Feature LayerScale Pro
Inference
OpenAI API (/v1/chat/completions)
Anthropic API (/v1/messages)
Tool calling
SSE streaming
GPU acceleration (CUDA, ROCm)
Stateful Inference
Persistent sessions1Unlimited
Streaming data ingestion
Flash Queries
WebSocket connections
Infrastructure
Context length32,768Unlimited
Context pool size2Unlimited
KV cache quantization
Prometheus metrics
Priority decode scheduler
Support
Documentation
Priority support
License key requiredNoYes