Crasis distills a frontier model's understanding of your task into a tiny, local ONNX specialist. Train once. Deploy everywhere. Zero API cost, zero latency, zero data leaving your device.
Crasis calls a frontier model exactly once — to understand your task and generate labeled training data. That intelligence is distilled into a tiny encoder model. After that, the frontier model is never called again.
Describe your task in plain English. A spec is not code — it's a contract. Define what a positive example looks like, what to ignore, and your quality bar.
Crasis calls OpenRouter with enforce_distillable_text: true — routing only to models whose licenses explicitly permit distillation. Clean provenance on every sample.
BERT-class distillation on your GPU. An RTX 4060 completes a binary specialist in under 30 minutes. CPU-only training is supported — it just takes longer.
The output is a single ONNX file. Runs on laptops, Raspberry Pi, Jetson, mobile. ONNX Runtime is available in every language. No GPU required at inference time.
When you have real labeled examples, crasis mix blends them with synthetic data at a configurable weight and retrains. The gap between synthetic and real accuracy closes fast.
The tasks people most commonly pay frontier models to handle. Pull, deploy, never pay for them again. Accuracy is reported as both synthetic and holdout — see SCORECARD.md for full results.
* Experimental: holdout accuracy below 75% indicates synthetic-to-real distribution shift. Use crasis mix to improve with your own data.
Frontier models are brilliant generalists. For bounded classification tasks, that's overkill.
| Metric | Frontier API | Crasis Specialist |
|---|---|---|
| Model size | 4GB+ | 4–11MB |
| Cost per query | $0.001–0.01 | $0.00 |
| Inference latency | 2–5 seconds | <3ms on CPU |
| Works offline | No | Yes |
| Data leaves device | Always | Never |
| Accuracy on narrow tasks | ~97% | See SCORECARD |
| Cost trend over time | Flat (toll) | Already free |
Inference only. No PyTorch. No GPU. Just ONNX Runtime and a 4MB model.
Full pipeline: pip install crasis[train]
Want the hosted pipeline? Crasis Studio →