When to fine-tune
Fine-tuning makes sense when you need consistent output format, domain-specific tone, or latency improvements that prompt engineering cannot achieve.
Methods
- Full fine-tuning — all weights updated; expensive but maximum flexibility.
- LoRA / QLoRA — low-rank adapters added to frozen weights; 10-100× cheaper and our default recommendation for most clients.
Evaluation
We always establish a baseline eval suite before fine-tuning starts so you can measure actual improvement, not vibes.