Smallest.ai, Tenstorrent Cut Voice AI Cost 3.6x

Smallest.ai's Lightning V2 TTS model now runs on Tenstorrent hardware, delivering production‑quality audio with low‑precision compute.
The partnership achieves 3.6× lower infrastructure cost versus NVIDIA L40S GPUs, enabling 550 simultaneous voice calls for ~$27k.
Over 95% of the model runs in reduced‑precision arithmetic, with >80% in BlockFloat8, maintaining audio quality without degradation.
The solution targets regulated sectors—financial services, healthcare, telecom—allowing on‑prem deployments that meet GDPR, HIPAA, and sovereign data requirements.