Smallest.ai’s Lightning V2 TTS model now runs on Tenstorrent’s P100 accelerators, delivering 3.6× lower infrastructure cost versus NVIDIA L40S GPUs.
The partnership achieves up to 4× cost reduction and higher throughput, supporting 550 simultaneous voice calls with no audio quality loss.
Over 95% of Lightning V2 runs in low‑precision arithmetic, >80% in BlockFloat8, enabling real‑time inference without audible degradation.
The solution targets regulated sectors—financial services, healthcare, telecom—allowing on‑prem deployments that meet GDPR, HIPAA, and sovereign data requirements.