Extracted Insight

  • Product Overview: Google announced Gemini 3.5 Live Translate, an audio‑only model that delivers near real‑time speech‑to‑speech translation across more than 70 languages. The model automatically detects the spoken language, preserves the speaker’s intonation, pacing and pitch, and generates translated speech continuously, staying only a few seconds behind the speaker.
  • Technical Capabilities: The system processes speech as it streams, handles multilingual inputs without manual configuration, and is noise‑robust enough to operate in loud, unpredictable environments. All audio output is water‑marked with SynthID, an imperceptible watermark designed to help prevent misinformation.
  • Roll‑out Plan: The model is rolling out today across multiple Google products:
  • Public preview for developers via the Gemini Live API and Google AI Studio.
  • Private preview for enterprises in Google Meet starting this month.
  • Integration into Google Translate on Android and iOS.
  • Developer Ecosystem: Platforms such as Agora, Fishjam, LiveKit, Pipecat and Vision Agents are integrating the technology to enable voice‑translation applications.
  • Enterprise Adoption – Grab: Grab is testing Gemini 3.5 Live Translate to enable multilingual communication between drivers and travelers at pickups. Grab’s users make over 10 million voice calls per month. Philipp Kandal, Chief Product Officer at Grab, highlighted the model’s ability to auto‑detect multiple languages and translate speech accurately with low latency.
  • Google Meet Enhancement: In Google Meet, the new model expands language support from the previous limit of five languages to over 70 languages, allowing more than 2,000 language‑combination possibilities within a single meeting.