Text to Speech in Chrome and Chrome OS

Chrome and Chrome OS allow developers to produce synthesized speech. This document is an overview of the relevant code and code structure around synthesized speech.

Code structure

A brief outline of the flow from speech request to the resulting speech on any platform.

Input

chrome.tts extension API
- The chrome.tts extension API allows extensions to request speech across Windows, Mac or Chrome OS, using native speech synthesis.
- Input to the extension is first processed in the TtsExtensionApi.
- The extension is passed an Options object in chrome.tts.speak, which is translated into a tts_controller Utterance.
Web Speech API
- Chrome implements Window.SpeechSynthesis from the Web Speech API. This allows web apps to do text-to-speech via the device's speech synthesizer.
- A WebSpeechSynthesisUtterance is created by window.SpeechSynthesis

Processing

The TtsControllerImpl (in content/) processes utterances and sends them to the correct output engine.
The TtsControllerDelegateImpl (in chrome/) provides chrome OS specific functionality.

Output

May differ by system, including Mac, Wind, Android, Arc++, and Chrome OS
- Platform APIs are in content/browser/speech, expect for Chrome OS's, which is in chrome/browser/speech.
In Chrome OS:
- TtsEngineExtensionAPI forwards speech events to PATTS, or the network speech engine, or, coming soon, third-party speech engines.
- PATTS is the built-in Chrome OS text-to-speech engine.

Testing

Unit tests
- TtsControllerUnittest in content/browser/speech
- TtsControllerDelegateImplUnittest in chrome/browser/speech
- ArcTtsServiceUnittest for ARC++ voices
Browser tests
- TtsApiTest tests Chrome TTS extension APIs
Fuzzer
- In content_unittests, content/browser/speech/tts_platform_fuzzer.cc (currently Windows only).