Introduction to Web SpeechSynthesis

7th July 2023

speech-synthesis

In the world of Web, there's a cool thing called Speech Synthesis or Text-to-Speech (TTS) synthesis. It's a special technology that turns written text into human-like speech. It's like having a computer that can talk! In this blog, we'll explore what Web Speech Synthesis is all about, how it helps people, and why it's so cool.

What is Web Speech Synthesis?

Before diving into the main topic, let's first understand what Web Speech Synthesis or Speech Synthesis in general means.

The SpeechSynthesis is an interface for the Web Speech API. Speech Synthesis is a way of transforming text into voice through a computerized voice. In a simple form of output, the computer reads the typed text in a human-like voice. This is used to retrieve information about the synthesis voices available, start and pause speech, and other commands besides. In short, by the use of SpeechSynthesis, we can command our browser to read a given text.

speech-synthesis

So, we just came across a term called the Web Speech API. Now, let's explore what it actually refers to.

What is Web Speech API?

The Web Speech API was first introduced in 2012 and it is used by web developers to provide audio input and output features in web apps. This is built in for the browser so we don’t need to use any third party app or plugins. This API has two major interfaces, SpeechSynthesis and SpeechRecognition.

speech-synthesis

How to use Web Speech Synthesis?

Now we will see how to use the magic of Web Speech Synthesis. Using Speech Synthesis in your application is a very simple and straightforward process. As we know, Speech Synthesis is built-in for the browser, so we don’t need to use anything complex to use this. We have a Speech Synthesis controller, which we can access using window.speechSynthesis.

SpeechSynthesis have some methods, properties and events to be used. It also inherits properties from its parent interface EventTarget, an interface which can have events and listeners.

Speech Synthesis Properties and Events

Below are most common properties provided by Speech Synthesis:

Below is the event provided by Speech Synthesis:

voiceschanged
Fired when the list of SpeechSynthesisVoice objects that would be returned by the SpeechSynthesis.getVoices() method has changed. Also available via the onvoiceschanged property Speech Synthesis Methods

Below are most common methods provided by Speech Synthesis:

SpeechSynthesisUtterance (Utterance)

Since you were curious about the term "utterance" mentioned in the previous section, let's take a moment to understand what exactly an utterance means.

The SpeechSynthesisUtterance interface of the Web Speech API represents a speech request. It contains the content the speech service should read and information about how to read it(e.g. language, pitch and volume.). In short its a message or text we want the browser to read.

It has its own properties like, .pitch, .rate, .text, .voice, .volume etc.

It has its own events like, end, resume, start, pause etc.

Now lets look at some benefits and limitations of Speech Synthesis

Benefits of Speech Synthesis

There are various useful applications of Speech Synthesizers and some of them are:

Limitations of Speech Synthesis

Speech Synthesis have some limitations also as everything have.

Example

You can see it working here https://github.com/corevalue-technologies/js-text-reader"

Now, let's examine how the application operates:

Below is a screenshot for the application:

speech-synthesis

Conclusion

Speech synthesis is an amazing technology that turns written words into spoken language. It helps people with reading difficulties, improves communication for those who can't speak, and makes digital experiences more accessible and enjoyable. As this technology continues to advance, we can expect even more exciting features and possibilities. So, get ready for a world where computers can talk and make our lives even more awesome!