Monday, June 25, 2007

The Java Speech API: A Primer on Speech Applications


You read about them in science fiction, you see them in the movies, and you've have dreamed of using them—machines that talk to you, listen to you and take your commands. Speech-based applications can make life more efficient, faster, and maybe add a little pep to the monotony of the everyday grind. The essential component in of these machines are based on speech technology.

Even though computer-based speech technology was first created in the latter half of the 1950s, it made little progress due to the cost and availability of equipment. Speech technology today has become more affordable and is being widely deployed in fields like security systems, medical systems, voice response systems, dictation systems, and in assistive technology for the disabled.

Most of the early work and applications using speech were done using C, C++, and other native languages. Now that speech applications need to be integrated with Web-based applications, Java has become the language of choice in which to build them.

The Java Speech API
The Java Speech API outlines standards and guidelines as to how speech applications can be built to inter-operate with each other and on all Java compatible platforms. As such, it provides the API and functionality to build the base for speech applications.

Features

* Converts speech to text
* Converts text and delivers them in various formats of speech
* Supports events based on the Java event queue
* Easy to implement API interoperates with multiple Java-based applications like applets and Swing applications
* Interacts seamlessly with the AWT event queue
* Supports annotations using JSML to improve pronunciation and naturalness in speech
* Supports grammar definitions using JSGF
* Ability to adapt to the language of the speaker

Hello World
To show you how the speech API works, this article walks you through a simple program. To run the sample program, you will need a Java implementation that supports the Java Speech API. This example uses CloudGarden's TalkingJava SDK (although you could also use FreeTTS; See 'Running the demo applications' to use FreeTTS). You can download a 30-day trial of the implementation here. Use the setup to install the application.

There are two files you need for developing your speech applications—cgjsapi.jar and cgjsapi.dll. Before starting work with the samples, make sure that the cgjsapi.dll file is available in your PATH and that the cgjsapi.jar file is available in your CLASSPATH:


TellTime.java:
This simple program demonstrates the working of the Speech API
by reading out the system time.

package speechdemo;

import javax.speech.*;
import javax.speech.synthesis.*;
import java.util.*;

public class TellTime {

public static void main(String[] args) {
try {
Calendar calendar = new GregorianCalendar();
String sayTime = "Its " +
calendar.get(Calendar.HOUR) + " " +
calendar.get(Calendar.MINUTE) + " " +
(calendar.get(Calendar.AM_PM)== 0 ? "AM" : "PM");

Synthesizer synth = Central.createSynthesizer(null);
synth.allocate();
synth.resume();

synth.speakPlainText(sayTime,null);

synth.waitEngineState(Synthesizer.QUEUE_EMPTY);
synth.deallocate();

} catch (Exception e) {
e.printStackTrace();
}
}
}

No comments: