Tuesday, March 1, 2011

Text To Speech by Java - Simplified

Sun released a standard cross platform Speech APIs for Text to Speech and Speech recognition.
With the Java Speech API you can incorporate speech technology into user interfaces for your applets and applications based on Java technology. This API specifies a cross-platform interface to support command and control recognizers, dictation systems and speech synthesizers.
Cross-Industry Development
The Java Speech API was developed by Sun Microsystems, Inc. in collaboration with leading speech technology companies: Apple Computer, Inc., AT&T, Dragon Systems, Inc., IBM Corporation, Novell, Inc., Philips Speech Processing, and Texas Instruments Incorporated. Sun works with third-party speech companies to encourage the availability of multiple implementations.

One of these implementations is FreeTTS.



In this post we will go through few steps and simplest way to use the text to speech functionality..

1. Download FreeTTS library:


Download version 1.2


2. Create new Java project:

Add to the class path of this project the following jars : freetts.jar & jsapi.jar

3. Create a new class:

Add the following logic:


1) Declare instance variable:

Voice speechVoice = null;


2) Initialize the voice manager :

VoiceManager voiceManager = VoiceManager.getInstance();
speechVoice = voiceManager.getVoice("kevin16");
speechVoice.allocate();

3) User the voice :

speechVoice.speak("Osama Oransa Java Architect from Egypt");
speechVoice.speak("Judy where are you?");

4) Deallocate it when you don't want it or want to exist from the program:

speechVoice.deallocate();


5) There are 3 possible values for the voice:

Alan , kevin and kevin16

* a low quality, unlimited domain, 8kHz diphone voice, called kevin
* a medium quality, unlimited domain, 16kHz diphone voice, called kevin16
* a high quality, limited domain, 16kHz cluster unit voice, called alan

To list all available voices:
Voice voices[] = voiceManager.getVoices();
for(int i = 0; i < voices.length; i++){
System.out.println(" " + voices[i].getName() + " (" + voices[i].getDomain() + " domain)");
}

** To control voice properties:
Like volume , pitch ,...etc.. call corresponding methods on Voice instance:

speechVoice.setVolume(2);
speechVoice.setPitch(100);
speechVoice.setRate(20) ;

The initial values for those parameters are 1.0 (volume at its loudest level), 100.0 (frequency of the voice) 150.0 (rate of words per minute).

This is the simplest way to use this APIs but there is another way you can check at the documentation to create Synthesizer

If you don't care about the voice characteristics:

Synthesizer synth = Central.createSynthesizer(null);

If you do care initialize it as:

SynthesizerModeDesc required = new SynthesizerModeDesc ();
required.setLocale (new Locale ("es", null));
required.addVoice (new Voice (null,GENDER_FEMALE,AGE_DONT_CARE,null));
Synthesizer synth = Central.createSynthesizer(required);

synth.allocate ();
synth.resume (); // to wait text to speech


To use it:
synth.speakPlainText(text.getText (), null);
try{
// Block this thread until
// the synthesizer's queue
// is empty.
synth.waitEngineState(Synthesizer.QUEUE_EMPTY);
}catch (InterruptedException e2){
}

To deallocate it:
synth.deallocate();

These need you to do the following as well:

-Copy speech.properties to Java Home %java.home%\lib

-Add voices.txt to classpath and put in mind that 1st line of voice will be used

*voices are stored in .jar files:

- cmu_time_awb.jar stores the Alan voice directory for clock-specific speech.
- cmu_us_kal.jar stores the Kevin voice directory for generic speech.

19 comments:

  1. As a biginner, i couldn't understand tat abouve u have mentioned to do. Can you pls describe me, wat i hav to do step by step ??

    ReplyDelete
  2. Tell me which step you want me to clarify more ?

    ReplyDelete
  3. hi osama....
    please specify the all code at one place..
    its unable to understand......please

    ReplyDelete
  4. You can find the complete code plus other functionality in this open source project : http://osama-oransa.blogspot.com/2011/06/interactive4j.html

    ReplyDelete
  5. Al-Salam Alaikum

    i have a problem with creating the Synthesizer
    this line causes a NullPointerException :
    synth.allocate ();

    i think my problem in the last steps
    can you please tell me in details what these lines mean :

    """"
    These need you to do the following as well:

    -Copy speech.properties to Java Home %java.home%\lib

    -Add voices.txt to classpath and put in mind that 1st line of voice will be used

    *voices are stored in .jar files:

    - cmu_time_awb.jar stores the Alan voice directory for clock-specific speech.
    - cmu_us_kal.jar stores the Kevin voice directory for generic speech.
    """"

    Thank you Eng Osama :)

    ReplyDelete
  6. Wa 3alekom Al-Salam,

    Why you didn't follow the 1st way ?

    Any way the 2nd methodology need to do the following :

    -Copy speech.properties to Java Home %java.home%\lib

    ==> in your machine c:\program files\java\jdk 1...\lib\speech.properties

    -Add voices.txt to classpath and put in mind that 1st line of voice will be used

    ==>Add voices.txt to the classpath.

    *voices are stored in .jar files:

    - cmu_time_awb.jar stores the Alan voice directory for clock-specific speech.
    - cmu_us_kal.jar stores the Kevin voice directory for generic speech.

    ==> Jars must be in the classpath ...

    Good luck ..

    ReplyDelete
  7. Hi Osama,
    I came across your blog looking for information about TextToSpeech for Arabic - I'm currently using the cloudgarden JSAPI implementation which apparently does not support any input in non-latin letters.
    Do you maybe have any experience with producing Arabic TTS output?

    ReplyDelete
  8. Sorry Nina, i do not have any experience in using Arabic TTS.

    ReplyDelete
  9. Hi Osama,
    we have done project on text to speech using java,now i have present documentation of text to speech. can you please send me the documentation i you have. imran.08.1207@gmail.com

    ReplyDelete
  10. Sorry i don't understand what you want ? you can download it from
    http://java.sun.com/products/java-media/speech/forDevelopers/jsapi-doc.pdf
    or
    http://docs.oracle.com/cd/E17802_01/products/products/java-media/speech/forDevelopers/jsapi-doc/index.html
    There are a lot of documentation in this link:
    http://java.sun.com/products/java-media/speech/reference/api/index.html

    ReplyDelete
  11. S3 Osama,
    i have tried your first method
    the program works properly on Netbeans
    but when i tried to run it from cmd , i got a NullPointerException at this line : speechVoice.allocate();
    although the libraries and jar files are created in the dist file normally

    ReplyDelete
  12. You need to add them to the class-path when you run using command line or add the class-path correctly to the manifest file.

    ReplyDelete
  13. thanks Osama
    I have one important question may you please answer for me

    I would like to know how to get the output sound which converted by your code to be heard in a pc of client's browser ?

    I'm using JSP so is there anything that can make it easy how to get and play it on the browser on demand .. please i need it ASAP

    cheer,

    ReplyDelete
  14. I believe there are ways to do so, like getting the audioPlayer or defaultAudioPlayer and use it, or using AudioManaer with Synthesizer the 3rd option which is the easiest to use Java Applet.

    ReplyDelete
    Replies
    1. Thanks for helping me by those precious infomation ^_^

      about Synthesizer: if I write it directly on a JSP page and use it , will it gives the sound to the client's broswer directly without the need of any additional script on a broswer to help doing it at client's broswer ..


      thanks

      Delete
  15. nice .. but what about the Arabic language ?
    if you know any infomation about Arabic text to speech, please contact me:
    president.sayed@gmail.com
    thanks in advance :)

    ReplyDelete
  16. I didn't use Arabic language but seems need to rebuild your own speech voice.

    ReplyDelete