Llaisdy

Tukh

Author:Ivan A. Uemlianin
Contact:ivan@llaisdy.com

Overview: What is Tukh?

Tukh is small village downstream of Cairo. For current purposes, however, Tukh is a set of scripts for working with the Cairo MRCP server. Tukh has two main objectives:

  • enable Cairo to be run on Linux
  • enable Cairo to use recognition resources based on HTK.

As the scripts are written in Python, they /may/ develop into a basic python API for Cairo.

Current status

I have rewritten the launch and demo .bat scripts in Python. They work more or less identically to the .bat scripts. I haven't yet tested them on Linux.

Set up

Download

Tukh can be downloaded as a tarball or a zipfile.

Installation

No installation is necessary. The scripts will run from wherever they are unpacked.

Configuration

Some configuration is necessary, however. The file tukh/config.py contains information which you should check tallies with your system. The current (or at least a recent) config.py looks like this:

#! /usr/bin/python

cairo_home = 'C:\cairo-0.1-bin'
cairo_version = '0.1'

Usage

The scripts in tukh are used exactly as the launch and demo .bat scripts of the same name in Cairo. Usage for these is specified in the Cairo Getting Started page.

Help wanted

The Windows Environment

Java or "C:\Program Files\Java\jdk1.6.0_01\bin\java"

The launch scripts will work with either java or "C:\Program Files\Java\jdk1.6.0_01\bin\java", but the demo scripts will only work with the former. The following was copied from a Command Prompt window:

C:\Documents and Settings\ivan\My Documents\iau\llaisdy_svn\tukh>python demo-spe
echsynth.py
"C:\Program Files\Java\jdk1.6.0_01\bin\java" -Xmx200m -Dlog4j.configuration=log4
j.xml org.speechforge.cairo.demo.tts.SpeechSynthClient 42046 "Congratulations! Y
ou have successfully installed the Cairo speech server. Please try the other dem
os to test out Cairo's speech recognition capabilities."
'C:\Program' is not recognized as an internal or external command,
operable program or batch file.

C:\Documents and Settings\ivan\My Documents\iau\llaisdy_svn\tukh>python demo-spe
echsynth.py
java -Xmx200m -Dlog4j.configuration=log4j.xml org.speechforge.cairo.demo.tts.Spe
echSynthClient 42046 "Congratulations! You have successfully installed the Cairo
 speech server. Please try the other demos to test out Cairo's speech recognitio
n capabilities."
2007-06-11 11:21:53,265 INFO  {main} org.speechforge.cairo.demo.tts.SpeechSynthC
lient
 looking up: rmi://192.168.0.2/ResourceServer

Is this something to do with the Command Prompt environment not being able to handle too many quotes?

start "name" on the command line

The {rserver,receiver1,transmitter1}.bat files run launch.bat using a line of the form:

start "name" command

For example, rsever.bat has:

set CLASS=org.speechforge.cairo.server.resource.ResourceServerImpl
start "rserver" launch %CLASS%

What is the significance of the start "name" part? rserver.bat (et al.) works just as with only launch %CLASS%. What does the start "name" prefix do, and what am I missing by not using it?

Cairo

Querying the server

How can you check to see whether an MRCP server is running? For example, so that receiver1 could do this check, and launch a server if necessary.

Related, does Cairo support the OPTIONS request (i.e., for a client to query the server's capabilities)? If not, how can a client find out what resources are available from the server?

TODO

  • use one script for all three launches (rserver, receiver, transmitter).
  • get running on linux.
  • get running with htk instead of sphinx (nb: or convert htk acoustic model to Sphinx4 engine).