Exploring Cairo

Cairo is a city in Egypt. For our current purposes however, Cairo is an MRCP server written in Java. The project homepage is here.

Why am I interested in MRCP? Trefnydd generates acoustic models (AMs) based on HTK, and I should like Trefnydd to provide MRCP speech recognition resources based on these AMs.

Before you ask, I’ve already looked into Twisted, Divmod Sine, pjsip, and jain-sip. I may return to Twisted, Sine and jain-sip; for now I concentrate on Cairo.

pro Cairo:

  • It works;
  • The mailing list is the most helpful of the above projects;
  • It is (AFAIK) the only open-source MRCP server.

There are some disadvantages, however:

  • Cairo is “written entirely in the Java programming language”, apart from the launch scripts which are written entirely in some Windows shell script. This means that:
    • I’m going to have to program in Java;
    • Cairo currently runs only on Windows.
  • Cairo does not support the range of standards and protocols required by MRCP:
    • instead of the Session Initiation Protocol (SIP) for session negotiation, Cairo uses Java Remote Methods Invocation (RMI);
    • only Java Speech Grammar Format (JSGF) is supported, not the W3C’s Speech Recognition Grammar Specification (SRGS);
    • recognition results are given in plain text only: the required W3C standard - NLSML - is not supported.

For guidance through this acronym soup, see my MRCP page.

Because of this lack of standards-compliance, Cairo is perhaps based described as an MRCP server simulator. But at least it works, right?

Here are my immediate priorities:

  1. Get it working on Windows. If this just means replacing the .bat launch scripts with something more platform-neutral (i.e., Python), good; if not, I suppose I’ll be stuck with Windows for the time being. I’ll hide inside emacs.
  2. Explore the speechrecog resource code; see how tightly tied in it is to Sphinx 4;
  3. Design a Trefnydd/HTK plug in;
  4. Start writing Java code.

Watch this space.

4 Responses to “Exploring Cairo”

  1. absolute regularity » Blog Archive » Tûkh Says:

    […] however, Tûkh is a set of scripts for working with the Cairo MRCP server (which I blogged about already). Tûkh has two main […]

  2. brij Says:

    Hi,
    I suppose you were doing this work a year back…I am actually stucked at the same condition.
    I want to run Cairo Server , and a client over My linux machine
    with sphinx4.
    now as Cairo is not yet for windows; what can happen is I might have the sphinx, and cairo both over a windows machine, and call them from a MRCP client from linux; what do u think?
    also, is it possible to stream a live audio from phone [ thorugh asterisk] over Cairo, like GOOG-411 service in USA.

    Please help! we might discuss some financial’s here, if you could help me.

    Regards

    Brij

  3. Ivan Says:

    Brij

    Thank you for your comment.

    The system you describe has the MRCP server (Cairo & Sphinx) on Windows, and a client on a linux box, yes? That sounds fine - your client can be anything that can connect to the server with the right protocols. It would be better of course if the server were on a linux box. Have you compared Cairo with OpenMRCP (now uniMRCP)?

    > … is it possible to stream a live audio from phone [through asterisk] over Cairo, like GOOG-411 service in USA.

    Certainly. Though whether this is the right thing to do depends on the application you have in mind.

    > Please help! we might discuss some financial’s here, if you could help me.

    I’m always ready to discuss financials. Please email me directly: my mailto is given on the Llaisdy.com front page.

    Best wishes

    Ivan

  4. brij Says:

    Hi,
    sorry for dropping so late :)
    I have worked with OpenMRCP, and infact i have worked with the developer of OpenMRCP…A very nice chap he is.
    Anyways..I was looking at hosting sphinx on a linux server, and calling it real time; which is now looking a bit far to me…
    Interfacing options of ASR’s [windows / linux - Java / C], i am interested in all, from now onwards lets converse in emails.
    my id is brijrajsingh@gmail.com
    thanks.