Exploring Cairo
Cairo is a city in Egypt. For our current purposes however, Cairo is an MRCP server written in Java. The project homepage is here.
Why am I interested in MRCP? Trefnydd generates acoustic models (AMs) based on HTK, and I should like Trefnydd to provide MRCP speech recognition resources based on these AMs.
Before you ask, I’ve already looked into Twisted, Divmod Sine, pjsip, and jain-sip. I may return to Twisted, Sine and jain-sip; for now I concentrate on Cairo.
pro Cairo:
- It works;
- The mailing list is the most helpful of the above projects;
- It is (AFAIK) the only open-source MRCP server.
There are some disadvantages, however:
- Cairo is “written entirely in the Java programming language”, apart from the launch scripts which are written entirely in some Windows shell script. This means that:
- I’m going to have to program in Java;
- Cairo currently runs only on Windows.
- Cairo does not support the range of standards and protocols required by MRCP:
- instead of the Session Initiation Protocol (SIP) for session negotiation, Cairo uses Java Remote Methods Invocation (RMI);
- only Java Speech Grammar Format (JSGF) is supported, not the W3C’s Speech Recognition Grammar Specification (SRGS);
- recognition results are given in plain text only: the required W3C standard - NLSML - is not supported.
For guidance through this acronym soup, see my MRCP page.
Because of this lack of standards-compliance, Cairo is perhaps based described as an MRCP server simulator. But at least it works, right?
Here are my immediate priorities:
- Get it working on Windows. If this just means replacing the .bat launch scripts with something more platform-neutral (i.e., Python), good; if not, I suppose I’ll be stuck with Windows for the time being. I’ll hide inside emacs.
- Explore the speechrecog resource code; see how tightly tied in it is to Sphinx 4;
- Design a Trefnydd/HTK plug in;
- Start writing Java code.
Watch this space.
June 11th, 2007 at 12:18 pm
[…] however, Tûkh is a set of scripts for working with the Cairo MRCP server (which I blogged about already). Tûkh has two main […]
May 2nd, 2008 at 7:04 am
Hi,
I suppose you were doing this work a year back…I am actually stucked at the same condition.
I want to run Cairo Server , and a client over My linux machine
with sphinx4.
now as Cairo is not yet for windows; what can happen is I might have the sphinx, and cairo both over a windows machine, and call them from a MRCP client from linux; what do u think?
also, is it possible to stream a live audio from phone [ thorugh asterisk] over Cairo, like GOOG-411 service in USA.
Please help! we might discuss some financial’s here, if you could help me.
Regards
Brij
May 2nd, 2008 at 8:49 am
Brij
Thank you for your comment.
The system you describe has the MRCP server (Cairo & Sphinx) on Windows, and a client on a linux box, yes? That sounds fine - your client can be anything that can connect to the server with the right protocols. It would be better of course if the server were on a linux box. Have you compared Cairo with OpenMRCP (now uniMRCP)?
> … is it possible to stream a live audio from phone [through asterisk] over Cairo, like GOOG-411 service in USA.
Certainly. Though whether this is the right thing to do depends on the application you have in mind.
> Please help! we might discuss some financial’s here, if you could help me.
I’m always ready to discuss financials. Please email me directly: my mailto is given on the Llaisdy.com front page.
Best wishes
Ivan
May 22nd, 2008 at 8:49 am
Hi,
sorry for dropping so late
I have worked with OpenMRCP, and infact i have worked with the developer of OpenMRCP…A very nice chap he is.
Anyways..I was looking at hosting sphinx on a linux server, and calling it real time; which is now looking a bit far to me…
Interfacing options of ASR’s [windows / linux - Java / C], i am interested in all, from now onwards lets converse in emails.
my id is brijrajsingh@gmail.com
thanks.