Chinese input methods for Emacs
Introduction
Emacs provides 25 input methods for Chinese. Although each input method has its own describe-input-method page, these pages can be rather terse. There is also no overview or comparison between the different input-methods, neither have I been able to find one on the web.
Here I have gathered together the information I’ve been able to find. I’d be pleased to hear about any errors I’ve made, and where I can find further information to correct my omissions. I’ll keep this page up-to-date.
I’m learning Mandarin Chinese, I’m interested in simplified script, and for the moment I find a pinyin-based approach to the written language easiest. For my own current requirements, chinese-tonepy is fitting the bill, but I’m interested in learning a structural input method (i.e., not based on pronunciation). See the Conclusion for further discussion.
Methods overview
Table: Chinese input methods provided by Emacs
| Method name | Method name |
|---|---|
| chinese-4corner | chinese-array30 |
| chinese-b5-quick | chinese-b5-tsangchi |
| chinese-ccdospy | |
| chinese-cns-quick | chinese-cns-tsangchi |
| chinese-ctlau | chinese-ctlaub |
| chinese-ecdict | chinese-etzy |
| chinese-punct | chinese-punct-b5 |
| chinese-py | chinese-py-b5 |
| chinese-py-punct | chinese-py-punct-b5 |
| chinese-qj | chinese-qj-b5 |
| chinese-sisheng | chinese-sw |
| chinese-tonepy | chinese-tonepy-punct |
| chinese-ziranma | chinese-zozy |
The basic idea common to all these input methods is that you type in a key sequence, bringing up a manu of options, from which you chose the character you want.
For example, in chinese-tonepy, typing in ‘hao3′ brings up this menu in the minibuffer window:

If you want hao3 = ‘good’, choose option 2 (by typing 2 or clicking on the menu). Character 好 appears at point.
At any point in any of the input methods you can press <tab> for a full tree-list of the options available to you from that point, e.g.:

punct
Some input methods have versions with or without -punct (see the table above). The -punct versions support proper Chinese punctuation characters. However, (a) although chinese-punct works, chinese-py-punct (& poss. others) doesn’t seem to; (b) versions without -punct use ascii punctuation, which meets my needs for the moment.
In the discussion below I ignore the -punct versions.
b5
Some input methods have versions with or without -b5. The -b5 input methods generate characters in the Big Five character set. Big Five is a Taiwanese character set for traditional characters. Other input methods support GB2312, which is a character set from the People’s Republic of China, for simplified characters. For example:
| Input Method | Input | Output |
|---|---|---|
| chinese-py-b5 | guo2/1 yu3/2 | 國 語 |
| chinese-tonepy | guo2/1 yu3/7 | 国 语 |
For the full skinny on character sets see Lunde (1999).
I presume that, apart from chinese-b5-quick and chinese-b5-tsangchi, the -b5 input methods work the same way as the non-b5 methods, but just output Big Five (i.e., traditional script) instead of GB2312 (i.e., simplified script). Consequently, in the following discussion I ignore the -b5 versions (apart from chinese-b5-quick and chinese-b5-tsangchi).
I presume that, apart from chinese-b5-quick and chinese-b5-tsangchi, the -b5 input methods work the same way as the non-b5 methods, but just output Big Five (i.e., traditional script) instead of GB2312 (i.e., simplified script). Consequently, in the following discussion I ignore the -b5 versions (apart from chinese-b5-quick and chinese-b5-tsangchi).
Methods in detail
For each method I’ll give the character set (if not GB2312), and, as a simple illustration of use, how to generate 你好吗 ( nǐ hǎo ma = are you well?).
chinese-4corner
No description in description!
The Wikipedia provides a description [http://en.wikipedia.org/wiki/Four_corner_method]. You use four digits to describe the four corners of a character (top left, top right, bottom left, bottom right).
Quoting the Wikipedia article:
| Digit | Meaning |
|---|---|
| 1 | a horizontal stroke, |
| 2 | a vertical or diagonal stroke, |
| 3 | a dot stroke, |
| 4 | two strokes in a cross shape, |
| 5 | three or more strokes in which one stroke intersects all others, |
| 6 | a box-shape, |
| 7 | where a stroke turns a corner, |
| 8 | the shape of the Chinese character 八 and its inverted form, and |
| 9 | the shape of the Chinese character 小 and its inverted form. |
| 0 | where there is either nothing in a corner, the part in a corner is already represented by a previous corner, or where a corner has a dot stroke followed by a horizontal stroke |
Usage example:
| Char | Digits | Interpretation |
|---|---|---|
| 你 | 2729/2 | Diagonal; Corner; Vertical; 小; ‘2′? |
| 好 | 47447/1 | Cross; Corner; Cross; Cross?; option 1 |
| 吗 | ??? | ??? |
To be honest, I cheated here: the Wiktionary gives four corner codes for Chinese characters (e.g., http://en.wiktionary.org/wiki/好).
chinese-array30
Outputs Big Five.
Some docs:
- http://www.array.com.tw/ (in Chinese)
- http://openvanilla.org/ (partly in Chinese)
- scim-array: Array 30 Input Method Engine for SCIM (in English).
Seems to be quite popular, and the MS Windows Chinese input method seems to use it. However, I can’t make head or tail of it. Also, it outputs Big Five.
chinese-b5-tsangchi
This is a Taiwanese method based on geometrical decomposition of characters. See: http://en.wikipedia.org/wiki/Cangjie_method.
| Char | Input | Interpretation |
|---|---|---|
| 你 | onf | O = 人 (LHS); N = hook (top of RHS); F = 火 (bottom of RHS) |
| 好 | vnd | V = 女 (LHS); N = hook (top of RHS); D = 木 wood (?) |
| 嗎 | rsqf | R = 口 (LHS); S = 尸 (L & top of RHS); Q = 手 (next top RHS); F = 火 (bottom of RHS) |
Again, I got these from the Wiktionary, but I can understand how they were made up. Notice that ‘ma’ is the traditional 嗎 and not the simplified 吗.
chinese-b5-quick
This looks like it should be a ‘quick’ version of b5-tsangchi. Indeed it takes fewer keystrokes to get to an end character - but I can’t find the end character I want. In other words, I don’t know how it uses the Tsangchi system.
chinese-ccdospy
From the description:
This input method works almost the same way as chinese-py_. The
difference is that you type a single key for these Pinyin spelling.
Pinyin: zh en eng ang ch an ao ai ong sh ing yu(ü)
keyseq: a f g h i j k l s u y v
| Char | Input | Interpretation |
|---|---|---|
| 你 | ni7 | 7th option |
| 好 | hk6 | 6th " |
| 吗 | ma9 | 9th " |
chinese-cns-tsangchi
Probably the same as chinese-b5-tsangchi, but outputing a CNS (Chinese National Standards) character set instead of Big Five (possibly CNS 11643-1992). My Emacs hasn’t got the fonts for it.
chinese-cns-quick
See chinese-cns-tsangchi and chinese-b5-quick.
chinese-ctlau
An input method based on Sidney Lau’s Romanisation system. (a) it’s for Cantonese; (b) it generates Big5; (c) I can’t get it to work.
chinese-ctlaub
See chinese-ctlau.
chinese-ecdict
Pretty impressive. You type in the English word and the Chinese (Big5) word appears! Wow!
| Char | Input | Interpretation |
|---|---|---|
| 香蕉 | banana | banana |
| 冰箱 | refrigerator | refrigerator |
| 你 | you | you |
| 良好的 | good | good |
Impressive but:
- notice that ‘good’ doesn’t actually give us hao3 (好), it gives us liang2 hao3 de (良好的). 良好 still means ‘good’ but 的 is a connector making this adjective ready to add on to a noun.
- I don’t know how I would get a grammatical particle like 吗 (ma).
- This will be handy for emergencies but I think I’ll keep a dictionary around too.
chinese-etzy
A Zhuyin input method with Big5 output.
chinese-py
Type in pinyin, without tones.
| Char | Input | Interpretation |
|---|---|---|
| 你 | ni1 | 1st menu option |
| 好 | hao | guessed correctly |
| 吗 | ma2 | 2nd option |
chinese-qj
Possibly another Zhuyin input method. The description has virtually no information.
chinese-sisheng
This will be quite useful. This input method does not generate Hanzi, but it transforms pinyin with tone numbers into ‘proper’ pinyin with diacritics.
| Char | Input | Interpretation |
|---|---|---|
| nǐ | ni3 | 你 |
| hǎo | hao3 | 好 |
| ma | ma | 吗 |
| nǚ | nv3 | 女 |
chinese-sw
Type in a pair of radicals. Quite clever and fast (and not pronunciation-based), easy to pick up. Usefulness will depend on coverage.
| Char | Input | Interpretation |
|---|---|---|
| 你 | kv4 | 亻+ 小 + 4th option |
| [好] | ??? | can’t find it |
| 吗 | fd9 | 口 + 刀 + 9th option |
chinese-tonepy
Type in pinyin, with tones.
| Char | Input | Interpretation |
|---|---|---|
| 你 | ni3 | |
| 好 | hao32 | 2nd menu option |
| 吗 | ma5 |
n.b.: chinese-py-b5 is the same as this method, but outputs Big Five i.e., traditional characters. A useful alternative to chinese-tonepy for when I want to write Trad.
chinese-ziranma
A pinyin-based input method where the pinyin initials and finals are mapped onto the qwerty keyboard (see http://eyegene.ophthy.med.umich.edu/unicode/KeyboardLayouts/ZiRanMaPinYinKeyboard.html).
The basic idea is a four-stroke pattern: initial, final, tone, quote (’). This may not be that impressive for, say, 好 (hke’), but the pattern extends to cover words, not just characters. That means that two-, three- or four- character words can be input with the same four-stroke pattern. The example from the description is 北京电视台 (bjdt) (Běijīng diànshìtái; Beijing TV station), which is impressive for four keystrokes.
| Char | Input | Interpretation |
|---|---|---|
| 你 | n | guessed correctly |
| 好 | hke’ | |
| 吗 | mae’5 | 5th option |
chinese-zozy
A Big5 Zhuyin input method.
Conclusion
chinese-tonepy is looking the best for now for most uses (see also chinese-ccdospy, chinese-ziranma).
chinese-sisheng will be useful for writing pinyin.
I think Zhuyin is probably in my future if/when I start learning Chinese seriously, but unfortunately both the Emacs Zhuyin output Big5 (i.e. traditional script). How much is Zhuyin used in the People’s Republic?
It would be useful to know a method not based on pronunciation. I might not always know how to say what I want to write - eg "How do you say X?" Emacs’ structural input methods generating GB characters are:
- chinese-4corner
- chinese-qj (maybe)
- chinese-sw
I’ll explore these, and update this page as I found out more.
References
Lunde, K. (1999). CJKV Information Processing. Sebastopol, CA.: O’Reilly.
http://en.wikipedia.org/wiki/Chinese_input_methods_for_computers