如何将unicode输出到emacs消息缓冲区?(How can I output unicode to the emacs Message buffer?)

如果我运行代码

# -*- coding: utf-8 -*- month = "März" print month.decode("utf-8")

在OS X终端,我得到了字符串März就好了。

此外,我的emacs(OS X 10.10上的24.5)似乎处理unicode(或至少是变形金刚),因为我可以在emacs窗口中看到变音符号。

然而,当我直接从emacs中运行上面的代码时,我得到:

Traceback (most recent call last): File "unicode-umlaut.py", line 3, in <module> print month.decode("utf-8") UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 1: ordinal not in range(128)

这是什么意思? 这是否意味着即使emacs正在处理latin-1字符,emacs消息缓冲区拒绝处理unicode? 是否有修复可以将非ascii字符输出到emacs中的Message缓冲区?

更新:

字节顺序文件看起来(通过emacs hexl-mode)像这样:

00000000: 2320 2d2a 2d20 636f 6469 6e67 3a20 7574 # -*- coding: ut 00000010: 662d 3820 2d2a 2d0a 6d6f 6e74 6820 3d20 f-8 -*-.month = 00000020: 224d c3a4 727a 220a 7072 696e 7420 6d6f "M..rz".print mo 00000030: 6e74 682e 6465 636f 6465 2822 7574 662d nth.decode("utf- 00000040: 3822 290a 8").

c3a4映射到a-umlaut(ä),因此文件似乎在UTF-8中正确编码。

If I run the code

# -*- coding: utf-8 -*- month = "März" print month.decode("utf-8")

in the OS X terminal, I get the string März just fine.

Also, my emacs (24.5 on OS X 10.10) seems to handle unicode (or at least umlauts) just fine, since I can see the umlaut in my emacs window.

Yet when I run the code above directly from within emacs I get:

Traceback (most recent call last): File "unicode-umlaut.py", line 3, in <module> print month.decode("utf-8") UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 1: ordinal not in range(128)

What does this mean? Does it mean that even though emacs is handling a latin-1 character, the emacs Message buffer refuses to handle unicode? Is there a fix to make it possible to output non-ascii characters to the Message buffer in emacs?

Update:

Byte-wise the file looks (via emacs hexl-mode) like this:

00000000: 2320 2d2a 2d20 636f 6469 6e67 3a20 7574 # -*- coding: ut 00000010: 662d 3820 2d2a 2d0a 6d6f 6e74 6820 3d20 f-8 -*-.month = 00000020: 224d c3a4 727a 220a 7072 696e 7420 6d6f "M..rz".print mo 00000030: 6e74 682e 6465 636f 6465 2822 7574 662d nth.decode("utf- 00000040: 3822 290a 8").

The c3a4 maps to a-umlaut (ä), and so the file seems to be properly coded in UTF-8.

更多推荐