Tuesday, March 31, 2009

Another Chinese Encoding Puzzle

Someone on the Unicode list had a text where strange escape codes had replaced accented chracters. For example the word "clichés" was printed as clich\x{5ee5}. The escape code presumably represents Unicode U+5EE5 or 廥. How could that happen? It turns out that this character has the code E973 in Big5, and that E9 73 in Latin-1 is és. So somehow a Latin-1 text was read as Traditional Chinese in Big5, then read again as Unicode and the non-Latin bits converted to escape sequences.

To make such a text readible, one can convert the the \x{abcd} escapes to the ꯍ html format, view the text with a browser, copy/paste to a text doc, save as Big5, and open as Latin-1.

Tuesday, March 17, 2009

iPhone 3.0 Software Adds Languages

In today's presention regarding the next version of iPhone software, which is supposed to be released sometime this summer, Apple indicated that it will include new languages and keyboards.

Unfortunately I have not found any info yet about which ones (though I would expect at least Arabic/Hebrew, Greek, and Thai might be among them). If anyone has details, let me know.

PS As of 3/25 I have seen online photos or other reports indicating new keyboards are available for Arabic, Greek, Hebrew, Indonesian, Malay, and Thai.

PSS Apple issued the tech specs for the iPhone with 3.0 software on 6/8/2009. It confirms UI, keyboard, and predictive dictionaries for Arabic, Thai, Greek, and Hebrew.

PSSS When made available 6/18/09, 3.0 has off/on switches for two keyboards not mentioned in the tech specs, Bulgarian and Macedonian. But these keyboards do not work.

Saturday, March 14, 2009

New Scottish Gaelic Spellcheckers

Thanks to Sealgar IT, users of Scottish Gaelic (Gàidhlig) can obtain new spellchecking dictionaries for this language for use with CocoAspell and with OpenOffice 3 from this page.

Wednesday, March 11, 2009

New iPod Speaks 14 Languages

The new iPod Shuffle has a "Voiceover" feature which can speak song titles and artists names in English, Chinese, French, Italian, Portuguese, Turkish, Czech, German, Japanese, Spanish, Dutch, Greek, Polish, and Swedish.


Hopefully this will find its way into future versions of OS X (where Voiceover is currently only English, unless you purchase additional 3rd party voices).

Sunday, March 8, 2009

Montenegrin Keyboard

Montenegro became independent of Serbia in 2006. For those who might want to use Apple's Latin and Cyrillic Serbian keyboard layouts with the Montenegrin flag showing in the Finder, the files for this can be downloaded here. For information on the status of Montenegrin as a language, see this article.