Wednesday, December 31, 2008

New Lao Font for OS X

Thanks to John Durdin a new free Lao font is now available. Saysettha MX, which includes Regular, Bold, Oblique, and Bold Oblique versions, is an OpenType font designed especially for optimal display in OS X apps. For full info and links to download the font and a keyboard layout, go to the OS X Section of John's Laoscript site.

Tuesday, December 23, 2008

Typing Equations and Formulas

While not exactly a "language," the special characters and formatting needed to input mathematical and other equations and formulas can present challenges at least as difficult as a complex Unicode script. Often the only way to get satisfactory display is to compose the equation in a special editor and then save it as pdf and paste into your word processor. Here are some examples of such editors that can be used with OS X:

+Grapher (in Applications/Utilities, Window > Show Equation Palette)

+AppleWorks (if you have it -- Edit/Insert Equation)


+MathType (30 days free trial, then becomes "Lite" version)

+Online LaTex (Use FireFox)

+MathMagic (30 days free trial)

Thursday, November 13, 2008

Typing Urdu

Urdu is the national language of Pakistan and a close relative of Hindi, but written right-to-left with Arabic script rather than left-to-right with Devanagari. A particular challenge in creating Urdu text on the Mac is that the preferred script style is Nastaliq. Unfortunately the only available modern Nastaliq fonts are for Windows, and won't work in OS X apps, except for Mellel, OpenOffice/X11, and Leopard TextEdit.

I don't know Urdu, but I tried several Windows Nastaliq fonts, and the ones that look best to me are Alvi and Nafees. My iDisk has several Urdu keyboard layouts, of which Urdu-Qwerty is probably a good one to try.

Here is an example of a short Urdu text in Mellel, first in standard script and then in Nastaliq:

Comments/corrections welcome as always.

New iPod Song Languages

The most recent standard iPod's have apparently added some language capabilities for song information display. According to their tech specs, the iPod Classic now has Vietnamese, while the iPod Nano has both Vietnamese and Thai.

Sunday, October 26, 2008

Character Palette Unicode Versions

You may have noticed that OS X's Character Palette can be out-of-date regarding the script and character names in the latest version of Unicode. That's because Apple apparently only updates this data in the named OS X releases. The last updates of Tiger, from 2007, still use the data from Unicode 4.0 of 2003. The most recent update of Leopard, from September 2008, has data from Unicode 5.0 of mid-2006.

This does not matter much in practical terms, however. Both Tiger and Leopard, with appropriate fonts installed, seem able to input and display all scripts and characters included in Unicode 5.1 of April, 2008.

Tuesday, October 21, 2008

Unicode Bug in Pages

In an earlier article I mentioned some fonts are now available for the new scripts in Unicode 5.1. If you try to use these in Pages (or Keynote or iWeb), you may find that they do not work. For some reason the text engine in these apps does not allow direct input of characters not included in the Unicode version embodied in the OS -- i.e. characters which do not have a name when you select them in the Character Palette.

Strangely, this bug only affects characters in the Unicode BMP (Basic Multilingual Plane) -- direct input of those at U+10000 and up is not a problem.

A workaround is to compose in TextEdit (or another app) and copy/paste into Pages.

Another bug is the apparent inability to input ZWJ and ZWNJ in Pages. These characters are required for correct encoding of some languages using the Arabic, Tamil, and Devanagari scripts.

Sunday, October 19, 2008

Arabic Transcription Tools

I have come across a site with excellent resources for those engaged in Arabic transcription. Although it is in German you should have no difficulty seeing how to download various fonts and the Orientalist Keyboard Layout. The latter is designed to facilitate typing Arabic Latin transcription in both the English and DMG conventions.

Bug in iPhone/iPod Chinese Handwriting Input

The Chinese handwriting input on the latest iPhone/iPod Touch is a pretty impressive feature. But when using the Simplified Chinese version of this, you may find that only 4 character choices are given, so that if none is correct you have to start over. If you switch to the Traditional Chinese version, you will find a button over at the left which says 其他. Tapping this will give you additional sets of 4 choices, where you will probably find the correct one.

The lack of the 其他 button for Simplified Chinese is presumably a bug which Apple will need to fix.

Thursday, October 16, 2008

An Interesting Virtual Keyboard

While Apple's Keyboard Viewer can be quite useful, it lacks some capabilities one expects in a true virtual keyboard, since clicking the mouse on Shift, Option/Alt, and other modifier keys does not have any effect on input -- to create uppercase or accented characters you still need to use the physical keyboard. 3rd party alternatives have been very costly, but I recently found a cheaper one which may meet the needs of some people: VirtualKeyboard

Among other features, this keyboard can be varied in size and transparency, and can also be used with at least some non-US keyboard layouts.

Sunday, October 5, 2008

The Case of the Bogus Chinese

Recently in the Apple forums someone had a problem with his normal English text turning into Chinese when he applied QuickTime's Export > Text to Text > Text with Descriptors function. Examination of the output file showed the encoding was marked as 256, which means UTF-16, even though the text itself was just ASCII. So for example the two characters "th", with byte values 74 68, were being read as a single two-byte character 7468, or 瑨.

From my testing it appears that when the input text is UTF-16, QT retains this in the Descriptors, but converts the text itself to ASCII. Must be a bug.

Saturday, September 20, 2008

Arabic Localization of OS X

I've just recently come across this report providing links to a source (Arab Business Machine Ltd) for Arabic localization files for OS X 10.4.10 and 10.5.2. If anyone has tried these out, I'd be grateful for any comments. It looks like iLife 08 in Arabic is available from ABM as well.

Tuesday, September 16, 2008

Babylon Translation Software Now Available for Mac

The Babylon translation software, previously only available for PC's, now has a Mac version. Unfortunately the site offers little info about what it can do and no trial download. If anyone uses this, I'd be grateful for an evaluation.

Hebrew Localization of OS X

If you are interested in having the OS X menus and dialogues displayed in Hebrew, check out this page. If anyone installs it, I'd love to have an evaluation.

Tuesday, September 9, 2008

10 New Keyboards for iPod Touch and iPhone

It looks like the Sept 9, 2008 firmware update 2.1 provides the following 10 new keyboards to the iPhone 3G and the iPod Touch: Czech, Estonian, Croatian, Hungarian, Icelandic, Lithuanian, Latvian, Romanian, Slovak, and Turkish. However no new dictionaries (thus none for the keyboards mentioned above) have been added. Arabic or Hebrew or Greek input is also still not available.

The full list can be found in the tech specs.

Saturday, August 30, 2008

Fonts for New Characters in Unicode 5.1

In an earlier article we mentioned the new scripts and other items included in Unicode 5.1. Fonts are now becoming available for some of them as indicated below.

Kayah Li, Cham, Ol Chiki, Rejang, Saurashtra, Vai: Code 2000

Carian, Lycian: Aegean

Lepcha, Lydian, Sundanese: None so far.

Phaistos Disk: Aegean, Code 2001

Dominoes: Code 2001, Unicode Symbols

Mahjong: Unicode Symbols

Sunday, August 17, 2008

Optimus Multilingual Keyboard

Someone has actually finally gotten a copy of the long-awaited Optimus keyboard, which can display the keyboard layout for any language directly on its keys via 113 OLED screens underneath them. For more information see this blog.

This report indicates that the keyboard may not actually work the way one would expect, at least on a Windows machine.

Here is another review.

Friday, July 11, 2008

MobileMe Language Options Same as .Mac

MobileMe replaced .Mac July 11 or so. I had somewhat expected, given the broader international dimension of the marketing for the iPhone 3G, that this new service would also be available in a wide range of languages. As far as I can tell, however, it is limited to the same 4 as was .Mac -- English, French, German, and Japanese.

Monday, June 30, 2008

Sinhala Font and Keyboard for Testing

Thanks to Nick Shanks, there is now an OS X Sinhala font available for testing here. A first stab at a "wijeyasekara" keyboard layout for this script can be had on my iDisk.

I'm sure the keyboard has errors, so I would be grateful for any users willing to test it and report them to me for correction.

Monday, June 9, 2008

iPhone 3G Expands Language Capabilities

According to its tech specs, Apple's iPhone 3G, which was announced June 9 and should be available July 11, has language support for English, French, German, Japanese, Dutch, Italian, Spanish, Portuguese, Danish, Finnish, Norwegian, Swedish, Korean, Simplified Chinese, Traditional Chinese, Russian, and Polish.

International keyboard and dictionary support is available for English (U.S.), English (UK), French (France), French (Canada), German, Japanese, Dutch, Italian, Spanish, Portuguese (Portugal), Portuguese (Brazil), Danish, Finnish, Norwegian, Swedish, Korean (no dictionary), Simplified Chinese, Traditional Chinese, Russian, and Polish.

I understand that a software upgrade for the iPod Touch will be available in July to give it the same language features.

From reports I have received, fonts are now present for a number of additional languages, but the software is still not capable of correct display of complex scripts like Arabic, Devanagari, Tamil, and Tibetan which can be handled by the full OS X without problem.

Thursday, April 17, 2008

Internationalized Domain Names

In recent months Internationalized Domain Names -- url's and email addresses written in scripts other than Latin -- have been set up for testing by ICANN. You can see whether your browser is equipped to handle the IDNA protocol which these use by clicking on links at the bottom of this page. You can similarly test your email client here.

Note the difference in what appears in the browser address bar when you point Safari at the topmost (Arabic) site and the (9th down) Russian site. The former will be in the native script, while the latter will be in an ASCII translation called Punycode. This is done because Russian script and Roman script can be confusable and create security problems. Which scripts generate Punycode is determined by a "whitelist" in the Safari app. Info on this and other aspects of Safari support for IDNA can be found here.

IDNA is currently limited to the range of scripts included in Unicode 3.2 in 2002. Since then nearly 30 more have been added, and the IETF is working on an update that will accommodate Unicode 5.1 and any future version

Tuesday, April 1, 2008

New Welsh Language Tools Available

Those who need to work in Welsh may be interested in Cysgliad, which has a spell and grammar checker plus dictionaries for this language. Look toward the bottom of the page for the OS X version.

Thursday, March 20, 2008

Unicode 5.1

The Unicode Standard is expected to be officially updated to version 5.1 around the end of March, and a draft summary of the changes is available. 1624 new characters will be added, bringing to total number defined to 100,507. New scripts are Kayah Li, Cham, Lepcha, Ol Chiki, Rejang, Saurashtra, Sundanese, Vai, Carian, Lycian, Lydian, plus the Phaistos disc, dominoes, and Mahjong symbols.

Unicode 5.1 will also enable the use of ideographic variation sequences. These allow standardized representation of variant glyphs in Japanese, Chinese, and Korean, according to the Uncode IV Database for such sequences.

Saturday, March 15, 2008

OS X 10.5 Leopard: Fix For Vietnamese VIQR IM

A reader of this blog has pointed out that the VIQR option in Leopard's new Vietnamese IM seems to be completely broken. For a possible replacement, download the layout found here.

Sunday, February 24, 2008

OS X 10.5 Leopard: Fixing the Macedonian Keyboard

Although Leopard corrected some errors in Tiger's Macedonian keyboard layout, the current version still seems to be missing two characters which should be used in certain circumstances, namely ѐЀ and ѝЍ. If you have an ISO/European keyboard, the fix is to download a replacement layout here. If you have an ANSI/US keyboard, you should download Macedonianz.keylayout from my iDisk..

If using Macedonianz, you make the extra characters by typing Option + `, followed by the base letter.

Monday, February 11, 2008

10.5.2 Update Fixes Russian-PC, Latvian, and Chinese Keyboards?

According to its release notes, the 10.5.2 update for OS X fixes the problems mentioned in this blog with the Russian PC and Latvian keyboard layouts that came with 10.5.0. But I still don't see any ё in the Russian-PC ANSI layout on my machine. Apparently it is only present on an ISO/European keyboard. On the other hand, although not mentioned, it looks like they may have fixed the bug in HaninYiTian.

Getting Your Mac to Speak Other Languages

For an excellent up-to-date reference on other voices for OS X's Text-To-Speech features, see this page at Ricky Buchanan's ATMac site.

Wednesday, February 6, 2008

If Your Chinese Characters Don't Look Quite Right....

A poster in the Apple Forums recently asked why certain Simplified Chinese characters did not look exactly the way he expected them to, giving as examples bao1 (U+5305) and fang2 (U+623F). Looking these up in the Character Palette, I discovered that they are characters where the Chinese and Japanese versions are visibly different. You can compare them here. The first column is Japanese, while the second two columns are Chinese.

So it appears that the poster's apps were using OS X's Japanese fonts instead of the Chinese ones that he wanted. Aside from switching to the correct font, one possible solution for this is to go to System Preferences/International/Languages and make sure that Simplified Chinese (简体中文) is higher on the list than Japanese (日本語).

This issue arises as a result of the unification of the Han Script in Unicode, under which slightly different versions of characters were given the same code point. Fonts produced for specific languages will nonetheless retain the different versions. For more info you can check here.

Monday, January 28, 2008

App for Advanced Korean Word Processing

If you work in Korean and need features beyond what normal Apple and Windows apps can provide, like additional fonts, vertical layout, spell checking, and support for ancient Hangul characters, I understand that the product to get is Hangul 2006 For Mac. A source for it is here. For more info on the app see this article.

Reading Non-Unicode Tibetan

Although OS X, starting with 10.5, includes support for Unicode Tibetan, it turns out that a number of important Tibetan language sites have not yet reached this point and still use custom fonts with legacy encodings. Examples are Radio Free Asia, and To view these sites you will need to download and install the special font they use, TCRC YoutsoWeb. You might also need to try a different browser, like FireFox or Opera, rather than Safari, and fiddle with its font preferences.

You can download tcrcyweb.ttf here or from my iDisk.

As for official Chinese sites, these seem to be only in Chinese or to use graphics instead of text. An example of the latter is China Tibet News An example of a Chinese site in Unicode Tibetan is Tibet Information Technology Web.

Thursday, January 24, 2008

Work-Around for Mail's NBSP Bug

A poster in the Apple forums has pointed out that appears to have a strange bug: There is no way to input U+00A0/NBSP "No-Break Space". When you try to do this, either via the keyboard (Alt/Opt + space) or the Character Palette or via copy/paste, only an ordinary Space (U+0020) is produced in the text.

This is a problem because French text, for example, should ideally have an NBSP before certain punctuation, especially ! and ?. Using ordinary spaces means that these marks can get separated from the text they belong to at line endings, which is very ugly.

A possible work-around is use U+202F "Narrow No-Break Space" instead, which does accept for input. Unfortunately I don't think any standard keyboard layouts have this character, so you have create a custom layout or input it from the Character Palette or via similar means. Also it could cause problems if the other end is using software that doesn't understand Unicode or fonts that don't handle 202F correctly.

The best tool for making a custom layout is Ukelele.

Wednesday, January 16, 2008

Mac Office 2008: Still No Arabic/Hebrew Support

MS Office for Mac 2008 was released Jan. 15, and info on its language capabilities is now available here. Notable is a) the continuing lack of ability to work with Arabic/Hebrew, and b) the presence of Tamil on the list of supported languages. First reports I have indicate that Word 2008 cannot in fact do Tamil (the font is not recognized and you just see squares). One source says the other parts of Office do display it. There are also reports that both PowerPoint and Entourage can do Arabic/Hebrew.

iPhone Language Input Capabilities Still Not Expanded

I was surprised to hear that the 1.1.3 iPhone update announced at MacWorld January 15 apparently did not add Japanese or other new keyboard input capabilities to match what has been available since introduction in the iPod Touch. This seems especially odd since the iPhone User Guide (p. 21) has said for months now that these were already present.

Monday, January 14, 2008

Typing Kharosthi

Kharosthi was used to write Sanskrit and Gandhari about 2000 years ago, and is the script of the oldest Buddhist manuscripts yet found. It plays a role in Eliot Pattison's mystery "Water Touching Stone," which I am currently reading. An example of the kind of wooden tablet mentioned in that book can be seen here.

Two fonts which include Kharosthi are Alphabetum Unicode and MPH 2B Damase. A rudimentary keyboard layout can be downloaded from my iDisk.

The available fonts and OS X are not yet capable of displaying Kharosthi properly.