You may have noticed that OS X's Character Palette can be out-of-date regarding the script and character names in the latest version of Unicode. That's because Apple apparently only updates this data in the named OS X releases. The last updates of Tiger, from 2007, still use the data from Unicode 4.0 of 2003. The most recent update of Leopard, from September 2008, has data from Unicode 5.0 of mid-2006.
This does not matter much in practical terms, however. Both Tiger and Leopard, with appropriate fonts installed, seem able to input and display all scripts and characters included in Unicode 5.1 of April, 2008.
Sunday, October 26, 2008
Tuesday, October 21, 2008
Unicode Bug in Pages
In an earlier article I mentioned some fonts are now available for the new scripts in Unicode 5.1. If you try to use these in Pages (or Keynote or iWeb), you may find that they do not work. For some reason the text engine in these apps does not allow direct input of characters not included in the Unicode version embodied in the OS -- i.e. characters which do not have a name when you select them in the Character Palette.
Strangely, this bug only affects characters in the Unicode BMP (Basic Multilingual Plane) -- direct input of those at U+10000 and up is not a problem.
A workaround is to compose in TextEdit (or another app) and copy/paste into Pages.
Another bug is the apparent inability to input ZWJ and ZWNJ in Pages. These characters are required for correct encoding of some languages using the Arabic, Tamil, and Devanagari scripts.
Strangely, this bug only affects characters in the Unicode BMP (Basic Multilingual Plane) -- direct input of those at U+10000 and up is not a problem.
A workaround is to compose in TextEdit (or another app) and copy/paste into Pages.
Another bug is the apparent inability to input ZWJ and ZWNJ in Pages. These characters are required for correct encoding of some languages using the Arabic, Tamil, and Devanagari scripts.
Sunday, October 19, 2008
Arabic Transcription Tools
I have come across a site with excellent resources for those engaged in Arabic transcription. Although it is in German you should have no difficulty seeing how to download various fonts and the Orientalist Keyboard Layout. The latter is designed to facilitate typing Arabic Latin transcription in both the English and DMG conventions.
Bug in iPhone/iPod Chinese Handwriting Input
The Chinese handwriting input on the latest iPhone/iPod Touch is a pretty impressive feature. But when using the Simplified Chinese version of this, you may find that only 4 character choices are given, so that if none is correct you have to start over. If you switch to the Traditional Chinese version, you will find a button over at the left which says 其他. Tapping this will give you additional sets of 4 choices, where you will probably find the correct one.
The lack of the 其他 button for Simplified Chinese is presumably a bug which Apple will need to fix.
The lack of the 其他 button for Simplified Chinese is presumably a bug which Apple will need to fix.
Thursday, October 16, 2008
An Interesting Virtual Keyboard
While Apple's Keyboard Viewer can be quite useful, it lacks some capabilities one expects in a true virtual keyboard, since clicking the mouse on Shift, Option/Alt, and other modifier keys does not have any effect on input -- to create uppercase or accented characters you still need to use the physical keyboard. 3rd party alternatives have been very costly, but I recently found a cheaper one which may meet the needs of some people: VirtualKeyboard
Among other features, this keyboard can be varied in size and transparency, and can also be used with at least some non-US keyboard layouts.
Among other features, this keyboard can be varied in size and transparency, and can also be used with at least some non-US keyboard layouts.
Sunday, October 5, 2008
The Case of the Bogus Chinese
Recently in the Apple forums someone had a problem with his normal English text turning into Chinese when he applied QuickTime's Export > Text to Text > Text with Descriptors function. Examination of the output file showed the encoding was marked as 256, which means UTF-16, even though the text itself was just ASCII. So for example the two characters "th", with byte values 74 68, were being read as a single two-byte character 7468, or 瑨.
From my testing it appears that when the input text is UTF-16, QT retains this in the Descriptors, but converts the text itself to ASCII. Must be a bug.
From my testing it appears that when the input text is UTF-16, QT retains this in the Descriptors, but converts the text itself to ASCII. Must be a bug.
Subscribe to:
Posts (Atom)