DVD
game
auto
gym
HDTV
sun
photo
wine
space
yachts
BBQ
baths
astro
phones
spy
GPS
iPods
PC
SCUBA
France


UK HDTV FAQ
(Sky HD, BBC HD)


Safe For Kids





Internationalization woes



Fri, 10 Mar 2006 17:22:07 -0500 comp.sys.mac.apps
previous


John Chambers...
I've embarked on the challenge of "internationalization" of a bunch of my files,

John Chambers...
...

Well, my original message got truncated when I pasted in the displayed
text. But I suppose that illustrates the problem, too. I can cut and paste
UTF-8 text between some apps, mostly the ones that take plain text. But
most of the "Mac" apps go insane when they see anything outside the
Latin-1 range. It looks like Thunderbird also goes insane when it gets
a bit of multi-byte UTF-8 text.

Anyone here know where to learn to make this work right? I'd guess
there are Chinese and Arabic Mac users; what do you have to do to make
Mac software handle your language sanely?

see_signature...
Use Unicode fonts? Seriously, there are only two apps I use that don't
handle Arabic well, apart from old, non unicode-aware apps, and that is
Office 2004 and Safari.

John Chambers...
Well, that's one of the things I've been trying to learn about. I think I've
installed some fonts in addition to the ones that came with Tiger, but I
see lots of evidence that I probably haven't done it right. I've been rather
frustrated by the lack of clear instructions on the topic.

For example, one of my makefile commands uses the pstopdf command
to create the .pdf file from the output of a program that generates PS:

: make GIS_4T_D.utf8.ps
pstopdf GIS_4T_D.utf8.ps
Gothic not found, using Courier.
Palatino not found, using Courier.
Times not found, using Courier.
Times-Narrow not found, using Courier.
Times-Courier not found, using Courier.
Times-New-Roman not found, using Courier.
: open -a Preview GIS_4T_D.utf8.pdf
:

The GIS_4T_D.utf8.ps file does appear, but as you can see, pstopdf has
some unspecified problem with fonts. There's a "man pstopdf" page, but
it doesn't contain the string "font" in any capitalization. Asking Mac Help
about "unicode fonts" gets a "No pages with your search words were found"
result. web-search isn't much help, either. Funny thing is that I can see in the
Preview window showing GIS_4T_D.utf8.pdf that several fonts were used,
including Italic and sans-serif fonts, so it finds some fonts. I can look in
the GIS_4T_D.utf8.ps file via Textedit or a Terminal window and see that
it shows Chinese and Arabic text, but the Preview window shows Latin-1
gibberish instead.

So is there some place I haven't found that explains how to install unicode
fonts so that pstopdf, Preview, Acrobat and my printer can use them successfully?

see_signature...
I know very little about the utilities you use on the CLI, but are you
sure the fonts are installed globally or at least are available for the
user ID you are logged in as?

Just to rehash, and you may very well know this, fonts stored in:

/System/Library/Fonts are the System's fonts.
Don't touch, is the simple message.

/Library/Fonts are global fonts, available to all users. This is the
default for most font installs. Your utils ought to see these, presuming
they are of a valid format.

~/Library/Fonts are fonts installed for a single user.

In addition, if you have Classic on your computer, fonts stored in
/System Folder/Fonts (the OS 9 default) will be available globally.
Remember that Classic can use old-style bitmaps or outdated PostScript
formats that may not work with your *nix utils.

This setup presumes that you don't have font folders on the netork or
that you use font utilities such as Suitcase that allow you to store
fonts "anywhere". Both of those may be untrue.



With Office it's an MS-acknowledged problem with MS' font rendering on
Mac OS X (the problem is MS', not Apple's). Office 2004 can't even open
Office 2003 Arabic documents properly. :-(
So - MS is not even MS-coompatible.

With Safari I don't know what it is, but many pages will not load the
Arabic properly - i.e., they load the characters without ligatures,
making them essentially unreadable to any Arabic speaker.

Apart from that, TextEdit, Mellel, NeoOffice 1.2, FireFox - more or less
everyone - treat Arabic (Unicode) straight out of the box just dandy.

mostly by converting them to UTF-8. This turns out, as one might expect, not
all that difficult for traditional unix programs, most of which treat text as a
string of bytes, with "unknown" chars just carried along unchanged. But I've
had a lot of problems with OS X apps, and I'm looking for some good advice.

I found a very useful test case, in the form of a nice Chinese song, which I first
transcribed in the ABC music notation (web-search for it), and then fed to some
programs that convert it to PS and PDF to get normal music notation. By some
chance, I have accounts on linux and FreeBSD machines that are running web
servers, as is my Mac Powerbook (10.4.5). It has a Chinese title, a Pinyin
transliteration, and an English title.

The real fun started when a friend asked if I could make an Arabic version.
I have a start, with four titles. Combining Chinese, English and Arabic is
a nice worst-case test. I can cat the file in a terminal window on OS X, linux
and FreeBSD, and the titles look fine. It even works if I ssh from one machine
to another. I can edit the file on all three systems, and when I save it, the text
in all three languages suffers no damage. I can even use TextEdit on the Mac,
and it works fine; that's where the Arabic title got added. (My wife did it; her
Arabic is a lot better than mine. ;-) I can also fetch the file on several browsers.
Mozilla, Firefox and Opera all display it fine, though Safari produces bizarrely
wrong Arabic.

The problem is getting the PS and PDF versions on the screen or a printer.
I can show that the PS contains the correct titles. You can see them all at:

Vienna Teng fans will recognize this. The .abc and .txt files are the same file,
but the .abc file produces a MIME type that your browser probably refuses to
handle, so look at the .txt file unless you have ABC software. The titles in the
.pdf and .ps files will probably look like 8859-1 gibberish. It looks something
like this (depending on your news reader ;-):

绿å2å ̧°
next