Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Re: font encoding question



burlingk@example.com writes:

 > > What is insufficiently inclusive about "universal"?
 > 
 > I had read somewhere that if you tried to use Chinese, Japanese, 
 > and Korean together, that sometimes UTF-8 ran into issues.  

No, there are people who have issues.  "ichi" is "ichi" no matter
which language.  The problem that people have is that they want to
visually distinguish text in Japanese from text in Chinese with
different fonts.  But it's more or less like asking for a different
charset for writing C keywords and Python keywords, and claiming that
C "for" is a different word from Python "for".  You solve it the same
way: you use markup to specify different fonts, or you do something a
little intelligent about recognizing whether text "looks like"
Japanese or "looks like" Chinese, and applying a font accordingly.
It's a little bit harder and less accurate than for programming
languages, but few editors ever get confused about the "for"s in

/* this is a bit of code I wrote for TLUG */
int
main (int argc, char **argv) {
  int i;
  for (i = 0; i <= argc; ++i)
    printf ("There's nothing wrong with Unicode for Japanese.\n");
  exit (0);
}

and mark all three up as keywords.  You can expect that a moderately
smart routine should be able to do reasonably well at discriminating
among the Han-using languages (Hangul and kana are a dead giveaway,
for two).

 > I could be wrong though, because it was something seen on a mailing 
 > list, and not an actual specification page. ^^;

Oh, there have been whole books written about how Unicode will be the
death of the Japanese language.  Eg, 「いま日本語があぶない!」.



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links