Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][tlug] Why change a linux server's locale?
- Date: Sat, 9 Feb 2019 20:32:30 +0900
- From: "Stephen J. Turnbull" <turnbull.stephen.fw@example.com>
- Subject: [tlug] Why change a linux server's locale?
- References: <5dd19c80-2ac9-8eb1-ea46-768288dc600f@ncsa.jp>
マスターズ・イアン writes: > During the time I've been working in Japan (over 15 years), almost > exclusively in Linux, I've come across many attempts to "push" > Linux into ... Japanese language encodings such as euc and > sjis. There are many inconveniences involved in working with modern software such as Python 3 and even glibc in Japan. For example, infozip utilities take the standard seriously, and assume that Shift-JIS- encoded file names are ISO-8859-1-encoded. But many utilities in common use produce such zipfiles. And glibc is specifically a POSIX system, and takes POSIX locales seriously. It leaks through sometimes. > It's never been a huge problem, as far as I can remember, but I > really wonder why it's necessary, given that: > 1. ja_JP.UTF-8 supports Japanese completely, and But it does not! Japanese is a language, not a coded character set. Japanese users use EUC, Shift JIS, ISO-2022-JP-some-corporate-variant, and UTF-8 quite catholicly, and expect that sewer sludge to be transparent. Japanese is also a culture where local custom is far more important than Internet or even national standards, especially where "local" is spelled c-o-r-p-o-r-a-t-e (thus the proliferation of "corporate variants" of the JIS character repertoire). Most corporate repertoires are subsets of the modern JIS (and therefore Unicode) repertoires, but they sometimes disagree on what codes are assigned to those characters. It's a huge mess, even today, though much less likely to cause substantial delays or misunderstandings than 30 years ago. For example, my university considers itself to be the MIT of Japan, yet many of its internal pages are "enterprise software" displaying partial mojibake due to use of iframes and other malware. (They assume use of a Japan-localized version of IE -- I don't think they really support Edge yet -- which prefers automatic recognition of coded character sets to MUST and REQUIRED features of the HTTP and HTML standards. So it just works, unless you have a standard- conforming browser, when the latter is kinda a good thing in today's insecurity environment. :-þ) > 2. Most things that go on inside a database are unaffected by the OS's locale > So, my questions are ... > 1. Do you know of any reason I should be worried about the fact > that our development server's locale is ja_JP.UTF-8 but the > customer's is ja_JP.sjis (which isn't even supported on Red Hat > Enterprise Linux)? "Worried," no. Expect occasional annoyances and bill accordingly. Specifically, any customers-of-customer-facing server probably is OK. As others point out, these usually are pretty good about handling text internally as Unicode and spitting it out in the client browser's preferred coded character set. (I assume you would know if that's not true!) On the other hand, internal-use software often assumes that it knows what the character set is, and that it may be inferred from the locale. Usually this is not a problem (most people know only one language and that one not so well ;-þ, so multilingual environments are uncommon, and roundtripping is a central design principle of Unicode). Somebody just needs to transcode when shipping stuff between systems. Likely both the dev server and the production server will work fine in their own environments. You don't say, but I guess that you aren't the admin of the customer's server, while the admins are Japanese and the corporate culture is Shift-JIS. The problem is going to be communication between devs and admins, eg, passing around zip files and perhaps scripts where their server uses a autodetecting "jgrep" that handles Shift JIS and ISO-2022-JP while you have a vanilla GNU grep that expects ASCII-compatible ISO-8859 or UTF-8. And there are potential problems with the file system if file names need to be input from the console (again, web apps are generally more robust). > 2. When have you found it absolutely imperative to have a Linux > server with sjis locale? Never. In fact, it's likely to get in the way of your own work. The communication problems mentioned above are annoying, but less so than the infelicities of dealing with Shift JIS during daily work.
- References:
- [tlug] Why change a linux server's locale?
- From: マスターズ・イアン
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Why change a linux server's locale?
- Next by Date: [tlug] Ubuntu Facebook popups
- Previous by thread: Re: [tlug] Why change a linux server's locale?
- Next by thread: Re: [tlug] Why change a linux server's locale?
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links