
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] Big5 Vs. Unicode Vs. Netscape 4.x Vs. deadline
- Date: Tue, 6 May 2003 18:28:15 -0700 (PDT)
- From: Jake Morrison <jake_morrison@example.com>
- Subject: Re: [tlug] Big5 Vs. Unicode Vs. Netscape 4.x Vs. deadline
Jonathan,
I used to do similar applications (web interfaces
to LDAP directories) back in the Netscape 4
days. My experience was that converting on the fly
between UTF-8 and Big5 had minimal performance impact.
I used Perl (mod_perl), with the conversion
done in C.
I would expect that Java would be able to
do this reasonably efficiently. But you never know...
I often find the Java development environment/process
to be less effective than what we were doing with
Perl/Apache years ago :-).
Putting Big5 in the database is OK, too. It is best
if the db supports Big5 code, though, otherwise there
may be wierd query results. Big5 can be a pain
to work with, as it often includes "special" characters
in with the data as the 2nd byte which need quoting --
< or & cause problems with HTML, and ' causes problems with
SQL.
Jake
--- Jonathan Q <jq@example.com> wrote:
> Let me present you with a hypothetical situation.
>
> I disavow any association with it except for having
> recently hypothetically stepped into a sort of hypothetical
> rescue-kibitzer role.
>
> A company developed a database-backed intranet for a certain
> other, large company's office in a rather prosperous part of
> China. The initial development was in English and now a Chinese
> translation is being done. The programmer working on this
> created the Chinese-language entries in the database in
> Unicode.
>
> Today, she learned some interesting facts:
>
> 1) Netscape 4 doesn't support Unicode;
>
> 2) 90%+ of the customer's staff are using Netscape 4.
> Telling them to upgrade is out of the question.
>
> The site is using JBoss and Apache for Windows, along with
> some Other Company's database.
>
> Her options at this point would seem to be:
>
> 1) Write or find a servlet that will convert the Unicode
> in the database to Big5 on the fly;
>
> 2) Throw all caution to the wind and convert the entire
> database to Unicode and be done with it.
>
> Oh, and did I mention that the project due date is Friday, so
> she's expected to have it in the customer's hands on Thursday so
> they can start checking it before the weekend?
>
> No milestone versions or betas have been done at all. Like I
> said, I disavow all association with that hypothetical project.
>
> I also hypothetically advised her that she really needs to
> have a good input filter to make sure that whatever the
> customer's staff input to the database, it is converted to
> Unicode or whatever else the database ends up finally using,
> since otherwise your database will doubtless quickly fill with
> all sorts of crap.
>
> She seems a bit too young to know about ugly old browsers and
> a bit thin on knowledge of the pitfalls of mutli-byte platforms
> issues.
>
> So, my question to you good people (and BOFHs :-) is, "What would
> you advise her to do? I'm sort of leaning toward solution 2, plus
> the input filter (of course), since the customer has thousands
> of employees and all of that outbound conversion could lead to
> significantly elevated server loads that they haven't planned
> on or budgeted for. On the other hand, keeping the database in
> Unicode is probably a cleaner solution.
>
>
> TIA,
> Jonathan
>
>
> **********************************************************
> TLUG server is hosted by Open Source Development Lab Japan
> http://www.osdl.jp/
> **********************************************************
>
> ==========================================================
> To unsubscribe from this mailing list,
> please see instructions at <http://www.tlug.jp/list.html>
> ==========================================================
>
Home |
Main Index |
Thread Index