
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[tlug] [OT/long] Yet another JMdict front-end
- Date: Sun, 30 Jul 2006 18:21:58 -0600
- From: Matt Gushee <matt@example.com>
- Subject: [tlug] [OT/long] Yet another JMdict front-end
- User-agent: Thunderbird 1.5.0.5 (X11/20060729)
[apologies if this appears twice--I sent it earlier from the wrong account]
This really has nothing to do with Linux, but I know many of you are
interested in Japanese and Japanese dictionaries, and many of you also
are knowledgeable about Web design & development, so I thought I'd let
you know about a little project I have begun, and solicit some feedback.
To make a long story short, a couple of weeks ago I was looking for an
online Kanji dictionary, and couldn't find one I really liked. Or
rather, I couldn't find an *interface* I really liked. So I decided to
create my own. The site is intended to be fast, easy to navigate, and
aesthetically pleasing. The target audience is English-speaking learners
of Japanese (intermediate-to-advanced?), so the emphasis is on providing
easy access to phrases including the target kanji, readings, and
definitions--basically the kind of info you find in edict/jmdict. It
currently does not provide information of more scholarly interest, such
as Nelson index numbers and all that, though such info could maybe be
added later on as an "advanced option."
Another thing you should know is that my site makes heavy use of the
latest in Web-standard[*] technology. It is an AJAX application, so you
at least need a browser with JavaScript enabled, and that supports
XMLHttpRequest. So recent Gecko-based browsers should be fine, along
with IE 5.5+ (?) and Opera 8+. Since the whole point of the project is
to develop a nicer interface to content that is easily available
elsewhere, I don't feel obligated to create an alternative for older
browsers, but of course I provide links to other online Kanji dictionaries.
Here's the URL: <http://matt.gushee.net:8250/index.html>. That's
probably temporary, so even if you really like it, please don't post any
links to it just yet. If you have comments and don't want to clutter
this list, just send an e-mail to <matt@example.com>.
SOME ISSUES TO CONSIDER
=======================
First of all, the title. I am tentatively calling the thing "楽漢摘." I
like to think it's rather a clever pun, but if any native Japanese
speakers are reading this, I'd like to know how it sounds to you. Is it
just a wake-wakaranai gaijin joke? Please don't worry about offending
me--I will be happy to change the title if it is too weird.
Now on to more substantive issues:
Indexing approach
-----------------
There will probably be several indexes in the future, but currently I
provide one way to look up Kanji: a traditional radical/stroke-count
index. Specifically, you select the radical stroke count, then the
radical itself, then the stroke count for the whole character, then the
specific character that you want. Although it is a linear process and
thus easy to understand in principle, it has the disadvantage that
people don't know by heart how many strokes are in a character, and it
can be very hard to figure out for the more complex ones. In a printed
dictionary it's less of a problem because you can easily shift your eyes
to another part of the page; in a browser I think it will be awkward at
best.
What other alternatives might work well (when you don't know the
pronunciation)? I've seen Jim Breen's "multi-radical" method and was
initially resistant to it for a couple of reasons: first, it is
non-linear, and thus is superficially more complex than the
radicals/strokes method.
Second, I have been taught (for both Chinese and Japanese) that the
radical is the "meaning" component, and that in general a character has
exactly one radical. At any rate, I believe the radical has etymological
significance, and that understanding which part of the Kanji is the
radical can contribute to an overall mastery of the language. And a
single-radical dictionary index reinforces that understanding.
But I'm thinking that a multi--can I say "component" instead of
"radical"? Then maybe I could set aside the philosophical objection.
Anyway, a well-designed multi-thing index might after all be an easier
way to look up Kanji.
Strokes/radicals index navigation
---------------------------------
If I decide to go to a multi-component index, this might not matter any
more. But for the moment, there is an issue with the index menus: in
view of the fact that the user will often not be sure how many strokes
there are in a character, I have created dynamic menus such that ...
actually it's best if you try it out. Basically, if you move your mouse
over an item in one row of the menu, the next row is *temporarily*
displayed. Thus, let's say you have chosen a given radical. There is a
row of numbers representing stroke counts of characters with that
radical; if you run your mouse along that row you can easily see what
characters exist for each stroke count.
So, do you think this is (a) useful, and (b) intuitive? It would be a
lot easier to make the menus so that the next row only changes when you
click something. But if people find the transient display a very helpful
feature, I will make it work.
Presentation of results
-----------------------
Currently when you select a Kanji, a request goes to the server, which
returns a document containing all phrases that start with that Kanji.
This document is dumped into a table with 3 columns: [Kanji] Phrase,
Reading, and Definitions. This is reasonable in some cases, but
sometimes the response document is quite large, so I think some kind of
chunking and/or filtering would be helpful. It gets worse if we want to
look up all phrases *containing* the selected character. My server-side
script can indeed do that, but sometimes it's just way too much data, so
I've disabled that behavior for the moment.
Another issue with the result sets is that they're not sorted in any
useful way--actually I believe they are ordered according to the JMdict
entry sequence number.
So, how can I improve the processing and presentation of the results?
Miscellaneous technical stuff
-----------------------------
Preparing the index: my list of radicals is derived from Jim Breen's
KANJIDIC, but since his data is prepared for a multi-radical lookup
system, I can't automatically extract a radicals-and-strokes index, so I
am currently creating the index manually. That's why it's so incomplete,
of course. Does anyone know of another database somewhere that list each
kanji by (single) radical and stroke count?
Glyphs for radicals: if my understanding of the KANJIDIC documentation
is correct, there is a glyph of each radical in Japanese Kanji, but some
of them only exist in JISX-0212. If so, you either have to require the
user to have a JISX-0212 font, use images to represent some radicals, or
use substitute glyphs from JISX-0208. The last option is not really
acceptable, I don't think. E.g., 化 for 人偏??
Nice Japanese font: this is purely subjective, of course, but I find
Mincho rather ugly. I have a font family called DFKaisho which I find to
be an excellent combination of elegance and readability; my stylesheet
specifies it for some of the Kanji display elements (with "serif" as a
fallback, of course). But in the interest of a more beautiful
Kanji-browsing experience, are there other Kaisho or similar fonts that
are widely used? Let me know their names and I'll stick 'em in the
stylesheet. Or tell me to just use Mincho if that's your view. But be
advised: I am very stubborn about fonts.
[*] Using the term 'standard' to include some de facto standards as well
as official published ones.
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
Home |
Main Index |
Thread Index