Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] [OT/long] Yet another JMdict front-end



[apologies if this appears twice--I sent it earlier from the wrong account]

This really has nothing to do with Linux, but I know many of you are interested in Japanese and Japanese dictionaries, and many of you also are knowledgeable about Web design & development, so I thought I'd let you know about a little project I have begun, and solicit some feedback.

To make a long story short, a couple of weeks ago I was looking for an online Kanji dictionary, and couldn't find one I really liked. Or rather, I couldn't find an *interface* I really liked. So I decided to create my own. The site is intended to be fast, easy to navigate, and aesthetically pleasing. The target audience is English-speaking learners of Japanese (intermediate-to-advanced?), so the emphasis is on providing easy access to phrases including the target kanji, readings, and definitions--basically the kind of info you find in edict/jmdict. It currently does not provide information of more scholarly interest, such as Nelson index numbers and all that, though such info could maybe be added later on as an "advanced option."

Another thing you should know is that my site makes heavy use of the latest in Web-standard[*] technology. It is an AJAX application, so you at least need a browser with JavaScript enabled, and that supports XMLHttpRequest. So recent Gecko-based browsers should be fine, along with IE 5.5+ (?) and Opera 8+. Since the whole point of the project is to develop a nicer interface to content that is easily available elsewhere, I don't feel obligated to create an alternative for older browsers, but of course I provide links to other online Kanji dictionaries.

Here's the URL: <http://matt.gushee.net:8250/index.html>. That's probably temporary, so even if you really like it, please don't post any links to it just yet. If you have comments and don't want to clutter this list, just send an e-mail to <matt@example.com>.


SOME ISSUES TO CONSIDER
=======================

First of all, the title. I am tentatively calling the thing "楽漢摘." I like to think it's rather a clever pun, but if any native Japanese speakers are reading this, I'd like to know how it sounds to you. Is it just a wake-wakaranai gaijin joke? Please don't worry about offending me--I will be happy to change the title if it is too weird.

Now on to more substantive issues:

Indexing approach
-----------------

There will probably be several indexes in the future, but currently I provide one way to look up Kanji: a traditional radical/stroke-count index. Specifically, you select the radical stroke count, then the radical itself, then the stroke count for the whole character, then the specific character that you want. Although it is a linear process and thus easy to understand in principle, it has the disadvantage that people don't know by heart how many strokes are in a character, and it can be very hard to figure out for the more complex ones. In a printed dictionary it's less of a problem because you can easily shift your eyes to another part of the page; in a browser I think it will be awkward at best.

What other alternatives might work well (when you don't know the pronunciation)? I've seen Jim Breen's "multi-radical" method and was initially resistant to it for a couple of reasons: first, it is non-linear, and thus is superficially more complex than the radicals/strokes method.

Second, I have been taught (for both Chinese and Japanese) that the radical is the "meaning" component, and that in general a character has exactly one radical. At any rate, I believe the radical has etymological significance, and that understanding which part of the Kanji is the radical can contribute to an overall mastery of the language. And a single-radical dictionary index reinforces that understanding.

But I'm thinking that a multi--can I say "component" instead of "radical"? Then maybe I could set aside the philosophical objection. Anyway, a well-designed multi-thing index might after all be an easier way to look up Kanji.

Strokes/radicals index navigation
---------------------------------

If I decide to go to a multi-component index, this might not matter any more. But for the moment, there is an issue with the index menus: in view of the fact that the user will often not be sure how many strokes there are in a character, I have created dynamic menus such that ... actually it's best if you try it out. Basically, if you move your mouse over an item in one row of the menu, the next row is *temporarily* displayed. Thus, let's say you have chosen a given radical. There is a row of numbers representing stroke counts of characters with that radical; if you run your mouse along that row you can easily see what characters exist for each stroke count.

So, do you think this is (a) useful, and (b) intuitive? It would be a lot easier to make the menus so that the next row only changes when you click something. But if people find the transient display a very helpful feature, I will make it work.

Presentation of results
-----------------------

Currently when you select a Kanji, a request goes to the server, which returns a document containing all phrases that start with that Kanji. This document is dumped into a table with 3 columns: [Kanji] Phrase, Reading, and Definitions. This is reasonable in some cases, but sometimes the response document is quite large, so I think some kind of chunking and/or filtering would be helpful. It gets worse if we want to look up all phrases *containing* the selected character. My server-side script can indeed do that, but sometimes it's just way too much data, so I've disabled that behavior for the moment.

Another issue with the result sets is that they're not sorted in any useful way--actually I believe they are ordered according to the JMdict entry sequence number.

So, how can I improve the processing and presentation of the results?

Miscellaneous technical stuff
-----------------------------

Preparing the index: my list of radicals is derived from Jim Breen's KANJIDIC, but since his data is prepared for a multi-radical lookup system, I can't automatically extract a radicals-and-strokes index, so I am currently creating the index manually. That's why it's so incomplete, of course. Does anyone know of another database somewhere that list each kanji by (single) radical and stroke count?

Glyphs for radicals: if my understanding of the KANJIDIC documentation is correct, there is a glyph of each radical in Japanese Kanji, but some of them only exist in JISX-0212. If so, you either have to require the user to have a JISX-0212 font, use images to represent some radicals, or use substitute glyphs from JISX-0208. The last option is not really acceptable, I don't think. E.g., 化 for 人偏??

Nice Japanese font: this is purely subjective, of course, but I find Mincho rather ugly. I have a font family called DFKaisho which I find to be an excellent combination of elegance and readability; my stylesheet specifies it for some of the Kanji display elements (with "serif" as a fallback, of course). But in the interest of a more beautiful Kanji-browsing experience, are there other Kaisho or similar fonts that are widely used? Let me know their names and I'll stick 'em in the stylesheet. Or tell me to just use Mincho if that's your view. But be advised: I am very stubborn about fonts.


[*] Using the term 'standard' to include some de facto standards as well
    as official published ones.

--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language :     matt.gushee.net/rascl/ :


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links