
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] korean engdic
Botond Botyanszki wrote:
>I looked at your "engdic problem". Here is a perl script I hacked together
>in a couple minutes that will to the same thing as your c program.
>
Grr! Death to the Perl infidels!
>Basically gjiten was designed to parse edict files, and engdic is pretty
>much incompatible unless you throw away half the content.
>Edict format is like this:
> japanese /english1/english2/
>On the other hand, engdic is korean->english. While you can do the reverse
>conversion, you will end up with something where the lines don't start
>with korean, eg. the english explanation is the first field. This can
>contain spaces, hence the segfault.
>
Ah. Well, I noticed some Hangul in a few of the English bits, but this
is actually not part of the specs stated on the web page.
_Theoretically_, the only incompatibilities between Engdic and EDICT are
the unparsed Korean and English bits, which are mostly in Hangul, and
are generally too long to be put into the EDICT file. What a shame it
didn't meet those specs completely, though. Perhaps the dictionary would
have to be stripped of some of its entries and forked into an EDICT
branch if this were to ever work perfectly. Sounds like a boring task,
and thus, not for me. ;P
>Another problem is when you end up with lines like this:
>=SYNONYM word
>This can be translated with a couple more lines of perl code, though.
>
>
Yes, I couldn't be bothered to parse the meta info either, but I thought
it was a shame to dis... dis... drop it all. Discard. Ergh. Anyway,
thanks for your input - I might try your treacherous Perl script, too,
though obviously, I've no way of evaluating the end result until I study
Korean for a bit in two years. ;) . Sorejya...!
-Dave Oftedal
--
http://home.no.net/david/
Home |
Main Index |
Thread Index