Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] korean engdic



Botond Botyanszki wrote:

>I looked at your "engdic problem". Here is a perl script I hacked together
>in a couple minutes that will to the same thing as your c program.
>
Grr! Death to the Perl infidels!

>Basically gjiten was designed to parse edict files, and engdic is pretty
>much incompatible unless you throw away half the content.
>Edict format is like this:
> japanese /english1/english2/
>On the other hand, engdic is korean->english. While you can do the reverse
>conversion, you will end up with something where the lines don't start
>with korean, eg. the english explanation is the first field. This can
>contain spaces, hence the segfault.
>
Ah. Well, I noticed some Hangul in a few of the English bits, but this 
is actually not part of the specs stated on the web page. 
_Theoretically_, the only incompatibilities between Engdic and EDICT are 
the unparsed Korean and English bits, which are mostly in Hangul, and 
are generally too long to be put into the EDICT file. What a shame it 
didn't meet those specs completely, though. Perhaps the dictionary would 
have to be stripped of some of its entries and forked into an EDICT 
branch if this were to ever work perfectly. Sounds like a boring task, 
and thus, not for me. ;P

>Another problem is when you end up with lines like this:
>=SYNONYM word
>This can be translated with a couple more lines of perl code, though.
>  
>
Yes, I couldn't be bothered to parse the meta info either, but I thought 
it was a shame to dis... dis... drop it all. Discard. Ergh. Anyway, 
thanks for your input - I might try your treacherous Perl script, too, 
though obviously, I've no way of evaluating the end result until I study 
Korean for a bit in two years. ;) . Sorejya...!

-Dave Oftedal

-- 
http://home.no.net/david/



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links