TLUG Mailing List

Mailing List Archive

tlug.jp Mailing List tlug archive tlug Mailing List Archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] oneliners, Was: Moving on from xterm

Date: Thu, 25 Aug 2016 08:34:26 +0900

From: NOKUBI Takatsugu <knok@example.com>

Subject: Re: [tlug] oneliners, Was: Moving on from xterm

References: <20160819111442.GA30780@quadratic.cynic.net> <9f9cc5f579c92c3ddf7f29865d5862c2@jp.sometwo.net> <20160822114101.GA3944@fluxcoil.net> <87h9ace7zm.wl-knok@daionet.gr.jp> <CABHGxq4gBx39m0+TPZe3LLYPFetAvoc1wfZj0_0YGz3+w2A=1w@mail.gmail.com> <87fupvdqna.wl-knok@daionet.gr.jp> <CABHGxq5=zHNYVLEK+SPg8jg3jJsD54rFsM9=-hNwZwQ2jhkEOw@mail.gmail.com>

User-agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (Gojō) APEL/10.8 EasyPG/1.0.0 Emacs/24.4 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
At Wed, 24 Aug 2016 12:11:58 +1000,
Jim Breen wrote:
> Apart from its age, IPADIC also had/has problems with release permissions
> dating back to its ICOT source. For that reason the people at NAIST built
> a replacement "NAIST DIC". (https://en.osdn.jp/projects/naist-jdic/)

Oh... This "IPADIC/ICOT license issue" is caused by me...

This problem was discussed on debian-legal mailing list, and the
following is the summary:

https://wiki.debian.org/IpadicLicense

Now debian treats ipadic as DFSG-free. BTW, this is only discussed on
Debian Project. Other distribution (like Fedora, OpenSuSE) don't care
about it.

I think this problem overrated in the public mind.

> > On the other hand, Toshinori Sato said that mecab-ipadic-neologd is
> > better performance than plain ipadic on text classification task.
> > It's really hard problem...
> 
> "text classification task"って? For getting the right yomikata (aka furigana)
> on a proper name longer sequences can be useful, but there's a lot of
> text analysis where the stuff Sato has added would cause quite some grief.
> His addition of "中居正広のミになる図書館" as an entry is a hoot.

It means using mecab-ipadic-neologd for word segmentation, and not
using feature. Word segmentation is widery used for text
classification task. I didn't make clear.

Toshinari said using text classification task is for quantitive
evaluation for the dictionary. I heard from him in a public event, but
the are no presentation material, so I don't now the details.

In general natural language processing, mecab-ipadic-neologd is not
good. I agree with you.

By the way, I made a script to convert from SKKJISYO to kakasidict.
I think It is also useful for everyone.
http://www.namazu.org/gitweb/?p=dictconv.git;a=tree

The original kakasidict is also based on very old SKKJISYO, but
SKKJISYO itself has been updated now.
References:

[tlug] Moving on from xterm
From: Curt Sampson

Re: [tlug] Moving on from xterm
From: Furkan Mustafa

[tlug] oneliners, Was: Moving on from xterm
From: Christian Horn

Re: [tlug] oneliners, Was: Moving on from xterm
From: NOKUBI Takatsugu

Re: [tlug] oneliners, Was: Moving on from xterm
From: Jim Breen

Re: [tlug] oneliners, Was: Moving on from xterm
From: NOKUBI Takatsugu

Re: [tlug] oneliners, Was: Moving on from xterm
From: Jim Breen

Prev by Date: Re: [tlug] oneliners, Was: Moving on from xterm

Previous by thread: Re: [tlug] oneliners, Was: Moving on from xterm

Next by thread: Re: [tlug] Moving on from xterm

Index(es):

Date

Thread

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links