TLUG Mailing List

Mailing List Archive
Support open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Counting hiragana in EUC

To: tlug@example.com

Subject: Re: Counting hiragana in EUC

From: Andreas Marcel Riechert <riechert@example.com>

Date: 04 Feb 2001 17:24:38 +0100

Content-Type: text/plain; charset=us-ascii

In-Reply-To: Simon Cozens's message of "Sun, 4 Feb 2001 15:04:20 +0000"

References: <20010204131550.A22942@example.com><m3n1c2fq20.fsf@example.com><20010204150420.A24742@example.com>

Reply-To: tlug@example.com

Resent-From: tlug@example.com

Resent-Message-ID: <FBpi9B.A.yUF.E_Xf6@example.com>

Resent-Sender: tlug-request@example.com

Sender: riechert@example.com

User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7
Simon Cozens <simon@example.com> writes:
 
> Yep. This is just for the purposes of an experiment, to see whether or
> not I can segment incoming hiragana text into words.

My favourite topic!!!...And I am always happy to get a chance to talk
about it, or to learn from other minds.
While your task seems to be interessting, I just wonder what your
definition of the word "word" may be.

Segementing a Japanese phrase into words (read "word" as lemmata) is for sure
very important for e.g an automatic dictionary-lookup routine.
For segmenting incoming hiragana text in a meaningful way, Part-of-speech/
morphological segmentation or bunsetsu  segmentation seems IMHO to be a
more promising approach, but I am allways happy to get new creative input.

Maybe 
http://www.ipsj.or.jp/members//Journal/Eng/3806/article005.html
could be interessting for you.

HTH,

Andreas Marcel Riechert
References:

Counting hiragana in EUC
From: Simon Cozens <simon@example.com>

Re: Counting hiragana in EUC
From: Andreas Marcel Riechert <riechert@example.com>

Re: Counting hiragana in EUC
From: Simon Cozens <simon@example.com>

Prev by Date: Re: Counting hiragana in EUC

Next by Date: weird pine behaviour (fwd)

Prev by thread: Re: Counting hiragana in EUC

Next by thread: Re: Counting hiragana in EUC

Index(es):

Date

Thread

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links