Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] search for fulltext-searchengine



2008/5/19 Christian Horn <chorn@example.com>:
> Hi,
>
> i look for software that fullfills those requirements:
> - provide a knowledge-database via webinterface to users
> - provide a search-function that indexes
>   - the knowledge-database contents
>   - and office-documents, pdf, textfiles in a directory
> - be able to do all this with kanji
>
(...)
>
> Will look into debugging this, or into other searchengines.
>
> Maybe some of you have similiar requirements and good ideas for
> other software to use.

I once built a Japanese-capable search function using Lucene[1], and
ChaSen [2] to analyze / tokenize the Japanese inputs. It was rather a
nasty hack involving embedding Java in Perl, but functioned pretty
well for a couple of days work. I think there's an addon CJK analyzer
/ tokenizer for Lucene but don't know how well it works [3].

Probably not the answer you're looking for, but I thought it might be
worth mentioning.

[1] http://lucene.apache.org/
[2] http://chasen.naist.jp/hiki/ChaSen/
[3] http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/analyzers/src/java/org/apache/lucene/analysis/cjk/


Ian Barwick


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links