Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][tlug] How to make a current running kanji compound list from the news
- Date: Fri, 22 Jul 2011 14:47:44 +0900
- From: "Stephen J. Turnbull" <stephen@example.com>
- Subject: [tlug] How to make a current running kanji compound list from the news
- References: <CA+kCxRZTuMYGzURV2Rm2k4F59oymJ7FsKiaK6bzYqb812hfycQ@example.com>
Martin G writes: > In the course of my studies, I came across mention of this book, which > lists 1000 kanji compounds useful for reading the news: > http://www.amazon.com/dp/0804809194/ (wget or other spider) + (FreeWAIS or Xapian or other full-text indexer) is probably overkill, but they already produce the kind of statistics needed, at least internally. Cross-referencing Wa-Ei dictionaries should be easy to do. I'm pretty sure Jim provides several different libraries for accessing EDICT files, as well as AJAX access or similar to the WWWDict site. I think there are also web spiders written in Python, which I mention because I know there are Python bindings for libxapian (and because I don't like PHP ;-). WAIS is old enough technology that I doubt there are bindings for many modern languages (besides C), but I could be wrong. Probably similar facilities are available for your choice of poison, though.
- References:
Home | Main Index | Thread Index
- Prev by Date: [tlug] How to make a current running kanji compound list from the news
- Next by Date: Re: [tlug] How to make a current running kanji compound list from the news
- Previous by thread: Re: [tlug] How to make a current running kanji compound list from the news
- Next by thread: Re: [tlug] How to make a current running kanji compound list from the news
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links