Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][tlug] [OT] Calling for volunteers to mark possible dictionary entries
- Date: Wed, 27 Jul 2011 13:27:40 +1000
- From: Jim Breen <jimbreen@example.com>
- Subject: [tlug] [OT] Calling for volunteers to mark possible dictionary entries
Greetings [This has already been posted on a couple of other lists.] As some of you know, I am carrying out research into ways of automatically identifying neologisms and other not-yet-in- dictionary terms. As part of this I am experimenting with a Machine Learning system which I am training to recognize the sorts of words and terms that are included in dictionaries. What I need is text in which such words have been identified by people, so I can test the ML system and compare results. I have selected a batch of 2,000 sentences from past issues of the Mainichi Shimbun and Nikkei Shimbun. Half of these contain recent JMdict additions (which the ML system doesn't know about) and the other half are randomly selected and may or may not contain unrecorded words. What I need are volunteers to look at the sentences, see if they contain unrecorded words which could be candidates to go in a dictionary, mark any that they see, and indicate if there are none or no more to mark. I need people who are reasonably comfortable reading Japanese newspaper text. I have put together a simple WWW system for displaying the sentences (one at a time) and enabling terms to be marked, comments added, etc. The system is at: http://www.csse.monash.edu.au/~jwb/cgi-bin/annotate/instructions.cgi Please help out by looking at some sentences and marking them. If people on this list did 20 or 30 sentences, the job would be done quickly. Looking forward to lots of activity. Cheers Jim -- Jim Breen Adjunct Snr Research Fellow, Clayton School of IT, Monash University Webmaster: Hawthorn Rowing Club, Treasurer: Japanese Studies Centre Graduate student: Language Technology Group, University of Melbourne
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] How to make a current running kanji compound list from the news
- Next by Date: Re: [tlug] Disposing of a hard drive (How to proceed with RAID drive failure)
- Previous by thread: [tlug] How to make a current running kanji compound list from the news
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links