Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Japanese search engines
- To: tlug@example.com
- Subject: Japanese search engines
- From: Frank BENNETT <bennett@example.com>
- Date: Sat, 16 Dec 2000 19:21:02 +0900
- Content-Transfer-Encoding: 7bit
- Content-Type: text/plain; charset=iso-2022-jp
- Reply-To: tlug@example.com
- Resent-From: tlug@example.com
- Resent-Message-ID: <7H8Mv.A.YVE.SI0O6@example.com>
- Resent-Sender: tlug-request@example.com
Does anyone have any suggestions for Japanese search engines? I would also welcome pointers to sources of information. The task at hand is to provide full-text search interface for a mirror Kanpo, the official gazette of the Japanese government (yes, I did finally finish the PDF site ripper I was working on!). The amount of data is large -- the archive expands at the rate of about 1.5 megabytes per day -- so scaleability is VERY important. I am most familiar with freeWAIS-sf, and the Japan-patched version is I think what I need for this task, but it will require some work to set up an interface to it. So before I get too far into that: o Is there something better for very large full-text databases (I'm leery of Namazu, despite its popularity -- I did try to look up earlier discussions of it on TLUG, but the TLUG search engine is, ah, broken, as in "Alert!: HTTP/1.1 500 Internal Server Error"). o freeWAIS-sf will blithely accept bound-to-fail queries and queries in broken syntax, simply returning no items found. I've started working on a syntax checker. I've settled on a model that will handle the problem, but ... has anyone already created such a thing? o Has anyone embedded nkf and kakasi or an equivalent into freeWAIS-sf-jp? This site will see a lot of traffic when it goes live, and it seems wasteful to be spawning collateral processes for every instance of the WAIS client. o Is there a WAIS client module for Python? For that matter, an nkf module? A kakasi module? A fastcgi module? Sorry for the number of questions, but if you're going to ask one, might as well ask 'em all. Cheers, Frank
- Follow-Ups:
- Re: Japanese search engines
- From: Shigeo Honda <shige@example.com>
Home | Main Index | Thread Index
- Prev by Date: "restarting" scsi
- Next by Date: Re: "restarting" scsi
- Prev by thread: Pine & reply-to [was: Re: "restarting" scsi]
- Next by thread: Re: Japanese search engines
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links