Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Japanese search engines
- To: tlug@example.com
 - Subject: Japanese search engines
 - From: Frank BENNETT <bennett@example.com>
 - Date: Sat, 16 Dec 2000 19:21:02 +0900
 - Content-Transfer-Encoding: 7bit
 - Content-Type: text/plain; charset=iso-2022-jp
 - Reply-To: tlug@example.com
 - Resent-From: tlug@example.com
 - Resent-Message-ID: <7H8Mv.A.YVE.SI0O6@example.com>
 - Resent-Sender: tlug-request@example.com
 
Does anyone have any suggestions for Japanese search engines? I would also welcome pointers to sources of information. The task at hand is to provide full-text search interface for a mirror Kanpo, the official gazette of the Japanese government (yes, I did finally finish the PDF site ripper I was working on!). The amount of data is large -- the archive expands at the rate of about 1.5 megabytes per day -- so scaleability is VERY important. I am most familiar with freeWAIS-sf, and the Japan-patched version is I think what I need for this task, but it will require some work to set up an interface to it. So before I get too far into that: o Is there something better for very large full-text databases (I'm leery of Namazu, despite its popularity -- I did try to look up earlier discussions of it on TLUG, but the TLUG search engine is, ah, broken, as in "Alert!: HTTP/1.1 500 Internal Server Error"). o freeWAIS-sf will blithely accept bound-to-fail queries and queries in broken syntax, simply returning no items found. I've started working on a syntax checker. I've settled on a model that will handle the problem, but ... has anyone already created such a thing? o Has anyone embedded nkf and kakasi or an equivalent into freeWAIS-sf-jp? This site will see a lot of traffic when it goes live, and it seems wasteful to be spawning collateral processes for every instance of the WAIS client. o Is there a WAIS client module for Python? For that matter, an nkf module? A kakasi module? A fastcgi module? Sorry for the number of questions, but if you're going to ask one, might as well ask 'em all. Cheers, Frank
- Follow-Ups:
 
- Re: Japanese search engines
 
- From: Shigeo Honda <shige@example.com>
 Home | Main Index | Thread Index
- Prev by Date: "restarting" scsi
 - Next by Date: Re: "restarting" scsi
 - Prev by thread: Pine & reply-to [was: Re: "restarting" scsi]
 - Next by thread: Re: Japanese search engines
 - Index(es):
 
Home Page Mailing List Linux and Japan TLUG Members Links