Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Database frontend in Linux



Raedwolf Summoner writes:

 > In Windows I have a database program called askSam, that some of
 > you may be familiar with.

Not if we can help it. ;-)  I've heard of it, but most of us avoid
Windows and its apps as though they were the plague.[1]  You really
should describe what you're doing in more detail.

 > I would like to get something in Linux that would be similarly or
 > more suited to my needs. I have a rather extensive collection of
 > medical information from a large number of subscriptions/
 > periodicals that I have followed over the years. Initially, I
 > scanned, OCR'ed, and tossed the results into askSam

What do you mean by "tossed the results into askSam"?  What results?
How do you "toss" them?  How do you access them?

 > (1) Is it possible to find one good database frontend in Linux for
 >     these various requirements?

You haven't specified any requirements except "like askSam", which
means nothing to me, and "imports text, HTML, email, graphics, and
PDF" which isn't much more help.

If Josh is right, and what you want is an indexing/searching
application (he calls it a "search engine"), FreeWAIS is an
implementation of the (now defunct AFAIK) Wide Area Indexing Service
protocol, and I know a couple of people who use it to index email and
other personal documents (originally it was designed for library use).
Other well-known indexers are Namazu and Xapian.  However, AFAIK all
of these are basically libraries.  They have sample front ends but
usually adapted to providing indexes for web sites.  I doubt any of
them can parse PDF; I know Xapian and Namazu can handle HTML, though.

FreeWAIS is an old, industrial-strength indexer that implements KWIC
(keyword in context) searches.  I don't know how will it handles
non-8-bit character sets (ie, Japanese); you don't mention that but I
gather you are here so ....

Namazu's main claims to fame are that it was written by a Japanese
hacker, and integrates well with Japanese-specific tools such as
Kakasi.

Xapian implements very modern indexing and search technologies and is
very fast (a lot of open source software web services use it for
free-text indexing and search, eg, bug trackers and mailing list
archives).  I know for a fact that its more advanced algorithms are
not well-tuned for Japanese, though.

None of these use powerful database engines as backends.  Rather, they
use application-specific databases for the index files.

If you want something with more structure, then a relational database
(like Oracle, Dbase II, or FoxBase) may be useful (but again I don't
know about easy-to-use front-ends; when I need to talk to an SQL
database I use Emacs Lisp or Python).  PostgreSQL and MySQL are the
best known open source databases, and you can buy Oracle if you want
to.

 > (2) Does Linux have anything that resembles askSam in the way it
 >     handles information? Is "Basket" similar to askSam, and would
 >     it easily handle the volume?

If you're talking about basket.kde.org, probably not.  BasKet is about
taking plain text notes according to the web site; I don't imagine
that you can import anything else usefully.  But you could install it
and check the docs, shouldn't take more than about 15 minutes for a
full install-try-uninstall cycle. :-)


Footnotes: 
[1]  Which is not to be confused with "shin-gata infuruenza".  See
http://turnbull.sk.tsukuba.ac.jp/Blog/Japan/JapanCowersBeforeTheBlackDeath.
P.S. to Josh: I've added dates on the main page.  Good idea, that!



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links