Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Re: tlug] Security question with grep/e...



On Tue, Mar 23, 2004 at 04:03:45PM +1100, Jim Breen wrote:
> "Stephen J. Turnbull" <stephen@example.com> wrote:
> >> 
> >> >>>>> "Jim" == Jim Breen <Jim.Breen@example.com> writes:
> >> 
> >>     Jim> [...] the CGI program would do a system() call [...]
> >> 
> >> Since you care about the host, don't do system() calls.  There are too
> >> many ways to break the call itself, and you then become hostage to any
> >> security holes that may exist in the called programs as well.
> 
> Can you be more specific about the risks? As I understand it, doing a 
> system("foobar par1 par2"); just stokes up /bin/sh under my account (it's
> usually cgiwrap or equivalent) and runs foobar. No different from my running
> foobar myself. I'm not doing it with anything suid, etc. I don't have su
> rights on the host.
> 
> >> What's wrong with using the native regexp facility of whatever you're
> >> using to write the CGI?  Even if it's in C or C++, the POSIX regcomp/
> >> regexec facility is not rocket science to use.  That's what you'd be
> >> using with grep, anyway, AFAIK.
> 
> Two reasons:
> 
> (a) laziness. It's easier to stoke up a system call than open the file and
> do it line-by-line. Actually it's *MUCH* easier than regexec()'s
> horrible call;
> 
> (b) portability. I have actually found some of those libraries not
> so smoothly implemented. Since I have mirrors on Solaris, AIX, FreeBSD
> and almost all Linices, system("egrep ..."); seemed more likely to 
> work on them all. (iconv(), for example, has some problems on the AIX
> system, probably because of code-table differences.)

Would it not be easier just to do this in PERL anyway, here is my
reasoning,

1) before doing the system(), you have to do a whole lot of messing to get
the output of the egrep back (not to mention parsing it), this basically
involves a fork(), but it is an expensive call and a lot of usage may
affect the machine.

2) charsets. Even though you are passing stuff to egrep, I would presume
you have to have it in a common charset, and the likelyhood is that you
will get it in utf-8, which may or may not be a good thing depending on
the charset you are comparing it to. Also you may have multiple encodings
for a double quote.

3) egrep is going to involve a lot of file IO, are yor disks up to it?

however a few ideas about putting it in PERL:

1) charsets are sorted, you just let PERL handle the conversion (from 5.6
onwards), no matter what the OS. PERL knows about broken iconvs and
oddities on different platforms.

2) you can loose even the initial fork from apache(?) by using modperl.

3) you can easily put your entire sentance list into a hash/DBM which
could be easier to search, and depending on the size, completely memory
resident.

4) your security problems are simplified,
    die unless ($user_str =~ m/^[\w.]+$/);
   PERL's regexes also know about charsets, what characters belong to
   each class and multiple encodings for characters.

5) you lose the system() call and dont have to worry about egrep
incompatabilities.

6) you get rid of SEGVs when mis-calculating the buffer size needed for a
multibyte character strings and all the other C nasties.

I am sure there are other arguments for both both sides, but I think it
would actually be lazier jsut to do it in PERL.

Tim.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links