Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Do you whitelist or blacklist utf-8?



Josh Glover writes:

 > > IMHO, only whitelist.
 > 
 > +1

What you mean is to blacklist possibly syntactic characters and only
take characters off if you really need them.  In particular, blacklist
everything in ASCII except for the alphanumeric characters and maybe
the space.  But non-ASCII characters don't matter most of the time.
"Whitelist everything in Unicode except for ASCII punctuation" isn't
really a white list.

 > > Of course, all this is not excuse for not using pre-compiled SQL queries
 > > with placeholders, or whatever they are called in PHP.
 > 
 > +2

Indeed, this is far more important.

It's really not clear to me what Dave is worried about.  XSS
vulnerabilities are 100% about untrusted *ML (mostly HTML, but now
many browsers can handle SVG and even generic XML).  Filter "<" and
you're done.  No meta tags, no script tags, no a tags, no img tags, no
link tags, have I missed any? doesn't matter, there are no tags at all
here!

More generically, the right thing to do is write down a grammar for
valid input, and validate everything.  Refuse to process or guess
about invalid input (it's OK to guess, but the guess must be formally
part of your grammar!)  This is (in general) *more* than whitelisting
characters, although for terms in a search box it might reduce to a
whitelist of characters.



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links