Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Do you whitelist or blacklist utf-8?
- Date: Tue, 22 Feb 2011 20:10:01 +0900
- From: Jean-Christian Imbeault <jc.imbeault@example.com>
- Subject: Re: [tlug] Do you whitelist or blacklist utf-8?
- References: <4D639689.1010302@example.com>
I use HTMLPurifier. Jc On Tue, Feb 22, 2011 at 7:57 PM, Dave M G <dave@example.com> wrote: > TLUG, > > I've been going a little mental today trying to figure out how to filter > out possible malicious characters from POST data going to my site. I > want to block things like <,>, *. etc... > > The thing is that I also want to be able to allow CJK characters, and > any other language with non-Latin characters. This is a snap to do if > you just want to allow 0-9a-zA-Z. But once you get into Unicode land, it > seems to be a whole other ballgame. > > I've got three stages I want to filter on. First I want to block > characters on the client side with Javascript, so that the user is aware > of what characters are permissible when entering names and whatnot. Then > I want to block any bad characters on the server side in PHP to make > sure no script kiddies have tried to POST anything nasty. And also, just > for good measure, I want to ensure no nastiness is inserted into my MySQL. > > I'd like all three steps to be consistent with each other, so I'm trying > to standardize a set of bad characters that I can filter for at each step. > > However, where I've broken down is whether or not I should blacklist bad > characters (where I fear I might miss one), whitelist good characters > (seems tough to get a whitelist that's utf-8 compatible), or do > something like make comparisons on HTML entities or with regex or > something using built in functions (PHP and Javascript differ on > specific functions and their results). > > Since you guys are the go-to people for handling utf-8 text, I thought > maybe you've encountered this before. > > How do you handle filtering malicious code from utf-8 text that contains > CJK and other languages? > > And how do you do it in PHP and Javascript? > > -- > Dave M G > > -- > To unsubscribe from this mailing list, > please see the instructions at http://lists.tlug.jp/list.html > > The TLUG mailing list is hosted by the award-winning Internet provider > ASAHI Net. > Visit ASAHI Net's English-language Web page: http://asahi-net.jp/en/ >
- References:
- [tlug] Do you whitelist or blacklist utf-8?
- From: Dave M G
Home | Main Index | Thread Index
- Prev by Date: [tlug] Do you whitelist or blacklist utf-8?
- Next by Date: Re: [tlug] [announcement] 2011-02-19 Technical meeting.
- Previous by thread: [tlug] Do you whitelist or blacklist utf-8?
- Next by thread: Re: [tlug] Do you whitelist or blacklist utf-8?
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links