Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] array duplicate check [C] [ Solved ]
- Date: Fri, 20 Apr 2007 17:47:43 +0900
- From: "Stephen J. Turnbull" <stephen@example.com>
- Subject: Re: [tlug] array duplicate check [C] [ Solved ]
- References: <78d7dd350704180121g29133f9lc427b81d76fd58d0@example.com> <878xcqrpuh.fsf@example.com> <d8fcc0800704181741u64a3bb1dyf7203f28e168ba0c@example.com> <87tzvbrh3f.fsf@example.com> <462817F9.1070903@example.com>
Darren Cook writes: > > BTW, this is the first time I've actually seen an application for a > > Bloom filter. I'm surprised nobody has called me on my N + M claim. :-) > Bloom filters ( http://en.wikipedia.org/wiki/Bloom_filter ) are not > reliable (they can return false positives). If I've understood > correctly, in terms of the original question it means it might say an > element in b[] is in a[] when in fact it isn't. You are correct. The probability can be made arbitrarily low, though. We'll have to ask Nguyen whether a rare false positive matters. That means the main task does not get run, even though the data is valid. On the other hand, Bloom filters guarantee no false negatives, so the main task never gets run on invalid data. BTW, as I imagined Nguyen's application, he is expecting that there will be zero positives, but needs to check. In that case, positives of either kind are "rare", so doing a full scan on M (< N) for equality if you do observe a positive is "cheap". But of course this is probably O(M*N) in the worst case. (It may be possible to choose the Bloom filter to do better.)
- References:
- Re: [tlug] array duplicate check [C] [ Solved ]
- From: Nguyen Vu Hung
- Re: [tlug] array duplicate check [C] [ Solved ]
- From: Stephen J. Turnbull
- Re: [tlug] array duplicate check [C] [ Solved ]
- From: Josh Glover
- Re: [tlug] array duplicate check [C] [ Solved ]
- From: Stephen J. Turnbull
- Re: [tlug] array duplicate check [C] [ Solved ]
- From: Darren Cook
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] TLUG shirt design vote
- Next by Date: Re: [tlug] TLUG Shirts (quantiy and sizes)
- Previous by thread: Re: [tlug] array duplicate check [C] [ Solved ]
- Next by thread: [tlug] Re: haha
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links