
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] array duplicate check [C] [ Solved ]
Josh Glover writes:
> On 18/04/07, Stephen J. Turnbull <stephen@example.com> wrote:
> > Cost of a sort is M log M.
> Assuming you have quicksort, and assuming non worst-case behaviour.
No, assuming you have anything like a reasonable sorting algorithm,
and non-worst-case behavior. Sorting is just an M log M problem, and
you have to go out of your way to find an algorithm that takes M^2.
(Not very far, I'm sure everybody has used a bubble sort at some
time. :-) You could use a different M log M sort every day of the
week, maybe even for a whole month. And if Nguyen doesn't have
quicksort, then he can't even do #include <stdlib>, no? (OK, maybe
ISO doesn't specify quicksort, but who'd use anything else?) And
somebody stole his copy of Sedgewick.
Furthermore there are algorithms that guarantee M log M in the worst
case (with a bigger step coefficient than quicksort). Python uses one
that is stable and in-place (ie, uses a constant amount of extra
space). I don't actually know what the algorithm used is ;-), that's
advertisement from the guy who implemented it, but Tim Peters is one
of the great programmers of our day -- FWIW I trust him, even blowing
his own horn. He claims that it's almost as fast as a stable
quicksort on average, and of course worst-case behavior is better.
BTW, this is the first time I've actually seen an application for a
Bloom filter. I'm surprised nobody has called me on my N + M claim. :-)
Home |
Main Index |
Thread Index