Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Compression Comparison (WAS: Tip of the Day: "ghosting" a machine with nc and dd)



On 火, 2007-07-17 at 08:35 -0500, Daniel A. Ramaley wrote:
> Given the choice of bzip2 or gzip, i'd recommend gzip for this 
> application because (with the -9 switch) it usually achieves almost
> as 
> good of compression as bzip2 while being *much* easier on the CPU. 

I can attest to this. I was recently involved in a little argument with
someone here at work about whether gzip or infozip was better. (In
short, file size about the same; speed favors gzip a bit.) So, I decided
to test bzip2 using the same file.

I used bzip2 and gzip on a 1,949,703 byte plain text file. Here are the
files sizes and timings (using the time command) (this is on my office
machine, a Dell Optiplex GX620 with a 3.0 GHz Pentium D with 2 GB RAM
running Linux):

Compressor     Compresssed             Time
               File Size
----------     ------------           --------
bzip2           146,586                0.914
bzip2 -9        146,586                0.868
gzip            212,035                0.109
gzip -9         208,901                0.145

These results rather surprised me, especially the comparison of bzip2
and bzip2 -9, so I decided to try it on another machine. This time it's
a Sun Sun-Fire V250 running SunOS 5.9 -- 2 cpus and multiple users. The
test file was 2,021,318 bytes.

bzip2           524,962                1.238
bzip2 -9        524,962                1.214
gzip            698,676                0.707
gzip -9         684,522                1.805

This shows the same peculiar pattern: bzip2 -9 results in files of the
same size, but it takes less time! Oh, I see. Perhaps I should have read
the fine bzip2 man page first:

-1 (or --fast) to -9 (or --best)
          Set the block size to 100 k,  200  k  ..   900  k  when
          compressing.   Has  no  effect when decompressing.  See
          MEMORY MANAGEMENT below.  The --fast and --best aliases
          are  primarily for GNU gzip compatibility.  In particu-
          lar, --fast doesn't make things  significantly  faster.
          And --best merely selects the default behaviour.

なるほど


-- 
Stuart Luppescu -=-=- slu <AT> ccsr <DOT> uchicago <DOT> edu
CCSR at U of C ,.;-*^*-;.,  ccsr.uchicago.edu
     (^_^)/    才文と智奈美の父
Thank God I'm an atheist!

Attachment: smime.p7s
Description: S/MIME cryptographic signature


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links