public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* lrzip: extreme compression (but beware its slow decompression speed)
@ 2012-03-30 17:02 Jim Meyering
  2012-03-31  7:35 ` Con Kolivas
  0 siblings, 1 reply; 2+ messages in thread
From: Jim Meyering @ 2012-03-30 17:02 UTC (permalink / raw)
  To: gcc

In case you're evaluating what compression programs to use...

This started off as a comparison of xz and lzip,
but then I added lzrip to the mix.

Sometimes it's useful to have an idea of how far from "ideal"
a compression program is.  I'm not claiming to have the answer,
but merely sharing my surprise at how far off xz and lzip are
when it comes to the size of the compressed result.

I started off by downloading the gcc-4.7.0.tar.bz2 release tarball
and decompressing it, then recompressing using bzip2, lzip, xz and lrzip:
(on a 6/12-core Fedora 17 x86_64 system with plenty of RAM)

  KiB   compression
 size   time m:ss  file name
------  --------   -----------------
514400     NA      gcc-4.7.0.tar
 80588  0:58.12    gcc-4.7.0.tar.bz2 (-9)
 59556  6:16.61    gcc-4.7.0.tar.lz (-9)
 58640  5:55.78    gcc-4.7.0.tar.xz (-9e)
 48876  2:46[*]    gcc-4.7.0.tar.lrz (-z -L8 -w2000)

[*] multi-threaded; I think it had at least 6 or 7 cores busy at one point.
This is using the latest, v0.47-590-ga9ba55f, from the upstream repo,
git://github.com/ckolivas/lrzip.git

The above shows that xz compresses both faster (by 5%)
and better (by 916 KiB, or ~1.5%).

It also shows that lrzip compresses extremely well, saving over 9MiB
(aka more than 16%) over xz with its -9e options.

----------------------------------------------------
More importantly, what about decompression speed?
The compression happens relatively rarely, by the person who prepares
a release, but then many people download and decompress the result.

(the following xz and lzip times are each best-of-3)

    $ env time xz -dc gcc-4.7.0.tar.xz > /dev/null
    4.35

    $ env time --f=%e lzip -dc gcc-4.7.0.tar.lz > /dev/null
    6.06

    $ env time --f=%e bzip2 -dc gcc-4.7.0.tar.bz2 > /dev/null
    13.96

    $ ./lrzip -d -o - gcc-4.7.0.tar.lrz > /dev/null
    3:36.12 (note, that's 3.5 *minutes* to decompress on a 12-core system)

That shows another reason to prefer xz over lzip.
xz decompresses this tarball in 28% less time than lzip.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: lrzip: extreme compression (but beware its slow decompression speed)
  2012-03-30 17:02 lrzip: extreme compression (but beware its slow decompression speed) Jim Meyering
@ 2012-03-31  7:35 ` Con Kolivas
  0 siblings, 0 replies; 2+ messages in thread
From: Con Kolivas @ 2012-03-31  7:35 UTC (permalink / raw)
  To: gcc

Jim Meyering <jim <at> meyering.net> writes:

> 
> In case you're evaluating what compression programs to use...
> 
> This started off as a comparison of xz and lzip,
> but then I added lzrip to the mix.

> I started off by downloading the gcc-4.7.0.tar.bz2 release tarball
> and decompressing it, then recompressing using bzip2, lzip, xz and lrzip:
> (on a 6/12-core Fedora 17 x86_64 system with plenty of RAM)
> 
>   KiB   compression
>  size   time m:ss  file name
> ------  --------   -----------------
> 514400     NA      gcc-4.7.0.tar
>  80588  0:58.12    gcc-4.7.0.tar.bz2 (-9)
>  59556  6:16.61    gcc-4.7.0.tar.lz (-9)
>  58640  5:55.78    gcc-4.7.0.tar.xz (-9e)
>  48876  2:46[*]    gcc-4.7.0.tar.lrz (-z -L8 -w2000)


>     $ ./lrzip -d -o - gcc-4.7.0.tar.lrz > /dev/null
>     3:36.12 (note, that's 3.5 *minutes* to decompress on a 12-core system)

Nice to see you trying out lrzip. The -z option is really for extreme
compression and not necessarily regular use (think permanent archival,
distribution over slow connections), and definitely not for one time compression
where the expectation is many people will be decompressing it and decompression
speed is crucial. The default options would be a much better choice there where
lrzip basically uses a multi-threaded lzma.

Regards,
Con

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-03-31  7:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-30 17:02 lrzip: extreme compression (but beware its slow decompression speed) Jim Meyering
2012-03-31  7:35 ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).