public inbox for cygwin-apps@cygwin.com
 help / color / mirror / Atom feed
* Dedup x86/x86_64 --> noarch
@ 2016-04-16 10:04 Achim Gratz
  2016-04-18 19:45 ` Achim Gratz
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Achim Gratz @ 2016-04-16 10:04 UTC (permalink / raw)
  To: cygwin-apps


After a discussion on IRC about de-duping the noarch content out of
package files (where I was told this would be too difficult), I've just
tried what would happen for two of my packages, maxima and perl.  Maxima
is practically a noarch package, save for the clisp memory image.  Perl
has gobs and gobs of non-arch-specific files mixed in with quite a bit
of arch-specific stuff.  I've used hashdeep for finding the dupes since
it is really fast, so the files are only de-duped if the are bit-for-bit
identical.

set p=perl
# reference
( cd $p.x86/inst    ; hashdeep -c sha256 -lr * ) > $p.x86.hash
# matching files
( cd $p.x86_64/inst ; hashdeep -c sha256 -k ../../$p.x86.hash -mlr * )
# non-matching files
( cd $p.x86_64/inst ; hashdeep -c sha256 -k ../../$p.x86.hash -xlr * )

For Maxima, there are a few files that should be identical, but aren't:
these are leakages from the build environment that I'll have to patch
out later (one of these leakages is actually a bug, affecting parts of
the documentation).

For Perl, the GZip compressed man-pages are flagged as different,
because gzip leaks the time-stamp (but that could be avoided using the
-n option to gzip in cygport).  Fixing that, the documentation packages
for Perl are completely shared between the two arches (well, duh), but
even the binary packages perl, perl-debginfo and perl_base would share
about a quarter of their content (so they'd need to be split into
something like perl_base / perl_base-common).

Looking at the current repo content we'd save about 30GB from the dedup
of the src abd doc packages alone and probably about 20GB from dedup in
the remaining packages.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf Q+, Q and microQ:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-05-11 18:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-16 10:04 Dedup x86/x86_64 --> noarch Achim Gratz
2016-04-18 19:45 ` Achim Gratz
2016-04-23 10:51 ` Jon Turney
2016-04-23 11:19   ` Achim Gratz
2016-04-23 14:19   ` Achim Gratz
2016-04-23 15:32     ` Corinna Vinschen
2016-04-23 15:43       ` Achim Gratz
2016-05-09 14:38         ` Jon Turney
2016-05-09 16:43           ` Achim Gratz
2016-05-09 14:18     ` Jon Turney
2016-05-09 16:45       ` Achim Gratz
2016-05-09 22:41 ` Andrew Schulman
2016-05-10  5:44   ` Achim Gratz
2016-05-10  6:20     ` Andrew Schulman
2016-05-11 18:59       ` Jon Turney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).