Jon Turney via Cygwin-apps wrote: > Detect filename collisions between packages > Don't check filenames under /etc/postinstall/ for collisions > Report when filename collisions exist > Add option '--collisions' to enable IMO a useful enhancement. > Notes: > > Reading file catalog from a package is moderately expensive in terms of > I/O: To extract all the filenames from a tar archive, we need to seek to > every file header, and to seek forward through a compressed file, we > must examine every intervening byte to decompress it. > > This adds a fourth(!) pass through each archive (one to checksum it, one > to extract files, another one (I added in dbfd1a64 without thinking too > deeply about it) to extract symlinks), and now one to check for filename > collisions). > > Using std::set_intersection() on values from std::map() here is probably > a mistake. It's simple to write, but the performance is not good. A faster alternative which avoids set_intersection calls in a loop is possibly to use one large data structure which maps filenames to sets of packages. Using multimap instead of the straightforward map> needs possibly less memory (not tested). But for multimap it is required that file/package name pairs are not inserted twice. I attached a small standalone POC source file using multimap. It would also detect collisions in the already installed packages. > ... > diff --git a/filemanifest.h b/filemanifest.h > new file mode 100644 > index 0000000..cc6fb81 > --- /dev/null > +++ b/filemanifest.h > @@ -0,0 +1,29 @@ > +/* > + * Copyright (c) 2022 Jon Turney > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * A copy of the GNU General Public License can be found at > + *http://www.gnu.org/ "SPDX-License-Identifier: GPL-2.0-or-later" is possibly a more brief and modern way to say this in new code. > + * > + */ > + > +/* > + FileManifest > + > + A serializable set of filenames > + each filename is associated with the package name that owns it > + > + Used when working with the install package file manifests in /etc/setup > +*/ > + > +#include > +#include This should be as std::unordered_map is not used in the code. > +#include > + > +class FileManifest: public std::map > +{ > +}; Is the new file filemanifest.h required at all? It could be reduced to the following in install.cc: #include ... typedef std::map FileManifest; // or more modern (C++11): // using FileManifest = std::map; -- Regards, Christian