public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Observations about Cygwin's md5 checksums
@ 2014-07-07  4:38 Luke Kendall
  2014-07-07 10:05 ` Marco Atzeri
  0 siblings, 1 reply; 4+ messages in thread
From: Luke Kendall @ 2014-07-07  4:38 UTC (permalink / raw)
  To: cygwin; +Cc: audit

Here are five observations about md5 checksums in Cygwin.  I share it in 
case it may be of some small interest to a few people.  Please note that 
I may be wrong; if so, I'm happy to be corrected.

1) For each package, Cygwin stores the md5sum for the components of the 
main parts of the package in the setup.ini file.  The exception is the 
setup.hint file: its md5 sum is not recorded in setup.ini.

2) In each zip file for each package, an md5.sum file is almost always 
provided.  But not always. (*)

3) These md5.sum files list all the components of the package (including 
setup.hint), but these md5 sums are not reliable: they often don't match 
the actual md5 checksum (of the file itself, or of course the md5 stored 
for it in setup.ini).(**)

4) The most common file to have the wrong md5 checksum is setup.hint

5) It's not rare for files to be mentioned in a package's md5.sum which 
are be absent from the package itself.(***)

I'm curious about the purpose of having the md5.sum file in each 
package.  Is it a relic of a previous system?

The above observations are based on a few weeks of mirroring and 
automatically checking the md5 sums of what we downloaded.  The main 
site we used was aarnet.edu.au (IIRC); recently we changed to 
mirrors.kernel.org, but from my ad hoc checks there wasn't much 
difference between the two).

Regards,

luke

(*)
For mirrors.kernel.org last night:

Worrying: X11/khronos-opengl-registry has no md5.sum file
Worrying: X11/xlaunch has no md5.sum file
Worrying: cygwin64-gcc/cygwin64-gcc-debuginfo has no md5.sum file
Worrying: git/git-oodiff has no md5.sum file
Worrying: git/stgit has no md5.sum file
Worrying: git/tig has no md5.sum file
Worrying: man has no md5.sum file
Worrying: python/python-paramiko has no md5.sum file

(**)
$ grep FAILED [path-omitted]/x86/cygwin-archive-incomplete.txt | wc
      55     110    1463
$ grep "^setup.hint: FAILED" 
[path-omitted]/x86/cygwin-archive-incomplete.txt | wc
      28      56     560

(***)
$ wc -l < [path-omitted]/x86/cygwin-archive-all-missing-files.txt
406



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Observations about Cygwin's md5 checksums
  2014-07-07  4:38 Observations about Cygwin's md5 checksums Luke Kendall
@ 2014-07-07 10:05 ` Marco Atzeri
  2014-07-08  3:12   ` Luke Kendall
  0 siblings, 1 reply; 4+ messages in thread
From: Marco Atzeri @ 2014-07-07 10:05 UTC (permalink / raw)
  To: cygwin

On 07/07/2014 06:38, Luke Kendall wrote:
> Here are five observations about md5 checksums in Cygwin.  I share it in
> case it may be of some small interest to a few people.  Please note that
> I may be wrong; if so, I'm happy to be corrected.
>
> 1) For each package, Cygwin stores the md5sum for the components of the
> main parts of the package in the setup.ini file.  The exception is the
> setup.hint file: its md5 sum is not recorded in setup.ini.

setup.ini is built using the setup.hint's of the several packages.
No further usage outside the www.cygwin.com server.



> 2) In each zip file for each package, an md5.sum file is almost always
> provided.  But not always. (*)
>
> 3) These md5.sum files list all the components of the package (including
> setup.hint), but these md5 sums are not reliable: they often don't match
> the actual md5 checksum (of the file itself, or of course the md5 stored
> for it in setup.ini).(**)

during upload of new files the creation of md5.sum is out of sync
with the directory content. Md5.sum is updated 1 time per hour
If the mirror sync before the creation of the md5sum it has still the 
old version.

> 4) The most common file to have the wrong md5 checksum is setup.hint
>
> 5) It's not rare for files to be mentioned in a package's md5.sum which
> are be absent from the package itself.(***)
>
> I'm curious about the purpose of having the md5.sum file in each
> package.  Is it a relic of a previous system?
>
> The above observations are based on a few weeks of mirroring and
> automatically checking the md5 sums of what we downloaded.  The main
> site we used was aarnet.edu.au (IIRC); recently we changed to
> mirrors.kernel.org, but from my ad hoc checks there wasn't much
> difference between the two).
>
> Regards,
>
> luke
>
> (*)
> For mirrors.kernel.org last night:
>
> Worrying: X11/khronos-opengl-registry has no md5.sum file
> Worrying: X11/xlaunch has no md5.sum file
> Worrying: cygwin64-gcc/cygwin64-gcc-debuginfo has no md5.sum file
> Worrying: git/git-oodiff has no md5.sum file
> Worrying: git/stgit has no md5.sum file
> Worrying: git/tig has no md5.sum file
> Worrying: man has no md5.sum file
> Worrying: python/python-paramiko has no md5.sum file

some of this directory does not exist anymore on www.cygwin.com

>
> (**)
> $ grep FAILED [path-omitted]/x86/cygwin-archive-incomplete.txt | wc
>       55     110    1463
> $ grep "^setup.hint: FAILED"
> [path-omitted]/x86/cygwin-archive-incomplete.txt | wc
>       28      56     560
>
> (***)
> $ wc -l < [path-omitted]/x86/cygwin-archive-all-missing-files.txt
> 406
>
>

Regards
MArco


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Observations about Cygwin's md5 checksums
  2014-07-07 10:05 ` Marco Atzeri
@ 2014-07-08  3:12   ` Luke Kendall
  2014-07-08  9:02     ` Marco Atzeri
  0 siblings, 1 reply; 4+ messages in thread
From: Luke Kendall @ 2014-07-08  3:12 UTC (permalink / raw)
  To: cygwin; +Cc: audit

On 07/07/14 20:05, Marco Atzeri wrote:
> On 07/07/2014 06:38, Luke Kendall wrote:
>> Here are five observations about md5 checksums in Cygwin.  I share it in
>> case it may be of some small interest to a few people.  Please note that
>> I may be wrong; if so, I'm happy to be corrected.
>>
>> 1) For each package, Cygwin stores the md5sum for the components of the
>> main parts of the package in the setup.ini file.  The exception is the
>> setup.hint file: its md5 sum is not recorded in setup.ini.
>
> setup.ini is built using the setup.hint's of the several packages.
> No further usage outside the www.cygwin.com server.

Thanks for the explanation, Marco.  Can I check what you mean?
I think you're saying that setup.hint has no function once setup.ini has 
been created.
(You're not saying "setup.ini shouldn't be used for other purposes.")

>> 2) In each zip file for each package, an md5.sum file is almost always
>> provided.  But not always. (*)
>>
>> 3) These md5.sum files list all the components of the package (including
>> setup.hint), but these md5 sums are not reliable: they often don't match
>> the actual md5 checksum (of the file itself, or of course the md5 stored
>> for it in setup.ini).(**)
>
> during upload of new files the creation of md5.sum is out of sync
> with the directory content. Md5.sum is updated 1 time per hour
> If the mirror sync before the creation of the md5sum it has still the 
> old version.
>

Does that mean that if the md5.sum file were created at the same time 
that the package contents were updated, there would be no possible way 
for out-of-sync md5.sum files to be provided?  Do you think the current 
process could be improved?

I have observed that the errors in the mirrors persist for weeks or more.

Anyway, by changing our checking process to use the information in 
setup.ini instead of the md5.sum files which can be wrong (for many 
weeks), we can bypass the incorrect md5.sum files.

>> 4) The most common file to have the wrong md5 checksum is setup.hint
>>
>> 5) It's not rare for files to be mentioned in a package's md5.sum which
>> are be absent from the package itself.(***)
>>
>> I'm curious about the purpose of having the md5.sum file in each
>> package.  Is it a relic of a previous system?
>>
>> The above observations are based on a few weeks of mirroring and
>> automatically checking the md5 sums of what we downloaded.  The main
>> site we used was aarnet.edu.au (IIRC); recently we changed to
>> mirrors.kernel.org, but from my ad hoc checks there wasn't much
>> difference between the two).
>>
>> Regards,
>>
>> luke
>>
>> (*)
>> For mirrors.kernel.org last night:
>>
>> Worrying: X11/khronos-opengl-registry has no md5.sum file
>> Worrying: X11/xlaunch has no md5.sum file
>> Worrying: cygwin64-gcc/cygwin64-gcc-debuginfo has no md5.sum file
>> Worrying: git/git-oodiff has no md5.sum file
>> Worrying: git/stgit has no md5.sum file
>> Worrying: git/tig has no md5.sum file
>> Worrying: man has no md5.sum file
>> Worrying: python/python-paramiko has no md5.sum file
>
> some of this directory does not exist anymore on www.cygwin.com

Fair enough.  But from what you explained above, it seems strange that 
no md5.sum file exists.  Is it another problem caused by not providing 
the updated md5.sum file at the same time that a package is updated?  I 
wonder if there's a race condition: files in the package are updated, 
one by one, the md5.sum file removed, the new md5 sums computed and the 
file written - and in the meantime, someone may have mirrored the 
partially-updated package?

>
>>
>> (**)
>> $ grep FAILED [path-omitted]/x86/cygwin-archive-incomplete.txt | wc
>>       55     110    1463
>> $ grep "^setup.hint: FAILED"
>> [path-omitted]/x86/cygwin-archive-incomplete.txt | wc
>>       28      56     560
>>
>> (***)
>> $ wc -l < [path-omitted]/x86/cygwin-archive-all-missing-files.txt
>> 406
>>
>>
>
> Regards
> MArco
>
>
> -- 
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>

Thanks for taking the time to explain, Marco.

regards,

luke

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Observations about Cygwin's md5 checksums
  2014-07-08  3:12   ` Luke Kendall
@ 2014-07-08  9:02     ` Marco Atzeri
  0 siblings, 0 replies; 4+ messages in thread
From: Marco Atzeri @ 2014-07-08  9:02 UTC (permalink / raw)
  To: cygwin

On 08/07/2014 05:12, Luke Kendall wrote:
> On 07/07/14 20:05, Marco Atzeri wrote:
>> On 07/07/2014 06:38, Luke Kendall wrote:
>>
>> setup.ini is built using the setup.hint's of the several packages.
>> No further usage outside the www.cygwin.com server.
>
> Thanks for the explanation, Marco.  Can I check what you mean?
> I think you're saying that setup.hint has no function once setup.ini has
> been created.
> (You're not saying "setup.ini shouldn't be used for other purposes.")

I am saying that when we upload a new package we also upload the new
setup.hint.
On the main server (www.cygwin.com) setup.ini is updated using
this new data, by a program that runs every 10 or 20 minutes.
Sometimes we update only the setup.hint when a "require" issue need to
be solved.
No further usage is done of the setup.hint files on the main server.
Of course you can do what you want of them ;-) but the same
(and additional) information are inside setup.ini

>> during upload of new files the creation of md5.sum is out of sync
>> with the directory content. Md5.sum is updated 1 time per hour
>> If the mirror sync before the creation of the md5sum it has still the
>> old version.
>>
>
> Does that mean that if the md5.sum file were created at the same time
> that the package contents were updated, there would be no possible way
> for out-of-sync md5.sum files to be provided?  Do you think the current
> process could be improved?
>
> I have observed that the errors in the mirrors persist for weeks or more.

upgrade of setup.ini and update of md5sum are done by two different 
processes.
Please note that the server is covering also other softwares not
only cygwin (https://www.sourceware.org/)

> Anyway, by changing our checking process to use the information in
> setup.ini instead of the md5.sum files which can be wrong (for many
> weeks), we can bypass the incorrect md5.sum files.

I expect so, as md5 in setup.ini is also used by setup-xxx.exe
to check for incorrect download.


Regards
Marco

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-07-08  9:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-07  4:38 Observations about Cygwin's md5 checksums Luke Kendall
2014-07-07 10:05 ` Marco Atzeri
2014-07-08  3:12   ` Luke Kendall
2014-07-08  9:02     ` Marco Atzeri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).