public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: "Wagemans, Peter" <peter.wagemans@kpn.com>
To: "cygwin@cygwin.com" <cygwin@cygwin.com>
Subject: RE: RCS file corruption.
Date: Thu, 21 Jun 2012 22:31:00 -0000	[thread overview]
Message-ID: <B7EEF3066CD1D24081DBFF5E9F5E68D4C1A83162@EXCNLDCM06.europe.unity> (raw)
In-Reply-To: <20120621210043.M43274@ds.net>


Brian Wilson wrote:

> From what I've read in this discussion, I think the issue is that
> the '^M' characters may not be seen by RCS as an EOL.

The problem occurs in a loop that copies one character at a time to
move the entire content of the work file into the new RCS file as the
latest version. It unexpectedly gets EOF back from getc() after
exactly 65536 characters.

Using the sysinternals tool procmon, one can see what the processes
are asking of the Windows OS. This was done for Cygwin rcs-5.8-1,
showing the readfile operations on the work file:

Time of Day     Process Name    Operation       Result          Detail
13:40:06.563    ci.exe          ReadFile        SUCCESS         Offset: 0, Length: 65,536, Priority: Normal
13:40:06.685    diff.exe        ReadFile        SUCCESS         Offset: 0, Length: 1,593,857
13:40:06.686    diff.exe        ReadFile        END OF FILE     Offset: 1,593,857, Length: 1
13:40:06.732    ci.exe          ReadFile        END OF FILE     Offset: 1,593,857, Length: 65,536

The RCS check-in tool ci.exe reads the start of the work file before
it starts diff.exe as a subprocess. The task of diff.exe is to figure
out the difference with the previous version of the file. To do this,
diff.exe reads the entire file (from stdin, the file descriptor of the
work file is supplied to diff.exe by ci.exe).

After that ci.exe wants to copy the content of the work file to the
RCS file. But at the end of the first 64kB (that it already has in the
buffer), it appears that ci.exe wants to read the next 64kB at the end
of the file. So it gets an EOF. This causes the truncation to 64kB of
the content of the last version in the RCS file.

It is not clear to me why ci.exe tries to read the second 64kB at the
end of the work file. Perhaps some (library) code uses/sets an
incorrect file position; perhaps influenced by the subprocess diff.exe
reading the entire file?

A similar procmon trace for Cygwin rcs-5.7-11 trace shows that this
older rcs version chooses to create a memory map of the work file.
This other access method apparently avoids the problem.

Regards,

Peter Wagemans



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

      reply	other threads:[~2012-06-21 22:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-19 13:18 Wagemans, Peter
2012-06-20 11:10 ` Richard Gribble
2012-06-20 11:15   ` Otto Meta
2012-06-21  9:05   ` Wagemans, Peter
2012-06-21 14:05     ` Richard Gribble
2012-06-21 20:50       ` Andrey Repin
2012-06-21 20:59         ` Richard Gribble
2012-06-22 11:05           ` Andrey Repin
2012-06-21 21:10         ` Brian Wilson
2012-06-21 22:31           ` Wagemans, Peter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B7EEF3066CD1D24081DBFF5E9F5E68D4C1A83162@EXCNLDCM06.europe.unity \
    --to=peter.wagemans@kpn.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).