public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Ryan Johnson <ryan.johnson@cs.utoronto.ca>
To: cygwin@cygwin.com
Subject: Re: checking in >= 256k file fatally corrupts rcs file
Date: Wed, 09 Oct 2013 13:47:00 -0000	[thread overview]
Message-ID: <52555E65.1070000@cs.utoronto.ca> (raw)
In-Reply-To: <525499E5.4090608@etr-usa.com>

On 08/10/2013 7:48 PM, Warren Young wrote:
> On 10/8/2013 04:22, Don Hatch wrote:
>>
>> Checking in a text file of size >= 256k
>> corrupts the rcs file, irretrievably losing most of the contents
>
> It's documented in the rcs NEWS file:
>
>     - Env var RCS_MEM_LIMIT controls stdio threshold.
>
>       For speed, RCS uses memory-based routines for files up to
>       256 kilobytes, and stream-based (stdio) routines otherwise.
>       You can change this threshold value by setting the environment
>       variable ‘RCS_MEM_LIMIT’ to a non-negative integer, measured in
>       kilobytes.  An empty ‘RCS_MEM_LIMIT’ value is silently ignored.
>
> So, use the new environment variable, or build up your huge diffs a 
> few steps at a time, so as to avoid spamming this buffer.
So in other words, a misguided performance optimization [1] that almost 
certainly has little measurable impact on performance [2] has introduced 
a silent data corruption bug (or tickled a latent one somewhere else). 
Lovely.

The gcc devs have the right philosophy: features that break things badly 
get reverted immediately regardless of whose fault the bug is, and will 
be considered for re-inclusion once the bug has been fixed on the side. 
In this case, though, the I'm not sure re-inclusion is even warranted.

[1] Modern filesystems and filesystem caching are pretty darn good at 
handling temporary files these days. Further, if you really care about 
using RAM to improve performance, 256kB is an absurdly low limit for a 
buffer size, and has been for most of the last decade.

[2] I'd be shocked if even 0.1% of checkins were large enough to have a 
noticeable latency in a modern system, and even more shocked if the 0.1% 
that are large enough to be slow were still small enough that 256kB of 
buffering made any difference in their runtime. Unless the code is 
calling fsync() after every newline or something, in which case that's 
what needs to be fixed.

$0.02
Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  parent reply	other threads:[~2013-10-09 13:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-08 10:16 Don Hatch
2013-10-08 23:49 ` Warren Young
2013-10-09  0:24   ` Don Hatch
2013-10-09  1:30     ` Warren Young
2013-10-09  2:07       ` Gary Johnson
2013-10-09  3:29         ` Warren Young
2013-10-09  6:59       ` Don Hatch
2013-10-09 13:37         ` Warren Young
2013-10-09 14:00           ` Ryan Johnson
2013-10-09 17:58           ` Don Hatch
2013-10-09 13:47   ` Ryan Johnson [this message]
2013-10-09 18:28     ` Achim Gratz
2013-10-09 20:57       ` Richard Gribble
2014-02-26 13:26 Wagemans, Peter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52555E65.1070000@cs.utoronto.ca \
    --to=ryan.johnson@cs.utoronto.ca \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).