From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-64341-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 18228 invoked by alias); 10 Dec 2002 08:28:15 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 18205 invoked from network); 10 Dec 2002 08:28:13 -0000
Received: from unknown (HELO emf.net) (205.149.0.20)
  by sources.redhat.com with SMTP; 10 Dec 2002 08:28:13 -0000
Received: (from lord@localhost) by emf.net (K/K) id AAA17798; Tue, 10 Dec 2002 00:28:13 -0800 (PST)
Date: Tue, 10 Dec 2002 00:31:00 -0000
From: Tom Lord <lord@emf.net>
Message-Id: <200212100828.AAA17798@emf.net>
To: gcc@gcc.gnu.org
Subject: new batch of replies (D)
X-SW-Source: 2002-12/txt/msg00515.txt.bz2


Replies in the message:

	Joseph S. Myers: There are Constraints on the Protocols We can use
	Joseph S. Myers: The Ability to Revert Patches is Important
	Joseph S. Myers: Sending Patches by Mail is Important
	Walter Landry: Some Slight Misstatements About Distributed Repositories
	Zack Weinberg: Additional Comments About Mainline Server Requirements 

================================================================

Joseph S. Myers: There are Constraints on the Protocols We can use

       One problem there was with SVN - it may have been fixed by now,
       and a fix would be necessary for it to be usable for GCC - was
       its use of HTTP and HTTPS (for write access); these tend to be
       heavily controlled by firewalls and the ability to tunnel over
       SSH (with just that one port needing to be open) would be
       necessary.  "Transparent" proxies may pass plain HTTP OK, but
       not the WebDAV/DeltaV extensions SVN needs.

I think Walter has mostly answered this.  

I would like to add that communication with a server consists of
commands that are analogous to a few FTP operations -- it's just a
very minimal set of filesystem primitives.  Creating new or supporting
additional protocols can be done quickly (as Walter has demonstrated).


================================================================

Joseph S. Myers: The Ability to Revert Patches is Important

       A change set is applied.  It turns out to have problems, so
       needs to be reverted - common enough.  Of course the version
       history and ChangeLog shows both the original application and
       reversion.  The reversion might in fact be of the original
       change set and a series of subsequent failed attempts at
       patching it up.  But intermediate unrelated changes to the tree
       should not be backed out in the process.

This can be accomplished with the `delta-patch' command, which takes
the changes between any two revisions, and applies them to your tree.

Once you have identified the changeset you want to back out, if it is
in revision N, then you want to apply the changes from N to N-1 to
your tree.

This is not currently recorded in the log as an official "reversion"
-- just as an ordinary change that happens to be a reversion.   That
is perhaps something that should be added prior to 1.0.


================================================================

Joseph S. Myers: Sending Patches by Mail is Important

       Patches by email (with distributed patch review by multiple
       people reading gcc-patches, including those who can't actually
       approve the patch) is the normal way GCC development works.
       Presume that most contributors will not want to deal with
       security issues of making any local repository accessible to
       other machines, even if it's on a permanently connected machine
       and local firewalls or policy don't prevent this.

       A patch for use with a better version control system would need
       to include some encoding for that system of renames / deletes /
       ... - but that needs to be just as human-readable as context
       diffs / unidiffs are.

It is a well known bug in the design of the reference implementation
that changesets are not convenient for email use.  Fixing this is, in
fact, one of the most important tasks to complete before 1.0.  I am in
favor of fixing this slowly and carefully: taking a few weeks and a
few people to design the new changeset format, because the semantics
of of the format have many long-term, wide-ranging implications.  You
can see the notes I made when I left off on this project on the
fifthvision.net web site.

Recently, a volunteer has prototyped a very email-friendly format and
has an implementation of `mkpatch' and `dopatch' up and crawling.

       
================================================================


Walter Landry: Some Slight Miststatements About Distributed Repositories

       So, for example, I can develop arch independently of whether
       Tom thinks that I am worthy enough to do so.

:-)  We disagree a bit, but that's all.


	So you don't, in general, have a repository that is writeable
	by more than one person.

This is the practice that has emerged among early adopters, but
multiple writers are fully supported, and I would expect the GCC
mainline repository to (continue to) have multiple writers.

Even if the procedures are slightly changed so that all the hard
merging and changeset vetting is done in other repositories before
being moved to the mainline, I'd predict that release manager duties
(moving in those changes) will be shared.


================================================================

Zack Weinberg: Server Security is Mission Critical


	As Joseph pointed out, GCC development is and will be centered
	around a 'master' server.  If we wind up using a distributed
	system, individual developers will take advantage of it to do
	offline work, but the master repository will still act as a
	communication nexus between us all, and official releases will
	be cut from there.  I doubt anyone will do releases except
	from there.[1] The security of this master server is
	mission-critical.

	The present situation, with CVS pserver enabled for read-only
	anonymous access, and write privilege available via CVS-over-ssh, has
	two potentially exploitable vulnerabilities that should be easy to
	address in a new system.

	_Imprimis_, the CVS pserver requires write privileges on the
	CVS repository directories, even if it is providing only read
	access.  Therefore, if the 'anoncvs' user is somehow
	compromised -- for instance, by a buffer overflow bug in the
	pserver itself -- the attacker could potentially modify any of
	the ,v files stored in the repository.  [....]

You do not need a write-privileged server to provide read-only access
to an `arch' repository.

	_Secundus_, CVS-over-ssh operates by invoking 'cvs server' on
	the repository host -- running under the user ID of the
	invoker, who must have an account on the repository host.  [...]
	It can't perform any operations that the invoking user can't.
	Which means that the invoking user must also have OS-level
	write privileges on the repository.  Now, such users are
	_supposed_ to be able to check in changes to the repository,
	but they _aren't_ supposed to be able to modify the ,v files
	with a text editor.  The distinction is crucial.  If the
	account of a user with write privileges is compromised, and
	used to check in a malicious change, the version history is
	intact, the change will be easily detected, and we can simply
	back out the malice. If the account of a user with write
	privileges is compromised and used to hand-edit a malicious
	change into a ,v file, it's quite that this will go undetected
	until after half the binaries on the planet are untrustworthy.
	It is this latter scenario I would like to be impossible.

I think your security analysis is incorrect, for all revision control
systems.

The account used to check-in changes must be able to write the
repository files.  Thus, if an attacker obtains the ability to run
arbitrary code as that user, the files may be compromised.   "Hand
editing" is immaterial, here.

Still, `arch' does have a _theoretical_ advantage in this area.
Let us assume that we have a security-enhanced OS in which we can
place very fine-grained restrictions on the set of system calls
available to particular users.  For example, we might forbid users
from opening files for writing except with the O_CREAT | O_EXCL flags,
and we might forbid them from removing files other than files whose
names match a particular regexps.

On such a system, and with those particular restrictions, an `arch'
based server can indeed be hardened against attacks that result in the
execution of arbitrary malicious code.

I don't know of systems that have such fine-grained controls, but they
are certainly doable -- and I wouldn't be too surprised to hear of
someone providing them.


	There are several possible ways to do that.  One way is the way
	Perforce does it: _all_ access, even local access, goes through p4d,
	and p4d can run under its own user ID and be the only user ID with
	write access to the repository.  

Yes, but any compromise that can run arbitrary code with the p4d id
can corrupt the repository.

	Another way, and perhaps a cleverer one, is OpenCM's way, where the
	(SHA of) the file content is the file's identity, so a malicious
	change will not even be picked up.  (Please correct me if I
	misunderstand.)  Of course, that provides no insulation against an
	attacker using a compromised account to execute "rm -fr
	/path/to/repository", but *that* problem is best solved with backups,
	because a disk failure could have the same effect and there's nothing
	software can do about that.

It also provides no help against an attacker whose arbitrary code is
careful to update the hash records.

Again, I think the fine-grained access controls I described are your
only help here, and if you find some, you'll be pleased by the narrow
and benign access needs of an arch write-transaction server.


================================================================