From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 38072 invoked by alias); 31 Dec 2019 03:09:52 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 38022 invoked by uid 89); 31 Dec 2019 03:09:52 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy=automation, formal, intelligent, captures X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 31 Dec 2019 03:09:50 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 1EE334704BB5; Mon, 30 Dec 2019 22:09:48 -0500 (EST) Date: Tue, 31 Dec 2019 03:09:00 -0000 From: "Eric S. Raymond" To: Joseph Myers Cc: Segher Boessenkool , "Richard Earnshaw (lists)" , Maxim Kuvyrkov , GCC Development , Alexandre Oliva , Jeff Law , Mark Wielaard , Jakub Jelinek , frnchfrgg@free.fr Subject: Re: Proposal for the transition timetable for the move to GIT Message-ID: <20191231030948.GA112495@thyrsus.com> Reply-To: esr@thyrsus.com References: <20191226111633.GJ10088@tucnak> <5DCEA32B-3E36-4400-B931-9F4E2A8F3FA5@linaro.org> <155B5BFD-6ECF-4EBF-A38C-D6DD178FB497@linaro.org> <20191229224740.GB51787@thyrsus.com> <20191229231342.GF3191@gate.crashing.org> <357e6bf2-55c5-fabc-19e7-457539594258@arm.com> <20191230223651.GG3191@gate.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-IsSubscribed: yes X-SW-Source: 2019-12/txt/msg00543.txt.bz2 Joseph Myers : > To me, that indicates that using a conversion tool that is conservative in > its heuristics, and then selectively applying improvements to the extent > they can be done safely with manual review in a reasonable time, is better > than applying a conversion tool with more aggressive heuristics. There's a more general point here, which I'm developing in my book-in-progress. Clean data-conversion problems can be done algorithmically without a human in the loop. Messy data-conversion problems need judgment amplifiers. Maxim's scripts try to treat a messy conversion problem as though it were a clean one. Maxim is pretty sharp, so this almost works. Almost. But the failure mode is predictable - overinterpreting badly-formed input leads to plausible garbage on output. When this happens, it's the Goddess Eris's way of telling you that there needs to be human judgment in the loop. Instead of trying to automate it out, you should be building tools that partion the process into things a computer does well, driven by choices a human makes well. This is a point that needs making because programmers thrown at messy conversion problems tend to be more fixated on achieving full automation than they perhaps ought to be. Elswhere I have written of Zeno tarpits: http://esr.ibiblio.org/?p=6772 Subversion dump streams are not quite a Zeno tarpit - they actually obey something that has the effect of a formal specification - but ChangeLog parsing is. > The issues with the reposurgeon conversion listed in Maxim's last comments > were of the form "reposurgeon is being conservative in how it generates > metadata from SVN information". I think that's a very good basis for > adding on a limited set of safe improvements to authors and commit > messages that can be done reasonably soon and then doing the final > conversion with reposurgeon. The flip side of this is that Joseph has been making intelligent and realistic suggestions for how to improve reposurgeon. That is *invaluable* - it captures knowledge that will make future comparisons easier and better. Software engineers (outside of a few AI specialists) don't ordinarily think of themselves as being in the knowledge-capture business. But it's a useful perspective to cultivate. -- Eric S. Raymond