From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 86708 invoked by alias); 26 Dec 2019 12:10:27 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 86690 invoked by uid 89); 26 Dec 2019 12:10:25 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy=middle, committer, gain, esr X-HELO: digraph.polyomino.org.uk Received: from digraph.polyomino.org.uk (HELO digraph.polyomino.org.uk) (81.187.227.50) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 26 Dec 2019 12:10:24 +0000 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.90_1) (envelope-from ) id 1ikRxx-0007DZ-Aj; Thu, 26 Dec 2019 12:10:09 +0000 Date: Thu, 26 Dec 2019 12:10:00 -0000 From: Joseph Myers To: Jakub Jelinek cc: Alexandre Oliva , "Eric S. Raymond" , Jeff Law , Segher Boessenkool , Mark Wielaard , Maxim Kuvyrkov , "Richard Earnshaw (lists)" , gcc@gcc.gnu.org Subject: Re: Proposal for the transition timetable for the move to GIT In-Reply-To: <20191226111633.GJ10088@tucnak> Message-ID: References: <20191216133632.GC3152@gate.crashing.org> <20191216135451.GA3142@thyrsus.com> <20191216140514.GD3152@gate.crashing.org> <20191216153649.GE3152@gate.crashing.org> <20191225120747.GA96669@thyrsus.com> <20191226111633.GJ10088@tucnak> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-SW-Source: 2019-12/txt/msg00421.txt.bz2 On Thu, 26 Dec 2019, Jakub Jelinek wrote: > Is there some easy way (e.g. file in the conversion scripts) to correct > spelling and other mistakes in the commit authors? These can be corrected via reposurgeon commands in gcc.lift (see the existing "// attribution =A set jwakely.gcc@gmail.com" command), or the msgout/msgin mechanism used in Richard's script for commit message improvements could also make changes to authors (don't know the exact syntax offhand, but I believe authors are among the things that mechanism allows to be changed in commit metadata, so the script could gain a table of author corrections to apply). > Or I see in git shortlog parts of date being parsed as name, e.g. > (basically anything in git shortlog after the "..." wrapped names and before > Aaron Conole (2): in alphabetical sorting, or after Zuxy Meng (4):. > 00:27 -0700 Zack Weinberg (1): > lsd.ic.unicamp.br), Jakub Jelinek (1): Filed https://gitlab.com/esr/reposurgeon/issues/218 for these kinds of ChangeLog entries - some changes to regular expressions should be able to make the code handle them better (possibly by reverting to committer identities in some more cases where the ChangeLog header line looks odd in some way). > Eric Botcazou (1): I didn't include anything for this in my reduced test. I'd noted some of the invalid attribution warnings from reposurgeon also involving bytes 0xA0 (= ISO-8859-1 NBSP). If anything is appropriate there, it might be something like "change any 0xA0 that's preceded by an ASCII byte to ASCII space before processing further" ("preceded by an ASCII byte" being needed to avoid the case of 0xA0 in the middle of a UTF-8 character). -- Joseph S. Myers jsm@polyomino.org.uk