public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Roman Zhuykov <zhroma@ispras.ru>
To: Segher Boessenkool <segher@kernel.crashing.org>,
	Joseph Myers <jsm@polyomino.org.uk>
Cc: gcc@gcc.gnu.org, Alexander Monakov <amonakov@ispras.ru>,
	Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>,
	esr@thyrsus.com
Subject: Re: Test GCC conversion with reposurgeon available
Date: Wed, 25 Dec 2019 11:03:00 -0000	[thread overview]
Message-ID: <279bf8dd-8725-c3fa-0def-130b3d128509@ispras.ru> (raw)
In-Reply-To: <20191224181444.GJ4505@gate.crashing.org>

First of all thanks to everyone who spent time making the conversion 
better and better. Here is my 2c, I have studied a little my colleagues 
trunk history in Maxim's gcc-pretty vs gcc-reposurgeon-5b.

1) In gcc-pretty timezone info is lost in both author/commiter date 
(keeping UTC time correct, certainly). Examples are r278990 and r289989.
Probably git-svn causes this, current read-only git mirror is also 
without timezone. Not sure we need that info, but reposurgeon is more 
correct here.

2) Some thoughts about script for summarizing commit log messages:
2a) Why r143753 and r150680 not have "re PR..." summary instead of 
"[multiple changes]" ?
2b) On the contrary r155892 have to mention two PRs, even "[multiple 
changes]" is better here, IMHO.
2c) In r130050 and r155902 we have "Rename too ... " in summary, not 
sure how to make it better.
2d) r146882 can have better summary if we somehow organize ChangeLog 
priority (gcc/ChangeLog is more important that testsuite one).

3) About author emails, see below
24.12.2019 21:14, Segher Boessenkool wrote:
> On Tue, Dec 24, 2019 at 05:16:54PM +0000, Joseph Myers wrote:
>> On Tue, 24 Dec 2019, Segher Boessenkool wrote:
>>>> That's because that commit also edits ChangeLog entries from other
>>>> authors.  When a commit adds / edits ChangeLog entries for more than one
>>>> author (the difference between purely editing an existing entry and adding
>>>> a new one, possibly under an existing date/author header, for a
>>>> multi-author commit, is not something that can reliably be determined
>>>> automatically), the conversion falls back to using the committer identity
>>>> instead of picking one of the multiple relevant authors from the ChangeLog
>>>> files.
>>> There is only one relevant author in r270511.  It edits a few wrong path
>>> names in the previous changelog entries.  People often do similar things
>>> (like fixing the commit date :-) )
>> Distinguishing "edits a previous ChangeLog entry" from "adds a new entry
>> under a previous ChangeLog header for a change included in the commit" is
>> a human judgement.
> We are doing only one conversion here, the one of the GCC repo.  The
> heuristic works, we checked it did.
>
>>> Either never use <account>@gcc.gnu.org, or always use it, don't do the
>>> worst of both worlds?
>> The heuristics here are to use an attribution from ChangeLog for the
>> author where unambiguous, but to use the committer (always @gcc.gnu.org /
>> @gnu.org [*], so avoiding attributions at the wrong company even where
>> people were using multiple addresses simultaneously for different changes)
>> as author if in doubt.
> You never need that, and it is worse to use two different schemes than to
> choose either.
>
> I would have chosen the "<account>@gcc.gnu.org" scheme, because it is
> simple and *correct*.  Other people wanted the nicer names.  Maxim's
> conversion gets that correct.  Please copy it.
>
IMHO Segher is a bit categorical is the discussion, but I'll be glad to 
see brief description of Maxim's approach to manage emails, gcc-pretty 
shows better results.
Speaking about the script counting authors from ChangeLog files, even if 
we drop an "edits a previous ChangeLog entry" issue, it still sometimes 
work not as Joseph described:
3a) In r155892, r155893 and r259314 Alex is not counted as the only 
author without any reason.
3b) In r139854, r141108 and r196252 script selected the author 
successfully, while actually there are more that one.
3c) Maybe here we can also somehow organize ChangeLog priority (again, 
gcc/ChangeLog is more important that testsuite one). There are a lot of 
examples, when testsuite/ChangeLog entry have another author: r145055, 
r150680, r155889, r155894, r155890, r163904, r180186 and r183325.
3d) If we fix 3b+3c we can also look at r143753, r155890 and r155895.
3e) r155891, r207422, r183627 and r234218 are examples of commits which 
don't touch any ChangeLog files for different reasons. Seems unsolvable 
in current approach.

--
Roman

  reply	other threads:[~2019-12-25 11:03 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 21:32 Joseph Myers
2019-12-17 23:33 ` Bernd Schmidt
2019-12-18  0:51   ` Eric S. Raymond
2019-12-18  0:52   ` Joseph Myers
2019-12-18  3:28     ` Joseph Myers
2019-12-18 14:36       ` Joseph Myers
2019-12-18 13:10 ` Jason Merrill
2019-12-18 18:16   ` Joseph Myers
2019-12-19  5:50     ` Jason Merrill
2019-12-19 15:55       ` Joseph Myers
2019-12-18 21:55 ` Joseph Myers
2019-12-19  0:36   ` Bernd Schmidt
2019-12-19  0:58     ` Joseph Myers
2019-12-19 13:51   ` Test GCC conversions (publicly) available Mark Wielaard
2019-12-19 14:06     ` Eric S. Raymond
2019-12-19 14:40       ` Joseph Myers
2019-12-19 16:00         ` Eric S. Raymond
2019-12-19 16:03           ` Richard Earnshaw (lists)
2019-12-19 16:08             ` Eric S. Raymond
2019-12-19 16:29   ` Test GCC conversion with reposurgeon available Joseph Myers
2019-12-22 13:57     ` Joseph Myers
2019-12-23 17:27       ` Roman Zhuykov
2019-12-24 11:50         ` Joseph Myers
2019-12-24 15:55           ` Segher Boessenkool
2019-12-24 17:17             ` Joseph Myers
2019-12-24 18:14               ` Segher Boessenkool
2019-12-25 11:03                 ` Roman Zhuykov [this message]
2019-12-25 11:20                   ` Joseph Myers
2019-12-25 12:23                     ` Eric S. Raymond
2019-12-25 14:32                   ` Andreas Schwab
2019-12-25 14:41                     ` Joseph Myers
2019-12-25 15:10                       ` Andreas Schwab
2019-12-25 15:36                         ` Joseph Myers
2019-12-25 17:15                           ` Segher Boessenkool
2019-12-25 19:33                             ` Eric S. Raymond
2019-12-26 21:03                               ` Vincent Lefevre
2019-12-26 21:31                                 ` Eric S. Raymond
2019-12-26 22:25                                   ` Toon Moene
2019-12-26 22:32                                     ` Eric S. Raymond
2019-12-27 14:40                                       ` Segher Boessenkool
2019-12-26 22:57                                   ` Vincent Lefevre
2019-12-26 23:38                                     ` Eric S. Raymond
2019-12-25 19:40                           ` Eric S. Raymond
2019-12-27 21:29                           ` Andreas Schwab
2019-12-27 21:43                             ` Joseph Myers
2019-12-25 19:19                     ` Eric S. Raymond
2019-12-27 21:30                       ` Andreas Schwab
2019-12-28  2:43                         ` Eric S. Raymond
2019-12-27 14:37                   ` Richard Earnshaw
2019-12-24 10:57       ` Maxim Kuvyrkov
2019-12-28 16:30       ` Joseph Myers
2020-01-03 12:38         ` Joseph Myers
2020-01-06 23:58           ` Andrew Pinski
2020-01-07  0:30             ` Joseph Myers
2020-01-07  0:44             ` Richard Earnshaw
2020-01-09 12:22           ` Joseph Myers
2020-01-09 21:57             ` Joseph Myers
2020-01-09  9:44 ` GIT conversion: question about tags & release branches Martin Liška
2020-01-09 10:51   ` Richard Earnshaw (lists)
2020-01-09 11:06     ` Martin Liška
2020-01-09 11:31       ` Eric S. Raymond
2020-01-09 11:46   ` Martin Jambor
2020-01-09 11:50     ` Martin Liška
2020-01-09 12:37       ` Joseph Myers
2020-01-09 13:38         ` Martin Liška
2020-01-09 11:57     ` Richard Earnshaw (lists)
2020-01-09 11:59       ` Richard Earnshaw (lists)
2020-01-06 22:09 Test GCC conversion with reposurgeon available Loren James Rittle
2020-01-07  9:35 ` Richard Earnshaw (lists)
2020-01-07 15:53   ` Loren James Rittle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=279bf8dd-8725-c3fa-0def-130b3d128509@ispras.ru \
    --to=zhroma@ispras.ru \
    --cc=amonakov@ispras.ru \
    --cc=esr@thyrsus.com \
    --cc=gcc@gcc.gnu.org \
    --cc=jsm@polyomino.org.uk \
    --cc=maxim.kuvyrkov@linaro.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).