public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
       [not found] <4c1114a7-2377-99e4-d451-1a086857e991@linux.ibm.com>
@ 2021-06-10  5:22 ` Xionghu Luo
  2021-06-10  6:17   ` Martin Liška
  2021-06-10  9:41   ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Jonathan Wakely
  0 siblings, 2 replies; 84+ messages in thread
From: Xionghu Luo @ 2021-06-10  5:22 UTC (permalink / raw)
  To: gcc

Sorry, should be sent to this mail-list.


On 2021/6/10 11:31, Xionghu Luo via Gcc-patches wrote:
> Hi,
> I noticed that the "git gcc-commit-mklog" command doesn't extract PR
> number from title to ChangeLog automatically, then the committed patch
> doesn't update the related bugzilla PR website after check in the patch?
> Martin, what's your opinion about this since you are much familar about
> this? Thanks.
> 
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  5:22 ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Xionghu Luo
@ 2021-06-10  6:17   ` Martin Liška
  2021-06-10  6:25     ` Xionghu Luo
  2021-06-10  6:35     ` Tobias Burnus
  2021-06-10  9:41   ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Jonathan Wakely
  1 sibling, 2 replies; 84+ messages in thread
From: Martin Liška @ 2021-06-10  6:17 UTC (permalink / raw)
  To: Xionghu Luo, gcc

On 6/10/21 7:22 AM, Xionghu Luo wrote:
> Sorry, should be sent to this mail-list.
> 
> 
> On 2021/6/10 11:31, Xionghu Luo via Gcc-patches wrote:
>> Hi,
>> I noticed that the "git gcc-commit-mklog" command doesn't extract PR
>> number from title to ChangeLog automatically, then the committed patch
>> doesn't update the related bugzilla PR website after check in the patch?

Hello.

Yes, we currently only support automatic extraction from comments from test-cases.
How does your commit title look like? Note that we require bugzilla components
being part of PR entry, which is not commonly used in git titles due to length
limitation.

>> Martin, what's your opinion about this since you are much familar about
>> this? Thanks.

Please provide few examples of existing commits?

Thanks,
Martin

>>
>>
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  6:17   ` Martin Liška
@ 2021-06-10  6:25     ` Xionghu Luo
  2021-06-10  8:07       ` Martin Liška
  2021-06-10  6:35     ` Tobias Burnus
  1 sibling, 1 reply; 84+ messages in thread
From: Xionghu Luo @ 2021-06-10  6:25 UTC (permalink / raw)
  To: Martin Liška, gcc



On 2021/6/10 14:17, Martin Liška wrote:
> On 6/10/21 7:22 AM, Xionghu Luo wrote:
>> Sorry, should be sent to this mail-list.
>>
>>
>> On 2021/6/10 11:31, Xionghu Luo via Gcc-patches wrote:
>>> Hi,
>>> I noticed that the "git gcc-commit-mklog" command doesn't extract PR
>>> number from title to ChangeLog automatically, then the committed patch
>>> doesn't update the related bugzilla PR website after check in the patch?
> 
> Hello.
> 
> Yes, we currently only support automatic extraction from comments from 
> test-cases.
> How does your commit title look like? Note that we require bugzilla 
> components
> being part of PR entry, which is not commonly used in git titles due to 
> length
> limitation.
> 
>>> Martin, what's your opinion about this since you are much familar about
>>> this? Thanks.
> 
> Please provide few examples of existing commits?

Thanks. For example:

rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f700e4b0ee3ef53b48975cf89be26b9177e3a3f3


> 
> Thanks,
> Martin
> 
>>>
>>>
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  6:17   ` Martin Liška
  2021-06-10  6:25     ` Xionghu Luo
@ 2021-06-10  6:35     ` Tobias Burnus
  2021-06-10  8:07       ` Martin Liška
  1 sibling, 1 reply; 84+ messages in thread
From: Tobias Burnus @ 2021-06-10  6:35 UTC (permalink / raw)
  To: Martin Liška, Xionghu Luo, gcc

On 10.06.21 08:17, Martin Liška wrote:

> Yes, we currently only support automatic extraction from comments from
> test-cases.
> How does your commit title look like? Note that we require bugzilla
> components
> being part of PR entry, which is not commonly used in git titles due
> to length
> limitation.

One options would be to require a 'PR <comp>/<nnnnn+>' line if there is
'PRnnnnn+' in the commit title, rejecting the commit otherwise.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  6:25     ` Xionghu Luo
@ 2021-06-10  8:07       ` Martin Liška
  0 siblings, 0 replies; 84+ messages in thread
From: Martin Liška @ 2021-06-10  8:07 UTC (permalink / raw)
  To: Xionghu Luo, gcc

On 6/10/21 8:25 AM, Xionghu Luo wrote:
> 
> 
> On 2021/6/10 14:17, Martin Liška wrote:
>> On 6/10/21 7:22 AM, Xionghu Luo wrote:
>>> Sorry, should be sent to this mail-list.
>>>
>>>
>>> On 2021/6/10 11:31, Xionghu Luo via Gcc-patches wrote:
>>>> Hi,
>>>> I noticed that the "git gcc-commit-mklog" command doesn't extract PR
>>>> number from title to ChangeLog automatically, then the committed patch
>>>> doesn't update the related bugzilla PR website after check in the patch?
>>
>> Hello.
>>
>> Yes, we currently only support automatic extraction from comments from
>> test-cases.
>> How does your commit title look like? Note that we require bugzilla
>> components
>> being part of PR entry, which is not commonly used in git titles due to
>> length
>> limitation.
>>
>>>> Martin, what's your opinion about this since you are much familar about
>>>> this? Thanks.
>>
>> Please provide few examples of existing commits?
> 
> Thanks. For example:
> 
> rs6000: Support doubleword swaps removal in rot64 load store [PR100085]
> 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f700e4b0ee3ef53b48975cf89be26b9177e3a3f3

I see. Well, note that gcc-commit-mklog is calling ./contrib/mklog.py when you're
going to write a commit message.

How can the script get [PRnnnn] from a non-existing title. Please show me your
complete workflow.

Thanks,
Martin

> 
> 
>>
>> Thanks,
>> Martin
>>
>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  6:35     ` Tobias Burnus
@ 2021-06-10  8:07       ` Martin Liška
  2021-06-10  9:44         ` Jonathan Wakely
  2021-06-10 11:51         ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog (was:: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Liška @ 2021-06-10  8:07 UTC (permalink / raw)
  To: Tobias Burnus, Xionghu Luo, gcc

On 6/10/21 8:35 AM, Tobias Burnus wrote:
> On 10.06.21 08:17, Martin Liška wrote:
> 
>> Yes, we currently only support automatic extraction from comments from
>> test-cases.
>> How does your commit title look like? Note that we require bugzilla
>> components
>> being part of PR entry, which is not commonly used in git titles due
>> to length
>> limitation.
> 
> One options would be to require a 'PR <comp>/<nnnnn+>' line if there is
> 'PRnnnnn+' in the commit title, rejecting the commit otherwise.

Quite interesting idea! Are you willing to prepare a patch for it?

Thanks,
Martin

> 
> Tobias
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  5:22 ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Xionghu Luo
  2021-06-10  6:17   ` Martin Liška
@ 2021-06-10  9:41   ` Jonathan Wakely
  1 sibling, 0 replies; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-10  9:41 UTC (permalink / raw)
  To: luoxhu; +Cc: gcc

On 2021/6/10 11:31, Xionghu Luo via Gcc-patches wrote:
> Hi,
> I noticed that the "git gcc-commit-mklog" command doesn't extract PR
> number from title to ChangeLog automatically, then the committed patch
> doesn't update the related bugzilla PR website after check in the patch?

Well then you shouldn't have checked it in like that :-)

The docs at https://gcc.gnu.org/contribute.html#patches are clear that
when you put a PR number in the email subject (which should also be
the summary line of the git commit msg) ...

"The body of the commit message should still contain the full form (PR
<component>/nnnnn) within the body of the commit message so that
Bugzilla will correctly notice the commit."



^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  8:07       ` Martin Liška
@ 2021-06-10  9:44         ` Jonathan Wakely
  2021-06-10 10:01           ` Jonathan Wakely
  2021-06-10 11:51         ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog (was:: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-10  9:44 UTC (permalink / raw)
  To: mliska; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 84 bytes --]

> Quite interesting idea! Are you willing to prepare a patch for it?

This works.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 1805 bytes --]

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index bd8c1ff7af2..58aad8b7f26 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -157,6 +157,7 @@ author_line_regex = \
 additional_author_regex = re.compile(r'^\t(?P<spaces>\ *)?(?P<name>.*  <.*>)')
 changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
 pr_regex = re.compile(r'\tPR (?P<component>[a-z+-]+\/)?([0-9]+)$')
+title_pr_regex = re.compile(r' \[PR ?[0-9]+]$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
 star_prefix_regex = re.compile(r'\t\*(?P<spaces>\ *)(?P<content>.*)')
 end_of_location_regex = re.compile(r'[\[<(:]')
@@ -343,6 +344,7 @@ class GitCommit:
             self.check_for_broken_parentheses()
             self.deduce_changelog_locations()
             self.check_file_patterns()
+            self.check_pr_consistent()
             if not self.errors:
                 self.check_mentioned_files()
                 self.check_for_correct_changelog()
@@ -568,6 +570,17 @@ class GitCommit:
                     msg = 'unsupported wildcard prefix'
                     self.errors.append(Error(msg, name))
 
+    def check_pr_consistent(self):
+        """Check that a 'PR component/nnn' line is present if the first
+           line ends with a bug number like [PRnnn] or [PR nnn]."""
+        m = title_pr_regex.search(self.info.lines[0])
+        if m:
+            for line in self.changes:
+                if pr_regex.match(line):
+                    return
+            msg = 'summary contains "%s" but no PR in changelog'
+            self.errors.append(Error(msg % m.group()))
+
     def check_for_empty_description(self):
         for entry in self.changelog_entries:
             for i, line in enumerate(entry.lines):

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10  9:44         ` Jonathan Wakely
@ 2021-06-10 10:01           ` Jonathan Wakely
  2021-06-10 10:08             ` Jakub Jelinek
  0 siblings, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-10 10:01 UTC (permalink / raw)
  To: mliska; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 292 bytes --]

On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
>> Quite interesting idea! Are you willing to prepare a patch for it?
>
>This works.

And this works better, because it checks the PR in the title matches
one in the changelog.

I'll get something added to the tests and prep this for commit.



[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 1912 bytes --]

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index bd8c1ff7af2..15b602595c4 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -157,6 +157,7 @@ author_line_regex = \
 additional_author_regex = re.compile(r'^\t(?P<spaces>\ *)?(?P<name>.*  <.*>)')
 changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
 pr_regex = re.compile(r'\tPR (?P<component>[a-z+-]+\/)?([0-9]+)$')
+title_pr_regex = re.compile(r' \[PR ?([0-9]+)]$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
 star_prefix_regex = re.compile(r'\t\*(?P<spaces>\ *)(?P<content>.*)')
 end_of_location_regex = re.compile(r'[\[<(:]')
@@ -343,6 +344,7 @@ class GitCommit:
             self.check_for_broken_parentheses()
             self.deduce_changelog_locations()
             self.check_file_patterns()
+            self.check_pr_consistent()
             if not self.errors:
                 self.check_mentioned_files()
                 self.check_for_correct_changelog()
@@ -568,6 +570,18 @@ class GitCommit:
                     msg = 'unsupported wildcard prefix'
                     self.errors.append(Error(msg, name))
 
+    def check_pr_consistent(self):
+        """Check that a 'PR component/nnn' line is present if the first
+           line ends with a bug number like [PRnnn] or [PR nnn]."""
+        title_pr = title_pr_regex.search(self.info.lines[0])
+        if title_pr:
+            for line in self.changes:
+                body_pr = pr_regex.match(line)
+                if body_pr and body_pr.group(2) == title_pr.group(1):
+                    return
+            msg = 'changelog does not contain the PR in the summary:%s'
+            self.errors.append(Error(msg % title_pr.group()))
+
     def check_for_empty_description(self):
         for entry in self.changelog_entries:
             for i, line in enumerate(entry.lines):

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 10:01           ` Jonathan Wakely
@ 2021-06-10 10:08             ` Jakub Jelinek
  2021-06-10 10:40               ` Jonathan Wakely
  0 siblings, 1 reply; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-10 10:08 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: mliska, gcc

On Thu, Jun 10, 2021 at 11:01:49AM +0100, Jonathan Wakely via Gcc wrote:
> On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
> > > Quite interesting idea! Are you willing to prepare a patch for it?
> > 
> > This works.
> 
> And this works better, because it checks the PR in the title matches
> one in the changelog.
> 
> I'll get something added to the tests and prep this for commit.

Note, some commits fix more than one PR.  Sometimes the subject lists
just one of them and the ChangeLog several, at other times people mention
[PRnnnnnn, PRnnnnnn] etc. in the subject.
I think checking that at least one changeLog PR line matches at least one PR
in the subject would be good enough.

Your regex will not match [PR123456, PR123457] in subject, perhaps ok
initially, and if I read it will will be happy if at least one line matches
it.

> diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
> index bd8c1ff7af2..15b602595c4 100755
> --- a/contrib/gcc-changelog/git_commit.py
> +++ b/contrib/gcc-changelog/git_commit.py
> @@ -157,6 +157,7 @@ author_line_regex = \
>  additional_author_regex = re.compile(r'^\t(?P<spaces>\ *)?(?P<name>.*  <.*>)')
>  changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
>  pr_regex = re.compile(r'\tPR (?P<component>[a-z+-]+\/)?([0-9]+)$')
> +title_pr_regex = re.compile(r' \[PR ?([0-9]+)]$')
>  dr_regex = re.compile(r'\tDR ([0-9]+)$')
>  star_prefix_regex = re.compile(r'\t\*(?P<spaces>\ *)(?P<content>.*)')
>  end_of_location_regex = re.compile(r'[\[<(:]')
> @@ -343,6 +344,7 @@ class GitCommit:
>              self.check_for_broken_parentheses()
>              self.deduce_changelog_locations()
>              self.check_file_patterns()
> +            self.check_pr_consistent()
>              if not self.errors:
>                  self.check_mentioned_files()
>                  self.check_for_correct_changelog()
> @@ -568,6 +570,18 @@ class GitCommit:
>                      msg = 'unsupported wildcard prefix'
>                      self.errors.append(Error(msg, name))
>  
> +    def check_pr_consistent(self):
> +        """Check that a 'PR component/nnn' line is present if the first
> +           line ends with a bug number like [PRnnn] or [PR nnn]."""
> +        title_pr = title_pr_regex.search(self.info.lines[0])
> +        if title_pr:
> +            for line in self.changes:
> +                body_pr = pr_regex.match(line)
> +                if body_pr and body_pr.group(2) == title_pr.group(1):
> +                    return
> +            msg = 'changelog does not contain the PR in the summary:%s'
> +            self.errors.append(Error(msg % title_pr.group()))
> +
>      def check_for_empty_description(self):
>          for entry in self.changelog_entries:
>              for i, line in enumerate(entry.lines):


	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 10:08             ` Jakub Jelinek
@ 2021-06-10 10:40               ` Jonathan Wakely
  2021-06-10 14:55                 ` Martin Sebor
  0 siblings, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-10 10:40 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: mliska, gcc

On Thu, 10 Jun 2021 at 11:08, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Jun 10, 2021 at 11:01:49AM +0100, Jonathan Wakely via Gcc wrote:
> > On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
> > > > Quite interesting idea! Are you willing to prepare a patch for it?
> > >
> > > This works.
> >
> > And this works better, because it checks the PR in the title matches
> > one in the changelog.
> >
> > I'll get something added to the tests and prep this for commit.
>
> Note, some commits fix more than one PR.  Sometimes the subject lists
> just one of them and the ChangeLog several, at other times people mention
> [PRnnnnnn, PRnnnnnn] etc. in the subject.
> I think checking that at least one changeLog PR line matches at least one PR
> in the subject would be good enough.
>
> Your regex will not match [PR123456, PR123457] in subject, perhaps ok

Yeah, that wouldn't get matched, so no checks would be done for the
changelog body. Not ideal, but better than what we have no where
nothing is checked at all.

> initially, and if I read it will will be happy if at least one line matches
> it.

Yes, if the summary line has a single PR number, it must be present in
the changelog body. Other PR numbers can also be in the body, and they
aren't checked.

But I've hit an issue trying to test it, because the testcases in
contrib/gcc-changelog/test_patches.txt are in the form of emails, and
the Subject: line from the emails is not passed to the GitInfo
constructor, so isn't part of the message that gets checked.

Martin, Shouldn't the GitEmail class extract the Subject: from the
email header and use that as the first line passed to the GitInfo
object?


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog (was:: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-10  8:07       ` Martin Liška
  2021-06-10  9:44         ` Jonathan Wakely
@ 2021-06-10 11:51         ` Tobias Burnus
  2021-06-10 11:54           ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog Florian Weimer
  2021-06-10 12:45           ` Jonathan Wakely
  1 sibling, 2 replies; 84+ messages in thread
From: Tobias Burnus @ 2021-06-10 11:51 UTC (permalink / raw)
  To: Martin Liška, Xionghu Luo, gcc

[-- Attachment #1: Type: text/plain, Size: 779 bytes --]

On 10.06.21 10:07, Martin Liška wrote:
> On 6/10/21 8:35 AM, Tobias Burnus wrote:
>> One options would be to require a 'PR <comp>/<nnnnn+>' line if there is
>> 'PRnnnnn+' in the commit title, rejecting the commit otherwise.
>
> Quite interesting idea! Are you willing to prepare a patch for it?

Done.

I was thinking of 'PRs 1234 and 4566' but it soon became too special and
I did not want to reject valid cases (false positive), hence, I settled
on the attached variant. (It should detect the most common issues, thus,
I don't worry about false negatives.)

OK? Comments?

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Attachment #2: gcc-changelog.diff --]
[-- Type: text/x-patch, Size: 10962 bytes --]

contrig/gcc-changelog: Check that PR in subject in in changelog

This patch checks that a '[PRnnnn]' and '(PRnnnn)' also appears as PR in the
changelog part of the commit message.  And it does likewise for 'PR comp/nnnn'
except that then also the component name is checked.  (Note that the reverse
is permitted, i.e. PR(s) only appearing in the changelog.)
To avoid false positives, PR numbers in the subject line are ignored,
if 'revert' appears.

contrib/ChangeLog:

	* gcc-changelog/git_commit.py (pr_regex): Add ?P<pr> for group('pr').
	(subject_pr_skip_regex, subject_pr_regex, subject_pr2_regex): New.
	(GitInfo.__init__, GitCommit.parse_changelog): Check subject PRs.
	* gcc-changelog/git_email.py (SUBJECT_PREFIX): New.
	(GitEmail.__init__): Parse 'Subject:' and pass it to GitInfo.
	* gcc-changelog/git_repository.py (parse_git_revisions): Pass first
	line to GitInfo as subject line.
	* gcc-changelog/test_email.py (test_pr_only_in_subject,
	test_wrong_pr_comp_in_subject): New.
	* gcc-changelog/test_patches.txt (0030-PR-c-92746, pr-check1.patch):
	Update to avoid triggering the new check.
	(0001-rs6000-Support-doubleword, pr-wrong-comp.patch): New.

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index bd8c1ff7af2..5513e1d32e4 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -156,7 +156,11 @@ author_line_regex = \
         re.compile(r'^(?P<datetime>\d{4}-\d{2}-\d{2})\ {2}(?P<name>.*  <.*>)')
 additional_author_regex = re.compile(r'^\t(?P<spaces>\ *)?(?P<name>.*  <.*>)')
 changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
-pr_regex = re.compile(r'\tPR (?P<component>[a-z+-]+\/)?([0-9]+)$')
+subject_pr_skip_regex = re.compile(r'[Rr]evert')
+subject_pr_regex = \
+        re.compile(r'(^|\W)PR\s+(?P<component>[a-zA-Z+-]+)/(?P<pr>\d{4,7})')
+subject_pr2_regex = re.compile(r'[(\[]PR\s*(?P<pr>\d{4,7})[)\]]')
+pr_regex = re.compile(r'\tPR (?P<component>[a-z+-]+\/)?(?P<pr>[0-9]+)$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
 star_prefix_regex = re.compile(r'\t\*(?P<spaces>\ *)(?P<content>.*)')
 end_of_location_regex = re.compile(r'[\[<(:]')
@@ -279,10 +283,11 @@ class ChangeLogEntry:
 
 
 class GitInfo:
-    def __init__(self, hexsha, date, author, lines, modified_files):
+    def __init__(self, hexsha, date, author, subject, lines, modified_files):
         self.hexsha = hexsha
         self.date = date
         self.author = author
+        self.subject = subject
         self.lines = lines
         self.modified_files = modified_files
 
@@ -298,6 +303,7 @@ class GitCommit:
         self.top_level_authors = []
         self.co_authors = []
         self.top_level_prs = []
+        self.subject_prs = set()
         self.cherry_pick_commit = None
         self.revert_commit = None
         self.commit_to_info_hook = commit_to_info_hook
@@ -307,6 +313,18 @@ class GitCommit:
         if self.info.lines and self.info.lines[0] == 'Update copyright years.':
             return
 
+        # Extract PR numbers form the subject line
+        # Match either [PRnnnn] / (PRnnnn) or PR component/nnnn
+        if not subject_pr_skip_regex.search(info.subject):
+            self.subject_prs = \
+                 set([m.group('pr')
+                      for m in subject_pr2_regex.finditer(info.subject)])
+            for m in subject_pr_regex.finditer(info.subject):
+                if not m.group('component') in bug_components:
+                    self.errors.append(Error('invalid PR component in subject',
+                                             info.subject))
+                self.subject_prs.add(m.group('pr'))
+
         # Identify first if the commit is a Revert commit
         for line in self.info.lines:
             m = revert_regex.match(line)
@@ -346,6 +364,10 @@ class GitCommit:
             if not self.errors:
                 self.check_mentioned_files()
                 self.check_for_correct_changelog()
+        if self.subject_prs:
+            self.errors.append(Error("PR %s in subject but not in changelog" %
+                                     ", ".join(self.subject_prs),
+                                     self.info.subject))
 
     @property
     def success(self):
@@ -460,7 +482,8 @@ class GitCommit:
                     else:
                         author_tuple = (m.group('name'), None)
                 elif pr_regex.match(line):
-                    component = pr_regex.match(line).group('component')
+                    m = pr_regex.match(line)
+                    component = m.group('component')
                     if not component:
                         self.errors.append(Error('missing PR component', line))
                         continue
@@ -469,6 +492,8 @@ class GitCommit:
                         continue
                     else:
                         pr_line = line.lstrip()
+                    if m.group('pr') in self.subject_prs:
+                        self.subject_prs.remove(m.group('pr'))
                 elif dr_regex.match(line):
                     pr_line = line.lstrip()
 
diff --git a/contrib/gcc-changelog/git_email.py b/contrib/gcc-changelog/git_email.py
index fa62e3ad2f7..0f6460c3f0d 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -28,6 +28,7 @@ from unidiff import PatchSet, PatchedFile
 
 DATE_PREFIX = 'Date: '
 FROM_PREFIX = 'From: '
+SUBJECT_PREFIX = 'Subject: '
 unidiff_supports_renaming = hasattr(PatchedFile(), 'is_rename')
 
 
@@ -37,7 +38,9 @@ class GitEmail(GitCommit):
         diff = PatchSet.from_filename(filename)
         date = None
         author = None
+        subject = None
 
+        subject_last = False
         with open(self.filename, 'r') as f:
             lines = f.read().splitlines()
         lines = list(takewhile(lambda line: line != '---', lines))
@@ -46,6 +49,15 @@ class GitEmail(GitCommit):
                 date = parse(line[len(DATE_PREFIX):])
             elif line.startswith(FROM_PREFIX):
                 author = GitCommit.format_git_author(line[len(FROM_PREFIX):])
+            elif line.startswith(SUBJECT_PREFIX):
+                subject = line[len(SUBJECT_PREFIX):]
+                subject_last = True
+            elif subject_last and line.startswith(' '):
+                subject += line
+            elif line == '':
+                break
+            else:
+                subject_last = False
         header = list(takewhile(lambda line: line != '', lines))
         body = lines[len(header) + 1:]
 
@@ -67,7 +79,7 @@ class GitEmail(GitCommit):
             else:
                 t = 'M'
             modified_files.append((target if t != 'D' else source, t))
-        git_info = GitInfo(None, date, author, body, modified_files)
+        git_info = GitInfo(None, date, author, subject, body, modified_files)
         super().__init__(git_info,
                          commit_to_info_hook=lambda x: None)
 
diff --git a/contrib/gcc-changelog/git_repository.py b/contrib/gcc-changelog/git_repository.py
index 2d688826ff8..d2aed8fa14f 100755
--- a/contrib/gcc-changelog/git_repository.py
+++ b/contrib/gcc-changelog/git_repository.py
@@ -59,8 +59,9 @@ def parse_git_revisions(repo_path, revisions, ref_name=None):
 
             date = datetime.utcfromtimestamp(c.committed_date)
             author = '%s  <%s>' % (c.author.name, c.author.email)
-            git_info = GitInfo(c.hexsha, date, author,
-                               c.message.split('\n'), modified_files)
+            message = c.message.split('\n')
+            git_info = GitInfo(c.hexsha, date, author, message[0],
+                               message[1:], modified_files)
             return git_info
         except ValueError:
             return None
diff --git a/contrib/gcc-changelog/test_email.py b/contrib/gcc-changelog/test_email.py
index 6d6596370c4..aff2cf855dd 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -427,3 +427,12 @@ class TestGccChangelog(unittest.TestCase):
     def test_multi_same_file(self):
         email = self.from_patch_glob('0001-OpenMP-Fix-SIMT')
         assert email.errors[0].message == 'same file specified multiple times'
+
+    def test_pr_only_in_subject(self):
+        email = self.from_patch_glob('0001-rs6000-Support-doubleword')
+        assert (email.errors[0].message ==
+                'PR 100085 in subject but not in changelog')
+
+    def test_wrong_pr_comp_in_subject(self):
+        email = self.from_patch_glob('pr-wrong-comp.patch')
+        assert email.errors[0].message == 'invalid PR component in subject'
diff --git a/contrib/gcc-changelog/test_patches.txt b/contrib/gcc-changelog/test_patches.txt
index 39d40b88618..dbd377fba2d 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -1461,6 +1461,7 @@ Subject: [PATCH 0030/2034] 	PR c++/92746 - ICE with noexcept of function
 Another place that needs to specially handle Concepts TS function-style
 concepts.
 
+	PR c++/92746
 	* except.c (check_noexcept_r): Handle concept-check.
 ---
  gcc/cp/ChangeLog                            | 3 +++
@@ -1977,7 +1978,7 @@ index aac31d02b6c..56c470f6ecf 100644
 From 5194b51ed9714808d88827531e91474895b6c706 Mon Sep 17 00:00:00 2001
 From: Jason Merrill <jason@redhat.com>
 Date: Thu, 16 Jan 2020 16:55:39 -0500
-Subject: [PATCH 0121/2034] PR c++/93286 - ICE with __is_constructible and
+Subject: [PATCH 0121/2034] Revert PR c++/93286 - ICE with __is_constructible and
  variadic template.
 
 Here we had been recursing in tsubst_copy_and_build if type2 was a TREE_LIST
@@ -3406,3 +3407,44 @@ index 00000000000..21540512e23
 +
 -- 
 2.25.1
+=== 0001-rs6000-Support-doubleword ===
+From f700e4b0ee3ef53b48975cf89be26b9177e3a3f3 Mon Sep 17 00:00:00 2001
+From: Xionghu Luo <luoxhu@linux.ibm.com>
+Date: Tue, 8 Jun 2021 21:48:12 -0500
+Subject: [PATCH] rs6000: Support doubleword swaps removal in rot64 load store
+ [PR100085]
+
+gcc/testsuite/ChangeLog:
+
+	* gcc.target/powerpc/pr100085.c: New test.
+---
+diff --git a/gcc/testsuite/gcc.target/powerpc/pr100085.c b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+new file mode 100644
+index 00000000000..7d8b147b127
+--- /dev/null
++++ b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+@@ -0,0 +1,1 @@
++
+-- 
+2.25.1
+=== pr-wrong-comp.patch ===
+From 5194b51ed9714808d88827531e91474895b6c706 Mon Sep 17 00:00:00 2001
+From: Jason Merrill <jason@redhat.com>
+Date: Thu, 16 Jan 2020 16:55:39 -0500
+Subject: [PATCH 0121/2034] PR some/93286 - ICE with __is_constructible and
+ variadic template.
+
+gcc/testsuite/ChangeLog:
+
+	PR c++/93286
+	* gcc.target/powerpc/pr100085.c: New test.
+---
+diff --git a/gcc/testsuite/gcc.target/powerpc/pr100085.c b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+new file mode 100644
+index 00000000000..7d8b147b127
+--- /dev/null
++++ b/gcc/testsuite/gcc.target/powerpc/pr100085.c
+@@ -0,0 +1,1 @@
++
+-- 
+2.25.1

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog
  2021-06-10 11:51         ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog (was:: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
@ 2021-06-10 11:54           ` Florian Weimer
  2021-06-10 12:45           ` Jonathan Wakely
  1 sibling, 0 replies; 84+ messages in thread
From: Florian Weimer @ 2021-06-10 11:54 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: Martin Liška, Xionghu Luo, gcc

* Tobias Burnus:

> On 10.06.21 10:07, Martin Liška wrote:
>> On 6/10/21 8:35 AM, Tobias Burnus wrote:
>>> One options would be to require a 'PR <comp>/<nnnnn+>' line if there is
>>> 'PRnnnnn+' in the commit title, rejecting the commit otherwise.
>>
>> Quite interesting idea! Are you willing to prepare a patch for it?
>
> Done.
>
> I was thinking of 'PRs 1234 and 4566' but it soon became too special and
> I did not want to reject valid cases (false positive), hence, I settled
> on the attached variant. (It should detect the most common issues, thus,
> I don't worry about false negatives.)
>
> OK? Comments?

Typo in commit subject: “contrig”

Thanks,
Florian


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog
  2021-06-10 11:51         ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog (was:: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
  2021-06-10 11:54           ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog Florian Weimer
@ 2021-06-10 12:45           ` Jonathan Wakely
  1 sibling, 0 replies; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-10 12:45 UTC (permalink / raw)
  To: tobias; +Cc: gcc

As well as the "contrig" typo that Florian noticed, the subject says
"in in" which should be "is in". And it should be CC'd to gcc-patches.

I like this more than my attempt, however ...

>--- a/contrib/gcc-changelog/git_repository.py
>+++ b/contrib/gcc-changelog/git_repository.py
>@@ -59,8 +59,9 @@ def parse_git_revisions(repo_path, revisions, ref_name=None):
>
>             date = datetime.utcfromtimestamp(c.committed_date)
>             author = '%s  <%s>' % (c.author.name, c.author.email)
>-            git_info = GitInfo(c.hexsha, date, author,
>-                               c.message.split('\n'), modified_files)
>+            message = c.message.split('\n')
>+            git_info = GitInfo(c.hexsha, date, author, message[0],
>+                               message[1:], modified_files)

Doesn't using message[1:] here mean that other checks which currently
look at all of self.info.lines will no longer check the subject line?

For example, we have:

        # Skip Update copyright years commits
        if self.info.lines and self.info.lines[0] == 'Update copyright years.':
            return

This will never match now, because you've extracted that into the
'subject' instead.

Would it be better to leave the first line in there, and just have it
duplicated in the new GitInfo.subject member? Or at least change the
check above to look at the subject, and review if other checks looking
at self.info.lines need to also check self.info.subject.


Aside: We should also have a check that the second line is blank, i.e.
the commit message is a single line subject, followed by blank,
followed by the body. And if we enforced that, then message[2:] would
be better than message[1:].


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 10:40               ` Jonathan Wakely
@ 2021-06-10 14:55                 ` Martin Sebor
  2021-06-10 15:54                   ` Tobias Burnus
  2021-06-10 15:56                   ` Jonathan Wakely
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-10 14:55 UTC (permalink / raw)
  To: Jonathan Wakely, Jakub Jelinek; +Cc: gcc

On 6/10/21 4:40 AM, Jonathan Wakely via Gcc wrote:
> On Thu, 10 Jun 2021 at 11:08, Jakub Jelinek <jakub@redhat.com> wrote:
>>
>> On Thu, Jun 10, 2021 at 11:01:49AM +0100, Jonathan Wakely via Gcc wrote:
>>> On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
>>>>> Quite interesting idea! Are you willing to prepare a patch for it?
>>>>
>>>> This works.
>>>
>>> And this works better, because it checks the PR in the title matches
>>> one in the changelog.
>>>
>>> I'll get something added to the tests and prep this for commit.
>>
>> Note, some commits fix more than one PR.  Sometimes the subject lists
>> just one of them and the ChangeLog several, at other times people mention
>> [PRnnnnnn, PRnnnnnn] etc. in the subject.
>> I think checking that at least one changeLog PR line matches at least one PR
>> in the subject would be good enough.
>>
>> Your regex will not match [PR123456, PR123457] in subject, perhaps ok
> 
> Yeah, that wouldn't get matched, so no checks would be done for the
> changelog body. Not ideal, but better than what we have no where
> nothing is checked at all.
> 
>> initially, and if I read it will will be happy if at least one line matches
>> it.
> 
> Yes, if the summary line has a single PR number, it must be present in
> the changelog body. Other PR numbers can also be in the body, and they
> aren't checked.
> 
> But I've hit an issue trying to test it, because the testcases in
> contrib/gcc-changelog/test_patches.txt are in the form of emails, and
> the Subject: line from the emails is not passed to the GitInfo
> constructor, so isn't part of the message that gets checked.
> 
> Martin, Shouldn't the GitEmail class extract the Subject: from the
> email header and use that as the first line passed to the GitInfo
> object?
> 

I'm a little lost as to what's being changed, and, truth be told,
what exactly the current "right" format is.  Where are the PRnnnnn
strings recognized as special?

The ChangeLog description doesn't seem to cover this and I've been
assuming they're recognized anywhere in the ChangeLog message, but
I think I also noticed they don't always end up updating all
the bugs.

FWIW, in commits for multiple PRs I've been adding a Resolves
line like this:
   https://gcc.gnu.org/pipermail/gcc-cvs/2021-May/347414.html

I usually also add the PR numbers under each ChangeLog but I'm not
sure it's necessary.  It would be good to know and for the ChangeLog
convention to document how exactly this works, and if something
changes, to update the documentation.

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 14:55                 ` Martin Sebor
@ 2021-06-10 15:54                   ` Tobias Burnus
  2021-06-10 16:05                     ` Jonathan Wakely
  2021-06-10 15:56                   ` Jonathan Wakely
  1 sibling, 1 reply; 84+ messages in thread
From: Tobias Burnus @ 2021-06-10 15:54 UTC (permalink / raw)
  To: Martin Sebor, Jonathan Wakely, Jakub Jelinek; +Cc: gcc

On 10.06.21 16:55, Martin Sebor via Gcc wrote:
> I'm a little lost as to what's being changed, and, truth be told,
> what exactly the current "right" format is.  Where are the PRnnnnn
> strings recognized as special?

For my version of the patch at least, which is
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572483.html :

If the first line (subject line) contains '.... [PR12345]', that commit
is now rejected unless 'PR comp/12345' also in in the 'body'/changelog
part of the patch.

(This does not apply to your commit as you don't have such number.)

That avoids issues where one has added it to the subject line (which
usually matches also the email subject) but has forgotten to include it
also in GCC's patch format. That avoids a common pitfall of not having
the PR in the generated changelog files and the commit is not added as
Bugzilla comment.

That's orthogonal to questions like: how should commits be formatted in
general; do we want to extend the requirements, what exactly should be
written in the first line of the commit etc.

Regarding:

> The ChangeLog description doesn't seem to cover this and I've been
> assuming they're recognized anywhere in the ChangeLog message, but
> I think I also noticed they don't always end up updating all
> the bugs.
>
> FWIW, in commits for multiple PRs I've been adding a Resolves
> line like this:
>   https://gcc.gnu.org/pipermail/gcc-cvs/2021-May/347414.html

The current pattern matches something like '^\tPR component/12345$'. In
your commits, you have the Bugzilla title after the PR number, which is
currently not matched. — Your commit also includes the same numbers
without tailing strings, hence, for the linked commit, the script should
have extracted the PR number.

(One can argue whether the accepted format should/could be extended but
in the ChangeLog files, it also was just <tab>PR <comp>/<number><new line>.)

> I usually also add the PR numbers under each ChangeLog but I'm not
> sure it's necessary.

You can also have the PR lines before the first 'directory/' or
'directory/ChangeLog' line, then it applies to all ChangeLog files. —
Thus, you do not need to repeat it. (With the caveat of above.)

> It would be good to know and for the ChangeLog
> convention to document how exactly this works, and if something
> changes, to update the documentation.

I think the accepted changelog convention has not really changed. And
the changelog script mostly diagnosed issues which were also wrong
before – or are obviously wrong. In my experience, it finds a lot of
valid issues and rejects only very little which should be accepted.

There is an unofficial convention to use [PR1234] in the first line of
the commit (and the email subject), but that's not enforced by the script.

Independent of the PR matching and checking issues, I think Jonathan was
thinking about extending the documentation (as I gathered from IRC
#gcc); I did not quite follow whether it is about best practice or
contained bits which should be enforced.

Tobias


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 14:55                 ` Martin Sebor
  2021-06-10 15:54                   ` Tobias Burnus
@ 2021-06-10 15:56                   ` Jonathan Wakely
  2021-06-10 17:06                     ` Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-10 15:56 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jakub Jelinek, gcc

On Thu, 10 Jun 2021 at 15:56, Martin Sebor wrote:
>
> On 6/10/21 4:40 AM, Jonathan Wakely via Gcc wrote:
> > On Thu, 10 Jun 2021 at 11:08, Jakub Jelinek <jakub@redhat.com> wrote:
> >>
> >> On Thu, Jun 10, 2021 at 11:01:49AM +0100, Jonathan Wakely via Gcc wrote:
> >>> On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
> >>>>> Quite interesting idea! Are you willing to prepare a patch for it?
> >>>>
> >>>> This works.
> >>>
> >>> And this works better, because it checks the PR in the title matches
> >>> one in the changelog.
> >>>
> >>> I'll get something added to the tests and prep this for commit.
> >>
> >> Note, some commits fix more than one PR.  Sometimes the subject lists
> >> just one of them and the ChangeLog several, at other times people mention
> >> [PRnnnnnn, PRnnnnnn] etc. in the subject.
> >> I think checking that at least one changeLog PR line matches at least one PR
> >> in the subject would be good enough.
> >>
> >> Your regex will not match [PR123456, PR123457] in subject, perhaps ok
> >
> > Yeah, that wouldn't get matched, so no checks would be done for the
> > changelog body. Not ideal, but better than what we have no where
> > nothing is checked at all.
> >
> >> initially, and if I read it will will be happy if at least one line matches
> >> it.
> >
> > Yes, if the summary line has a single PR number, it must be present in
> > the changelog body. Other PR numbers can also be in the body, and they
> > aren't checked.
> >
> > But I've hit an issue trying to test it, because the testcases in
> > contrib/gcc-changelog/test_patches.txt are in the form of emails, and
> > the Subject: line from the emails is not passed to the GitInfo
> > constructor, so isn't part of the message that gets checked.
> >
> > Martin, Shouldn't the GitEmail class extract the Subject: from the
> > email header and use that as the first line passed to the GitInfo
> > object?
> >
>
> I'm a little lost as to what's being changed,

Nothing is being changed.

If your patch is related to a bug, you're supposed to say so:
https://gcc.gnu.org/codingconventions.html#ChangeLogs

And you're also supposed to put a [PRnnnn] tag in the email subject
line (which best practices say should also be the Subject: of the
email you send to gcc-patches):
https://gcc.gnu.org/contribute.html#patches

The patches being proposed verify that if you put [PRnnnn] at the end
of the subject line, that you also put "PR component/nnnn" in the
changelog part. Because if you didn't, something is wrong and you're
failing to follow the existing policy (why would you put a [PRnnnn]
tag in the subject if it's not related to that PR, and if it is
related, why are you not putting it in the changelog part?)

> and, truth be told,
> what exactly the current "right" format is.  Where are the PRnnnnn
> strings recognized as special?
>
> The ChangeLog description doesn't seem to cover this and I've been
> assuming they're recognized anywhere in the ChangeLog message, but
> I think I also noticed they don't always end up updating all
> the bugs.

There are various reasons for that, including that non-ASCII
characters tend to break the automation.

But in general, if you follow the policy of mentioning "PR
component/nnnn" in the changelog, your git commit should add a comment
to the mentioned bugzilla PRs.

Within a bugzilla comment, any string mentioning "PR nnnn" or "Bug
nnnn" (and variations on those) will get dynamically turned into a
link to that bug, but I think that happens on the fly when rendering
the HTML, and it's separate from the automation that adds git commits
to bugzilla as comments. I think the automation to add commits to
bugzilla uses a much stricter pattern to recognise a PR mentioned in
the commit.

>
> FWIW, in commits for multiple PRs I've been adding a Resolves
> line like this:
>    https://gcc.gnu.org/pipermail/gcc-cvs/2021-May/347414.html
>
> I usually also add the PR numbers under each ChangeLog but I'm not
> sure it's necessary.  It would be good to know and for the ChangeLog
> convention to document how exactly this works, and if something
> changes, to update the documentation.

It already says you should put it in the changelog. For each changelog
piece you add it to, it will be added to that relevant ChangeLog file.
It probably makes sense to add it to all of them, unless some piece of
the commit really isn't relevant to that PR (but then it should
probably be a separate commit!)

You can also put it once, before all the changelog entries e.g.
https://gcc.gnu.org/g:91349e57bbfd010156b9128b2ad751c8843e7245
which got added to two ChangeLog files by the daily bump script:
https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/cp/ChangeLog;h=5a97fc84264e85b60adfdc50187ce7faeeb6f86f;hp=225b891700e88a58639d7ea0f10ad76ffb8d87f4;hb=c60387214593445d1514bf7852f27f4523458cda;hpb=25e5ecdf82b49977e86bfaded236fb34af2705ed
https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/testsuite/ChangeLog;h=9e31d686e1cb3b63b2141f86e5394aa1794ecdf5;hp=640fcbed0ebb9ed5e0b0ab12c52e1950284f9ee9;hb=4f625f47b4456e5c05a025fca4d072831e59126c;hpb=53cb324cb4f9475d4eabcd9f5a858c5edaacc0cf

I do want to improve the https://gcc.gnu.org/contribute.html#patches
and https://gcc.gnu.org/codingconventions.html#ChangeLogs docs though.
We shouldn't have the info split in two places, and we should update
the recommendations to reflect current practices.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 15:54                   ` Tobias Burnus
@ 2021-06-10 16:05                     ` Jonathan Wakely
  0 siblings, 0 replies; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-10 16:05 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: Martin Sebor, Jakub Jelinek, gcc

On Thu, 10 Jun 2021 at 16:55, Tobias Burnus <tobias@codesourcery.com> wrote:
> Independent of the PR matching and checking issues, I think Jonathan was
> thinking about extending the documentation (as I gathered from IRC
> #gcc);

Right.

> I did not quite follow whether it is about best practice or
> contained bits which should be enforced.

My complaints about the current docs:

We do not say you should have a one-line subject in your git commit.
All we say at https://gcc.gnu.org/codingconventions.html#ChangeLogs is
that you should have "a leading text with git commit description".
This is vague. We should follow best practices, e.g.
https://chris.beams.io/posts/git-commit/ and **require** a one-line
summary, followed by a blank line, followed by the rest of the commit
msg (including the ChangeLog entries).

We do describe a preferred format for a one-line subject, but only
when documenting the format of emails to gcc-patches, at
https://gcc.gnu.org/contribute.html#patches
This is silly, we should not be recommending different formats for git
commits and patch emails, we should have one format which works for
both (which is what many people are already doing anyway). We should
document that format properly, in the coding conventions, and then the
email guidelines are much simpler because people have already written
good commit messages with one-line subjects.

So I want to improve the docs to describe best practices. If everybody
agrees with those best practices, we could start to enforce them.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 15:56                   ` Jonathan Wakely
@ 2021-06-10 17:06                     ` Martin Sebor
  2021-06-10 17:20                       ` Martin Sebor
  2021-06-11  9:08                       ` Jonathan Wakely
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-10 17:06 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, gcc

On 6/10/21 9:56 AM, Jonathan Wakely wrote:
> On Thu, 10 Jun 2021 at 15:56, Martin Sebor wrote:
>>
>> On 6/10/21 4:40 AM, Jonathan Wakely via Gcc wrote:
>>> On Thu, 10 Jun 2021 at 11:08, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>
>>>> On Thu, Jun 10, 2021 at 11:01:49AM +0100, Jonathan Wakely via Gcc wrote:
>>>>> On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
>>>>>>> Quite interesting idea! Are you willing to prepare a patch for it?
>>>>>>
>>>>>> This works.
>>>>>
>>>>> And this works better, because it checks the PR in the title matches
>>>>> one in the changelog.
>>>>>
>>>>> I'll get something added to the tests and prep this for commit.
>>>>
>>>> Note, some commits fix more than one PR.  Sometimes the subject lists
>>>> just one of them and the ChangeLog several, at other times people mention
>>>> [PRnnnnnn, PRnnnnnn] etc. in the subject.
>>>> I think checking that at least one changeLog PR line matches at least one PR
>>>> in the subject would be good enough.
>>>>
>>>> Your regex will not match [PR123456, PR123457] in subject, perhaps ok
>>>
>>> Yeah, that wouldn't get matched, so no checks would be done for the
>>> changelog body. Not ideal, but better than what we have no where
>>> nothing is checked at all.
>>>
>>>> initially, and if I read it will will be happy if at least one line matches
>>>> it.
>>>
>>> Yes, if the summary line has a single PR number, it must be present in
>>> the changelog body. Other PR numbers can also be in the body, and they
>>> aren't checked.
>>>
>>> But I've hit an issue trying to test it, because the testcases in
>>> contrib/gcc-changelog/test_patches.txt are in the form of emails, and
>>> the Subject: line from the emails is not passed to the GitInfo
>>> constructor, so isn't part of the message that gets checked.
>>>
>>> Martin, Shouldn't the GitEmail class extract the Subject: from the
>>> email header and use that as the first line passed to the GitInfo
>>> object?
>>>
>>
>> I'm a little lost as to what's being changed,
> 
> Nothing is being changed.

The script is being changed and Tobias' proposal that I understand
is being discussed is:

 >> One options would be to require a 'PR <comp>/<nnnnn+>' line if there is
 >> 'PRnnnnn+' in the commit title, rejecting the commit otherwise.

That sounds like a new requirement.  I want to understand what that
will mean for me.

> 
> If your patch is related to a bug, you're supposed to say so:
> https://gcc.gnu.org/codingconventions.html#ChangeLogs
> 
> And you're also supposed to put a [PRnnnn] tag in the email subject
> line (which best practices say should also be the Subject: of the
> email you send to gcc-patches):
> https://gcc.gnu.org/contribute.html#patches
> 
> The patches being proposed verify that if you put [PRnnnn] at the end
> of the subject line, that you also put "PR component/nnnn" in the
> changelog part. Because if you didn't, something is wrong and you're
> failing to follow the existing policy (why would you put a [PRnnnn]
> tag in the subject if it's not related to that PR, and if it is
> related, why are you not putting it in the changelog part?)

By "the subject line" are you referring to what the ChangeLog calls
$git_description, and, AFAICS, consists of multiple lines?  (Based
on the Example patch on the conventions page.  In that example,
the PR87763 reference is in the middle of line 3.)

I find this confusing.  Either we're talking about the Subject line
of the email I send to gcc-patches with the patch or we're talking
about the Git description ($git_description).  If they're supposed
to be the same the ChangeLog convention should say so, and
the Example patch should be changed to show that.

> 
>> and, truth be told,
>> what exactly the current "right" format is.  Where are the PRnnnnn
>> strings recognized as special?
>>
>> The ChangeLog description doesn't seem to cover this and I've been
>> assuming they're recognized anywhere in the ChangeLog message, but
>> I think I also noticed they don't always end up updating all
>> the bugs.
> 
> There are various reasons for that, including that non-ASCII
> characters tend to break the automation.
> 
> But in general, if you follow the policy of mentioning "PR
> component/nnnn" in the changelog, your git commit should add a comment
> to the mentioned bugzilla PRs.

Again, I'm not sure what exactly you mean by "the changelog" here.
Are you referring to the whole commit message or just to what
the conventions refer to as the ChangeLog entry?  I'm guessing
the latter but my understanding of this proposal is to also require
all the affected PRnnnnn references in the first line of the commit
message (or the whole $git_description).

> 
> Within a bugzilla comment, any string mentioning "PR nnnn" or "Bug
> nnnn" (and variations on those) will get dynamically turned into a
> link to that bug, but I think that happens on the fly when rendering
> the HTML, and it's separate from the automation that adds git commits
> to bugzilla as comments. I think the automation to add commits to
> bugzilla uses a much stricter pattern to recognise a PR mentioned in
> the commit.

I'm familiar with this feature and I've been (naively) assuming
the ChangeLog/Git commit hook script worked similarly.  I.e.,
wherever I mention PRnnnnn (or the other forms) in a commit
message the script interprets it as a reference to a bug that
it then updates with the commit.

If it doesn't (as it sounds like is the case) it might be helpful
to mention that and explain how it's different. (Although I think
it would be better to make both work the same.)

I understand that PRnmmmm references under each ChangeLog entry are
added verbatim to the ChangeLog file, as you explain below. I don't
care about that.  I just want to make sure the right bugs get updated
for my commits.

> 
>>
>> FWIW, in commits for multiple PRs I've been adding a Resolves
>> line like this:
>>     https://gcc.gnu.org/pipermail/gcc-cvs/2021-May/347414.html
>>
>> I usually also add the PR numbers under each ChangeLog but I'm not
>> sure it's necessary.  It would be good to know and for the ChangeLog
>> convention to document how exactly this works, and if something
>> changes, to update the documentation.
> 
> It already says you should put it in the changelog. For each changelog
> piece you add it to, it will be added to that relevant ChangeLog file.
> It probably makes sense to add it to all of them, unless some piece of
> the commit really isn't relevant to that PR (but then it should
> probably be a separate commit!)
> 
> You can also put it once, before all the changelog entries e.g.
> https://gcc.gnu.org/g:91349e57bbfd010156b9128b2ad751c8843e7245
> which got added to two ChangeLog files by the daily bump script:
> https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/cp/ChangeLog;h=5a97fc84264e85b60adfdc50187ce7faeeb6f86f;hp=225b891700e88a58639d7ea0f10ad76ffb8d87f4;hb=c60387214593445d1514bf7852f27f4523458cda;hpb=25e5ecdf82b49977e86bfaded236fb34af2705ed
> https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/testsuite/ChangeLog;h=9e31d686e1cb3b63b2141f86e5394aa1794ecdf5;hp=640fcbed0ebb9ed5e0b0ab12c52e1950284f9ee9;hb=4f625f47b4456e5c05a025fca4d072831e59126c;hpb=53cb324cb4f9475d4eabcd9f5a858c5edaacc0cf
> 
> I do want to improve the https://gcc.gnu.org/contribute.html#patches
> and https://gcc.gnu.org/codingconventions.html#ChangeLogs docs though.
> We shouldn't have the info split in two places, and we should update
> the recommendations to reflect current practices.
> 

Great!  I'll do my best to help, at a minimum by critiquing your
improvements ;)

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 17:06                     ` Martin Sebor
@ 2021-06-10 17:20                       ` Martin Sebor
  2021-06-10 17:30                         ` Jakub Jelinek
  2021-06-11  9:08                       ` Jonathan Wakely
  1 sibling, 1 reply; 84+ messages in thread
From: Martin Sebor @ 2021-06-10 17:20 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, gcc

On 6/10/21 11:06 AM, Martin Sebor wrote:
> On 6/10/21 9:56 AM, Jonathan Wakely wrote:
>> On Thu, 10 Jun 2021 at 15:56, Martin Sebor wrote:
>>>
>>> On 6/10/21 4:40 AM, Jonathan Wakely via Gcc wrote:
>>>> On Thu, 10 Jun 2021 at 11:08, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>>
>>>>> On Thu, Jun 10, 2021 at 11:01:49AM +0100, Jonathan Wakely via Gcc 
>>>>> wrote:
>>>>>> On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
>>>>>>>> Quite interesting idea! Are you willing to prepare a patch for it?
>>>>>>>
>>>>>>> This works.
>>>>>>
>>>>>> And this works better, because it checks the PR in the title matches
>>>>>> one in the changelog.
>>>>>>
>>>>>> I'll get something added to the tests and prep this for commit.
>>>>>
>>>>> Note, some commits fix more than one PR.  Sometimes the subject lists
>>>>> just one of them and the ChangeLog several, at other times people 
>>>>> mention
>>>>> [PRnnnnnn, PRnnnnnn] etc. in the subject.
>>>>> I think checking that at least one changeLog PR line matches at 
>>>>> least one PR
>>>>> in the subject would be good enough.
>>>>>
>>>>> Your regex will not match [PR123456, PR123457] in subject, perhaps ok
>>>>
>>>> Yeah, that wouldn't get matched, so no checks would be done for the
>>>> changelog body. Not ideal, but better than what we have no where
>>>> nothing is checked at all.
>>>>
>>>>> initially, and if I read it will will be happy if at least one line 
>>>>> matches
>>>>> it.
>>>>
>>>> Yes, if the summary line has a single PR number, it must be present in
>>>> the changelog body. Other PR numbers can also be in the body, and they
>>>> aren't checked.
>>>>
>>>> But I've hit an issue trying to test it, because the testcases in
>>>> contrib/gcc-changelog/test_patches.txt are in the form of emails, and
>>>> the Subject: line from the emails is not passed to the GitInfo
>>>> constructor, so isn't part of the message that gets checked.
>>>>
>>>> Martin, Shouldn't the GitEmail class extract the Subject: from the
>>>> email header and use that as the first line passed to the GitInfo
>>>> object?
>>>>
>>>
>>> I'm a little lost as to what's being changed,
>>
>> Nothing is being changed.
> 
> The script is being changed and Tobias' proposal that I understand
> is being discussed is:
> 
>  >> One options would be to require a 'PR <comp>/<nnnnn+>' line if there is
>  >> 'PRnnnnn+' in the commit title, rejecting the commit otherwise.
> 
> That sounds like a new requirement.  I want to understand what that
> will mean for me.
> 
>>
>> If your patch is related to a bug, you're supposed to say so:
>> https://gcc.gnu.org/codingconventions.html#ChangeLogs
>>
>> And you're also supposed to put a [PRnnnn] tag in the email subject
>> line (which best practices say should also be the Subject: of the
>> email you send to gcc-patches):
>> https://gcc.gnu.org/contribute.html#patches
>>
>> The patches being proposed verify that if you put [PRnnnn] at the end
>> of the subject line, that you also put "PR component/nnnn" in the
>> changelog part. Because if you didn't, something is wrong and you're
>> failing to follow the existing policy (why would you put a [PRnnnn]

One other note: you mention policy above, which suggests a requirement.
My understanding is that the format of a commit message is a convention
put in place to take some of the tedium out of creating ChangeLog
entries.  If that's still correct, I would be more in favor of making
the Git commit hook smarter and more forgiving than of adding new
requirements imposing a more rigid format.

Recognizing a PRnnnnn anywhere within the commit message and "doing
the thing" rather than rejecting a commit if it doesn't have a PRnnnn
on the first line (but does somewhere else) would be a change I'd vote
for.

>> tag in the subject if it's not related to that PR, and if it is
>> related, why are you not putting it in the changelog part?)
> 
> By "the subject line" are you referring to what the ChangeLog calls
> $git_description, and, AFAICS, consists of multiple lines?  (Based
> on the Example patch on the conventions page.  In that example,
> the PR87763 reference is in the middle of line 3.)
> 
> I find this confusing.  Either we're talking about the Subject line
> of the email I send to gcc-patches with the patch or we're talking
> about the Git description ($git_description).  If they're supposed
> to be the same the ChangeLog convention should say so, and
> the Example patch should be changed to show that.
> 
>>
>>> and, truth be told,
>>> what exactly the current "right" format is.  Where are the PRnnnnn
>>> strings recognized as special?
>>>
>>> The ChangeLog description doesn't seem to cover this and I've been
>>> assuming they're recognized anywhere in the ChangeLog message, but
>>> I think I also noticed they don't always end up updating all
>>> the bugs.
>>
>> There are various reasons for that, including that non-ASCII
>> characters tend to break the automation.
>>
>> But in general, if you follow the policy of mentioning "PR
>> component/nnnn" in the changelog, your git commit should add a comment
>> to the mentioned bugzilla PRs.
> 
> Again, I'm not sure what exactly you mean by "the changelog" here.
> Are you referring to the whole commit message or just to what
> the conventions refer to as the ChangeLog entry?  I'm guessing
> the latter but my understanding of this proposal is to also require
> all the affected PRnnnnn references in the first line of the commit
> message (or the whole $git_description).
> 
>>
>> Within a bugzilla comment, any string mentioning "PR nnnn" or "Bug
>> nnnn" (and variations on those) will get dynamically turned into a
>> link to that bug, but I think that happens on the fly when rendering
>> the HTML, and it's separate from the automation that adds git commits
>> to bugzilla as comments. I think the automation to add commits to
>> bugzilla uses a much stricter pattern to recognise a PR mentioned in
>> the commit.
> 
> I'm familiar with this feature and I've been (naively) assuming
> the ChangeLog/Git commit hook script worked similarly.  I.e.,
> wherever I mention PRnnnnn (or the other forms) in a commit
> message the script interprets it as a reference to a bug that
> it then updates with the commit.
> 
> If it doesn't (as it sounds like is the case) it might be helpful
> to mention that and explain how it's different. (Although I think
> it would be better to make both work the same.)
> 
> I understand that PRnmmmm references under each ChangeLog entry are
> added verbatim to the ChangeLog file, as you explain below. I don't
> care about that.  I just want to make sure the right bugs get updated
> for my commits.
> 
>>
>>>
>>> FWIW, in commits for multiple PRs I've been adding a Resolves
>>> line like this:
>>>     https://gcc.gnu.org/pipermail/gcc-cvs/2021-May/347414.html
>>>
>>> I usually also add the PR numbers under each ChangeLog but I'm not
>>> sure it's necessary.  It would be good to know and for the ChangeLog
>>> convention to document how exactly this works, and if something
>>> changes, to update the documentation.
>>
>> It already says you should put it in the changelog. For each changelog
>> piece you add it to, it will be added to that relevant ChangeLog file.
>> It probably makes sense to add it to all of them, unless some piece of
>> the commit really isn't relevant to that PR (but then it should
>> probably be a separate commit!)
>>
>> You can also put it once, before all the changelog entries e.g.
>> https://gcc.gnu.org/g:91349e57bbfd010156b9128b2ad751c8843e7245
>> which got added to two ChangeLog files by the daily bump script:
>> https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/cp/ChangeLog;h=5a97fc84264e85b60adfdc50187ce7faeeb6f86f;hp=225b891700e88a58639d7ea0f10ad76ffb8d87f4;hb=c60387214593445d1514bf7852f27f4523458cda;hpb=25e5ecdf82b49977e86bfaded236fb34af2705ed 
>>
>> https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/testsuite/ChangeLog;h=9e31d686e1cb3b63b2141f86e5394aa1794ecdf5;hp=640fcbed0ebb9ed5e0b0ab12c52e1950284f9ee9;hb=4f625f47b4456e5c05a025fca4d072831e59126c;hpb=53cb324cb4f9475d4eabcd9f5a858c5edaacc0cf 
>>
>>
>> I do want to improve the https://gcc.gnu.org/contribute.html#patches
>> and https://gcc.gnu.org/codingconventions.html#ChangeLogs docs though.
>> We shouldn't have the info split in two places, and we should update
>> the recommendations to reflect current practices.
>>
> 
> Great!  I'll do my best to help, at a minimum by critiquing your
> improvements ;)
> 
> Martin


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 17:20                       ` Martin Sebor
@ 2021-06-10 17:30                         ` Jakub Jelinek
  2021-06-10 18:55                           ` Martin Sebor
  0 siblings, 1 reply; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-10 17:30 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jonathan Wakely, gcc

On Thu, Jun 10, 2021 at 11:20:26AM -0600, Martin Sebor wrote:
> One other note: you mention policy above, which suggests a requirement.
> My understanding is that the format of a commit message is a convention
> put in place to take some of the tedium out of creating ChangeLog
> entries.  If that's still correct, I would be more in favor of making
> the Git commit hook smarter and more forgiving than of adding new
> requirements imposing a more rigid format.

It is a requirement (perhaps badly worded) which has been there since
January last year.  And it is a good idea that the PR number if any
and some category shows up in git shortlog listing too.

> Recognizing a PRnnnnn anywhere within the commit message and "doing
> the thing" rather than rejecting a commit if it doesn't have a PRnnnn
> on the first line (but does somewhere else) would be a change I'd vote
> for.

The text above the by the script recognized ChangeLog entry, except for
the first line which is treated by git specially, is free text, it could
mention various PRs but doesn't actually mean that they were fixed, just
perhaps in some way related etc.  Furthermore, for commits that fixed
PRs we want those PR numbers be in the ChangeLog files too.

	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 17:30                         ` Jakub Jelinek
@ 2021-06-10 18:55                           ` Martin Sebor
  2021-06-10 19:09                             ` Jakub Jelinek
  0 siblings, 1 reply; 84+ messages in thread
From: Martin Sebor @ 2021-06-10 18:55 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jonathan Wakely, gcc

On 6/10/21 11:30 AM, Jakub Jelinek wrote:
> On Thu, Jun 10, 2021 at 11:20:26AM -0600, Martin Sebor wrote:
>> One other note: you mention policy above, which suggests a requirement.
>> My understanding is that the format of a commit message is a convention
>> put in place to take some of the tedium out of creating ChangeLog
>> entries.  If that's still correct, I would be more in favor of making
>> the Git commit hook smarter and more forgiving than of adding new
>> requirements imposing a more rigid format.
> 
> It is a requirement (perhaps badly worded) which has been there since
> January last year.  And it is a good idea that the PR number if any
> and some category shows up in git shortlog listing too.

Sure, but a preference for the output from a tool doesn't have to
translate into a rigid requirement on human input into another,
making it more onerous than it needs to be.

> 
>> Recognizing a PRnnnnn anywhere within the commit message and "doing
>> the thing" rather than rejecting a commit if it doesn't have a PRnnnn
>> on the first line (but does somewhere else) would be a change I'd vote
>> for.
> 
> The text above the by the script recognized ChangeLog entry, except for
> the first line which is treated by git specially, is free text, it could
> mention various PRs but doesn't actually mean that they were fixed, just
> perhaps in some way related etc.  Furthermore, for commits that fixed
> PRs we want those PR numbers be in the ChangeLog files too.

Instead of rejecting commits that don't mention all the same PRs on
the first line of the commit as in the ChangeLog entries it seems
that the Git commit script could extract the PRnnnn references from
just the ChangeLong entries (if we wanted to constrain it that way
to avoid the unrelated references and append each to the end of
the $git_description if it isn't already there.  I thought it
already did that (and more).

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 18:55                           ` Martin Sebor
@ 2021-06-10 19:09                             ` Jakub Jelinek
  2021-06-10 21:16                               ` Martin Sebor
  0 siblings, 1 reply; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-10 19:09 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jonathan Wakely, gcc

On Thu, Jun 10, 2021 at 12:55:43PM -0600, Martin Sebor wrote:
> Instead of rejecting commits that don't mention all the same PRs on
> the first line of the commit as in the ChangeLog entries it seems
> that the Git commit script could extract the PRnnnn references from
> just the ChangeLong entries (if we wanted to constrain it that way
> to avoid the unrelated references and append each to the end of
> the $git_description if it isn't already there.  I thought it
> already did that (and more).

Just look at the start of this thread.  Some people put
the [PRnnnnn] only in the first line of the commit.  And that is
what these changes want to diagnose, that is an error and results
in bugzilla not being updated.
The current changes won't do the other direction, i.e. if there
is a PR line in the ChangeLog entry complain about [PRnnnnn] missing
in the first line.  The fact that it isn't required by the script doesn't
mean people shouldn't do it, but if they occassionally forget, it is not
that bad as not mentioning it in ChangeLog.

	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 19:09                             ` Jakub Jelinek
@ 2021-06-10 21:16                               ` Martin Sebor
  2021-06-10 21:28                                 ` Jakub Jelinek
  2021-06-11  9:13                                 ` Jonathan Wakely
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-10 21:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jonathan Wakely, gcc

On 6/10/21 1:09 PM, Jakub Jelinek wrote:
> On Thu, Jun 10, 2021 at 12:55:43PM -0600, Martin Sebor wrote:
>> Instead of rejecting commits that don't mention all the same PRs on
>> the first line of the commit as in the ChangeLog entries it seems
>> that the Git commit script could extract the PRnnnn references from
>> just the ChangeLong entries (if we wanted to constrain it that way
>> to avoid the unrelated references and append each to the end of
>> the $git_description if it isn't already there.  I thought it
>> already did that (and more).
> 
> Just look at the start of this thread.  Some people put
> the [PRnnnnn] only in the first line of the commit.  And that is
> what these changes want to diagnose, that is an error and results
> in bugzilla not being updated.

That's Tobias' proposal, yes:

   One options would be to require a 'PR <comp>/<nnnnn+>' line if there
   is 'PRnnnnn+' in the commit title, rejecting the commit otherwise.

I can't think of why rejecting such commits is preferable to having
the script fix them up by copying the PRnnnn strings from the ChangeLog
entries in the commit message into the first line.  So that's my
counterproposal: make the script do the tedious work for us.

> The current changes won't do the other direction, i.e. if there
> is a PR line in the ChangeLog entry complain about [PRnnnnn] missing
> in the first line.  The fact that it isn't required by the script doesn't
> mean people shouldn't do it, but if they occassionally forget, it is not
> that bad as not mentioning it in ChangeLog.

I don't see why the script users should be subjected to this tedium
when it can be done in the script itself with (presumably) only
a little more effort.  The proposed change is, IMO, a step in
the wrong direction.

Martin


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 21:16                               ` Martin Sebor
@ 2021-06-10 21:28                                 ` Jakub Jelinek
  2021-06-10 21:56                                   ` Martin Sebor
  2021-06-11  9:13                                 ` Jonathan Wakely
  1 sibling, 1 reply; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-10 21:28 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jonathan Wakely, gcc

On Thu, Jun 10, 2021 at 03:16:43PM -0600, Martin Sebor wrote:
> > Just look at the start of this thread.  Some people put
> > the [PRnnnnn] only in the first line of the commit.  And that is
> > what these changes want to diagnose, that is an error and results
> > in bugzilla not being updated.
> 
> That's Tobias' proposal, yes:
> 
>   One options would be to require a 'PR <comp>/<nnnnn+>' line if there
>   is 'PRnnnnn+' in the commit title, rejecting the commit otherwise.
> 
> I can't think of why rejecting such commits is preferable to having
> the script fix them up by copying the PRnnnn strings from the ChangeLog
> entries in the commit message into the first line.  So that's my
> counterproposal: make the script do the tedious work for us.

You keep the direction wrong.  What will be rejected is [PR123456]
in the first line and no PR in the ChangeLog.  People make this mistake
often.
That isn't something the script can easily add, first of all, the component
is missing, the script would need to query bugzilla to find that out.
And, it should be user's choice where exactly in the ChangeLog the PR goes,
if it goes into every ChangeLog file, or e.g. just one etc., and that
depends on where exactly the user specified it.

	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 21:28                                 ` Jakub Jelinek
@ 2021-06-10 21:56                                   ` Martin Sebor
  0 siblings, 0 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-10 21:56 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jonathan Wakely, gcc

On 6/10/21 3:28 PM, Jakub Jelinek wrote:
> On Thu, Jun 10, 2021 at 03:16:43PM -0600, Martin Sebor wrote:
>>> Just look at the start of this thread.  Some people put
>>> the [PRnnnnn] only in the first line of the commit.  And that is
>>> what these changes want to diagnose, that is an error and results
>>> in bugzilla not being updated.
>>
>> That's Tobias' proposal, yes:
>>
>>    One options would be to require a 'PR <comp>/<nnnnn+>' line if there
>>    is 'PRnnnnn+' in the commit title, rejecting the commit otherwise.
>>
>> I can't think of why rejecting such commits is preferable to having
>> the script fix them up by copying the PRnnnn strings from the ChangeLog
>> entries in the commit message into the first line.  So that's my
>> counterproposal: make the script do the tedious work for us.
> 
> You keep the direction wrong.  What will be rejected is [PR123456]
> in the first line and no PR in the ChangeLog.  People make this mistake
> often.

Ah!  I did say I found the terminology confusing!  Thanks for
explaining it!

> That isn't something the script can easily add, first of all, the component
> is missing, the script would need to query bugzilla to find that out.
> And, it should be user's choice where exactly in the ChangeLog the PR goes,
> if it goes into every ChangeLog file, or e.g. just one etc., and that
> depends on where exactly the user specified it.

I have a script that does that (looks up the component in Bugzilla
and puts PR component/nnnnn in each ChangeLog entry).  I used it
until Martin wrote mklog.py.

I still think it would be preferable to make mklog.py do the work
at least in the common case(*) since, as you say, people make
the mistake often, but it doesn't seem as bad as if it had been
the other way around.

By the common case I mean a single PR nnnn in the description as
in the commit that led up to this discussion.  The use case of
putting the PR nnnn in some entries and not others doesn't seem
important enough to me to worry about.  Such changes should
arguably be committed separately.

Martin


> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 17:06                     ` Martin Sebor
  2021-06-10 17:20                       ` Martin Sebor
@ 2021-06-11  9:08                       ` Jonathan Wakely
  2021-06-11  9:35                         ` Jonathan Wakely
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-11  9:08 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jakub Jelinek, gcc

On Thu, 10 Jun 2021 at 18:06, Martin Sebor <msebor@gmail.com> wrote:
>
> On 6/10/21 9:56 AM, Jonathan Wakely wrote:
> > On Thu, 10 Jun 2021 at 15:56, Martin Sebor wrote:
> >>
> >> On 6/10/21 4:40 AM, Jonathan Wakely via Gcc wrote:
> >>> On Thu, 10 Jun 2021 at 11:08, Jakub Jelinek <jakub@redhat.com> wrote:
> >>>>
> >>>> On Thu, Jun 10, 2021 at 11:01:49AM +0100, Jonathan Wakely via Gcc wrote:
> >>>>> On 10/06/21 10:44 +0100, Jonathan Wakely wrote:
> >>>>>>> Quite interesting idea! Are you willing to prepare a patch for it?
> >>>>>>
> >>>>>> This works.
> >>>>>
> >>>>> And this works better, because it checks the PR in the title matches
> >>>>> one in the changelog.
> >>>>>
> >>>>> I'll get something added to the tests and prep this for commit.
> >>>>
> >>>> Note, some commits fix more than one PR.  Sometimes the subject lists
> >>>> just one of them and the ChangeLog several, at other times people mention
> >>>> [PRnnnnnn, PRnnnnnn] etc. in the subject.
> >>>> I think checking that at least one changeLog PR line matches at least one PR
> >>>> in the subject would be good enough.
> >>>>
> >>>> Your regex will not match [PR123456, PR123457] in subject, perhaps ok
> >>>
> >>> Yeah, that wouldn't get matched, so no checks would be done for the
> >>> changelog body. Not ideal, but better than what we have no where
> >>> nothing is checked at all.
> >>>
> >>>> initially, and if I read it will will be happy if at least one line matches
> >>>> it.
> >>>
> >>> Yes, if the summary line has a single PR number, it must be present in
> >>> the changelog body. Other PR numbers can also be in the body, and they
> >>> aren't checked.
> >>>
> >>> But I've hit an issue trying to test it, because the testcases in
> >>> contrib/gcc-changelog/test_patches.txt are in the form of emails, and
> >>> the Subject: line from the emails is not passed to the GitInfo
> >>> constructor, so isn't part of the message that gets checked.
> >>>
> >>> Martin, Shouldn't the GitEmail class extract the Subject: from the
> >>> email header and use that as the first line passed to the GitInfo
> >>> object?
> >>>
> >>
> >> I'm a little lost as to what's being changed,
> >
> > Nothing is being changed.
>
> The script is being changed and Tobias' proposal that I understand
> is being discussed is:
>
>  >> One options would be to require a 'PR <comp>/<nnnnn+>' line if there is
>  >> 'PRnnnnn+' in the commit title, rejecting the commit otherwise.
>
> That sounds like a new requirement.  I want to understand what that
> will mean for me.

It's what the policies already recommend. It would mean you cannot say
"[PR1234]" in the first line of your git commit if you don't list a PR
in the changelog. Why would that affect you? Why would you be doing
that anyway? It was always wrong.

> >
> > If your patch is related to a bug, you're supposed to say so:
> > https://gcc.gnu.org/codingconventions.html#ChangeLogs
> >
> > And you're also supposed to put a [PRnnnn] tag in the email subject
> > line (which best practices say should also be the Subject: of the
> > email you send to gcc-patches):
> > https://gcc.gnu.org/contribute.html#patches
> >
> > The patches being proposed verify that if you put [PRnnnn] at the end
> > of the subject line, that you also put "PR component/nnnn" in the
> > changelog part. Because if you didn't, something is wrong and you're
> > failing to follow the existing policy (why would you put a [PRnnnn]
> > tag in the subject if it's not related to that PR, and if it is
> > related, why are you not putting it in the changelog part?)
>
> By "the subject line" are you referring to what the ChangeLog calls
> $git_description, and, AFAICS, consists of multiple lines?  (Based
> on the Example patch on the conventions page.  In that example,
> the PR87763 reference is in the middle of line 3.)


No, I mean the first line of the $git_description, which would also be

Please read https://chris.beams.io/posts/git-commit/ for git log best
practices. If you follow that, your git description has a single line
summary (the subject), followed by a blank line, followed by zero or
more lines of text.

The subject is the first line. Which also makes an excellent email
Subject: when you send the patch.

>
> I find this confusing.  Either we're talking about the Subject line
> of the email I send to gcc-patches with the patch or we're talking
> about the Git description ($git_description).  If they're supposed
> to be the same the ChangeLog convention should say so, and
> the Example patch should be changed to show that.

Yes, which is what I'm planning to do to the contribute.html and
codingconventions.html docs. You are complaining about exactly the
problems I'm pointing out and planning to fix.

We currently describe a useful and rich format for patch emails, but
allow any old garbage in the actual commit message. This is stupid. We
should describe and strongly recommend a useful and rich format for
the commit message itself. Then turning that into a useful email is
utterly trivial (you just add "[PATCH]" or "[committed]" at the start
of the first line of the commit message, and you have yourself a high
quality email for gcc-patches.

Many people are already doing it this way, because it produces better
git commit messages, and takes less effort when sending that as an
email. It's win win. We should document that as best practice, and
persuade everybody to do it that way (and IMHO enforce at least some
of it, but one step at a time).


> >
> >> and, truth be told,
> >> what exactly the current "right" format is.  Where are the PRnnnnn
> >> strings recognized as special?
> >>
> >> The ChangeLog description doesn't seem to cover this and I've been
> >> assuming they're recognized anywhere in the ChangeLog message, but
> >> I think I also noticed they don't always end up updating all
> >> the bugs.
> >
> > There are various reasons for that, including that non-ASCII
> > characters tend to break the automation.
> >
> > But in general, if you follow the policy of mentioning "PR
> > component/nnnn" in the changelog, your git commit should add a comment
> > to the mentioned bugzilla PRs.
>
> Again, I'm not sure what exactly you mean by "the changelog" here.
> Are you referring to the whole commit message or just to what
> the conventions refer to as the ChangeLog entry?

The latter.

>  I'm guessing
> the latter but my understanding of this proposal is to also require
> all the affected PRnnnnn references in the first line of the commit
> message (or the whole $git_description).

No, that is not the proposal, you've got it backwards.

The suggestion is that **IF** the first line mentions a PR, then the
ChangeLog entry must also mention that PR (and optionally, additional
PRs as well). That is not the same as "if the ChangeLog entry mentions
a PR, the first line must mention it too".


>
> >
> > Within a bugzilla comment, any string mentioning "PR nnnn" or "Bug
> > nnnn" (and variations on those) will get dynamically turned into a
> > link to that bug, but I think that happens on the fly when rendering
> > the HTML, and it's separate from the automation that adds git commits
> > to bugzilla as comments. I think the automation to add commits to
> > bugzilla uses a much stricter pattern to recognise a PR mentioned in
> > the commit.
>
> I'm familiar with this feature and I've been (naively) assuming
> the ChangeLog/Git commit hook script worked similarly.  I.e.,
> wherever I mention PRnnnnn (or the other forms) in a commit
> message the script interprets it as a reference to a bug that
> it then updates with the commit.
>
> If it doesn't (as it sounds like is the case) it might be helpful
> to mention that and explain how it's different. (Although I think
> it would be better to make both work the same.)
>
> I understand that PRnmmmm references under each ChangeLog entry are
> added verbatim to the ChangeLog file, as you explain below. I don't
> care about that.  I just want to make sure the right bugs get updated
> for my commits.

The references that get added to the ChangeLog file **also** update
bugzilla (modulo bugs due to non-ASCII characters, or other
unexplained bugs in the BZ updating). So if you follow the policy for
mentioning bugs in the ChangeLog entry (which has been in place for
many, many years), it Just Works.

> >
> >>
> >> FWIW, in commits for multiple PRs I've been adding a Resolves
> >> line like this:
> >>     https://gcc.gnu.org/pipermail/gcc-cvs/2021-May/347414.html
> >>
> >> I usually also add the PR numbers under each ChangeLog but I'm not
> >> sure it's necessary.  It would be good to know and for the ChangeLog
> >> convention to document how exactly this works, and if something
> >> changes, to update the documentation.
> >
> > It already says you should put it in the changelog. For each changelog
> > piece you add it to, it will be added to that relevant ChangeLog file.
> > It probably makes sense to add it to all of them, unless some piece of
> > the commit really isn't relevant to that PR (but then it should
> > probably be a separate commit!)
> >
> > You can also put it once, before all the changelog entries e.g.
> > https://gcc.gnu.org/g:91349e57bbfd010156b9128b2ad751c8843e7245
> > which got added to two ChangeLog files by the daily bump script:
> > https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/cp/ChangeLog;h=5a97fc84264e85b60adfdc50187ce7faeeb6f86f;hp=225b891700e88a58639d7ea0f10ad76ffb8d87f4;hb=c60387214593445d1514bf7852f27f4523458cda;hpb=25e5ecdf82b49977e86bfaded236fb34af2705ed
> > https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/testsuite/ChangeLog;h=9e31d686e1cb3b63b2141f86e5394aa1794ecdf5;hp=640fcbed0ebb9ed5e0b0ab12c52e1950284f9ee9;hb=4f625f47b4456e5c05a025fca4d072831e59126c;hpb=53cb324cb4f9475d4eabcd9f5a858c5edaacc0cf
> >
> > I do want to improve the https://gcc.gnu.org/contribute.html#patches
> > and https://gcc.gnu.org/codingconventions.html#ChangeLogs docs though.
> > We shouldn't have the info split in two places, and we should update
> > the recommendations to reflect current practices.
> >
>
> Great!  I'll do my best to help, at a minimum by critiquing your
> improvements ;)
>
> Martin
>


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-10 21:16                               ` Martin Sebor
  2021-06-10 21:28                                 ` Jakub Jelinek
@ 2021-06-11  9:13                                 ` Jonathan Wakely
  2021-06-11 17:02                                   ` Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-11  9:13 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jakub Jelinek, gcc

On Thu, 10 Jun 2021 at 22:16, Martin Sebor wrote:
> I don't see why the script users should be subjected to this tedium
> when it can be done in the script itself with (presumably) only
> a little more effort.  The proposed change is, IMO, a step in
> the wrong direction.

I don't see why "improve the mklog.py script to automatically follow
the policy" is in conflict with "enforce the policy".

If the script can be improved to do the right thing automatically when
you commit (it's not clear if it can be improved that way, but let's
say it can) then what's the problem with enforcing it when you push
the commit to the server? What are you being "subjected to"? If it's
automated, what is "this tedium"? You've already followed the policy,
why do you care if there are checks to make sure that everybody
follows the policy, including the people not using the script and
writing the entire commit msg by hand?

I'm not saying we are going to enforce every part of the policy
(because the policy isn't even clear right now), but if we wanted to
do it later, I don't understand your objection.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11  9:08                       ` Jonathan Wakely
@ 2021-06-11  9:35                         ` Jonathan Wakely
  2021-06-11 15:43                           ` Joseph Myers
  0 siblings, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-11  9:35 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jakub Jelinek, gcc

On Fri, 11 Jun 2021 at 10:08, Jonathan Wakely wrote:
>
> On Thu, 10 Jun 2021 at 18:06, Martin Sebor <msebor@gmail.com> wrote:
> > By "the subject line" are you referring to what the ChangeLog calls
> > $git_description, and, AFAICS, consists of multiple lines?  (Based
> > on the Example patch on the conventions page.  In that example,
> > the PR87763 reference is in the middle of line 3.)
>
>
> No, I mean the first line of the $git_description, which would also be

Sorry, the end of that sentence got lost on my clipboard.

The first line of the $git_description would also be the Subject of
your email, if you follow established Git best practices (e.g. in the
kernel, GitHub and elsewhere). For further reading:

https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration#_commit_template
https://chris.beams.io/posts/git-commit/
https://reflectoring.io/meaningful-commit-messages/#formatting

This one suggests a lot more structure, which is partially in conflict
with GCC's own guidelines, but it agrees that the commit msg should
start with a single-line description:
https://www.conventionalcommits.org/en/v1.0.0/

If you're not already doing a brief "subject" line in your git
commits, you're Doing It Wrong!

This is why I want the docs to go into more detail on the content of
the $git_description part of the git commit msg, because then reusing
that subject as your email Subject: is a no-brainer. So if you're
confused whether I mean the (first line of the) $git_description or
the email Subject: you should ask yourself why you think there is any
distinction between the two :-)

And for the avoidance of doubt ...

> > Again, I'm not sure what exactly you mean by "the changelog" here.
> > Are you referring to the whole commit message or just to what
> > the conventions refer to as the ChangeLog entry?
>
> The latter.

When I say changelog, I mean the changelog, and when I say commit
message, I mean the whole git commit message. I don't say "changelog"
to mean the whole git commit message. I'm using the terminology of the
codingconventions.html page:
"ChangeLog entries are part of git commit messages and are
automatically put into a corresponding ChangeLog file."

Using "changelog" to mean the whole commit message would be confusing
and/or ambiguous.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11  9:35                         ` Jonathan Wakely
@ 2021-06-11 15:43                           ` Joseph Myers
  2021-06-11 17:02                             ` Jonathan Wakely
  0 siblings, 1 reply; 84+ messages in thread
From: Joseph Myers @ 2021-06-11 15:43 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Martin Sebor, Jakub Jelinek, gcc

On Fri, 11 Jun 2021, Jonathan Wakely via Gcc wrote:

> If you're not already doing a brief "subject" line in your git
> commits, you're Doing It Wrong!

If you don't have a subject line which is more than one word, and does not 
look like a ChangeLog header line, and which is followed by a blank line, 
the commit hooks will reject the commit when you try to push it to the GCC 
repository.  (The check for a blank line after the subject is a generic 
feature of the hooks, the checks for single-word subjects and those that 
look like ChangeLog headers are part of the local commit_checker hook.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11 15:43                           ` Joseph Myers
@ 2021-06-11 17:02                             ` Jonathan Wakely
  0 siblings, 0 replies; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-11 17:02 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Martin Sebor, Jakub Jelinek, gcc

On Fri, 11 Jun 2021 at 16:45, Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Fri, 11 Jun 2021, Jonathan Wakely via Gcc wrote:
>
> > If you're not already doing a brief "subject" line in your git
> > commits, you're Doing It Wrong!
>
> If you don't have a subject line which is more than one word, and does not
> look like a ChangeLog header line, and which is followed by a blank line,
> the commit hooks will reject the commit when you try to push it to the GCC
> repository.  (The check for a blank line after the subject is a generic
> feature of the hooks, the checks for single-word subjects and those that
> look like ChangeLog headers are part of the local commit_checker hook.)

Oh good! All the more reason to document that format in
codingconventions.html then.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11  9:13                                 ` Jonathan Wakely
@ 2021-06-11 17:02                                   ` Martin Sebor
  2021-06-11 17:05                                     ` Jakub Jelinek
  2021-06-11 17:32                                     ` Jonathan Wakely
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-11 17:02 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, gcc

On 6/11/21 3:13 AM, Jonathan Wakely wrote:
> On Thu, 10 Jun 2021 at 22:16, Martin Sebor wrote:
>> I don't see why the script users should be subjected to this tedium
>> when it can be done in the script itself with (presumably) only
>> a little more effort.  The proposed change is, IMO, a step in
>> the wrong direction.
> 
> I don't see why "improve the mklog.py script to automatically follow
> the policy" is in conflict with "enforce the policy".
> 
> If the script can be improved to do the right thing automatically when
> you commit (it's not clear if it can be improved that way, but let's
> say it can) then what's the problem with enforcing it when you push
> the commit to the server? What are you being "subjected to"? If it's
> automated, what is "this tedium"? You've already followed the policy,
> why do you care if there are checks to make sure that everybody
> follows the policy, including the people not using the script and
> writing the entire commit msg by hand?
> 
> I'm not saying we are going to enforce every part of the policy
> (because the policy isn't even clear right now), but if we wanted to
> do it later, I don't understand your objection.

My objection is to making our policies and tools more restrictive
than they need to be.  We shouldn't expect everyone to study whole
manuals just to figure out how to successfully commit a change (or
learn how to format it just the right way).  It should be easy.

Setting this discussion aside for a moment and using a different
example, the commit hook rejects commit messages that don't start
ChangeLog entries with tabs.  It also rejects commit messages that
don't list all the same test files as those changed by the commit
(and probably some others as well).  That's in my view unnecessary
when the hook could just replace the leading spaces with tabs and
automatically mention all the tests.

I see this proposal as heading in the same direction.  Rather than
making the script fix things up if we get them wrong it would reject
the commit, requiring the user to massage the ChangeLog by hand into
an unnecessarily rigid format.

Martin

PS I think I referred to mklog.py when I meant gcc-commit-mklog,
confusing the discussion and probably even myself.  Sorry about
that.  The mklog.py script is great.  It helps us avoid some of
these mistakes.  But the script isn't perfect and commit messages
for some more involved changes still need to be hand-edited.
That's when the mistakes usually creep back in.  But the fact
that we need a script to preformat a commit message for us in
the stringent format that makes the commit hook happy seems like
a pretty good indication that our policies and requirements are
too onerous for most of us to get all the details right.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11 17:02                                   ` Martin Sebor
@ 2021-06-11 17:05                                     ` Jakub Jelinek
  2021-06-11 17:32                                     ` Jonathan Wakely
  1 sibling, 0 replies; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-11 17:05 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jonathan Wakely, gcc

On Fri, Jun 11, 2021 at 11:02:52AM -0600, Martin Sebor wrote:
> My objection is to making our policies and tools more restrictive
> than they need to be.  We shouldn't expect everyone to study whole
> manuals just to figure out how to successfully commit a change (or
> learn how to format it just the right way).  It should be easy.

Following the policies isn't that hard, and the scripts when they
reject a commit also explain why it was rejected, so it is trivial
to
git commit --amend
it and git push again.

	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11 17:02                                   ` Martin Sebor
  2021-06-11 17:05                                     ` Jakub Jelinek
@ 2021-06-11 17:32                                     ` Jonathan Wakely
  2021-06-11 18:01                                       ` Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-11 17:32 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jakub Jelinek, gcc

On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
> My objection is to making our policies and tools more restrictive
> than they need to be.  We shouldn't expect everyone to study whole
> manuals just to figure out how to successfully commit a change (or
> learn how to format it just the right way).  It should be easy.

I agree, to some extent. But consistency is also good. The conventions
for GNU ChangeLog formatting exist for a reason, and so do the
conventions for good Git commit messages.

> Setting this discussion aside for a moment and using a different
> example, the commit hook rejects commit messages that don't start
> ChangeLog entries with tabs.  It also rejects commit messages that
> don't list all the same test files as those changed by the commit
> (and probably some others as well).  That's in my view unnecessary
> when the hook could just replace the leading spaces with tabs and
> automatically mention all the tests.
>
> I see this proposal as heading in the same direction.  Rather than
> making the script fix things up if we get them wrong it would reject
> the commit, requiring the user to massage the ChangeLog by hand into
> an unnecessarily rigid format.

You cannot "fix things up" in a server-side receive hook, because
changing the commit message would alter the commit hash, which would
require the committer to do a rebase to proceed. That breaks the
expected behaviour and workflow of a git repo.

You can use the scripts on the client side to verify your commit
message before pushing, so you don't have to be surprised when the
server rejects it.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11 17:32                                     ` Jonathan Wakely
@ 2021-06-11 18:01                                       ` Martin Sebor
  2021-06-11 18:14                                         ` Jonathan Wakely
  2021-06-16  0:56                                         ` Hans-Peter Nilsson
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-11 18:01 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, gcc

On 6/11/21 11:32 AM, Jonathan Wakely wrote:
> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>> My objection is to making our policies and tools more restrictive
>> than they need to be.  We shouldn't expect everyone to study whole
>> manuals just to figure out how to successfully commit a change (or
>> learn how to format it just the right way).  It should be easy.
> 
> I agree, to some extent. But consistency is also good. The conventions
> for GNU ChangeLog formatting exist for a reason, and so do the
> conventions for good Git commit messages.
> 
>> Setting this discussion aside for a moment and using a different
>> example, the commit hook rejects commit messages that don't start
>> ChangeLog entries with tabs.  It also rejects commit messages that
>> don't list all the same test files as those changed by the commit
>> (and probably some others as well).  That's in my view unnecessary
>> when the hook could just replace the leading spaces with tabs and
>> automatically mention all the tests.
>>
>> I see this proposal as heading in the same direction.  Rather than
>> making the script fix things up if we get them wrong it would reject
>> the commit, requiring the user to massage the ChangeLog by hand into
>> an unnecessarily rigid format.
> 
> You cannot "fix things up" in a server-side receive hook, because
> changing the commit message would alter the commit hash, which would
> require the committer to do a rebase to proceed. That breaks the
> expected behaviour and workflow of a git repo.
> 
> You can use the scripts on the client side to verify your commit
> message before pushing, so you don't have to be surprised when the
> server rejects it.

That sounds like a killer argument.  Do we have shared client-side
scripts that could fix things up for us, or are we each on our own
to write them?

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11 18:01                                       ` Martin Sebor
@ 2021-06-11 18:14                                         ` Jonathan Wakely
  2021-06-16  0:56                                         ` Hans-Peter Nilsson
  1 sibling, 0 replies; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-11 18:14 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jakub Jelinek, gcc

On Fri, 11 Jun 2021 at 19:01, Martin Sebor wrote:
>
> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
> > On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
> >> My objection is to making our policies and tools more restrictive
> >> than they need to be.  We shouldn't expect everyone to study whole
> >> manuals just to figure out how to successfully commit a change (or
> >> learn how to format it just the right way).  It should be easy.
> >
> > I agree, to some extent. But consistency is also good. The conventions
> > for GNU ChangeLog formatting exist for a reason, and so do the
> > conventions for good Git commit messages.
> >
> >> Setting this discussion aside for a moment and using a different
> >> example, the commit hook rejects commit messages that don't start
> >> ChangeLog entries with tabs.  It also rejects commit messages that
> >> don't list all the same test files as those changed by the commit
> >> (and probably some others as well).  That's in my view unnecessary
> >> when the hook could just replace the leading spaces with tabs and
> >> automatically mention all the tests.
> >>
> >> I see this proposal as heading in the same direction.  Rather than
> >> making the script fix things up if we get them wrong it would reject
> >> the commit, requiring the user to massage the ChangeLog by hand into
> >> an unnecessarily rigid format.
> >
> > You cannot "fix things up" in a server-side receive hook, because
> > changing the commit message would alter the commit hash, which would
> > require the committer to do a rebase to proceed. That breaks the
> > expected behaviour and workflow of a git repo.
> >
> > You can use the scripts on the client side to verify your commit
> > message before pushing, so you don't have to be surprised when the
> > server rejects it.
>
> That sounds like a killer argument.  Do we have shared client-side
> scripts that could fix things up for us, or are we each on our own
> to write them?

The scripts used in the server-side hooks are all based on what's
under contrib/gcc-changelog, e.g. the
contrib/gcc-changelog/git_check_com
mit.py script can be used in a client-side hook to run the same checks
as the server-side hook does when it receives a push.

If you want to write utilities to automate things on the client-side,
that's where they should go.

That's also where the implementations live for the gcc-verify and
gcc-commit-mklog aliases created by the
contrib/gcc-git-customization.sh script.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-11 18:01                                       ` Martin Sebor
  2021-06-11 18:14                                         ` Jonathan Wakely
@ 2021-06-16  0:56                                         ` Hans-Peter Nilsson
  2021-06-16  2:03                                           ` Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Hans-Peter Nilsson @ 2021-06-16  0:56 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jonathan Wakely, Jakub Jelinek, gcc

On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:

> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
> > On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
> > > My objection is to making our policies and tools more restrictive
> > > than they need to be.  We shouldn't expect everyone to study whole
> > > manuals just to figure out how to successfully commit a change (or
> > > learn how to format it just the right way).  It should be easy.
> >
> > I agree, to some extent. But consistency is also good. The conventions
> > for GNU ChangeLog formatting exist for a reason, and so do the
> > conventions for good Git commit messages.
> >
> > > Setting this discussion aside for a moment and using a different
> > > example, the commit hook rejects commit messages that don't start
> > > ChangeLog entries with tabs.  It also rejects commit messages that
> > > don't list all the same test files as those changed by the commit
> > > (and probably some others as well).  That's in my view unnecessary
> > > when the hook could just replace the leading spaces with tabs and
> > > automatically mention all the tests.
> > >
> > > I see this proposal as heading in the same direction.  Rather than
> > > making the script fix things up if we get them wrong it would reject
> > > the commit, requiring the user to massage the ChangeLog by hand into
> > > an unnecessarily rigid format.
> >
> > You cannot "fix things up" in a server-side receive hook, because
> > changing the commit message would alter the commit hash, which would
> > require the committer to do a rebase to proceed. That breaks the
> > expected behaviour and workflow of a git repo.
> >
> > You can use the scripts on the client side to verify your commit
> > message before pushing, so you don't have to be surprised when the
> > server rejects it.
>
> That sounds like a killer argument.  Do we have shared client-side
> scripts that could fix things up for us, or are we each on our own
> to write them?

I hope I got your view wrong.  If not: the "scripts fixing
things up for us" direction is flawed (compared to the "scripts
rejecting bad formats"), unless offered as a non-default option;
please don't proceed.

Why?  For one, there'll always be bugs in the scripting.
Mitigate those situations: while wrongly rejecting a commit is
bad, wrongly "fixing things up" is worse, as a general rule.
Better avoid that.  (There's probably a popular "pattern name"
for what I try to describe.)

brgds, H-P

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16  0:56                                         ` Hans-Peter Nilsson
@ 2021-06-16  2:03                                           ` Martin Sebor
  2021-06-16  3:42                                             ` Jason Merrill
                                                               ` (2 more replies)
  0 siblings, 3 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-16  2:03 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Jonathan Wakely, Jakub Jelinek, gcc

On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
> On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
> 
>> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>>>> My objection is to making our policies and tools more restrictive
>>>> than they need to be.  We shouldn't expect everyone to study whole
>>>> manuals just to figure out how to successfully commit a change (or
>>>> learn how to format it just the right way).  It should be easy.
>>>
>>> I agree, to some extent. But consistency is also good. The conventions
>>> for GNU ChangeLog formatting exist for a reason, and so do the
>>> conventions for good Git commit messages.
>>>
>>>> Setting this discussion aside for a moment and using a different
>>>> example, the commit hook rejects commit messages that don't start
>>>> ChangeLog entries with tabs.  It also rejects commit messages that
>>>> don't list all the same test files as those changed by the commit
>>>> (and probably some others as well).  That's in my view unnecessary
>>>> when the hook could just replace the leading spaces with tabs and
>>>> automatically mention all the tests.
>>>>
>>>> I see this proposal as heading in the same direction.  Rather than
>>>> making the script fix things up if we get them wrong it would reject
>>>> the commit, requiring the user to massage the ChangeLog by hand into
>>>> an unnecessarily rigid format.
>>>
>>> You cannot "fix things up" in a server-side receive hook, because
>>> changing the commit message would alter the commit hash, which would
>>> require the committer to do a rebase to proceed. That breaks the
>>> expected behaviour and workflow of a git repo.
>>>
>>> You can use the scripts on the client side to verify your commit
>>> message before pushing, so you don't have to be surprised when the
>>> server rejects it.
>>
>> That sounds like a killer argument.  Do we have shared client-side
>> scripts that could fix things up for us, or are we each on our own
>> to write them?
> 
> I hope I got your view wrong.  If not: the "scripts fixing
> things up for us" direction is flawed (compared to the "scripts
> rejecting bad formats"), unless offered as a non-default option;
> please don't proceed.
> 
> Why?  For one, there'll always be bugs in the scripting.
> Mitigate those situations: while wrongly rejecting a commit is
> bad, wrongly "fixing things up" is worse, as a general rule.
> Better avoid that.  (There's probably a popular "pattern name"
> for what I try to describe.)

The word that comes to mind is Technophobia.  Is it wise to trust
compilers to transform programs from their source form into
executables?  What if there are bugs in either?  What about the OS?
The whole computer, or the Internet?  Our cars?  Fortunately, there's
more to gain than to lose by trusting automation.  If there weren't
human progress would be stuck sometime in the 1700's.

But we're not talking about anything anywhere that sophisticated
here: a sed script to copy and paste a piece of text in
the description of a change from one place to another.  It's been
done a few times before with more important data than ChangeLogs.

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16  2:03                                           ` Martin Sebor
@ 2021-06-16  3:42                                             ` Jason Merrill
  2021-06-16 14:31                                               ` Martin Sebor
  2021-06-16 20:49                                               ` Jason Merrill
  2021-06-16 13:46                                             ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Jonathan Wakely
  2021-06-16 17:44                                             ` Hans-Peter Nilsson
  2 siblings, 2 replies; 84+ messages in thread
From: Jason Merrill @ 2021-06-16  3:42 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc <gcc@gcc.gnu.org>
wrote:

> On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
> > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
> >
> >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
> >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
> >>>> My objection is to making our policies and tools more restrictive
> >>>> than they need to be.  We shouldn't expect everyone to study whole
> >>>> manuals just to figure out how to successfully commit a change (or
> >>>> learn how to format it just the right way).  It should be easy.
> >>>
> >>> I agree, to some extent. But consistency is also good. The conventions
> >>> for GNU ChangeLog formatting exist for a reason, and so do the
> >>> conventions for good Git commit messages.
> >>>
> >>>> Setting this discussion aside for a moment and using a different
> >>>> example, the commit hook rejects commit messages that don't start
> >>>> ChangeLog entries with tabs.  It also rejects commit messages that
> >>>> don't list all the same test files as those changed by the commit
> >>>> (and probably some others as well).  That's in my view unnecessary
> >>>> when the hook could just replace the leading spaces with tabs and
> >>>> automatically mention all the tests.
> >>>>
> >>>> I see this proposal as heading in the same direction.  Rather than
> >>>> making the script fix things up if we get them wrong it would reject
> >>>> the commit, requiring the user to massage the ChangeLog by hand into
> >>>> an unnecessarily rigid format.
> >>>
> >>> You cannot "fix things up" in a server-side receive hook, because
> >>> changing the commit message would alter the commit hash, which would
> >>> require the committer to do a rebase to proceed. That breaks the
> >>> expected behaviour and workflow of a git repo.
> >>>
> >>> You can use the scripts on the client side to verify your commit
> >>> message before pushing, so you don't have to be surprised when the
> >>> server rejects it.
> >>
> >> That sounds like a killer argument.  Do we have shared client-side
> >> scripts that could fix things up for us, or are we each on our own
> >> to write them?
> >
> > I hope I got your view wrong.  If not: the "scripts fixing
> > things up for us" direction is flawed (compared to the "scripts
> > rejecting bad formats"), unless offered as a non-default option;
> > please don't proceed.
> >
> > Why?  For one, there'll always be bugs in the scripting.
> > Mitigate those situations: while wrongly rejecting a commit is
> > bad, wrongly "fixing things up" is worse, as a general rule.
> > Better avoid that.  (There's probably a popular "pattern name"
> > for what I try to describe.)
>
> The word that comes to mind is Technophobia.  Is it wise to trust
> compilers to transform programs from their source form into
> executables?  What if there are bugs in either?  What about the OS?
> The whole computer, or the Internet?  Our cars?  Fortunately, there's
> more to gain than to lose by trusting automation.  If there weren't
> human progress would be stuck sometime in the 1700's.
>
> But we're not talking about anything anywhere that sophisticated
> here: a sed script to copy and paste a piece of text in
> the description of a change from one place to another.  It's been
> done a few times before with more important data than ChangeLogs.
>

git gcc-commit-mklog already automates most of the process.  It could also
automate adding [PRxxxxx] to the first line.  Is that what you're asking
for?

Jason

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16  2:03                                           ` Martin Sebor
  2021-06-16  3:42                                             ` Jason Merrill
@ 2021-06-16 13:46                                             ` Jonathan Wakely
  2021-06-16 17:44                                             ` Hans-Peter Nilsson
  2 siblings, 0 replies; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-16 13:46 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc

On Wed, 16 Jun 2021 at 03:03, Martin Sebor <msebor@gmail.com> wrote:
>
> On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
> > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
> >
> >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
> >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
> >>>> My objection is to making our policies and tools more restrictive
> >>>> than they need to be.  We shouldn't expect everyone to study whole
> >>>> manuals just to figure out how to successfully commit a change (or
> >>>> learn how to format it just the right way).  It should be easy.
> >>>
> >>> I agree, to some extent. But consistency is also good. The conventions
> >>> for GNU ChangeLog formatting exist for a reason, and so do the
> >>> conventions for good Git commit messages.
> >>>
> >>>> Setting this discussion aside for a moment and using a different
> >>>> example, the commit hook rejects commit messages that don't start
> >>>> ChangeLog entries with tabs.  It also rejects commit messages that
> >>>> don't list all the same test files as those changed by the commit
> >>>> (and probably some others as well).  That's in my view unnecessary
> >>>> when the hook could just replace the leading spaces with tabs and
> >>>> automatically mention all the tests.
> >>>>
> >>>> I see this proposal as heading in the same direction.  Rather than
> >>>> making the script fix things up if we get them wrong it would reject
> >>>> the commit, requiring the user to massage the ChangeLog by hand into
> >>>> an unnecessarily rigid format.
> >>>
> >>> You cannot "fix things up" in a server-side receive hook, because
> >>> changing the commit message would alter the commit hash, which would
> >>> require the committer to do a rebase to proceed. That breaks the
> >>> expected behaviour and workflow of a git repo.
> >>>
> >>> You can use the scripts on the client side to verify your commit
> >>> message before pushing, so you don't have to be surprised when the
> >>> server rejects it.
> >>
> >> That sounds like a killer argument.  Do we have shared client-side
> >> scripts that could fix things up for us, or are we each on our own
> >> to write them?
> >
> > I hope I got your view wrong.  If not: the "scripts fixing
> > things up for us" direction is flawed (compared to the "scripts
> > rejecting bad formats"), unless offered as a non-default option;
> > please don't proceed.
> >
> > Why?  For one, there'll always be bugs in the scripting.
> > Mitigate those situations: while wrongly rejecting a commit is
> > bad, wrongly "fixing things up" is worse, as a general rule.
> > Better avoid that.  (There's probably a popular "pattern name"
> > for what I try to describe.)
>
> The word that comes to mind is Technophobia.  Is it wise to trust
> compilers to transform programs from their source form into
> executables?  What if there are bugs in either?  What about the OS?
> The whole computer, or the Internet?  Our cars?  Fortunately, there's
> more to gain than to lose by trusting automation.  If there weren't
> human progress would be stuck sometime in the 1700's.
>
> But we're not talking about anything anywhere that sophisticated
> here: a sed script to copy and paste a piece of text in
> the description of a change from one place to another.  It's been
> done a few times before with more important data than ChangeLogs.

But git commits are immutable (in practice, because we don't allow
rewriting history by force pushing) so having automated rewrites on
the server side where mistakes can't be corrected is a bad idea. Just
get your commit message right before you push (using automated tools
if you prefer) and give it a once over before pushing.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16  3:42                                             ` Jason Merrill
@ 2021-06-16 14:31                                               ` Martin Sebor
  2021-06-16 20:49                                               ` Jason Merrill
  1 sibling, 0 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-16 14:31 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

On 6/15/21 9:42 PM, Jason Merrill wrote:
> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc <gcc@gcc.gnu.org 
> <mailto:gcc@gcc.gnu.org>> wrote:
> 
>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>      >
>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>      >>>> My objection is to making our policies and tools more restrictive
>      >>>> than they need to be.  We shouldn't expect everyone to study whole
>      >>>> manuals just to figure out how to successfully commit a change (or
>      >>>> learn how to format it just the right way).  It should be easy.
>      >>>
>      >>> I agree, to some extent. But consistency is also good. The
>     conventions
>      >>> for GNU ChangeLog formatting exist for a reason, and so do the
>      >>> conventions for good Git commit messages.
>      >>>
>      >>>> Setting this discussion aside for a moment and using a different
>      >>>> example, the commit hook rejects commit messages that don't start
>      >>>> ChangeLog entries with tabs.  It also rejects commit messages that
>      >>>> don't list all the same test files as those changed by the commit
>      >>>> (and probably some others as well).  That's in my view unnecessary
>      >>>> when the hook could just replace the leading spaces with tabs and
>      >>>> automatically mention all the tests.
>      >>>>
>      >>>> I see this proposal as heading in the same direction.  Rather than
>      >>>> making the script fix things up if we get them wrong it would
>     reject
>      >>>> the commit, requiring the user to massage the ChangeLog by
>     hand into
>      >>>> an unnecessarily rigid format.
>      >>>
>      >>> You cannot "fix things up" in a server-side receive hook, because
>      >>> changing the commit message would alter the commit hash, which
>     would
>      >>> require the committer to do a rebase to proceed. That breaks the
>      >>> expected behaviour and workflow of a git repo.
>      >>>
>      >>> You can use the scripts on the client side to verify your commit
>      >>> message before pushing, so you don't have to be surprised when the
>      >>> server rejects it.
>      >>
>      >> That sounds like a killer argument.  Do we have shared client-side
>      >> scripts that could fix things up for us, or are we each on our own
>      >> to write them?
>      >
>      > I hope I got your view wrong.  If not: the "scripts fixing
>      > things up for us" direction is flawed (compared to the "scripts
>      > rejecting bad formats"), unless offered as a non-default option;
>      > please don't proceed.
>      >
>      > Why?  For one, there'll always be bugs in the scripting.
>      > Mitigate those situations: while wrongly rejecting a commit is
>      > bad, wrongly "fixing things up" is worse, as a general rule.
>      > Better avoid that.  (There's probably a popular "pattern name"
>      > for what I try to describe.)
> 
>     The word that comes to mind is Technophobia.  Is it wise to trust
>     compilers to transform programs from their source form into
>     executables?  What if there are bugs in either?  What about the OS?
>     The whole computer, or the Internet?  Our cars?  Fortunately, there's
>     more to gain than to lose by trusting automation.  If there weren't
>     human progress would be stuck sometime in the 1700's.
> 
>     But we're not talking about anything anywhere that sophisticated
>     here: a sed script to copy and paste a piece of text in
>     the description of a change from one place to another.  It's been
>     done a few times before with more important data than ChangeLogs.
> 
> 
> git gcc-commit-mklog already automates most of the process.  It could 
> also automate adding [PRxxxxx] to the first line.  Is that what you're 
> asking for?

My suggestion was to have the script adjust the commit message when
it found something amiss (like a PRnnnn in the subject but not in
the ChangeLog entry that prompted this thread, or leading spaces
rather than tabs, missing test files, etc.), instead of rejecting
it and making the user fix it.  Jonathan explained that (IIUC)
changing the commit message by the commit hook on the server side
would result in a different commit hash and that the idea would
only work by running a pre-commit script on the client side.
Until we have such a script rejecting these simple mistakes appears
to be the best we can do.  (Barring more radical approaches like
completely automating the ChangeLog file maintenance or getting
rid of it).

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16  2:03                                           ` Martin Sebor
  2021-06-16  3:42                                             ` Jason Merrill
  2021-06-16 13:46                                             ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Jonathan Wakely
@ 2021-06-16 17:44                                             ` Hans-Peter Nilsson
  2 siblings, 0 replies; 84+ messages in thread
From: Hans-Peter Nilsson @ 2021-06-16 17:44 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jonathan Wakely, Jakub Jelinek, gcc

On Tue, 15 Jun 2021, Martin Sebor wrote:
> On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
> > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
> >
> > > On 6/11/21 11:32 AM, Jonathan Wakely wrote:
> > > > On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
> > > > > My objection is to making our policies and tools more restrictive
> > > > > than they need to be.  We shouldn't expect everyone to study whole
> > > > > manuals just to figure out how to successfully commit a change (or
> > > > > learn how to format it just the right way).  It should be easy.
> > > >
> > > > I agree, to some extent. But consistency is also good. The conventions
> > > > for GNU ChangeLog formatting exist for a reason, and so do the
> > > > conventions for good Git commit messages.
> > > >
> > > > > Setting this discussion aside for a moment and using a different
> > > > > example, the commit hook rejects commit messages that don't start
> > > > > ChangeLog entries with tabs.  It also rejects commit messages that
> > > > > don't list all the same test files as those changed by the commit
> > > > > (and probably some others as well).  That's in my view unnecessary
> > > > > when the hook could just replace the leading spaces with tabs and
> > > >
> > > > You cannot "fix things up" in a server-side receive hook, because
> > > > changing the commit message would alter the commit hash, which would
> > > > require the committer to do a rebase to proceed. That breaks the
> > > > expected behaviour and workflow of a git repo.
> > > >
> > > > You can use the scripts on the client side to verify your commit
> > > > message before pushing, so you don't have to be surprised when the
> > > > server rejects it.
> > >
> > > That sounds like a killer argument.  Do we have shared client-side
> > > scripts that could fix things up for us, or are we each on our own
> > > to write them?
> >
> > > > > automatically mention all the tests.
> > > > >
> > > > > I see this proposal as heading in the same direction.  Rather than
> > > > > making the script fix things up if we get them wrong it would reject
> > > > > the commit, requiring the user to massage the ChangeLog by hand into
> > > > > an unnecessarily rigid format.

> > I hope I got your view wrong.  If not: the "scripts fixing
> > things up for us" direction is flawed (compared to the "scripts
> > rejecting bad formats"), unless offered as a non-default option;
> > please don't proceed.
> >
> > Why?  For one, there'll always be bugs in the scripting.
> > Mitigate those situations: while wrongly rejecting a commit is
> > bad, wrongly "fixing things up" is worse, as a general rule.
> > Better avoid that.  (There's probably a popular "pattern name"
> > for what I try to describe.)
>
> The word that comes to mind is Technophobia.  Is it wise to trust
> compilers to transform programs from their source form into
> executables?  What if there are bugs in either?  What about the OS?
> The whole computer, or the Internet?  Our cars?  Fortunately, there's
> more to gain than to lose by trusting automation.  If there weren't
> human progress would be stuck sometime in the 1700's.
>
> But we're not talking about anything anywhere that sophisticated
> here: a sed script to copy and paste a piece of text in
> the description of a change from one place to another.  It's been
> done a few times before with more important data than ChangeLogs.

Your hyperbole would be funnier if it wasn't too much over the
top.  Criticizing the suggested approach for suffering from a
technical fallacy ("outsmarting the user") isn't technophobia.
Keep it in the helper scripts, don't put it in a commit hook.

brgds, H-P

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16  3:42                                             ` Jason Merrill
  2021-06-16 14:31                                               ` Martin Sebor
@ 2021-06-16 20:49                                               ` Jason Merrill
  2021-06-16 21:45                                                 ` Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Jason Merrill @ 2021-06-16 20:49 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

[-- Attachment #1: Type: text/plain, Size: 3940 bytes --]

On 6/15/21 11:42 PM, Jason Merrill wrote:
> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc <gcc@gcc.gnu.org 
> <mailto:gcc@gcc.gnu.org>> wrote:
> 
>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>      >
>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>      >>>> My objection is to making our policies and tools more restrictive
>      >>>> than they need to be.  We shouldn't expect everyone to study whole
>      >>>> manuals just to figure out how to successfully commit a change (or
>      >>>> learn how to format it just the right way).  It should be easy.
>      >>>
>      >>> I agree, to some extent. But consistency is also good. The
>     conventions
>      >>> for GNU ChangeLog formatting exist for a reason, and so do the
>      >>> conventions for good Git commit messages.
>      >>>
>      >>>> Setting this discussion aside for a moment and using a different
>      >>>> example, the commit hook rejects commit messages that don't start
>      >>>> ChangeLog entries with tabs.  It also rejects commit messages that
>      >>>> don't list all the same test files as those changed by the commit
>      >>>> (and probably some others as well).  That's in my view unnecessary
>      >>>> when the hook could just replace the leading spaces with tabs and
>      >>>> automatically mention all the tests.
>      >>>>
>      >>>> I see this proposal as heading in the same direction.  Rather than
>      >>>> making the script fix things up if we get them wrong it would
>     reject
>      >>>> the commit, requiring the user to massage the ChangeLog by
>     hand into
>      >>>> an unnecessarily rigid format.
>      >>>
>      >>> You cannot "fix things up" in a server-side receive hook, because
>      >>> changing the commit message would alter the commit hash, which
>     would
>      >>> require the committer to do a rebase to proceed. That breaks the
>      >>> expected behaviour and workflow of a git repo.
>      >>>
>      >>> You can use the scripts on the client side to verify your commit
>      >>> message before pushing, so you don't have to be surprised when the
>      >>> server rejects it.
>      >>
>      >> That sounds like a killer argument.  Do we have shared client-side
>      >> scripts that could fix things up for us, or are we each on our own
>      >> to write them?
>      >
>      > I hope I got your view wrong.  If not: the "scripts fixing
>      > things up for us" direction is flawed (compared to the "scripts
>      > rejecting bad formats"), unless offered as a non-default option;
>      > please don't proceed.
>      >
>      > Why?  For one, there'll always be bugs in the scripting.
>      > Mitigate those situations: while wrongly rejecting a commit is
>      > bad, wrongly "fixing things up" is worse, as a general rule.
>      > Better avoid that.  (There's probably a popular "pattern name"
>      > for what I try to describe.)
> 
>     The word that comes to mind is Technophobia.  Is it wise to trust
>     compilers to transform programs from their source form into
>     executables?  What if there are bugs in either?  What about the OS?
>     The whole computer, or the Internet?  Our cars?  Fortunately, there's
>     more to gain than to lose by trusting automation.  If there weren't
>     human progress would be stuck sometime in the 1700's.
> 
>     But we're not talking about anything anywhere that sophisticated
>     here: a sed script to copy and paste a piece of text in
>     the description of a change from one place to another.  It's been
>     done a few times before with more important data than ChangeLogs.
> 
> 
> git gcc-commit-mklog already automates most of the process.  It could 
> also automate adding [PRxxxxx] to the first line.  Is that what you're 
> asking for?

Like, say:

[-- Attachment #2: 0001-mklog-add-subject-line-skeleton.patch --]
[-- Type: text/x-patch, Size: 2483 bytes --]

From a4e1c5eef3491e3d2e72a01e864af204f124a994 Mon Sep 17 00:00:00 2001
From: Jason Merrill <jason@redhat.com>
Date: Wed, 16 Jun 2021 16:46:38 -0400
Subject: [PATCH] mklog: add subject line skeleton
To: gcc-patches@gcc.gnu.org

contrib/ChangeLog:

	* mklog.py: Add an initial component: [PRnnnnn] line when
	we have a PR.
---
 contrib/mklog.py | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 5c93c707128..6113ae30a8e 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -39,6 +39,7 @@ import requests
 from unidiff import PatchSet
 
 pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<pr>PR [a-z+-]+\/[0-9]+)')
+prnum_regex = re.compile(r'PR (?P<comp>[a-z+-]+)/(?P<num>[0-9]+)')
 dr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<dr>DR [0-9]+)')
 dg_regex = re.compile(r'{\s+dg-(error|warning)')
 identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
@@ -67,6 +68,7 @@ PATCH must be generated using diff(1)'s -up or -cp options
 script_folder = os.path.realpath(__file__)
 root = os.path.dirname(os.path.dirname(script_folder))
 
+firstpr = ''
 
 def find_changelog(path):
     folder = os.path.split(path)[0]
@@ -134,6 +136,7 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
     prs = []
     out = ''
     diff = PatchSet(data)
+    global firstpr;
 
     for file in diff:
         # skip files that can't be parsed
@@ -166,6 +169,9 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
                         # Found dg-warning/dg-error line
                         break
 
+    if prs:
+        firstpr = prs[0]
+
     if fill_pr_titles:
         out += get_pr_titles(prs)
 
@@ -308,8 +314,14 @@ if __name__ == '__main__':
             start = list(takewhile(lambda l: not l.startswith('#'), lines))
             end = lines[len(start):]
             with open(args.changelog, 'w') as f:
+                if (not start) or (start[0] == ''):
+                    # initial commit subject line 'component: [PRnnnnn]'
+                    m = prnum_regex.match(firstpr)
+                    if m:
+                        start.insert (0, m.group('comp') + ': [PR'
+                                      + m.group('num') + ']')
                 if start:
-                    # appent empty line
+                    # append empty line
                     if start[-1] != '':
                         start.append('')
                 else:
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16 20:49                                               ` Jason Merrill
@ 2021-06-16 21:45                                                 ` Martin Sebor
  2021-06-16 23:45                                                   ` Jason Merrill
  0 siblings, 1 reply; 84+ messages in thread
From: Martin Sebor @ 2021-06-16 21:45 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

On 6/16/21 2:49 PM, Jason Merrill wrote:
> On 6/15/21 11:42 PM, Jason Merrill wrote:
>> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc <gcc@gcc.gnu.org 
>> <mailto:gcc@gcc.gnu.org>> wrote:
>>
>>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>>      >
>>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>>      >>>> My objection is to making our policies and tools more 
>> restrictive
>>      >>>> than they need to be.  We shouldn't expect everyone to study 
>> whole
>>      >>>> manuals just to figure out how to successfully commit a 
>> change (or
>>      >>>> learn how to format it just the right way).  It should be easy.
>>      >>>
>>      >>> I agree, to some extent. But consistency is also good. The
>>     conventions
>>      >>> for GNU ChangeLog formatting exist for a reason, and so do the
>>      >>> conventions for good Git commit messages.
>>      >>>
>>      >>>> Setting this discussion aside for a moment and using a 
>> different
>>      >>>> example, the commit hook rejects commit messages that don't 
>> start
>>      >>>> ChangeLog entries with tabs.  It also rejects commit 
>> messages that
>>      >>>> don't list all the same test files as those changed by the 
>> commit
>>      >>>> (and probably some others as well).  That's in my view 
>> unnecessary
>>      >>>> when the hook could just replace the leading spaces with 
>> tabs and
>>      >>>> automatically mention all the tests.
>>      >>>>
>>      >>>> I see this proposal as heading in the same direction.  
>> Rather than
>>      >>>> making the script fix things up if we get them wrong it would
>>     reject
>>      >>>> the commit, requiring the user to massage the ChangeLog by
>>     hand into
>>      >>>> an unnecessarily rigid format.
>>      >>>
>>      >>> You cannot "fix things up" in a server-side receive hook, 
>> because
>>      >>> changing the commit message would alter the commit hash, which
>>     would
>>      >>> require the committer to do a rebase to proceed. That breaks the
>>      >>> expected behaviour and workflow of a git repo.
>>      >>>
>>      >>> You can use the scripts on the client side to verify your commit
>>      >>> message before pushing, so you don't have to be surprised 
>> when the
>>      >>> server rejects it.
>>      >>
>>      >> That sounds like a killer argument.  Do we have shared 
>> client-side
>>      >> scripts that could fix things up for us, or are we each on our 
>> own
>>      >> to write them?
>>      >
>>      > I hope I got your view wrong.  If not: the "scripts fixing
>>      > things up for us" direction is flawed (compared to the "scripts
>>      > rejecting bad formats"), unless offered as a non-default option;
>>      > please don't proceed.
>>      >
>>      > Why?  For one, there'll always be bugs in the scripting.
>>      > Mitigate those situations: while wrongly rejecting a commit is
>>      > bad, wrongly "fixing things up" is worse, as a general rule.
>>      > Better avoid that.  (There's probably a popular "pattern name"
>>      > for what I try to describe.)
>>
>>     The word that comes to mind is Technophobia.  Is it wise to trust
>>     compilers to transform programs from their source form into
>>     executables?  What if there are bugs in either?  What about the OS?
>>     The whole computer, or the Internet?  Our cars?  Fortunately, there's
>>     more to gain than to lose by trusting automation.  If there weren't
>>     human progress would be stuck sometime in the 1700's.
>>
>>     But we're not talking about anything anywhere that sophisticated
>>     here: a sed script to copy and paste a piece of text in
>>     the description of a change from one place to another.  It's been
>>     done a few times before with more important data than ChangeLogs.
>>
>>
>> git gcc-commit-mklog already automates most of the process.  It could 
>> also automate adding [PRxxxxx] to the first line.  Is that what you're 
>> asking for?
> 
> Like, say:

I don't think this solves the problem Xionghu Luo was asking about:
   https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html
i.e., they did have a [PRnnnn] in the one line subject but not in
their ChangeLog entries.  It also not clear if they used mklog.py
at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
a patch if it finds in one of the tests.  Though it doesn't seem
to put in the ChangeLog entries.  Odd.

I was suggesting to make this (and the other things the commit
hook rejects commit for) happen automatically by running a script
at the same time as git commit.

But to be clear, I'm not asking for anything for myself.  Although
I use mklog.py I have my own script that does what I suggest that
I could go back to.  I responded to this thread because I think
these minute details could be automated for everyone's benefit.
Before moving to Git we talked about doing much more, including
automatically running a format/style checker on the patch and
(IIRC) Jeff even wanted it to do some basic tweaks on its own
(like replace spaces with tabs).

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16 21:45                                                 ` Martin Sebor
@ 2021-06-16 23:45                                                   ` Jason Merrill
  2021-06-17  0:17                                                     ` Martin Sebor
  0 siblings, 1 reply; 84+ messages in thread
From: Jason Merrill @ 2021-06-16 23:45 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor <msebor@gmail.com> wrote:

> On 6/16/21 2:49 PM, Jason Merrill wrote:
> > On 6/15/21 11:42 PM, Jason Merrill wrote:
> >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc <gcc@gcc.gnu.org
> >> <mailto:gcc@gcc.gnu.org>> wrote:
> >>
> >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
> >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
> >>      >
> >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
> >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
> >>      >>>> My objection is to making our policies and tools more
> >> restrictive
> >>      >>>> than they need to be.  We shouldn't expect everyone to study
> >> whole
> >>      >>>> manuals just to figure out how to successfully commit a
> >> change (or
> >>      >>>> learn how to format it just the right way).  It should be
> easy.
> >>      >>>
> >>      >>> I agree, to some extent. But consistency is also good. The
> >>     conventions
> >>      >>> for GNU ChangeLog formatting exist for a reason, and so do the
> >>      >>> conventions for good Git commit messages.
> >>      >>>
> >>      >>>> Setting this discussion aside for a moment and using a
> >> different
> >>      >>>> example, the commit hook rejects commit messages that don't
> >> start
> >>      >>>> ChangeLog entries with tabs.  It also rejects commit
> >> messages that
> >>      >>>> don't list all the same test files as those changed by the
> >> commit
> >>      >>>> (and probably some others as well).  That's in my view
> >> unnecessary
> >>      >>>> when the hook could just replace the leading spaces with
> >> tabs and
> >>      >>>> automatically mention all the tests.
> >>      >>>>
> >>      >>>> I see this proposal as heading in the same direction.
> >> Rather than
> >>      >>>> making the script fix things up if we get them wrong it would
> >>     reject
> >>      >>>> the commit, requiring the user to massage the ChangeLog by
> >>     hand into
> >>      >>>> an unnecessarily rigid format.
> >>      >>>
> >>      >>> You cannot "fix things up" in a server-side receive hook,
> >> because
> >>      >>> changing the commit message would alter the commit hash, which
> >>     would
> >>      >>> require the committer to do a rebase to proceed. That breaks
> the
> >>      >>> expected behaviour and workflow of a git repo.
> >>      >>>
> >>      >>> You can use the scripts on the client side to verify your
> commit
> >>      >>> message before pushing, so you don't have to be surprised
> >> when the
> >>      >>> server rejects it.
> >>      >>
> >>      >> That sounds like a killer argument.  Do we have shared
> >> client-side
> >>      >> scripts that could fix things up for us, or are we each on our
> >> own
> >>      >> to write them?
> >>      >
> >>      > I hope I got your view wrong.  If not: the "scripts fixing
> >>      > things up for us" direction is flawed (compared to the "scripts
> >>      > rejecting bad formats"), unless offered as a non-default option;
> >>      > please don't proceed.
> >>      >
> >>      > Why?  For one, there'll always be bugs in the scripting.
> >>      > Mitigate those situations: while wrongly rejecting a commit is
> >>      > bad, wrongly "fixing things up" is worse, as a general rule.
> >>      > Better avoid that.  (There's probably a popular "pattern name"
> >>      > for what I try to describe.)
> >>
> >>     The word that comes to mind is Technophobia.  Is it wise to trust
> >>     compilers to transform programs from their source form into
> >>     executables?  What if there are bugs in either?  What about the OS?
> >>     The whole computer, or the Internet?  Our cars?  Fortunately,
> there's
> >>     more to gain than to lose by trusting automation.  If there weren't
> >>     human progress would be stuck sometime in the 1700's.
> >>
> >>     But we're not talking about anything anywhere that sophisticated
> >>     here: a sed script to copy and paste a piece of text in
> >>     the description of a change from one place to another.  It's been
> >>     done a few times before with more important data than ChangeLogs.
> >>
> >>
> >> git gcc-commit-mklog already automates most of the process.  It could
> >> also automate adding [PRxxxxx] to the first line.  Is that what you're
> >> asking for?
> >
> > Like, say:
>
> I don't think this solves the problem Xionghu Luo was asking about:
>    https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html


Indeed, their problem was not mentioning the PR in the testcase, which a
script isn't going to fix.

i.e., they did have a [PRnnnn] in the one line subject but not in
> their ChangeLog entries.  It also not clear if they used mklog.py
> at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
> a patch if it finds in one of the tests.  Though it doesn't seem
> to put in the ChangeLog entries.  Odd.
>

It currently puts in

 PR comp/nnnnn

at the beginning of the ChangeLog entries; it used to put them in the
entries for each ChangeLog file, but that changed in r12-771.  My patch
also adds the [PRnnnn] to the subject line.

I was suggesting to make this (and the other things the commit
> hook rejects commit for) happen automatically by running a script
> at the same time as git commit.
>
> But to be clear, I'm not asking for anything for myself.  Although
> I use mklog.py I have my own script that does what I suggest that
> I could go back to.  I responded to this thread because I think
> these minute details could be automated for everyone's benefit.
> Before moving to Git we talked about doing much more, including
> automatically running a format/style checker on the patch and
> (IIRC) Jeff even wanted it to do some basic tweaks on its own
> (like replace spaces with tabs).
>

We could do various cleanup in the 'commit-msg' hook on the user side.  Or,
probably better, git gcc-verify could clean up some of the issues it finds.

Jason

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-16 23:45                                                   ` Jason Merrill
@ 2021-06-17  0:17                                                     ` Martin Sebor
  2021-06-17  0:40                                                       ` Jason Merrill
  2021-06-21  7:54                                                       ` [Patch, v2] contrib/mklog.py: Improve PR handling (was: " Tobias Burnus
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-17  0:17 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

[-- Attachment #1: Type: text/plain, Size: 8098 bytes --]

On 6/16/21 5:45 PM, Jason Merrill wrote:
> On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor <msebor@gmail.com 
> <mailto:msebor@gmail.com>> wrote:
> 
>     On 6/16/21 2:49 PM, Jason Merrill wrote:
>      > On 6/15/21 11:42 PM, Jason Merrill wrote:
>      >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc
>     <gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>
>      >> <mailto:gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>>> wrote:
>      >>
>      >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>      >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>      >>      >
>      >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>      >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>      >>      >>>> My objection is to making our policies and tools more
>      >> restrictive
>      >>      >>>> than they need to be.  We shouldn't expect everyone to
>     study
>      >> whole
>      >>      >>>> manuals just to figure out how to successfully commit a
>      >> change (or
>      >>      >>>> learn how to format it just the right way).  It should
>     be easy.
>      >>      >>>
>      >>      >>> I agree, to some extent. But consistency is also good. The
>      >>     conventions
>      >>      >>> for GNU ChangeLog formatting exist for a reason, and so
>     do the
>      >>      >>> conventions for good Git commit messages.
>      >>      >>>
>      >>      >>>> Setting this discussion aside for a moment and using a
>      >> different
>      >>      >>>> example, the commit hook rejects commit messages that
>     don't
>      >> start
>      >>      >>>> ChangeLog entries with tabs.  It also rejects commit
>      >> messages that
>      >>      >>>> don't list all the same test files as those changed by
>     the
>      >> commit
>      >>      >>>> (and probably some others as well).  That's in my view
>      >> unnecessary
>      >>      >>>> when the hook could just replace the leading spaces with
>      >> tabs and
>      >>      >>>> automatically mention all the tests.
>      >>      >>>>
>      >>      >>>> I see this proposal as heading in the same direction.
>      >> Rather than
>      >>      >>>> making the script fix things up if we get them wrong
>     it would
>      >>     reject
>      >>      >>>> the commit, requiring the user to massage the ChangeLog by
>      >>     hand into
>      >>      >>>> an unnecessarily rigid format.
>      >>      >>>
>      >>      >>> You cannot "fix things up" in a server-side receive hook,
>      >> because
>      >>      >>> changing the commit message would alter the commit
>     hash, which
>      >>     would
>      >>      >>> require the committer to do a rebase to proceed. That
>     breaks the
>      >>      >>> expected behaviour and workflow of a git repo.
>      >>      >>>
>      >>      >>> You can use the scripts on the client side to verify
>     your commit
>      >>      >>> message before pushing, so you don't have to be surprised
>      >> when the
>      >>      >>> server rejects it.
>      >>      >>
>      >>      >> That sounds like a killer argument.  Do we have shared
>      >> client-side
>      >>      >> scripts that could fix things up for us, or are we each
>     on our
>      >> own
>      >>      >> to write them?
>      >>      >
>      >>      > I hope I got your view wrong.  If not: the "scripts fixing
>      >>      > things up for us" direction is flawed (compared to the
>     "scripts
>      >>      > rejecting bad formats"), unless offered as a non-default
>     option;
>      >>      > please don't proceed.
>      >>      >
>      >>      > Why?  For one, there'll always be bugs in the scripting.
>      >>      > Mitigate those situations: while wrongly rejecting a
>     commit is
>      >>      > bad, wrongly "fixing things up" is worse, as a general rule.
>      >>      > Better avoid that.  (There's probably a popular "pattern
>     name"
>      >>      > for what I try to describe.)
>      >>
>      >>     The word that comes to mind is Technophobia.  Is it wise to
>     trust
>      >>     compilers to transform programs from their source form into
>      >>     executables?  What if there are bugs in either?  What about
>     the OS?
>      >>     The whole computer, or the Internet?  Our cars? 
>     Fortunately, there's
>      >>     more to gain than to lose by trusting automation.  If there
>     weren't
>      >>     human progress would be stuck sometime in the 1700's.
>      >>
>      >>     But we're not talking about anything anywhere that sophisticated
>      >>     here: a sed script to copy and paste a piece of text in
>      >>     the description of a change from one place to another.  It's
>     been
>      >>     done a few times before with more important data than
>     ChangeLogs.
>      >>
>      >>
>      >> git gcc-commit-mklog already automates most of the process.  It
>     could
>      >> also automate adding [PRxxxxx] to the first line.  Is that what
>     you're
>      >> asking for?
>      >
>      > Like, say:
> 
>     I don't think this solves the problem Xionghu Luo was asking about:
>     https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html
> 
> 
> Indeed, their problem was not mentioning the PR in the testcase, which a 
> script isn't going to fix.
> 
>     i.e., they did have a [PRnnnn] in the one line subject but not in
>     their ChangeLog entries.  It also not clear if they used mklog.py
>     at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
>     a patch if it finds in one of the tests.  Though it doesn't seem
>     to put in the ChangeLog entries.  Odd.
> 
> 
> It currently puts in
> 
>   PR comp/nnnnn
> 
> at the beginning of the ChangeLog entries; it used to put them in the 
> entries for each ChangeLog file, but that changed in r12-771.  My patch 
> also adds the [PRnnnn] to the subject line.

To say I'm not good at Python would be an understatement but I hacked
up the attached patch that:

1) extracts PR numbers from test names,
2) gets the component for each PR from Bugzilla,
3) adds the PR component/nnnnn to each ChangeLog

Running the hacked up script with -p on the patch for PR 100085 prints
the following:

Resolves:
PR 100085/target - Bad code for union transfer from __float128 to vector 
types

gcc/ChangeLog:

	PR 100085/target
	* config/rs6000/rs6000-p8swap.c (pattern_is_rotate64):
	(insn_is_load_p):
	(insn_is_swap_p):
	(quad_aligned_load_p):
	(const_load_sequence_p):
	(replace_swapped_aligned_load):
	(recombine_lvx_pattern):
	(recombine_stvx_pattern):

gcc/testsuite/ChangeLog:

	PR 100085/target
	* gcc.target/powerpc/float128-call.c:
	* gcc.target/powerpc/pr100085.c: New test.


Without -p it doesn't print the Resolves: line or the component name
in the ChangeLog entries.  This also mimics what my retired awk script
does.  What do you think?

>     I was suggesting to make this (and the other things the commit
>     hook rejects commit for) happen automatically by running a script
>     at the same time as git commit.
> 
>     But to be clear, I'm not asking for anything for myself.  Although
>     I use mklog.py I have my own script that does what I suggest that
>     I could go back to.  I responded to this thread because I think
>     these minute details could be automated for everyone's benefit.
>     Before moving to Git we talked about doing much more, including
>     automatically running a format/style checker on the patch and
>     (IIRC) Jeff even wanted it to do some basic tweaks on its own
>     (like replace spaces with tabs).
> 
> 
> We could do various cleanup in the 'commit-msg' hook on the user side.  
> Or, probably better, git gcc-verify could clean up some of the issues it 
> finds.

I'm not familiar with these.  Where should I look for them to learn
how to do this?

Martin

[-- Attachment #2: gcc-mklog-prs.diff --]
[-- Type: text/x-patch, Size: 3821 bytes --]

Add PRs to ChangeLog entries in output.

contrib/ChangeLog:

	* mklog.py (bugzilla_url): Add component to query.
	get_pr_titles): Remove stray debugging output.  Get
	components for PRs.  Return new PR array including components.
	(generate_changelog): Extract PRs from test names.
	Print a Resolves line before PRs.  Replace PRS with updated.
	Print all PRs in each ChangeLog entry.

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 5c93c707128..e6a418a2fb4 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -51,7 +51,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
 template_and_param_regex = re.compile(r'<[^<>]*>')
 md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
 bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&' \
-               'include_fields=summary'
+               'include_fields=component,summary'
 
 function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}
 
@@ -115,17 +115,22 @@ def sort_changelog_files(changed_file):
 
 
 def get_pr_titles(prs):
+    new_prs = []
     output = ''
     for pr in prs:
         pr_id = pr.split('/')[-1]
+        if pr_id == pr:
+          pr_id = pr.split(' ')[-1]
         r = requests.get(bugzilla_url % pr_id)
         bugs = r.json()['bugs']
         if len(bugs) == 1:
-            output += '%s - %s\n' % (pr, bugs[0]['summary'])
-            print(output)
+            comp = bugs[0]['component']
+            pr_id_comp = "PR " + pr_id + "/" + comp
+            new_prs.append (pr_id_comp)
+            output += 'PR %s/%s - %s\n' % (pr_id, comp, bugs[0]['summary'])
     if output:
         output += '\n'
-    return output
+    return output, new_prs
 
 
 def generate_changelog(data, no_functions=False, fill_pr_titles=False):
@@ -147,6 +152,12 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
 
         # Extract PR entries from newly added tests
         if 'testsuite' in file.path and file.is_added_file:
+            name = os.path.basename(file.path)
+            name = os.path.splitext(name)[0]
+            if name.startswith("pr"):
+                name = name[2:]
+                name = "PR " + name
+                prs.append(name)
             # Only search first ten lines as later lines may
             # contains commented code which a note that it
             # has not been tested due to a certain PR or DR.
@@ -166,22 +177,28 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
                         # Found dg-warning/dg-error line
                         break
 
-    if fill_pr_titles:
-        out += get_pr_titles(prs)
+    # Start with a blank line for a subject and description.
+    out += '\n'
 
-    # print list of PR entries before ChangeLog entries
-    if prs:
-        if not out:
-            out += '\n'
-        for pr in prs:
-            out += '\t%s\n' % pr
-        out += '\n'
+    # Append a Resolves: line with PRs and their titles and replace
+    # PRS with an array of <pr-id>/<component> in case it's not in
+    # that format.
+    if fill_pr_titles:
+        pr_titles, new_prs = get_pr_titles(prs)
+        if new_prs:
+            prs = new_prs
+        if pr_titles:
+            out += "Resolves:\n"
+            out += pr_titles
 
     # sort ChangeLog so that 'testsuite' is at the end
     for changelog in sorted(changelog_list, key=lambda x: 'testsuite' in x):
         files = changelogs[changelog]
         out += '%s:\n' % os.path.join(changelog, 'ChangeLog')
         out += '\n'
+        # Print list of PR entries before each ChangeLog entry.
+        for pr in prs:
+            out += '\t%s\n' % pr
         # new and deleted files should be at the end
         for file in sorted(files, key=sort_changelog_files):
             assert file.path.startswith(changelog)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17  0:17                                                     ` Martin Sebor
@ 2021-06-17  0:40                                                       ` Jason Merrill
  2021-06-17  1:01                                                         ` Martin Sebor
  2021-06-17 10:08                                                         ` Richard Earnshaw
  2021-06-21  7:54                                                       ` [Patch, v2] contrib/mklog.py: Improve PR handling (was: " Tobias Burnus
  1 sibling, 2 replies; 84+ messages in thread
From: Jason Merrill @ 2021-06-17  0:40 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

On 6/16/21 8:17 PM, Martin Sebor wrote:
> On 6/16/21 5:45 PM, Jason Merrill wrote:
>> On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor <msebor@gmail.com 
>> <mailto:msebor@gmail.com>> wrote:
>>
>>     On 6/16/21 2:49 PM, Jason Merrill wrote:
>>      > On 6/15/21 11:42 PM, Jason Merrill wrote:
>>      >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc
>>     <gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>
>>      >> <mailto:gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>>> wrote:
>>      >>
>>      >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>>      >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>>      >>      >
>>      >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>>      >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>>      >>      >>>> My objection is to making our policies and tools more
>>      >> restrictive
>>      >>      >>>> than they need to be.  We shouldn't expect everyone to
>>     study
>>      >> whole
>>      >>      >>>> manuals just to figure out how to successfully commit a
>>      >> change (or
>>      >>      >>>> learn how to format it just the right way).  It should
>>     be easy.
>>      >>      >>>
>>      >>      >>> I agree, to some extent. But consistency is also 
>> good. The
>>      >>     conventions
>>      >>      >>> for GNU ChangeLog formatting exist for a reason, and so
>>     do the
>>      >>      >>> conventions for good Git commit messages.
>>      >>      >>>
>>      >>      >>>> Setting this discussion aside for a moment and using a
>>      >> different
>>      >>      >>>> example, the commit hook rejects commit messages that
>>     don't
>>      >> start
>>      >>      >>>> ChangeLog entries with tabs.  It also rejects commit
>>      >> messages that
>>      >>      >>>> don't list all the same test files as those changed by
>>     the
>>      >> commit
>>      >>      >>>> (and probably some others as well).  That's in my view
>>      >> unnecessary
>>      >>      >>>> when the hook could just replace the leading spaces 
>> with
>>      >> tabs and
>>      >>      >>>> automatically mention all the tests.
>>      >>      >>>>
>>      >>      >>>> I see this proposal as heading in the same direction.
>>      >> Rather than
>>      >>      >>>> making the script fix things up if we get them wrong
>>     it would
>>      >>     reject
>>      >>      >>>> the commit, requiring the user to massage the 
>> ChangeLog by
>>      >>     hand into
>>      >>      >>>> an unnecessarily rigid format.
>>      >>      >>>
>>      >>      >>> You cannot "fix things up" in a server-side receive 
>> hook,
>>      >> because
>>      >>      >>> changing the commit message would alter the commit
>>     hash, which
>>      >>     would
>>      >>      >>> require the committer to do a rebase to proceed. That
>>     breaks the
>>      >>      >>> expected behaviour and workflow of a git repo.
>>      >>      >>>
>>      >>      >>> You can use the scripts on the client side to verify
>>     your commit
>>      >>      >>> message before pushing, so you don't have to be 
>> surprised
>>      >> when the
>>      >>      >>> server rejects it.
>>      >>      >>
>>      >>      >> That sounds like a killer argument.  Do we have shared
>>      >> client-side
>>      >>      >> scripts that could fix things up for us, or are we each
>>     on our
>>      >> own
>>      >>      >> to write them?
>>      >>      >
>>      >>      > I hope I got your view wrong.  If not: the "scripts fixing
>>      >>      > things up for us" direction is flawed (compared to the
>>     "scripts
>>      >>      > rejecting bad formats"), unless offered as a non-default
>>     option;
>>      >>      > please don't proceed.
>>      >>      >
>>      >>      > Why?  For one, there'll always be bugs in the scripting.
>>      >>      > Mitigate those situations: while wrongly rejecting a
>>     commit is
>>      >>      > bad, wrongly "fixing things up" is worse, as a general 
>> rule.
>>      >>      > Better avoid that.  (There's probably a popular "pattern
>>     name"
>>      >>      > for what I try to describe.)
>>      >>
>>      >>     The word that comes to mind is Technophobia.  Is it wise to
>>     trust
>>      >>     compilers to transform programs from their source form into
>>      >>     executables?  What if there are bugs in either?  What about
>>     the OS?
>>      >>     The whole computer, or the Internet?  Our cars?     
>> Fortunately, there's
>>      >>     more to gain than to lose by trusting automation.  If there
>>     weren't
>>      >>     human progress would be stuck sometime in the 1700's.
>>      >>
>>      >>     But we're not talking about anything anywhere that 
>> sophisticated
>>      >>     here: a sed script to copy and paste a piece of text in
>>      >>     the description of a change from one place to another.  It's
>>     been
>>      >>     done a few times before with more important data than
>>     ChangeLogs.
>>      >>
>>      >>
>>      >> git gcc-commit-mklog already automates most of the process.  It
>>     could
>>      >> also automate adding [PRxxxxx] to the first line.  Is that what
>>     you're
>>      >> asking for?
>>      >
>>      > Like, say:
>>
>>     I don't think this solves the problem Xionghu Luo was asking about:
>>     https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html
>>
>>
>> Indeed, their problem was not mentioning the PR in the testcase, which 
>> a script isn't going to fix.
>>
>>     i.e., they did have a [PRnnnn] in the one line subject but not in
>>     their ChangeLog entries.  It also not clear if they used mklog.py
>>     at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
>>     a patch if it finds in one of the tests.  Though it doesn't seem
>>     to put in the ChangeLog entries.  Odd.
>>
>>
>> It currently puts in
>>
>>   PR comp/nnnnn
>>
>> at the beginning of the ChangeLog entries; it used to put them in the 
>> entries for each ChangeLog file, but that changed in r12-771.  My 
>> patch also adds the [PRnnnn] to the subject line.
> 
> To say I'm not good at Python would be an understatement but I hacked
> up the attached patch that:
> 
> 1) extracts PR numbers from test names,
> 2) gets the component for each PR from Bugzilla,

That seems useful for testcases like the OP's that put the PR number in 
the filename rather than a comment.  Maybe submit it as a patch?

> 3) adds the PR component/nnnnn to each ChangeLog

This would be reverting the r12-771 change, which seems both unrelated 
and undesirable.

> Running the hacked up script with -p on the patch for PR 100085 prints
> the following:
> 
> Resolves:
> PR 100085/target - Bad code for union transfer from __float128 to vector 
> types

You have the number and component switched.

>> We could do various cleanup in the 'commit-msg' hook on the user side. 
>> Or, probably better, git gcc-verify could clean up some of the issues 
>> it finds.
> 
> I'm not familiar with these.  Where should I look for them to learn
> how to do this?

We currently have no commit-msg hook; see git help hooks for a 
description, or .git/hooks/commit-msg.sample for an example hook.

git gcc-verify is an alias to run 
contrib/gcc-changelog/git_check_commit.py , which is the same checker 
that runs on the server.  I don't know how hard it would be to adjust it 
to have a fixup mode as well, for when it is run from git gcc-verify.

Jason


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17  0:40                                                       ` Jason Merrill
@ 2021-06-17  1:01                                                         ` Martin Sebor
  2021-06-17  1:46                                                           ` Jason Merrill
  2021-06-17 10:18                                                           ` Jonathan Wakely
  2021-06-17 10:08                                                         ` Richard Earnshaw
  1 sibling, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-17  1:01 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

On 6/16/21 6:40 PM, Jason Merrill wrote:
> On 6/16/21 8:17 PM, Martin Sebor wrote:
>> On 6/16/21 5:45 PM, Jason Merrill wrote:
>>> On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor <msebor@gmail.com 
>>> <mailto:msebor@gmail.com>> wrote:
>>>
>>>     On 6/16/21 2:49 PM, Jason Merrill wrote:
>>>      > On 6/15/21 11:42 PM, Jason Merrill wrote:
>>>      >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc
>>>     <gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>
>>>      >> <mailto:gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>>> wrote:
>>>      >>
>>>      >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>>>      >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>>>      >>      >
>>>      >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>>>      >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>>>      >>      >>>> My objection is to making our policies and tools more
>>>      >> restrictive
>>>      >>      >>>> than they need to be.  We shouldn't expect everyone to
>>>     study
>>>      >> whole
>>>      >>      >>>> manuals just to figure out how to successfully 
>>> commit a
>>>      >> change (or
>>>      >>      >>>> learn how to format it just the right way).  It should
>>>     be easy.
>>>      >>      >>>
>>>      >>      >>> I agree, to some extent. But consistency is also 
>>> good. The
>>>      >>     conventions
>>>      >>      >>> for GNU ChangeLog formatting exist for a reason, and so
>>>     do the
>>>      >>      >>> conventions for good Git commit messages.
>>>      >>      >>>
>>>      >>      >>>> Setting this discussion aside for a moment and using a
>>>      >> different
>>>      >>      >>>> example, the commit hook rejects commit messages that
>>>     don't
>>>      >> start
>>>      >>      >>>> ChangeLog entries with tabs.  It also rejects commit
>>>      >> messages that
>>>      >>      >>>> don't list all the same test files as those changed by
>>>     the
>>>      >> commit
>>>      >>      >>>> (and probably some others as well).  That's in my view
>>>      >> unnecessary
>>>      >>      >>>> when the hook could just replace the leading spaces 
>>> with
>>>      >> tabs and
>>>      >>      >>>> automatically mention all the tests.
>>>      >>      >>>>
>>>      >>      >>>> I see this proposal as heading in the same direction.
>>>      >> Rather than
>>>      >>      >>>> making the script fix things up if we get them wrong
>>>     it would
>>>      >>     reject
>>>      >>      >>>> the commit, requiring the user to massage the 
>>> ChangeLog by
>>>      >>     hand into
>>>      >>      >>>> an unnecessarily rigid format.
>>>      >>      >>>
>>>      >>      >>> You cannot "fix things up" in a server-side receive 
>>> hook,
>>>      >> because
>>>      >>      >>> changing the commit message would alter the commit
>>>     hash, which
>>>      >>     would
>>>      >>      >>> require the committer to do a rebase to proceed. That
>>>     breaks the
>>>      >>      >>> expected behaviour and workflow of a git repo.
>>>      >>      >>>
>>>      >>      >>> You can use the scripts on the client side to verify
>>>     your commit
>>>      >>      >>> message before pushing, so you don't have to be 
>>> surprised
>>>      >> when the
>>>      >>      >>> server rejects it.
>>>      >>      >>
>>>      >>      >> That sounds like a killer argument.  Do we have shared
>>>      >> client-side
>>>      >>      >> scripts that could fix things up for us, or are we each
>>>     on our
>>>      >> own
>>>      >>      >> to write them?
>>>      >>      >
>>>      >>      > I hope I got your view wrong.  If not: the "scripts 
>>> fixing
>>>      >>      > things up for us" direction is flawed (compared to the
>>>     "scripts
>>>      >>      > rejecting bad formats"), unless offered as a non-default
>>>     option;
>>>      >>      > please don't proceed.
>>>      >>      >
>>>      >>      > Why?  For one, there'll always be bugs in the scripting.
>>>      >>      > Mitigate those situations: while wrongly rejecting a
>>>     commit is
>>>      >>      > bad, wrongly "fixing things up" is worse, as a general 
>>> rule.
>>>      >>      > Better avoid that.  (There's probably a popular "pattern
>>>     name"
>>>      >>      > for what I try to describe.)
>>>      >>
>>>      >>     The word that comes to mind is Technophobia.  Is it wise to
>>>     trust
>>>      >>     compilers to transform programs from their source form into
>>>      >>     executables?  What if there are bugs in either?  What about
>>>     the OS?
>>>      >>     The whole computer, or the Internet?  Our cars? 
>>> Fortunately, there's
>>>      >>     more to gain than to lose by trusting automation.  If there
>>>     weren't
>>>      >>     human progress would be stuck sometime in the 1700's.
>>>      >>
>>>      >>     But we're not talking about anything anywhere that 
>>> sophisticated
>>>      >>     here: a sed script to copy and paste a piece of text in
>>>      >>     the description of a change from one place to another.  It's
>>>     been
>>>      >>     done a few times before with more important data than
>>>     ChangeLogs.
>>>      >>
>>>      >>
>>>      >> git gcc-commit-mklog already automates most of the process.  It
>>>     could
>>>      >> also automate adding [PRxxxxx] to the first line.  Is that what
>>>     you're
>>>      >> asking for?
>>>      >
>>>      > Like, say:
>>>
>>>     I don't think this solves the problem Xionghu Luo was asking about:
>>>     https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html
>>>
>>>
>>> Indeed, their problem was not mentioning the PR in the testcase, 
>>> which a script isn't going to fix.
>>>
>>>     i.e., they did have a [PRnnnn] in the one line subject but not in
>>>     their ChangeLog entries.  It also not clear if they used mklog.py
>>>     at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
>>>     a patch if it finds in one of the tests.  Though it doesn't seem
>>>     to put in the ChangeLog entries.  Odd.
>>>
>>>
>>> It currently puts in
>>>
>>>   PR comp/nnnnn
>>>
>>> at the beginning of the ChangeLog entries; it used to put them in the 
>>> entries for each ChangeLog file, but that changed in r12-771.  My 
>>> patch also adds the [PRnnnn] to the subject line.
>>
>> To say I'm not good at Python would be an understatement but I hacked
>> up the attached patch that:
>>
>> 1) extracts PR numbers from test names,
>> 2) gets the component for each PR from Bugzilla,
> 
> That seems useful for testcases like the OP's that put the PR number in 
> the filename rather than a comment.  Maybe submit it as a patch?

Will do.

> 
>> 3) adds the PR component/nnnnn to each ChangeLog
> 
> This would be reverting the r12-771 change, which seems both unrelated 
> and undesirable.

Now I'm confused.  Isn't that just what caused the problem to begin
with?  (The bug not being updated with the commit because it's not
in the ChangeLog entries?)

> 
>> Running the hacked up script with -p on the patch for PR 100085 prints
>> the following:
>>
>> Resolves:
>> PR 100085/target - Bad code for union transfer from __float128 to 
>> vector types
> 
> You have the number and component switched.

Doh!

> 
>>> We could do various cleanup in the 'commit-msg' hook on the user 
>>> side. Or, probably better, git gcc-verify could clean up some of the 
>>> issues it finds.
>>
>> I'm not familiar with these.  Where should I look for them to learn
>> how to do this?
> 
> We currently have no commit-msg hook; see git help hooks for a 
> description, or .git/hooks/commit-msg.sample for an example hook.
> 
> git gcc-verify is an alias to run 
> contrib/gcc-changelog/git_check_commit.py , which is the same checker 
> that runs on the server.  I don't know how hard it would be to adjust it 
> to have a fixup mode as well, for when it is run from git gcc-verify.

Thanks, I'll see if I can find some time to look at this.

Martin

> 
> Jason
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17  1:01                                                         ` Martin Sebor
@ 2021-06-17  1:46                                                           ` Jason Merrill
  2021-06-17 10:18                                                           ` Jonathan Wakely
  1 sibling, 0 replies; 84+ messages in thread
From: Jason Merrill @ 2021-06-17  1:46 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List, Jonathan Wakely

On 6/16/21 9:01 PM, Martin Sebor wrote:
> On 6/16/21 6:40 PM, Jason Merrill wrote:
>> On 6/16/21 8:17 PM, Martin Sebor wrote:
>>> On 6/16/21 5:45 PM, Jason Merrill wrote:
>>>> On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor <msebor@gmail.com 
>>>> <mailto:msebor@gmail.com>> wrote:
>>>>
>>>>     On 6/16/21 2:49 PM, Jason Merrill wrote:
>>>>      > On 6/15/21 11:42 PM, Jason Merrill wrote:
>>>>      >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc
>>>>     <gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>
>>>>      >> <mailto:gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>>> wrote:
>>>>      >>
>>>>      >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>>>>      >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>>>>      >>      >
>>>>      >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>>>>      >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>>>>      >>      >>>> My objection is to making our policies and tools more
>>>>      >> restrictive
>>>>      >>      >>>> than they need to be.  We shouldn't expect 
>>>> everyone to
>>>>     study
>>>>      >> whole
>>>>      >>      >>>> manuals just to figure out how to successfully 
>>>> commit a
>>>>      >> change (or
>>>>      >>      >>>> learn how to format it just the right way).  It 
>>>> should
>>>>     be easy.
>>>>      >>      >>>
>>>>      >>      >>> I agree, to some extent. But consistency is also 
>>>> good. The
>>>>      >>     conventions
>>>>      >>      >>> for GNU ChangeLog formatting exist for a reason, 
>>>> and so
>>>>     do the
>>>>      >>      >>> conventions for good Git commit messages.
>>>>      >>      >>>
>>>>      >>      >>>> Setting this discussion aside for a moment and 
>>>> using a
>>>>      >> different
>>>>      >>      >>>> example, the commit hook rejects commit messages that
>>>>     don't
>>>>      >> start
>>>>      >>      >>>> ChangeLog entries with tabs.  It also rejects commit
>>>>      >> messages that
>>>>      >>      >>>> don't list all the same test files as those 
>>>> changed by
>>>>     the
>>>>      >> commit
>>>>      >>      >>>> (and probably some others as well).  That's in my 
>>>> view
>>>>      >> unnecessary
>>>>      >>      >>>> when the hook could just replace the leading 
>>>> spaces with
>>>>      >> tabs and
>>>>      >>      >>>> automatically mention all the tests.
>>>>      >>      >>>>
>>>>      >>      >>>> I see this proposal as heading in the same direction.
>>>>      >> Rather than
>>>>      >>      >>>> making the script fix things up if we get them wrong
>>>>     it would
>>>>      >>     reject
>>>>      >>      >>>> the commit, requiring the user to massage the 
>>>> ChangeLog by
>>>>      >>     hand into
>>>>      >>      >>>> an unnecessarily rigid format.
>>>>      >>      >>>
>>>>      >>      >>> You cannot "fix things up" in a server-side receive 
>>>> hook,
>>>>      >> because
>>>>      >>      >>> changing the commit message would alter the commit
>>>>     hash, which
>>>>      >>     would
>>>>      >>      >>> require the committer to do a rebase to proceed. That
>>>>     breaks the
>>>>      >>      >>> expected behaviour and workflow of a git repo.
>>>>      >>      >>>
>>>>      >>      >>> You can use the scripts on the client side to verify
>>>>     your commit
>>>>      >>      >>> message before pushing, so you don't have to be 
>>>> surprised
>>>>      >> when the
>>>>      >>      >>> server rejects it.
>>>>      >>      >>
>>>>      >>      >> That sounds like a killer argument.  Do we have shared
>>>>      >> client-side
>>>>      >>      >> scripts that could fix things up for us, or are we each
>>>>     on our
>>>>      >> own
>>>>      >>      >> to write them?
>>>>      >>      >
>>>>      >>      > I hope I got your view wrong.  If not: the "scripts 
>>>> fixing
>>>>      >>      > things up for us" direction is flawed (compared to the
>>>>     "scripts
>>>>      >>      > rejecting bad formats"), unless offered as a non-default
>>>>     option;
>>>>      >>      > please don't proceed.
>>>>      >>      >
>>>>      >>      > Why?  For one, there'll always be bugs in the scripting.
>>>>      >>      > Mitigate those situations: while wrongly rejecting a
>>>>     commit is
>>>>      >>      > bad, wrongly "fixing things up" is worse, as a 
>>>> general rule.
>>>>      >>      > Better avoid that.  (There's probably a popular "pattern
>>>>     name"
>>>>      >>      > for what I try to describe.)
>>>>      >>
>>>>      >>     The word that comes to mind is Technophobia.  Is it wise to
>>>>     trust
>>>>      >>     compilers to transform programs from their source form into
>>>>      >>     executables?  What if there are bugs in either?  What about
>>>>     the OS?
>>>>      >>     The whole computer, or the Internet?  Our cars? 
>>>> Fortunately, there's
>>>>      >>     more to gain than to lose by trusting automation.  If there
>>>>     weren't
>>>>      >>     human progress would be stuck sometime in the 1700's.
>>>>      >>
>>>>      >>     But we're not talking about anything anywhere that 
>>>> sophisticated
>>>>      >>     here: a sed script to copy and paste a piece of text in
>>>>      >>     the description of a change from one place to another.  
>>>> It's
>>>>     been
>>>>      >>     done a few times before with more important data than
>>>>     ChangeLogs.
>>>>      >>
>>>>      >>
>>>>      >> git gcc-commit-mklog already automates most of the process.  It
>>>>     could
>>>>      >> also automate adding [PRxxxxx] to the first line.  Is that what
>>>>     you're
>>>>      >> asking for?
>>>>      >
>>>>      > Like, say:
>>>>
>>>>     I don't think this solves the problem Xionghu Luo was asking about:
>>>>     https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html
>>>>
>>>>
>>>> Indeed, their problem was not mentioning the PR in the testcase, 
>>>> which a script isn't going to fix.
>>>>
>>>>     i.e., they did have a [PRnnnn] in the one line subject but not in
>>>>     their ChangeLog entries.  It also not clear if they used mklog.py
>>>>     at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
>>>>     a patch if it finds in one of the tests.  Though it doesn't seem
>>>>     to put in the ChangeLog entries.  Odd.
>>>>
>>>>
>>>> It currently puts in
>>>>
>>>>   PR comp/nnnnn
>>>>
>>>> at the beginning of the ChangeLog entries; it used to put them in 
>>>> the entries for each ChangeLog file, but that changed in r12-771.  
>>>> My patch also adds the [PRnnnn] to the subject line.
>>>
>>> To say I'm not good at Python would be an understatement but I hacked
>>> up the attached patch that:
>>>
>>> 1) extracts PR numbers from test names,
>>> 2) gets the component for each PR from Bugzilla,
>>
>> That seems useful for testcases like the OP's that put the PR number 
>> in the filename rather than a comment.  Maybe submit it as a patch?
> 
> Will do.
> 
>>> 3) adds the PR component/nnnnn to each ChangeLog
>>
>> This would be reverting the r12-771 change, which seems both unrelated 
>> and undesirable.
> 
> Now I'm confused.  Isn't that just what caused the problem to begin
> with?  (The bug not being updated with the commit because it's not
> in the ChangeLog entries?)

The PR number wasn't in the ChangeLog entries because mklog didn't know 
the number, because it wasn't in a comment in the testcase; your #1 and 
#2 should fix that.

The bugzilla integration works fine with the current output of mklog 
when it can find the PR number.

Jason


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17  0:40                                                       ` Jason Merrill
  2021-06-17  1:01                                                         ` Martin Sebor
@ 2021-06-17 10:08                                                         ` Richard Earnshaw
  2021-06-17 17:12                                                           ` Joseph Myers
  1 sibling, 1 reply; 84+ messages in thread
From: Richard Earnshaw @ 2021-06-17 10:08 UTC (permalink / raw)
  To: Jason Merrill, Martin Sebor
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely



On 17/06/2021 01:40, Jason Merrill via Gcc wrote:
> On 6/16/21 8:17 PM, Martin Sebor wrote:
>> On 6/16/21 5:45 PM, Jason Merrill wrote:
>>> On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor <msebor@gmail.com 
>>> <mailto:msebor@gmail.com>> wrote:
>>>
>>>     On 6/16/21 2:49 PM, Jason Merrill wrote:
>>>      > On 6/15/21 11:42 PM, Jason Merrill wrote:
>>>      >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc
>>>     <gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>
>>>      >> <mailto:gcc@gcc.gnu.org <mailto:gcc@gcc.gnu.org>>> wrote:
>>>      >>
>>>      >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
>>>      >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
>>>      >>      >
>>>      >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
>>>      >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
>>>      >>      >>>> My objection is to making our policies and tools more
>>>      >> restrictive
>>>      >>      >>>> than they need to be.  We shouldn't expect everyone to
>>>     study
>>>      >> whole
>>>      >>      >>>> manuals just to figure out how to successfully 
>>> commit a
>>>      >> change (or
>>>      >>      >>>> learn how to format it just the right way).  It should
>>>     be easy.
>>>      >>      >>>
>>>      >>      >>> I agree, to some extent. But consistency is also 
>>> good. The
>>>      >>     conventions
>>>      >>      >>> for GNU ChangeLog formatting exist for a reason, and so
>>>     do the
>>>      >>      >>> conventions for good Git commit messages.
>>>      >>      >>>
>>>      >>      >>>> Setting this discussion aside for a moment and using a
>>>      >> different
>>>      >>      >>>> example, the commit hook rejects commit messages that
>>>     don't
>>>      >> start
>>>      >>      >>>> ChangeLog entries with tabs.  It also rejects commit
>>>      >> messages that
>>>      >>      >>>> don't list all the same test files as those changed by
>>>     the
>>>      >> commit
>>>      >>      >>>> (and probably some others as well).  That's in my view
>>>      >> unnecessary
>>>      >>      >>>> when the hook could just replace the leading spaces 
>>> with
>>>      >> tabs and
>>>      >>      >>>> automatically mention all the tests.
>>>      >>      >>>>
>>>      >>      >>>> I see this proposal as heading in the same direction.
>>>      >> Rather than
>>>      >>      >>>> making the script fix things up if we get them wrong
>>>     it would
>>>      >>     reject
>>>      >>      >>>> the commit, requiring the user to massage the 
>>> ChangeLog by
>>>      >>     hand into
>>>      >>      >>>> an unnecessarily rigid format.
>>>      >>      >>>
>>>      >>      >>> You cannot "fix things up" in a server-side receive 
>>> hook,
>>>      >> because
>>>      >>      >>> changing the commit message would alter the commit
>>>     hash, which
>>>      >>     would
>>>      >>      >>> require the committer to do a rebase to proceed. That
>>>     breaks the
>>>      >>      >>> expected behaviour and workflow of a git repo.
>>>      >>      >>>
>>>      >>      >>> You can use the scripts on the client side to verify
>>>     your commit
>>>      >>      >>> message before pushing, so you don't have to be 
>>> surprised
>>>      >> when the
>>>      >>      >>> server rejects it.
>>>      >>      >>
>>>      >>      >> That sounds like a killer argument.  Do we have shared
>>>      >> client-side
>>>      >>      >> scripts that could fix things up for us, or are we each
>>>     on our
>>>      >> own
>>>      >>      >> to write them?
>>>      >>      >
>>>      >>      > I hope I got your view wrong.  If not: the "scripts 
>>> fixing
>>>      >>      > things up for us" direction is flawed (compared to the
>>>     "scripts
>>>      >>      > rejecting bad formats"), unless offered as a non-default
>>>     option;
>>>      >>      > please don't proceed.
>>>      >>      >
>>>      >>      > Why?  For one, there'll always be bugs in the scripting.
>>>      >>      > Mitigate those situations: while wrongly rejecting a
>>>     commit is
>>>      >>      > bad, wrongly "fixing things up" is worse, as a general 
>>> rule.
>>>      >>      > Better avoid that.  (There's probably a popular "pattern
>>>     name"
>>>      >>      > for what I try to describe.)
>>>      >>
>>>      >>     The word that comes to mind is Technophobia.  Is it wise to
>>>     trust
>>>      >>     compilers to transform programs from their source form into
>>>      >>     executables?  What if there are bugs in either?  What about
>>>     the OS?
>>>      >>     The whole computer, or the Internet?  Our cars? 
>>> Fortunately, there's
>>>      >>     more to gain than to lose by trusting automation.  If there
>>>     weren't
>>>      >>     human progress would be stuck sometime in the 1700's.
>>>      >>
>>>      >>     But we're not talking about anything anywhere that 
>>> sophisticated
>>>      >>     here: a sed script to copy and paste a piece of text in
>>>      >>     the description of a change from one place to another.  It's
>>>     been
>>>      >>     done a few times before with more important data than
>>>     ChangeLogs.
>>>      >>
>>>      >>
>>>      >> git gcc-commit-mklog already automates most of the process.  It
>>>     could
>>>      >> also automate adding [PRxxxxx] to the first line.  Is that what
>>>     you're
>>>      >> asking for?
>>>      >
>>>      > Like, say:
>>>
>>>     I don't think this solves the problem Xionghu Luo was asking about:
>>>     https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html
>>>
>>>
>>> Indeed, their problem was not mentioning the PR in the testcase, 
>>> which a script isn't going to fix.
>>>
>>>     i.e., they did have a [PRnnnn] in the one line subject but not in
>>>     their ChangeLog entries.  It also not clear if they used mklog.py
>>>     at all.  IME, mklog.py already puts in a [PRnnnn] near the top of
>>>     a patch if it finds in one of the tests.  Though it doesn't seem
>>>     to put in the ChangeLog entries.  Odd.
>>>
>>>
>>> It currently puts in
>>>
>>>   PR comp/nnnnn
>>>
>>> at the beginning of the ChangeLog entries; it used to put them in the 
>>> entries for each ChangeLog file, but that changed in r12-771.  My 
>>> patch also adds the [PRnnnn] to the subject line.
>>
>> To say I'm not good at Python would be an understatement but I hacked
>> up the attached patch that:
>>
>> 1) extracts PR numbers from test names,
>> 2) gets the component for each PR from Bugzilla,
> 
> That seems useful for testcases like the OP's that put the PR number in 
> the filename rather than a comment.  Maybe submit it as a patch?

It seems a bit dangerous to me to rely on just extracting PR numbers 
from tests.  What if the patch is just adjusting a test to make it 
compatible with the remainder of the change?

Perhaps we should scan for such changes and ask the user to confirm if 
they are the right and then to also give the chance to 
add/remove/specify primary PRs as appropriate.  Finally, we can scrape 
bugzilla and print out the bug summaries as a way of validating that the 
correct PR number has been added.  It's far easier to spot that the 
summary doesn't match expectations than the PR number typed.

> 
>> 3) adds the PR component/nnnnn to each ChangeLog
> 
> This would be reverting the r12-771 change, which seems both unrelated 
> and undesirable.
> 
>> Running the hacked up script with -p on the patch for PR 100085 prints
>> the following:
>>
>> Resolves:
>> PR 100085/target - Bad code for union transfer from __float128 to 
>> vector types
> 
> You have the number and component switched.
> 
>>> We could do various cleanup in the 'commit-msg' hook on the user 
>>> side. Or, probably better, git gcc-verify could clean up some of the 
>>> issues it finds.
>>
>> I'm not familiar with these.  Where should I look for them to learn
>> how to do this?
> 
> We currently have no commit-msg hook; see git help hooks for a 
> description, or .git/hooks/commit-msg.sample for an example hook.
> 
> git gcc-verify is an alias to run 
> contrib/gcc-changelog/git_check_commit.py , which is the same checker 
> that runs on the server.  I don't know how hard it would be to adjust it 
> to have a fixup mode as well, for when it is run from git gcc-verify.
> 
> Jason
> 

R.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17  1:01                                                         ` Martin Sebor
  2021-06-17  1:46                                                           ` Jason Merrill
@ 2021-06-17 10:18                                                           ` Jonathan Wakely
  2021-06-17 14:55                                                             ` Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-17 10:18 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Jason Merrill, Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List

On Thu, 17 Jun 2021 at 02:01, Martin Sebor wrote:
>
> On 6/16/21 6:40 PM, Jason Merrill wrote:
> > On 6/16/21 8:17 PM, Martin Sebor wrote:
> >> 3) adds the PR component/nnnnn to each ChangeLog
> >
> > This would be reverting the r12-771 change, which seems both unrelated
> > and undesirable.
>
> Now I'm confused.  Isn't that just what caused the problem to begin
> with?  (The bug not being updated with the commit because it's not
> in the ChangeLog entries?)


No.

The original problem is that the PR wasn't in the body of the commit AT ALL.

The current mklog.py will do this:

    PR cmpt/nnnn

gcc/ChangeLog:

    * blah.c (blah): Blah blah.

That will cause updates to bugzilla, it works fine. You do not need to
change it to:

gcc/ChangeLog:

    PR cmpt/nnnn
    * blah.c (blah): Blah blah.

That would also work, but isn't necessary (and would be reverting the
r12-771 change).

Either of the forms above will update bugzilla. The original problem
was that neither of those forms was used.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 10:18                                                           ` Jonathan Wakely
@ 2021-06-17 14:55                                                             ` Martin Sebor
  2021-06-17 15:11                                                               ` Michael Matz
  0 siblings, 1 reply; 84+ messages in thread
From: Martin Sebor @ 2021-06-17 14:55 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Jason Merrill, Hans-Peter Nilsson, Jakub Jelinek, gcc Mailing List

On 6/17/21 4:18 AM, Jonathan Wakely wrote:
> On Thu, 17 Jun 2021 at 02:01, Martin Sebor wrote:
>>
>> On 6/16/21 6:40 PM, Jason Merrill wrote:
>>> On 6/16/21 8:17 PM, Martin Sebor wrote:
>>>> 3) adds the PR component/nnnnn to each ChangeLog
>>>
>>> This would be reverting the r12-771 change, which seems both unrelated
>>> and undesirable.
>>
>> Now I'm confused.  Isn't that just what caused the problem to begin
>> with?  (The bug not being updated with the commit because it's not
>> in the ChangeLog entries?)
> 
> 
> No.
> 
> The original problem is that the PR wasn't in the body of the commit AT ALL.

But I see [PR100085] right there at the end of the summary line:

   https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html

and here:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f700e4b0ee3ef53b48975cf89be26b9177e3a3f3

> 
> The current mklog.py will do this:
> 
>      PR cmpt/nnnn
> 
> gcc/ChangeLog:
> 
>      * blah.c (blah): Blah blah.
> 
> That will cause updates to bugzilla, it works fine. You do not need to
> change it to:
> 
> gcc/ChangeLog:
> 
>      PR cmpt/nnnn
>      * blah.c (blah): Blah blah.
> 
> That would also work, but isn't necessary (and would be reverting the
> r12-771 change).
> 
> Either of the forms above will update bugzilla. The original problem
> was that neither of those forms was used.

You mean the sole reason why Bugzilla wasn't updated was that
the component was missing from the PR?  And the proposed change
is to reject commits without the component bit?  Why?

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 14:55                                                             ` Martin Sebor
@ 2021-06-17 15:11                                                               ` Michael Matz
  2021-06-17 15:33                                                                 ` Martin Sebor
  0 siblings, 1 reply; 84+ messages in thread
From: Michael Matz @ 2021-06-17 15:11 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jonathan Wakely, Jakub Jelinek, gcc Mailing List

Hello,

On Thu, 17 Jun 2021, Martin Sebor via Gcc wrote:

> > The original problem is that the PR wasn't _in the body_ of the commit 
> 
> But I see [PR100085] right there at the end of the _summary line_:

Emphasis mine.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 15:11                                                               ` Michael Matz
@ 2021-06-17 15:33                                                                 ` Martin Sebor
  2021-06-17 16:31                                                                   ` Jakub Jelinek
  2021-06-17 16:32                                                                   ` Jonathan Wakely
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-17 15:33 UTC (permalink / raw)
  To: Michael Matz; +Cc: Jonathan Wakely, Jakub Jelinek, gcc Mailing List

On 6/17/21 9:11 AM, Michael Matz wrote:
> Hello,
> 
> On Thu, 17 Jun 2021, Martin Sebor via Gcc wrote:
> 
>>> The original problem is that the PR wasn't _in the body_ of the commit
>>
>> But I see [PR100085] right there at the end of the _summary line_:
> 
> Emphasis mine.

Let me make sure I understand: we ask users to put PR numbers
on the first line but the script doesn't consider the first line
a part of the "body" and so it doesn't use the PRs mentioned on
it to update Bugzilla?

If I got that right, why on earth not?  That makes no sense to me.
Are we worried about unrelated PRnnnn mentioned on the first line
triggering updates to unrelated bugs?

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 15:33                                                                 ` Martin Sebor
@ 2021-06-17 16:31                                                                   ` Jakub Jelinek
  2021-06-17 16:32                                                                   ` Jonathan Wakely
  1 sibling, 0 replies; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-17 16:31 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Michael Matz, Jonathan Wakely, gcc Mailing List

On Thu, Jun 17, 2021 at 09:33:06AM -0600, Martin Sebor wrote:
> On 6/17/21 9:11 AM, Michael Matz wrote:
> > Hello,
> > 
> > On Thu, 17 Jun 2021, Martin Sebor via Gcc wrote:
> > 
> > > > The original problem is that the PR wasn't _in the body_ of the commit
> > > 
> > > But I see [PR100085] right there at the end of the _summary line_:
> > 
> > Emphasis mine.
> 
> Let me make sure I understand: we ask users to put PR numbers
> on the first line but the script doesn't consider the first line
> a part of the "body" and so it doesn't use the PRs mentioned on
> it to update Bugzilla?

No.  The only thing that is linked to bugzilla is what appears in the
ChangeLog entry (intentionally so, not putting the PR into the ChangeLog
part means one doesn't see that in the ChangeLog).
And we ask users to put the PR numbers both on the first line (== subject
of mail) and in the ChangeLog entry.

The reason this thread started was that several people made the mistake
where they only mentioned PR in the first line and nowhere else and bugzilla
wasn't updated.  This is now fixed, such commits are rejected and so they
won't make the same mistake again.

	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 15:33                                                                 ` Martin Sebor
  2021-06-17 16:31                                                                   ` Jakub Jelinek
@ 2021-06-17 16:32                                                                   ` Jonathan Wakely
  2021-06-17 18:00                                                                     ` Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-17 16:32 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Michael Matz, Jakub Jelinek, gcc Mailing List

On Thu, 17 Jun 2021 at 16:33, Martin Sebor <msebor@gmail.com> wrote:
>
> On 6/17/21 9:11 AM, Michael Matz wrote:
> > Hello,
> >
> > On Thu, 17 Jun 2021, Martin Sebor via Gcc wrote:
> >
> >>> The original problem is that the PR wasn't _in the body_ of the commit
> >>
> >> But I see [PR100085] right there at the end of the _summary line_:
> >
> > Emphasis mine

Right. How are we still discussing this part of the original issue?!

>
> Let me make sure I understand: we ask users to put PR numbers
> on the first line but the script doesn't consider the first line
> a part of the "body" and so it doesn't use the PRs mentioned on
> it to update Bugzilla?

AFAIK Bugzilla does not process a "PRnnnnn" string anywhere in the
commit log. It only processes the full "PR component/nnnn" form. This
is stated in the docs:

"The body of the commit message should still contain the full form (PR
<component>/nnnnn) within the body of the commit message so that
Bugzilla will correctly notice the commit."

We want it on the subject line for **our** benefit, not bugzilla's. So
that the outout of 'git log --oneline' shows useful context.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 10:08                                                         ` Richard Earnshaw
@ 2021-06-17 17:12                                                           ` Joseph Myers
  2021-06-17 17:21                                                             ` Jason Merrill
  2021-06-17 17:21                                                             ` Jakub Jelinek
  0 siblings, 2 replies; 84+ messages in thread
From: Joseph Myers @ 2021-06-17 17:12 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: Jason Merrill, Martin Sebor, Jakub Jelinek, gcc Mailing List,
	Jonathan Wakely

On Thu, 17 Jun 2021, Richard Earnshaw via Gcc wrote:

> It seems a bit dangerous to me to rely on just extracting PR numbers from
> tests.  What if the patch is just adjusting a test to make it compatible with
> the remainder of the change?

Also, that a test is added for a PR, or a commit is relevant to a PR, is a 
weaker property than the commit *resolving* the PR.  The fact that a 
commit *resolves* a PR (allows it to be marked as resolved, or the 
regression markers to be updated if it's resolved in master but the fix 
still needs to be backported) needs to be explicitly affirmed by the 
committer (possibly based on a question asked by a script) rather than 
assumed by default based on the PR being mentioned somewhere.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 17:12                                                           ` Joseph Myers
@ 2021-06-17 17:21                                                             ` Jason Merrill
  2021-06-17 17:21                                                             ` Jakub Jelinek
  1 sibling, 0 replies; 84+ messages in thread
From: Jason Merrill @ 2021-06-17 17:21 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Richard Earnshaw, Martin Sebor, Jakub Jelinek, gcc Mailing List,
	Jonathan Wakely

On Thu, Jun 17, 2021 at 1:14 PM Joseph Myers <joseph@codesourcery.com>
wrote:

> On Thu, 17 Jun 2021, Richard Earnshaw via Gcc wrote:
>
> > It seems a bit dangerous to me to rely on just extracting PR numbers from
> > tests.  What if the patch is just adjusting a test to make it compatible
> with
> > the remainder of the change?
>
> Also, that a test is added for a PR, or a commit is relevant to a PR, is a
> weaker property than the commit *resolving* the PR.  The fact that a
> commit *resolves* a PR (allows it to be marked as resolved, or the
> regression markers to be updated if it's resolved in master but the fix
> still needs to be backported) needs to be explicitly affirmed by the
> committer (possibly based on a question asked by a script) rather than
> assumed by default based on the PR being mentioned somewhere.
>

Indeed, we would need to be cautious about adding such automation.  But
there isn't any now; the only danger is of an unrelated patch showing up in
a comment on a PR.  Which we already get when I mess up the PR number in my
test (oops).

Jason

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 17:12                                                           ` Joseph Myers
  2021-06-17 17:21                                                             ` Jason Merrill
@ 2021-06-17 17:21                                                             ` Jakub Jelinek
  2021-06-18  9:32                                                               ` Richard Earnshaw
  1 sibling, 1 reply; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-17 17:21 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Richard Earnshaw, gcc Mailing List, Jonathan Wakely

On Thu, Jun 17, 2021 at 05:12:52PM +0000, Joseph Myers wrote:
> On Thu, 17 Jun 2021, Richard Earnshaw via Gcc wrote:
> 
> > It seems a bit dangerous to me to rely on just extracting PR numbers from
> > tests.  What if the patch is just adjusting a test to make it compatible with
> > the remainder of the change?
> 
> Also, that a test is added for a PR, or a commit is relevant to a PR, is a 
> weaker property than the commit *resolving* the PR.  The fact that a 
> commit *resolves* a PR (allows it to be marked as resolved, or the 
> regression markers to be updated if it's resolved in master but the fix 
> still needs to be backported) needs to be explicitly affirmed by the 
> committer (possibly based on a question asked by a script) rather than 
> assumed by default based on the PR being mentioned somewhere.

mklog as is doesn't fill in the details (descriptions of the changes
to each function etc.), nor is realiable in many cases, and with Jason's
recent change just fills in the first and last part of the first line
but not the important middle part.
So, the developer has to hand edit it anyway and that I'd consider also
be the right time when the verification whether the PR being mentioned
is the right one etc.  So no need to add a question asked by the script
at another point.

	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 16:32                                                                   ` Jonathan Wakely
@ 2021-06-17 18:00                                                                     ` Martin Sebor
  0 siblings, 0 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-17 18:00 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Michael Matz, Jakub Jelinek, gcc Mailing List

On 6/17/21 10:32 AM, Jonathan Wakely wrote:
> On Thu, 17 Jun 2021 at 16:33, Martin Sebor <msebor@gmail.com> wrote:
>>
>> On 6/17/21 9:11 AM, Michael Matz wrote:
>>> Hello,
>>>
>>> On Thu, 17 Jun 2021, Martin Sebor via Gcc wrote:
>>>
>>>>> The original problem is that the PR wasn't _in the body_ of the commit
>>>>
>>>> But I see [PR100085] right there at the end of the _summary line_:
>>>
>>> Emphasis mine
> 
> Right. How are we still discussing this part of the original issue?!

Because no has explained *why* there had to be an issue to begin
with: the rationale for ignoring the piece of data that was in
the commit message and that could have been used to update Bugzilla.
I have just gone through the whole discussion again in case I missed
it but I couldn't find an answer.

>>
>> Let me make sure I understand: we ask users to put PR numbers
>> on the first line but the script doesn't consider the first line
>> a part of the "body" and so it doesn't use the PRs mentioned on
>> it to update Bugzilla?
> 
> AFAIK Bugzilla does not process a "PRnnnnn" string anywhere in the
> commit log. It only processes the full "PR component/nnnn" form. This
> is stated in the docs:
> 
> "The body of the commit message should still contain the full form (PR
> <component>/nnnnn) within the body of the commit message so that
> Bugzilla will correctly notice the commit."
> 
> We want it on the subject line for **our** benefit, not bugzilla's. So
> that the outout of 'git log --oneline' shows useful context.

But *why* do we not want to use the short PR in the subject to
update Bugzilla?

It sounds like our tooling (since r12-771) massages ChangeLogs
based on the PR component/nnnn in the commit description.  There
also is a way to determine the component for a PR nnnn by querying
Bugzilla (the -p option to mklog.py).  So why are we required to
duplicate the PR in two places in the commit message and why do
we need to provide the component?  In other words, why can
the original problem not be solved by making our tooling
smarter instead of rejecting such commits?

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog
  2021-06-17 17:21                                                             ` Jakub Jelinek
@ 2021-06-18  9:32                                                               ` Richard Earnshaw
  2021-06-18 11:05                                                                 ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
  0 siblings, 1 reply; 84+ messages in thread
From: Richard Earnshaw @ 2021-06-18  9:32 UTC (permalink / raw)
  To: Jakub Jelinek, Joseph Myers; +Cc: gcc Mailing List, Jonathan Wakely

On 17/06/2021 18:21, Jakub Jelinek wrote:
> On Thu, Jun 17, 2021 at 05:12:52PM +0000, Joseph Myers wrote:
>> On Thu, 17 Jun 2021, Richard Earnshaw via Gcc wrote:
>>
>>> It seems a bit dangerous to me to rely on just extracting PR numbers from
>>> tests.  What if the patch is just adjusting a test to make it compatible with
>>> the remainder of the change?
>>
>> Also, that a test is added for a PR, or a commit is relevant to a PR, is a 
>> weaker property than the commit *resolving* the PR.  The fact that a 
>> commit *resolves* a PR (allows it to be marked as resolved, or the 
>> regression markers to be updated if it's resolved in master but the fix 
>> still needs to be backported) needs to be explicitly affirmed by the 
>> committer (possibly based on a question asked by a script) rather than 
>> assumed by default based on the PR being mentioned somewhere.
> 
> mklog as is doesn't fill in the details (descriptions of the changes
> to each function etc.), nor is realiable in many cases, and with Jason's
> recent change just fills in the first and last part of the first line
> but not the important middle part.
> So, the developer has to hand edit it anyway and that I'd consider also
> be the right time when the verification whether the PR being mentioned
> is the right one etc.  So no need to add a question asked by the script
> at another point.
> 
> 	Jakub
> 


That misses my point.  If we use a tool to help doing this we can make
the tool also scrape the entry out of bugzilla and print the summary
line as a stronger visual check that the number has been typed
correctly.  We get many bug attributions wrong simply because two digits
have been transposed: visually checking the summary line is a far
stronger check that the correct number has been entered.

R.

^ permalink raw reply	[flat|nested] 84+ messages in thread

* [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18  9:32                                                               ` Richard Earnshaw
@ 2021-06-18 11:05                                                                 ` Tobias Burnus
  2021-06-18 11:10                                                                   ` Jonathan Wakely
  2021-06-18 14:41                                                                   ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Jason Merrill
  0 siblings, 2 replies; 84+ messages in thread
From: Tobias Burnus @ 2021-06-18 11:05 UTC (permalink / raw)
  To: Richard Earnshaw, Jakub Jelinek, Joseph Myers, Martin Liška
  Cc: gcc Mailing List, Jonathan Wakely

[-- Attachment #1: Type: text/plain, Size: 2357 bytes --]

On 18.06.21 11:32, Richard Earnshaw via Gcc wrote:
> On 17/06/2021 18:21, Jakub Jelinek wrote:
>> mklog as is doesn't fill in the details (descriptions of the changes
>> to each function etc.), nor is realiable in many cases, and with Jason's
>> recent change just fills in the first and last part of the first line
>> but not the important middle part.
>> So, the developer has to hand edit it anyway and that I'd consider also
>> be the right time when the verification whether the PR being mentioned
>> is the right one etc.  So no need to add a question asked by the script
>> at another point.
> That misses my point.  If we use a tool to help doing this we can make
> the tool also scrape the entry out of bugzilla and print the summary
> line as a stronger visual check that the number has been typed
> correctly.  We get many bug attributions wrong simply because two digits
> have been transposed: visually checking the summary line is a far
> stronger check that the correct number has been entered.

I want to point out that 'mklog -p' already outputs the bug-summary lines
at the top of the generated changelog, by fetching them from Bugzilla

This patch extends this by:
* Being able to specify the PR numbers on the command line in addition
   (currently, they are only extracted from the testsuite patches)
* For -p, updating the format and correcting/filling-in the component
* Avoiding to print the fetched PR lines summary lines multiple times.

The new '-b' option takes PR numbers, but treats them in the same way
as the PRs extracted from the examples: They are simply output, i.e.
'5' is printed as '\t5\n', which is not that useful.

But with -p added, it becomes rather nice. For instance:

   git diff |./contrib/mklog.py -b foo/12394 -b 100123 -p

nows prints:

PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517
PR fortran/100123 - -ftree-fre gives incorrect result in subroutine with array declared as length 1

         PR c++/12394
         PR fortran/100123

gcc/ChangeLog:


I think that works rather nice :-)

Martin (and all): What do you think? Patch attached.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Attachment #2: mklog-p.diff --]
[-- Type: text/x-patch, Size: 3216 bytes --]

contrib/mklog.py: Improve PR handling

contrib/ChangeLog:

	* mklog.py (bugzilla_url): Fetch also component.
	(get_pr_titles): Update PR string with correct format and component.
	(generate_changelog): Take additional PRs.
	(__main__): Add -b/--pr-numbers argument.

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 1f59055e723..d95f056c9cc 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -52,7 +52,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
 template_and_param_regex = re.compile(r'<[^<>]*>')
 md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
 bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&' \
-               'include_fields=summary'
+               'include_fields=summary,component'
 
 function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}
 
@@ -119,19 +119,20 @@ def sort_changelog_files(changed_file):
 
 def get_pr_titles(prs):
     output = ''
-    for pr in prs:
+    for idx, pr in enumerate(prs):
         pr_id = pr.split('/')[-1]
         r = requests.get(bugzilla_url % pr_id)
         bugs = r.json()['bugs']
         if len(bugs) == 1:
-            output += '%s - %s\n' % (pr, bugs[0]['summary'])
-            print(output)
+            prs[idx] = 'PR %s/%s' % (bugs[0]['component'], pr_id)
+            output += '%s - %s\n' % (prs[idx], bugs[0]['summary'])
     if output:
         output += '\n'
     return output
 
 
-def generate_changelog(data, no_functions=False, fill_pr_titles=False):
+def generate_changelog(data, no_functions=False, fill_pr_titles=False,
+                       additional_prs=None):
     changelogs = {}
     changelog_list = []
     prs = []
@@ -139,6 +140,8 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
     diff = PatchSet(data)
     global firstpr
 
+    if additional_prs:
+        prs = [pr for pr in additional_prs if pr not in prs]
     for file in diff:
         # skip files that can't be parsed
         if file.path == '/dev/null':
@@ -286,6 +289,8 @@ if __name__ == '__main__':
     parser = argparse.ArgumentParser(description=help_message)
     parser.add_argument('input', nargs='?',
                         help='Patch file (or missing, read standard input)')
+    parser.add_argument('-b', '--pr-numbers', action='append',
+                        help='Add the specified PRs (comma separated)')
     parser.add_argument('-s', '--no-functions', action='store_true',
                         help='Do not generate function names in ChangeLogs')
     parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
@@ -308,8 +313,11 @@ if __name__ == '__main__':
     if args.update_copyright:
         update_copyright(data)
     else:
+        pr_numbers = args.pr_numbers
+        if pr_numbers:
+            pr_numbers = [b for i in args.pr_numbers for b in i.split(',')]
         output = generate_changelog(data, args.no_functions,
-                                    args.fill_up_bug_titles)
+                                    args.fill_up_bug_titles, pr_numbers)
         if args.changelog:
             lines = open(args.changelog).read().split('\n')
             start = list(takewhile(lambda l: not l.startswith('#'), lines))

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18 11:05                                                                 ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
@ 2021-06-18 11:10                                                                   ` Jonathan Wakely
  2021-06-18 11:24                                                                     ` Jakub Jelinek
  2021-06-18 11:25                                                                     ` Tobias Burnus
  2021-06-18 14:41                                                                   ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Jason Merrill
  1 sibling, 2 replies; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-18 11:10 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Richard Earnshaw, Jakub Jelinek, Joseph Myers, Martin Liška,
	gcc Mailing List

On Fri, 18 Jun 2021 at 12:05, Tobias Burnus wrote:
> But with -p added, it becomes rather nice. For instance:
>
>    git diff |./contrib/mklog.py -b foo/12394 -b 100123 -p
>
> nows prints:
>
> PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517
> PR fortran/100123 - -ftree-fre gives incorrect result in subroutine with array declared as length 1
>
>          PR c++/12394
>          PR fortran/100123

Now that we put these PR cmpt/nnnn lines before the xxx/ChangeLog
entries, is there any reason to indent them with a TAB?

If we didn't require the TAB before them, then the lines added by -p
would serve the same purpose, and we wouldn't need to repeat them

So instead of requiring "^\tPR .*/\d+$" for the PR entry, require
"^\t?PR ([^/]+)/\d+"

That would allow the existing convention, but also allow these to be
used in their place:

PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517
PR fortran/100123 - -ftree-fre gives incorrect result in subroutine
with array declared as length 1


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18 11:10                                                                   ` Jonathan Wakely
@ 2021-06-18 11:24                                                                     ` Jakub Jelinek
  2021-06-18 11:25                                                                     ` Tobias Burnus
  1 sibling, 0 replies; 84+ messages in thread
From: Jakub Jelinek @ 2021-06-18 11:24 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Tobias Burnus, Richard Earnshaw, Joseph Myers, Martin Liška,
	gcc Mailing List

On Fri, Jun 18, 2021 at 12:10:33PM +0100, Jonathan Wakely wrote:
> On Fri, 18 Jun 2021 at 12:05, Tobias Burnus wrote:
> > But with -p added, it becomes rather nice. For instance:
> >
> >    git diff |./contrib/mklog.py -b foo/12394 -b 100123 -p
> >
> > nows prints:
> >
> > PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517
> > PR fortran/100123 - -ftree-fre gives incorrect result in subroutine with array declared as length 1
> >
> >          PR c++/12394
> >          PR fortran/100123
> 
> Now that we put these PR cmpt/nnnn lines before the xxx/ChangeLog
> entries, is there any reason to indent them with a TAB?

We want those PR lines in the ChangeLogs to be indented by the tab,
only the date line is not indented.

I agree it would be nice to allow again optional space hyphen space description
after the PR component/nnnnnn and it is also pitty we don't allow anymore
what used to be common in the gcc/cp ChangeLog for C++ papers.
That used to be either written after the PR, or without it:
	PR c++/88337 - Implement P1327R1: Allow dynamic_cast in constexpr.
	PR c++/91353 - P1331R2: Allow trivial default init in constexpr contexts.
	Implement P1814R0, CTAD for alias templates.
	P1381R1 - Reference capture of structured bindings
not sure if we want to allow arbitrary text there, but perhaps allow
"^\tP[0-9]+R[0-9]+ - .*"
in addition to that optional
"^\tPR ([^/]+)/\d+ - .*"
?

	Jakub


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18 11:10                                                                   ` Jonathan Wakely
  2021-06-18 11:24                                                                     ` Jakub Jelinek
@ 2021-06-18 11:25                                                                     ` Tobias Burnus
  2021-06-18 11:40                                                                       ` Jonathan Wakely
  2021-06-18 16:40                                                                       ` [Patch] contrib/mklog.py: Improve PR handling Martin Sebor
  1 sibling, 2 replies; 84+ messages in thread
From: Tobias Burnus @ 2021-06-18 11:25 UTC (permalink / raw)
  To: Jonathan Wakely
  Cc: Richard Earnshaw, Jakub Jelinek, Joseph Myers, Martin Liška,
	gcc Mailing List

On 18.06.21 13:10, Jonathan Wakely wrote:

> On Fri, 18 Jun 2021 at 12:05, Tobias Burnus wrote:
>> PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517
>> PR fortran/100123 - -ftree-fre gives incorrect result in subroutine with array declared as length 1
>>           PR c++/12394
>>           PR fortran/100123
> Now that we put these PR cmpt/nnnn lines before the xxx/ChangeLog
> entries, is there any reason to indent them with a TAB?

My understanding with the PR before vs. after the xxx/ChangeLog is
that if you put them before, they apply to all changelog entries
and if you put them afterward, it only only applies to the those
ChangeLog files for which they have been mentioned.

(But I also might have misread the code/were misguided by the
var names.)

And when manually creating the changelog from scratch, they still
often end up after the ChangeLog line.

But otherwise, I am not attached to the tab. Just one thing:
It might be slightly inconsistent to require the tab after the
xxx/ChangeLog line but not before that line. And I think a
tab should be used after the xxx/ChangeLog line.

> If we didn't require the TAB before them, then the lines added by -p
> would serve the same purpose, and we wouldn't need to repeat them
>
> So instead of requiring "^\tPR .*/\d+$" for the PR entry, require
> "^\t?PR ([^/]+)/\d+"
I think we should require either ($|\s) after the PR number but
otherwise, it works for me.

I am sure that Martin Sebor will like this change as he currently
writes changelogs (see in this thread or) like the following one.

In that sense, there would be a precedent in terms of actual usage.

----------
Teach compute_objsize about placement new [PR100876].

Resolves:
PR c++/100876 - -Wmismatched-new-delete should understand placement new when it's not inlined

gcc/ChangeLog:

        PR c++/100876
----------

Tobias

PS: My feeling is that you like the proposed mklog patch :-)

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18 11:25                                                                     ` Tobias Burnus
@ 2021-06-18 11:40                                                                       ` Jonathan Wakely
  2021-06-21  7:28                                                                         ` Martin Liška
  2021-06-18 16:40                                                                       ` [Patch] contrib/mklog.py: Improve PR handling Martin Sebor
  1 sibling, 1 reply; 84+ messages in thread
From: Jonathan Wakely @ 2021-06-18 11:40 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Richard Earnshaw, Jakub Jelinek, Joseph Myers, Martin Liška,
	gcc Mailing List

On Fri, 18 Jun 2021 at 12:26, Tobias Burnus <tobias@codesourcery.com> wrote:
>
> On 18.06.21 13:10, Jonathan Wakely wrote:
>
> > On Fri, 18 Jun 2021 at 12:05, Tobias Burnus wrote:
> >> PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517
> >> PR fortran/100123 - -ftree-fre gives incorrect result in subroutine with array declared as length 1
> >>           PR c++/12394
> >>           PR fortran/100123
> > Now that we put these PR cmpt/nnnn lines before the xxx/ChangeLog
> > entries, is there any reason to indent them with a TAB?
>
> My understanding with the PR before vs. after the xxx/ChangeLog is
> that if you put them before, they apply to all changelog entries
> and if you put them afterward, it only only applies to the those
> ChangeLog files for which they have been mentioned.

That's my understanding too.

> (But I also might have misread the code/were misguided by the
> var names.)
>
> And when manually creating the changelog from scratch, they still
> often end up after the ChangeLog line.
>
> But otherwise, I am not attached to the tab. Just one thing:
> It might be slightly inconsistent to require the tab after the
> xxx/ChangeLog line but not before that line. And I think a
> tab should be used after the xxx/ChangeLog line.

I agree it should be used after the xxx/ChangeLog line, but I am OK
with the inconsistency. *Everything* after the xxx/ChangeLog line
(except the next yyy/ChangeLog line) is indented.

The current output of mklog.py just looks weird:

        PR cmpt/123456

xxx/ChangeLog:

       * file.c (func): Description.

The changelog entry is indented because that's what the actual
ChangeLog files use, but also because it makes it distinct from the
"xxx/ChangeLog" heading before it. But there is no "heading" line
before the PR number that it needs to stand out from. The PR line just
seems to be floating in the middle of the page, looking a bit lost.

I think what matters is that the script that generates the ChangeLog
files can identify the "PR cmpt/123456" part and add it to the
ChangeLog file. If it can identify that without a leading TAB (and
insert one for the text added to the ChangeLog file) then I think we
shouldn't require the TAB. And if it can still identify it even with
text following the PR number, then I think we should be allowed to
have text following the PR number.

> PS: My feeling is that you like the proposed mklog patch :-)

Yes, I think it's a nice improvement, sorry for not saying that explicitly.


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18 11:05                                                                 ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
  2021-06-18 11:10                                                                   ` Jonathan Wakely
@ 2021-06-18 14:41                                                                   ` Jason Merrill
  2021-06-18 16:47                                                                     ` [Patch] contrib/mklog.py: Improve PR handling Martin Sebor
  2021-06-21  6:42                                                                     ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
  1 sibling, 2 replies; 84+ messages in thread
From: Jason Merrill @ 2021-06-18 14:41 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Richard Earnshaw, Jakub Jelinek, Joseph Myers, Martin Liška,
	gcc Mailing List, Jonathan Wakely

On Fri, Jun 18, 2021 at 7:06 AM Tobias Burnus <tobias@codesourcery.com>
wrote:

> On 18.06.21 11:32, Richard Earnshaw via Gcc wrote:
> > On 17/06/2021 18:21, Jakub Jelinek wrote:
> >> mklog as is doesn't fill in the details (descriptions of the changes
> >> to each function etc.), nor is realiable in many cases, and with Jason's
> >> recent change just fills in the first and last part of the first line
> >> but not the important middle part.
> >> So, the developer has to hand edit it anyway and that I'd consider also
> >> be the right time when the verification whether the PR being mentioned
> >> is the right one etc.  So no need to add a question asked by the script
> >> at another point.
> > That misses my point.  If we use a tool to help doing this we can make
> > the tool also scrape the entry out of bugzilla and print the summary
> > line as a stronger visual check that the number has been typed
> > correctly.  We get many bug attributions wrong simply because two digits
> > have been transposed: visually checking the summary line is a far
> > stronger check that the correct number has been entered.
>
> I want to point out that 'mklog -p' already outputs the bug-summary lines
> at the top of the generated changelog, by fetching them from Bugzilla
>
> This patch extends this by:
> * Being able to specify the PR numbers on the command line in addition
>    (currently, they are only extracted from the testsuite patches)
>

This bit seems unnecessary to me, since we want the commit to include tests
that identify the PR.

Martin Sebor's patch to extract the PR number from the testcase filename,
as an alternative to a comment, should be enough.

Jason

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling
  2021-06-18 11:25                                                                     ` Tobias Burnus
  2021-06-18 11:40                                                                       ` Jonathan Wakely
@ 2021-06-18 16:40                                                                       ` Martin Sebor
  1 sibling, 0 replies; 84+ messages in thread
From: Martin Sebor @ 2021-06-18 16:40 UTC (permalink / raw)
  To: Tobias Burnus, Jonathan Wakely
  Cc: Jakub Jelinek, gcc Mailing List, Richard Earnshaw, Joseph Myers

On 6/18/21 5:25 AM, Tobias Burnus wrote:
> On 18.06.21 13:10, Jonathan Wakely wrote:
> 
>> On Fri, 18 Jun 2021 at 12:05, Tobias Burnus wrote:
>>> PR c++/12394 - internal compiler error: in write_type, at 
>>> cp/mangle.c:1517
>>> PR fortran/100123 - -ftree-fre gives incorrect result in subroutine 
>>> with array declared as length 1
>>>           PR c++/12394
>>>           PR fortran/100123
>> Now that we put these PR cmpt/nnnn lines before the xxx/ChangeLog
>> entries, is there any reason to indent them with a TAB?
> 
> My understanding with the PR before vs. after the xxx/ChangeLog is
> that if you put them before, they apply to all changelog entries
> and if you put them afterward, it only only applies to the those
> ChangeLog files for which they have been mentioned.
> 
> (But I also might have misread the code/were misguided by the
> var names.)
> 
> And when manually creating the changelog from scratch, they still
> often end up after the ChangeLog line.
> 
> But otherwise, I am not attached to the tab. Just one thing:
> It might be slightly inconsistent to require the tab after the
> xxx/ChangeLog line but not before that line. And I think a
> tab should be used after the xxx/ChangeLog line.
> 
>> If we didn't require the TAB before them, then the lines added by -p
>> would serve the same purpose, and we wouldn't need to repeat them
>>
>> So instead of requiring "^\tPR .*/\d+$" for the PR entry, require
>> "^\t?PR ([^/]+)/\d+"
> I think we should require either ($|\s) after the PR number but
> otherwise, it works for me.
> 
> I am sure that Martin Sebor will like this change as he currently
> writes changelogs (see in this thread or) like the following one.

I do like adding the bug title and extracting the component from
Bugzilla.  IIRC, Martin Liska implemented -p in mklog.py on my
request (although I just discovered the option recently). I was
going to propose something similar to your solution:
   https://gcc.gnu.org/pipermail/gcc/2021-June/236460.html
but I'm happy to let you run with it.

Thanks
Martin

> 
> In that sense, there would be a precedent in terms of actual usage.
> 
> ----------
> Teach compute_objsize about placement new [PR100876].
> 
> Resolves:
> PR c++/100876 - -Wmismatched-new-delete should understand placement new 
> when it's not inlined
> 
> gcc/ChangeLog:
> 
>         PR c++/100876
> ----------
> 
> Tobias
> 
> PS: My feeling is that you like the proposed mklog patch :-)
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
> Frank Thürauf


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling
  2021-06-18 14:41                                                                   ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Jason Merrill
@ 2021-06-18 16:47                                                                     ` Martin Sebor
  2021-06-18 16:59                                                                       ` Iain Sandoe
  2021-06-21  6:42                                                                     ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
  1 sibling, 1 reply; 84+ messages in thread
From: Martin Sebor @ 2021-06-18 16:47 UTC (permalink / raw)
  To: Jason Merrill, Tobias Burnus
  Cc: Jakub Jelinek, gcc Mailing List, Richard Earnshaw,
	Jonathan Wakely, Joseph Myers

On 6/18/21 8:41 AM, Jason Merrill via Gcc wrote:
> On Fri, Jun 18, 2021 at 7:06 AM Tobias Burnus <tobias@codesourcery.com>
> wrote:
> 
>> On 18.06.21 11:32, Richard Earnshaw via Gcc wrote:
>>> On 17/06/2021 18:21, Jakub Jelinek wrote:
>>>> mklog as is doesn't fill in the details (descriptions of the changes
>>>> to each function etc.), nor is realiable in many cases, and with Jason's
>>>> recent change just fills in the first and last part of the first line
>>>> but not the important middle part.
>>>> So, the developer has to hand edit it anyway and that I'd consider also
>>>> be the right time when the verification whether the PR being mentioned
>>>> is the right one etc.  So no need to add a question asked by the script
>>>> at another point.
>>> That misses my point.  If we use a tool to help doing this we can make
>>> the tool also scrape the entry out of bugzilla and print the summary
>>> line as a stronger visual check that the number has been typed
>>> correctly.  We get many bug attributions wrong simply because two digits
>>> have been transposed: visually checking the summary line is a far
>>> stronger check that the correct number has been entered.
>>
>> I want to point out that 'mklog -p' already outputs the bug-summary lines
>> at the top of the generated changelog, by fetching them from Bugzilla
>>
>> This patch extends this by:
>> * Being able to specify the PR numbers on the command line in addition
>>     (currently, they are only extracted from the testsuite patches)
>>
> 
> This bit seems unnecessary to me, since we want the commit to include tests
> that identify the PR.
> 
> Martin Sebor's patch to extract the PR number from the testcase filename,
> as an alternative to a comment, should be enough.

Just to prevent a deadlock: Tobias, please feel free to reuse any
parts of my patch if you wish, or just go with what you have if you
prefer.  I can submit whatever's left after you're done.

Martin

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling
  2021-06-18 16:47                                                                     ` [Patch] contrib/mklog.py: Improve PR handling Martin Sebor
@ 2021-06-18 16:59                                                                       ` Iain Sandoe
  0 siblings, 0 replies; 84+ messages in thread
From: Iain Sandoe @ 2021-06-18 16:59 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Jason Merrill, Jakub Jelinek, gcc Mailing List, Jonathan Wakely,
	Richard Earnshaw, Martin Sebor, Joseph Myers



> On 18 Jun 2021, at 17:47, Martin Sebor via Gcc <gcc@gcc.gnu.org> wrote:
> 
> On 6/18/21 8:41 AM, Jason Merrill via Gcc wrote:
>> On Fri, Jun 18, 2021 at 7:06 AM Tobias Burnus <tobias@codesourcery.com>
>> wrote:
>>> On 18.06.21 11:32, Richard Earnshaw via Gcc wrote:
>>>> On 17/06/2021 18:21, Jakub Jelinek wrote:
>>>>> mklog as is doesn't fill in the details (descriptions of the changes
>>>>> to each function etc.), nor is realiable in many cases, and with Jason's
>>>>> recent change just fills in the first and last part of the first line
>>>>> but not the important middle part.
>>>>> So, the developer has to hand edit it anyway and that I'd consider also
>>>>> be the right time when the verification whether the PR being mentioned
>>>>> is the right one etc.  So no need to add a question asked by the script
>>>>> at another point.
>>>> That misses my point.  If we use a tool to help doing this we can make
>>>> the tool also scrape the entry out of bugzilla and print the summary
>>>> line as a stronger visual check that the number has been typed
>>>> correctly.  We get many bug attributions wrong simply because two digits
>>>> have been transposed: visually checking the summary line is a far
>>>> stronger check that the correct number has been entered.
>>> 
>>> I want to point out that 'mklog -p' already outputs the bug-summary lines
>>> at the top of the generated changelog, by fetching them from Bugzilla
>>> 
>>> This patch extends this by:
>>> * Being able to specify the PR numbers on the command line in addition
>>>    (currently, they are only extracted from the testsuite patches)
>>> 
>> This bit seems unnecessary to me, since we want the commit to include tests
>> that identify the PR.
>> Martin Sebor's patch to extract the PR number from the testcase filename,
>> as an alternative to a comment, should be enough.
> 
> Just to prevent a deadlock: Tobias, please feel free to reuse any
> parts of my patch if you wish, or just go with what you have if you
> prefer.  I can submit whatever's left after you're done.

Hopefully, these improvements can still be used with ‘ git commit-mklog … ‘ so that
there’s some what to separate out e.g. ‘-s’ (to the commit) from any arguments used
for the mklog.

I’m totally in favour of automating this kind of thing (our current system is a huge
improvement over what we had) - but we need to cater for a variety of work-flows.



^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18 14:41                                                                   ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Jason Merrill
  2021-06-18 16:47                                                                     ` [Patch] contrib/mklog.py: Improve PR handling Martin Sebor
@ 2021-06-21  6:42                                                                     ` Tobias Burnus
  2021-06-21  7:26                                                                       ` Martin Liška
  1 sibling, 1 reply; 84+ messages in thread
From: Tobias Burnus @ 2021-06-21  6:42 UTC (permalink / raw)
  To: Jason Merrill
  Cc: Richard Earnshaw, Jakub Jelinek, Joseph Myers, Martin Liška,
	gcc Mailing List, Jonathan Wakely

On 18.06.21 16:41, Jason Merrill wrote:

>
>     * Being able to specify the PR numbers on the command line in addition
>        (currently, they are only extracted from the testsuite patches)
>
>
> This bit seems unnecessary to me, since we want the commit to include
> tests that identify the PR.
I full hearty disagree!
> Martin Sebor's patch to extract the PR number from the testcase
> filename, as an alternative to a comment, should be enough.

Regarding the PR from the filename, are we only talking about
new files – or about modifications to old files?

For new files, together with -p it seems to be useful; for
old files, I think there will be more spurious/false PRs than
useful/right PRs.

Looking through the last few commits, in all of the following
cases the -b would be useful — and I do regard the commits
messages are sensible:

* cc9c94d43dcfa98436152af9c00f011e9dab25f6 PR libstdc++/100387
   added testsuite/25_algorithms/minmax/constrained.cc (etc.)
   but file neither has a pr... name nor contains a PR line
   (BTW: The fix is not primarily for the PR but fixes it as,
    a side effect; hence, not having a PR line makes sense)
* 870b674f72d4894b94efa61764fd87ecec29ffde
   'ranger' - removes a datastructure (no testcase but also
   does not make sense here)
* 17a4bee01c3b29c5ccdd39f34384521e5d44135b
   Fixes an ICE for one testcase (but of course does not add
   a new testcase or modify an existing one.)
* 76e990fd211cbb20bf74ce074eb8b2d7b096d3b7
   fix a bootstrap issue (found with the GCC 11 backport).

Those are all commits between last Friday and today. Hence, I do
believe it makes sense to be able to specify the PR on the
command line — besides obtaining it from file names.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-21  6:42                                                                     ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
@ 2021-06-21  7:26                                                                       ` Martin Liška
  2021-06-21  8:02                                                                         ` Iain Sandoe
  0 siblings, 1 reply; 84+ messages in thread
From: Martin Liška @ 2021-06-21  7:26 UTC (permalink / raw)
  To: Tobias Burnus, Jason Merrill
  Cc: Richard Earnshaw, Jakub Jelinek, Joseph Myers, gcc Mailing List,
	Jonathan Wakely

On 6/21/21 8:42 AM, Tobias Burnus wrote:
> On 18.06.21 16:41, Jason Merrill wrote:
> 
>>
>>     * Being able to specify the PR numbers on the command line in addition
>>        (currently, they are only extracted from the testsuite patches)
>>
>>
>> This bit seems unnecessary to me, since we want the commit to include
>> tests that identify the PR.
> I full hearty disagree!

Hello.

I do support Tobias' patch and I really think -b argument is handy. Question is,
if one can use it from 'git commit-mklog' hook? I'm asking because I don't use mklog.py
directly.

Martin

>> Martin Sebor's patch to extract the PR number from the testcase
>> filename, as an alternative to a comment, should be enough.
> 
> Regarding the PR from the filename, are we only talking about
> new files – or about modifications to old files?
> 
> For new files, together with -p it seems to be useful; for
> old files, I think there will be more spurious/false PRs than
> useful/right PRs.
> 
> Looking through the last few commits, in all of the following
> cases the -b would be useful — and I do regard the commits
> messages are sensible:
> 
> * cc9c94d43dcfa98436152af9c00f011e9dab25f6 PR libstdc++/100387
>    added testsuite/25_algorithms/minmax/constrained.cc (etc.)
>    but file neither has a pr... name nor contains a PR line
>    (BTW: The fix is not primarily for the PR but fixes it as,
>     a side effect; hence, not having a PR line makes sense)
> * 870b674f72d4894b94efa61764fd87ecec29ffde
>    'ranger' - removes a datastructure (no testcase but also
>    does not make sense here)
> * 17a4bee01c3b29c5ccdd39f34384521e5d44135b
>    Fixes an ICE for one testcase (but of course does not add
>    a new testcase or modify an existing one.)
> * 76e990fd211cbb20bf74ce074eb8b2d7b096d3b7
>    fix a bootstrap issue (found with the GCC 11 backport).
> 
> Those are all commits between last Friday and today. Hence, I do
> believe it makes sense to be able to specify the PR on the
> command line — besides obtaining it from file names.
> 
> Tobias
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-18 11:40                                                                       ` Jonathan Wakely
@ 2021-06-21  7:28                                                                         ` Martin Liška
  0 siblings, 0 replies; 84+ messages in thread
From: Martin Liška @ 2021-06-21  7:28 UTC (permalink / raw)
  To: Jonathan Wakely, Tobias Burnus
  Cc: Richard Earnshaw, Jakub Jelinek, Joseph Myers, gcc Mailing List

On 6/18/21 1:40 PM, Jonathan Wakely wrote:
> On Fri, 18 Jun 2021 at 12:26, Tobias Burnus <tobias@codesourcery.com> wrote:
>>
>> On 18.06.21 13:10, Jonathan Wakely wrote:
>>
>>> On Fri, 18 Jun 2021 at 12:05, Tobias Burnus wrote:
>>>> PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517
>>>> PR fortran/100123 - -ftree-fre gives incorrect result in subroutine with array declared as length 1
>>>>            PR c++/12394
>>>>            PR fortran/100123
>>> Now that we put these PR cmpt/nnnn lines before the xxx/ChangeLog
>>> entries, is there any reason to indent them with a TAB?
>>
>> My understanding with the PR before vs. after the xxx/ChangeLog is
>> that if you put them before, they apply to all changelog entries
>> and if you put them afterward, it only only applies to the those
>> ChangeLog files for which they have been mentioned.
> 
> That's my understanding too.
> 
>> (But I also might have misread the code/were misguided by the
>> var names.)
>>
>> And when manually creating the changelog from scratch, they still
>> often end up after the ChangeLog line.
>>
>> But otherwise, I am not attached to the tab. Just one thing:
>> It might be slightly inconsistent to require the tab after the
>> xxx/ChangeLog line but not before that line. And I think a
>> tab should be used after the xxx/ChangeLog line.
> 
> I agree it should be used after the xxx/ChangeLog line, but I am OK
> with the inconsistency. *Everything* after the xxx/ChangeLog line
> (except the next yyy/ChangeLog line) is indented.
> 
> The current output of mklog.py just looks weird:
> 
>          PR cmpt/123456
> 
> xxx/ChangeLog:
> 
>         * file.c (func): Description.

I agree we should support the relaxed format like:
PR c++/12394 - internal compiler error: in write_type, at cp/mangle.c:1517

What's tricky in gcc-changelog script that one needs to split pre-ChangeLog part
and ChangeLog parts. That's make it more complicated, but doable.

Martin

> 
> The changelog entry is indented because that's what the actual
> ChangeLog files use, but also because it makes it distinct from the
> "xxx/ChangeLog" heading before it. But there is no "heading" line
> before the PR number that it needs to stand out from. The PR line just
> seems to be floating in the middle of the page, looking a bit lost.
> 
> I think what matters is that the script that generates the ChangeLog
> files can identify the "PR cmpt/123456" part and add it to the
> ChangeLog file. If it can identify that without a leading TAB (and
> insert one for the text added to the ChangeLog file) then I think we
> shouldn't require the TAB. And if it can still identify it even with
> text following the PR number, then I think we should be allowed to
> have text following the PR number.
> 
>> PS: My feeling is that you like the proposed mklog patch :-)
> 
> Yes, I think it's a nice improvement, sorry for not saying that explicitly.
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [Patch, v2] contrib/mklog.py: Improve PR handling (was: Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-17  0:17                                                     ` Martin Sebor
  2021-06-17  0:40                                                       ` Jason Merrill
@ 2021-06-21  7:54                                                       ` Tobias Burnus
  2021-06-21  8:09                                                         ` Martin Liška
  1 sibling, 1 reply; 84+ messages in thread
From: Tobias Burnus @ 2021-06-21  7:54 UTC (permalink / raw)
  To: Martin Sebor, Jason Merrill, Martin Liška
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2417 bytes --]

On 17.06.21 02:17, Martin Sebor via Gcc wrote:
> @@ -147,6 +152,12 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
>
>           # Extract PR entries from newly added tests
>           if 'testsuite' in file.path and file.is_added_file:
> +            name = os.path.basename(file.path)
> +            name = os.path.splitext(name)[0]
> +            if name.startswith("pr"):
> +                name = name[2:]
> +                name = "PR " + name
> +                prs.append(name)

I think you need a regular expression to extract the PR – as it will both match
too much and to little. We have file names such as:
* libstdc++-pr91488.C (other prefix)
* PR37039.f90  (capitalized PR)
* pr98218-1.C (suffix with '-')
* pr40724_1.f (suffix with '_')
* pr101023a.C (suffix with a letter)

But otherwise, I like that idea.

  * * *

Changes in my patch compared to v1:
- (From Martin's patch:) Extract the PR from new-files file
   name (using pattern matching), but only take the PR if the
   PR wasn't found in the file as PR comment.
   (The latter happens, e.g., with b376b1ef389.)
- Avoid printing the same PR multiple times as summary line
   (duplicates occur due to 'PR 134' vs. 'PR comp/123' vs.
   'PR othercomp/123') — This does not avoid all issues but at least
   some. If this becomes a real world issue, we can try harder.

OK to commit this one? — Comments?

  * * *

I did leave out other changes as they seem to be less clear cut,
and which can be still be handled as follow up. Like:
- Adding 'Resolves:' (as in some cases it only resolves part of
   the PR)
- ... other changes/patches I missed. (This thread has too many
   emails.) In particular, if
   ^PR <comp>/<pr> - ....
   is accepted by gcc-commit/, then there is no need to list the
   PRs individually later on. But currently, it is still required.

  * * *

Cross ref:
* v1 of my patch was at
   https://gcc.gnu.org/pipermail/gcc/2021-June/236498.html
* Discussion of the -b option is at
   https://gcc.gnu.org/pipermail/gcc/2021-June/236519.html
* Martin S's patch (partially quoted above) is at
   https://gcc.gnu.org/pipermail/gcc/2021-June/236460.html

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Attachment #2: mklog-p-v2.diff --]
[-- Type: text/x-patch, Size: 5558 bytes --]

contrib/mklog.py: Improve PR handling

Co-authored-by: Martin Sebor <msebor@redhat.com>

contrib/ChangeLog:

	* mklog.py (bugzilla_url): Fetch also component.
	(pr_filename_regex): New.
	(get_pr_titles): Update PR string with correct format and component.
	(generate_changelog): Take additional PRs; extract PR from the
	filename.
	(__main__): Add -b/--pr-numbers argument.

 contrib/mklog.py | 41 ++++++++++++++++++++++++++++++++---------
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 1f59055e723..bba6c1a0e1a 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -42,6 +42,7 @@ pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<pr>PR [a-z+-]+\/[0-9]+)')
 prnum_regex = re.compile(r'PR (?P<comp>[a-z+-]+)/(?P<num>[0-9]+)')
 dr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<dr>DR [0-9]+)')
 dg_regex = re.compile(r'{\s+dg-(error|warning)')
+pr_filename_regex = re.compile(r'(^|[\W_])[Pp][Rr](?P<pr>\d{4,})')
 identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
 comment_regex = re.compile(r'^\/\*')
 struct_regex = re.compile(r'^(class|struct|union|enum)\s+'
@@ -52,7 +53,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
 template_and_param_regex = re.compile(r'<[^<>]*>')
 md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
 bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&' \
-               'include_fields=summary'
+               'include_fields=summary,component'
 
 function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}
 
@@ -118,20 +119,23 @@ def sort_changelog_files(changed_file):
 
 
 def get_pr_titles(prs):
-    output = ''
-    for pr in prs:
+    output = []
+    for idx, pr in enumerate(prs):
         pr_id = pr.split('/')[-1]
         r = requests.get(bugzilla_url % pr_id)
         bugs = r.json()['bugs']
         if len(bugs) == 1:
-            output += '%s - %s\n' % (pr, bugs[0]['summary'])
-            print(output)
+            prs[idx] = 'PR %s/%s' % (bugs[0]['component'], pr_id)
+            out = '%s - %s\n' % (prs[idx], bugs[0]['summary'])
+            if out not in output:
+                output.append(out)
     if output:
-        output += '\n'
-    return output
+        output.append('')
+    return '\n'.join(output)
 
 
-def generate_changelog(data, no_functions=False, fill_pr_titles=False):
+def generate_changelog(data, no_functions=False, fill_pr_titles=False,
+                       additional_prs=None):
     changelogs = {}
     changelog_list = []
     prs = []
@@ -139,6 +143,8 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
     diff = PatchSet(data)
     global firstpr
 
+    if additional_prs:
+        prs = [pr for pr in additional_prs if pr not in prs]
     for file in diff:
         # skip files that can't be parsed
         if file.path == '/dev/null':
@@ -154,21 +160,33 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
             # Only search first ten lines as later lines may
             # contains commented code which a note that it
             # has not been tested due to a certain PR or DR.
+            this_file_prs = []
             for line in list(file)[0][0:10]:
                 m = pr_regex.search(line.value)
                 if m:
                     pr = m.group('pr')
                     if pr not in prs:
                         prs.append(pr)
+                        this_file_prs.append(pr.split('/')[-1])
                 else:
                     m = dr_regex.search(line.value)
                     if m:
                         dr = m.group('dr')
                         if dr not in prs:
                             prs.append(dr)
+                            this_file_prs.append(dr.split('/')[-1])
                     elif dg_regex.search(line.value):
                         # Found dg-warning/dg-error line
                         break
+            # PR number in the file name
+            fname = os.path.basename(file.path)
+            fname = os.path.splitext(fname)[0]
+            m = pr_filename_regex.search(fname)
+            if m:
+                pr = m.group('pr')
+                pr2 = "PR " + pr
+                if pr not in this_file_prs and pr2 not in prs:
+                    prs.append(pr2)
 
     if prs:
         firstpr = prs[0]
@@ -286,6 +304,8 @@ if __name__ == '__main__':
     parser = argparse.ArgumentParser(description=help_message)
     parser.add_argument('input', nargs='?',
                         help='Patch file (or missing, read standard input)')
+    parser.add_argument('-b', '--pr-numbers', action='append',
+                        help='Add the specified PRs (comma separated)')
     parser.add_argument('-s', '--no-functions', action='store_true',
                         help='Do not generate function names in ChangeLogs')
     parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
@@ -308,8 +328,11 @@ if __name__ == '__main__':
     if args.update_copyright:
         update_copyright(data)
     else:
+        pr_numbers = args.pr_numbers
+        if pr_numbers:
+            pr_numbers = [b for i in args.pr_numbers for b in i.split(',')]
         output = generate_changelog(data, args.no_functions,
-                                    args.fill_up_bug_titles)
+                                    args.fill_up_bug_titles, pr_numbers)
         if args.changelog:
             lines = open(args.changelog).read().split('\n')
             start = list(takewhile(lambda l: not l.startswith('#'), lines))

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-21  7:26                                                                       ` Martin Liška
@ 2021-06-21  8:02                                                                         ` Iain Sandoe
  0 siblings, 0 replies; 84+ messages in thread
From: Iain Sandoe @ 2021-06-21  8:02 UTC (permalink / raw)
  To: Martin Liska
  Cc: Tobias Burnus, Jason Merrill, Jakub Jelinek, gcc Mailing List,
	Jonathan Wakely, Richard Earnshaw, Joseph Myers



> On 21 Jun 2021, at 08:26, Martin Liška <mliska@suse.cz> wrote:
> 
> On 6/21/21 8:42 AM, Tobias Burnus wrote:
>> On 18.06.21 16:41, Jason Merrill wrote:
>>> 
>>>     * Being able to specify the PR numbers on the command line in addition
>>>        (currently, they are only extracted from the testsuite patches)
>>> 
>>> 
>>> This bit seems unnecessary to me, since we want the commit to include
>>> tests that identify the PR.
>> I full hearty disagree!
> 
> Hello.
> 
> I do support Tobias' patch and I really think -b argument is handy. Question is,
> if one can use it from 'git commit-mklog' hook? I'm asking because I don't use mklog.py
> directly.

+1 …

and probably including options to the commit .. 
so for something like

git commit-mklog -s -b xxxxxx

or even (if this makes sense )

git commit-mklog —amend -b xxxxx

perhaps we need a way to indicate which sub-process the options are for?

Iain

> 
> Martin
> 
>>> Martin Sebor's patch to extract the PR number from the testcase
>>> filename, as an alternative to a comment, should be enough.
>> Regarding the PR from the filename, are we only talking about
>> new files – or about modifications to old files?
>> For new files, together with -p it seems to be useful; for
>> old files, I think there will be more spurious/false PRs than
>> useful/right PRs.
>> Looking through the last few commits, in all of the following
>> cases the -b would be useful — and I do regard the commits
>> messages are sensible:
>> * cc9c94d43dcfa98436152af9c00f011e9dab25f6 PR libstdc++/100387
>>   added testsuite/25_algorithms/minmax/constrained.cc (etc.)
>>   but file neither has a pr... name nor contains a PR line
>>   (BTW: The fix is not primarily for the PR but fixes it as,
>>    a side effect; hence, not having a PR line makes sense)
>> * 870b674f72d4894b94efa61764fd87ecec29ffde
>>   'ranger' - removes a datastructure (no testcase but also
>>   does not make sense here)
>> * 17a4bee01c3b29c5ccdd39f34384521e5d44135b
>>   Fixes an ICE for one testcase (but of course does not add
>>   a new testcase or modify an existing one.)
>> * 76e990fd211cbb20bf74ce074eb8b2d7b096d3b7
>>   fix a bootstrap issue (found with the GCC 11 backport).
>> Those are all commits between last Friday and today. Hence, I do
>> believe it makes sense to be able to specify the PR on the
>> command line — besides obtaining it from file names.
>> Tobias
>> -----------------
>> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch, v2] contrib/mklog.py: Improve PR handling (was: Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-21  7:54                                                       ` [Patch, v2] contrib/mklog.py: Improve PR handling (was: " Tobias Burnus
@ 2021-06-21  8:09                                                         ` Martin Liška
  2021-06-21  8:37                                                           ` Tobias Burnus
  0 siblings, 1 reply; 84+ messages in thread
From: Martin Liška @ 2021-06-21  8:09 UTC (permalink / raw)
  To: Tobias Burnus, Martin Sebor, Jason Merrill
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

Hi.

I see the following warnings:

$ pytest test_mklog.py
FAILED test_mklog.py::TestMklog::test_sorting - AssertionError: assert '\n\tPR 50209...New test.\n\n' == 'gcc/ChangeLo...New test.\n\n'

$ flake8 mklog.py
mklog.py:187:23: Q000 Remove bad quotes

and ...

> contrib/mklog.py: Improve PR handling
> 
> Co-authored-by: Martin Sebor <msebor@redhat.com>
> 
> contrib/ChangeLog:
> 
> 	* mklog.py (bugzilla_url): Fetch also component.
> 	(pr_filename_regex): New.
> 	(get_pr_titles): Update PR string with correct format and component.
> 	(generate_changelog): Take additional PRs; extract PR from the
> 	filename.
> 	(__main__): Add -b/--pr-numbers argument.
> 
>  contrib/mklog.py | 41 ++++++++++++++++++++++++++++++++---------
>  1 file changed, 32 insertions(+), 9 deletions(-)
> 
> diff --git a/contrib/mklog.py b/contrib/mklog.py
> index 1f59055e723..bba6c1a0e1a 100755
> --- a/contrib/mklog.py
> +++ b/contrib/mklog.py
> @@ -42,6 +42,7 @@ pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<pr>PR [a-z+-]+\/[0-9]+)')
>  prnum_regex = re.compile(r'PR (?P<comp>[a-z+-]+)/(?P<num>[0-9]+)')
>  dr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<dr>DR [0-9]+)')
>  dg_regex = re.compile(r'{\s+dg-(error|warning)')
> +pr_filename_regex = re.compile(r'(^|[\W_])[Pp][Rr](?P<pr>\d{4,})')
>  identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
>  comment_regex = re.compile(r'^\/\*')
>  struct_regex = re.compile(r'^(class|struct|union|enum)\s+'
> @@ -52,7 +53,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
>  template_and_param_regex = re.compile(r'<[^<>]*>')
>  md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
>  bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&' \
> -               'include_fields=summary'
> +               'include_fields=summary,component'
>  
>  function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}
>  
> @@ -118,20 +119,23 @@ def sort_changelog_files(changed_file):
>  
>  
>  def get_pr_titles(prs):
> -    output = ''
> -    for pr in prs:
> +    output = []
> +    for idx, pr in enumerate(prs):
>          pr_id = pr.split('/')[-1]
>          r = requests.get(bugzilla_url % pr_id)
>          bugs = r.json()['bugs']
>          if len(bugs) == 1:
> -            output += '%s - %s\n' % (pr, bugs[0]['summary'])
> -            print(output)
> +            prs[idx] = 'PR %s/%s' % (bugs[0]['component'], pr_id)
> +            out = '%s - %s\n' % (prs[idx], bugs[0]['summary'])
> +            if out not in output:
> +                output.append(out)
>      if output:
> -        output += '\n'
> -    return output
> +        output.append('')
> +    return '\n'.join(output)
>  
>  
> -def generate_changelog(data, no_functions=False, fill_pr_titles=False):
> +def generate_changelog(data, no_functions=False, fill_pr_titles=False,
> +                       additional_prs=None):
>      changelogs = {}
>      changelog_list = []
>      prs = []
> @@ -139,6 +143,8 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
>      diff = PatchSet(data)
>      global firstpr
>  
> +    if additional_prs:
> +        prs = [pr for pr in additional_prs if pr not in prs]
>      for file in diff:
>          # skip files that can't be parsed
>          if file.path == '/dev/null':
> @@ -154,21 +160,33 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
>              # Only search first ten lines as later lines may
>              # contains commented code which a note that it
>              # has not been tested due to a certain PR or DR.
> +            this_file_prs = []
>              for line in list(file)[0][0:10]:
>                  m = pr_regex.search(line.value)
>                  if m:
>                      pr = m.group('pr')
>                      if pr not in prs:
>                          prs.append(pr)
> +                        this_file_prs.append(pr.split('/')[-1])
>                  else:
>                      m = dr_regex.search(line.value)
>                      if m:
>                          dr = m.group('dr')
>                          if dr not in prs:
>                              prs.append(dr)
> +                            this_file_prs.append(dr.split('/')[-1])
>                      elif dg_regex.search(line.value):
>                          # Found dg-warning/dg-error line
>                          break
> +            # PR number in the file name
> +            fname = os.path.basename(file.path)

This is a dead code.

> +            fname = os.path.splitext(fname)[0]
> +            m = pr_filename_regex.search(fname)
> +            if m:
> +                pr = m.group('pr')
> +                pr2 = "PR " + pr

Bad quotes here.

> +                if pr not in this_file_prs and pr2 not in prs:
> +                    prs.append(pr2)
>  
>      if prs:
>          firstpr = prs[0]
> @@ -286,6 +304,8 @@ if __name__ == '__main__':
>      parser = argparse.ArgumentParser(description=help_message)
>      parser.add_argument('input', nargs='?',
>                          help='Patch file (or missing, read standard input)')
> +    parser.add_argument('-b', '--pr-numbers', action='append',
> +                        help='Add the specified PRs (comma separated)')

Do we really want to support '-b 1 -b 2' and also -b '1,2' formats? Seems to me quite
complicated.

Cheers,
Martin

>      parser.add_argument('-s', '--no-functions', action='store_true',
>                          help='Do not generate function names in ChangeLogs')
>      parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
> @@ -308,8 +328,11 @@ if __name__ == '__main__':
>      if args.update_copyright:
>          update_copyright(data)
>      else:
> +        pr_numbers = args.pr_numbers
> +        if pr_numbers:
> +            pr_numbers = [b for i in args.pr_numbers for b in i.split(',')]
>          output = generate_changelog(data, args.no_functions,
> -                                    args.fill_up_bug_titles)
> +                                    args.fill_up_bug_titles, pr_numbers)
>          if args.changelog:
>              lines = open(args.changelog).read().split('\n')
>              start = list(takewhile(lambda l: not l.startswith('#'), lines))



On 6/21/21 9:54 AM, Tobias Burnus wrote:
> On 17.06.21 02:17, Martin Sebor via Gcc wrote:
>> @@ -147,6 +152,12 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
>>
>>           # Extract PR entries from newly added tests
>>           if 'testsuite' in file.path and file.is_added_file:
>> +            name = os.path.basename(file.path)
>> +            name = os.path.splitext(name)[0]
>> +            if name.startswith("pr"):
>> +                name = name[2:]
>> +                name = "PR " + name
>> +                prs.append(name)
> 
> I think you need a regular expression to extract the PR – as it will both match
> too much and to little. We have file names such as:
> * libstdc++-pr91488.C (other prefix)
> * PR37039.f90  (capitalized PR)
> * pr98218-1.C (suffix with '-')
> * pr40724_1.f (suffix with '_')
> * pr101023a.C (suffix with a letter)
> 
> But otherwise, I like that idea.
> 
>   * * *
> 
> Changes in my patch compared to v1:
> - (From Martin's patch:) Extract the PR from new-files file
>    name (using pattern matching), but only take the PR if the
>    PR wasn't found in the file as PR comment.
>    (The latter happens, e.g., with b376b1ef389.)
> - Avoid printing the same PR multiple times as summary line
>    (duplicates occur due to 'PR 134' vs. 'PR comp/123' vs.
>    'PR othercomp/123') — This does not avoid all issues but at least
>    some. If this becomes a real world issue, we can try harder.
> 
> OK to commit this one? — Comments?
> 
>   * * *
> 
> I did leave out other changes as they seem to be less clear cut,
> and which can be still be handled as follow up. Like:
> - Adding 'Resolves:' (as in some cases it only resolves part of
>    the PR)
> - ... other changes/patches I missed. (This thread has too many
>    emails.) In particular, if
>    ^PR <comp>/<pr> - ....
>    is accepted by gcc-commit/, then there is no need to list the
>    PRs individually later on. But currently, it is still required.
> 
>   * * *
> 
> Cross ref:
> * v1 of my patch was at
>    https://gcc.gnu.org/pipermail/gcc/2021-June/236498.html
> * Discussion of the -b option is at
>    https://gcc.gnu.org/pipermail/gcc/2021-June/236519.html
> * Martin S's patch (partially quoted above) is at
>    https://gcc.gnu.org/pipermail/gcc/2021-June/236460.html
> 
> Tobias
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch, v2] contrib/mklog.py: Improve PR handling (was: Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-21  8:09                                                         ` Martin Liška
@ 2021-06-21  8:37                                                           ` Tobias Burnus
  2021-06-21 12:53                                                             ` Martin Liška
  0 siblings, 1 reply; 84+ messages in thread
From: Tobias Burnus @ 2021-06-21  8:37 UTC (permalink / raw)
  To: Martin Liška, Martin Sebor, Jason Merrill
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1548 bytes --]

On 21.06.21 10:09, Martin Liška wrote:

> $ pytest test_mklog.py
> FAILED test_mklog.py::TestMklog::test_sorting - AssertionError: assert
> '\n\tPR 50209...New test.\n\n' == 'gcc/ChangeLo...New test.\n\n'
Aha, missed that there is indeed a testsuite - nice!
> $ flake8 mklog.py
> mklog.py:187:23: Q000 Remove bad quotes
I have now filled:
https://bugs.launchpad.net/ubuntu/+source/python-pytest-flake8/+bug/1933075

>> +            # PR number in the file name
>> +            fname = os.path.basename(file.path)
>
> This is a dead code.
>
>> + fname = os.path.splitext(fname)[0]
>> +            m = pr_filename_regex.search(fname)
It does not look like dead code to me.
>> + parser.add_argument('-b', '--pr-numbers', action='append',
>> +                        help='Add the specified PRs (comma separated)')
>
> Do we really want to support '-b 1 -b 2' and also -b '1,2' formats?
> Seems to me quite
> complicated.

I don't have a strong opinion. I started with '-b 123,245', believing
that the syntax is fine. But then I realized that without '-p'
specifying multiple '-b' looks better by having multiple '-b' if 'PR
<component>/'  (needed for -p as the string is than taken as is). Thus,
I ended up supporting either variant.

But I also happily drop the ',' support.

Change: One quote change, one test_mklog update.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Attachment #2: mklog-p-v3.diff --]
[-- Type: text/x-patch, Size: 5950 bytes --]

contrib/mklog.py: Improve PR handling

Co-authored-by: Martin Sebor <msebor@redhat.com>

contrib/ChangeLog:

	* mklog.py (bugzilla_url): Fetch also component.
	(pr_filename_regex): New.
	(get_pr_titles): Update PR string with correct format and component.
	(generate_changelog): Take additional PRs; extract PR from the
	filename.
	(__main__): Add -b/--pr-numbers argument.
	* test_mklog.py (EXPECTED4): Update to expect a PR for the new file.

 contrib/mklog.py      | 41 ++++++++++++++++++++++++++++++++---------
 contrib/test_mklog.py |  3 +++
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 1f59055e723..e49d14d0859 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -42,6 +42,7 @@ pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<pr>PR [a-z+-]+\/[0-9]+)')
 prnum_regex = re.compile(r'PR (?P<comp>[a-z+-]+)/(?P<num>[0-9]+)')
 dr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<dr>DR [0-9]+)')
 dg_regex = re.compile(r'{\s+dg-(error|warning)')
+pr_filename_regex = re.compile(r'(^|[\W_])[Pp][Rr](?P<pr>\d{4,})')
 identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
 comment_regex = re.compile(r'^\/\*')
 struct_regex = re.compile(r'^(class|struct|union|enum)\s+'
@@ -52,7 +53,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
 template_and_param_regex = re.compile(r'<[^<>]*>')
 md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
 bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&' \
-               'include_fields=summary'
+               'include_fields=summary,component'
 
 function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}
 
@@ -118,20 +119,23 @@ def sort_changelog_files(changed_file):
 
 
 def get_pr_titles(prs):
-    output = ''
-    for pr in prs:
+    output = []
+    for idx, pr in enumerate(prs):
         pr_id = pr.split('/')[-1]
         r = requests.get(bugzilla_url % pr_id)
         bugs = r.json()['bugs']
         if len(bugs) == 1:
-            output += '%s - %s\n' % (pr, bugs[0]['summary'])
-            print(output)
+            prs[idx] = 'PR %s/%s' % (bugs[0]['component'], pr_id)
+            out = '%s - %s\n' % (prs[idx], bugs[0]['summary'])
+            if out not in output:
+                output.append(out)
     if output:
-        output += '\n'
-    return output
+        output.append('')
+    return '\n'.join(output)
 
 
-def generate_changelog(data, no_functions=False, fill_pr_titles=False):
+def generate_changelog(data, no_functions=False, fill_pr_titles=False,
+                       additional_prs=None):
     changelogs = {}
     changelog_list = []
     prs = []
@@ -139,6 +143,8 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
     diff = PatchSet(data)
     global firstpr
 
+    if additional_prs:
+        prs = [pr for pr in additional_prs if pr not in prs]
     for file in diff:
         # skip files that can't be parsed
         if file.path == '/dev/null':
@@ -154,21 +160,33 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
             # Only search first ten lines as later lines may
             # contains commented code which a note that it
             # has not been tested due to a certain PR or DR.
+            this_file_prs = []
             for line in list(file)[0][0:10]:
                 m = pr_regex.search(line.value)
                 if m:
                     pr = m.group('pr')
                     if pr not in prs:
                         prs.append(pr)
+                        this_file_prs.append(pr.split('/')[-1])
                 else:
                     m = dr_regex.search(line.value)
                     if m:
                         dr = m.group('dr')
                         if dr not in prs:
                             prs.append(dr)
+                            this_file_prs.append(dr.split('/')[-1])
                     elif dg_regex.search(line.value):
                         # Found dg-warning/dg-error line
                         break
+            # PR number in the file name
+            fname = os.path.basename(file.path)
+            fname = os.path.splitext(fname)[0]
+            m = pr_filename_regex.search(fname)
+            if m:
+                pr = m.group('pr')
+                pr2 = 'PR ' + pr
+                if pr not in this_file_prs and pr2 not in prs:
+                    prs.append(pr2)
 
     if prs:
         firstpr = prs[0]
@@ -286,6 +304,8 @@ if __name__ == '__main__':
     parser = argparse.ArgumentParser(description=help_message)
     parser.add_argument('input', nargs='?',
                         help='Patch file (or missing, read standard input)')
+    parser.add_argument('-b', '--pr-numbers', action='append',
+                        help='Add the specified PRs (comma separated)')
     parser.add_argument('-s', '--no-functions', action='store_true',
                         help='Do not generate function names in ChangeLogs')
     parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
@@ -308,8 +328,11 @@ if __name__ == '__main__':
     if args.update_copyright:
         update_copyright(data)
     else:
+        pr_numbers = args.pr_numbers
+        if pr_numbers:
+            pr_numbers = [b for i in args.pr_numbers for b in i.split(',')]
         output = generate_changelog(data, args.no_functions,
-                                    args.fill_up_bug_titles)
+                                    args.fill_up_bug_titles, pr_numbers)
         if args.changelog:
             lines = open(args.changelog).read().split('\n')
             start = list(takewhile(lambda l: not l.startswith('#'), lines))
diff --git a/contrib/test_mklog.py b/contrib/test_mklog.py
index a0670dac119..f5e9ecd577c 100755
--- a/contrib/test_mklog.py
+++ b/contrib/test_mklog.py
@@ -240,6 +240,9 @@ index 4ad78c1f77b..6687b368038 100644
 '''
 
 EXPECTED4 = '''\
+
+	PR 50209
+
 gcc/ChangeLog:
 
 	* ipa-icf.c:

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch, v2] contrib/mklog.py: Improve PR handling (was: Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-21  8:37                                                           ` Tobias Burnus
@ 2021-06-21 12:53                                                             ` Martin Liška
  2021-06-21 13:26                                                               ` Tobias Burnus
  0 siblings, 1 reply; 84+ messages in thread
From: Martin Liška @ 2021-06-21 12:53 UTC (permalink / raw)
  To: Tobias Burnus, Martin Sebor, Jason Merrill
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

On 6/21/21 10:37 AM, Tobias Burnus wrote:
> On 21.06.21 10:09, Martin Liška wrote:
> 
>> $ pytest test_mklog.py
>> FAILED test_mklog.py::TestMklog::test_sorting - AssertionError: assert
>> '\n\tPR 50209...New test.\n\n' == 'gcc/ChangeLo...New test.\n\n'
> Aha, missed that there is indeed a testsuite - nice!
>> $ flake8 mklog.py
>> mklog.py:187:23: Q000 Remove bad quotes
> I have now filled:
> https://bugs.launchpad.net/ubuntu/+source/python-pytest-flake8/+bug/1933075
> 
>>> +            # PR number in the file name
>>> +            fname = os.path.basename(file.path)
>>
>> This is a dead code.
>>
>>> + fname = os.path.splitext(fname)[0]
>>> +            m = pr_filename_regex.search(fname)
> It does not look like dead code to me.

Hello.

The code is weird as os.path.basename returns:

In [5]: os.path.basename('/tmp/a/b/c.txt')
Out[5]: 'c.txt'

why do you need os.path.splitext(fname) call?


>>> + parser.add_argument('-b', '--pr-numbers', action='append',
>>> +                        help='Add the specified PRs (comma separated)')
>>
>> Do we really want to support '-b 1 -b 2' and also -b '1,2' formats?
>> Seems to me quite
>> complicated.
> 
> I don't have a strong opinion. I started with '-b 123,245', believing
> that the syntax is fine. But then I realized that without '-p'
> specifying multiple '-b' looks better by having multiple '-b' if 'PR
> <component>/'  (needed for -p as the string is than taken as is). Thus,
> I ended up supporting either variant.

I would start with -b 1,2,3,4 syntax. It will be likely easier for git alias integration.

Martin

> 
> But I also happily drop the ',' support.
> 
> Change: One quote change, one test_mklog update.
> 
> Tobias
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Patch, v2] contrib/mklog.py: Improve PR handling (was: Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog)
  2021-06-21 12:53                                                             ` Martin Liška
@ 2021-06-21 13:26                                                               ` Tobias Burnus
  2021-06-22  7:30                                                                 ` [RFC][PATCH] contrib: add git-commit-mklog wrapper Martin Liška
  0 siblings, 1 reply; 84+ messages in thread
From: Tobias Burnus @ 2021-06-21 13:26 UTC (permalink / raw)
  To: Martin Liška, Martin Sebor, Jason Merrill
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1417 bytes --]

Now committed as r12-1700-gedf0c3ffb59d75c11e05bc722432dc554e275c72 / as
attached.

(We had some follow-up discussion on IRC.)

On 21.06.21 14:53, Martin Liška wrote:
>>>> +            # PR number in the file name
>>>> +            fname = os.path.basename(file.path)
>>>
>>> This is a dead code.
>>>
>>>> + fname = os.path.splitext(fname)[0]
>>>> +            m = pr_filename_regex.search(fname)
(Meant was the 'splitext' line - it is dead code as the re.search globs
all digits after 'pr' and then stops, ignoring the rest, including file
extensions. – I first thought it referred to the basename line, which
confused me.)
>>>> + parser.add_argument('-b', '--pr-numbers', action='append',
>>>> +                        help='Add the specified PRs (comma
>>>> separated)')
>>>
>>> Do we really want to support '-b 1 -b 2' and also -b '1,2' formats?
>>> Seems to me quite
>>> complicated.
> [...]
> I would start with -b 1,2,3,4 syntax. It will be likely easier for git
> alias integration.

Done so.

I note that argparse permits '-d dir1 -d dir2 -b bug1 -b bug2' which
then only keeps the dir2 as directory and bug2 as bug without printing
an error (or warning) for ignoring dir1 and bug1.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

[-- Attachment #2: committed.diff --]
[-- Type: text/x-patch, Size: 6041 bytes --]

commit edf0c3ffb59d75c11e05bc722432dc554e275c72
Author: Tobias Burnus <tobias@codesourcery.com>
Date:   Mon Jun 21 15:17:22 2021 +0200

    contrib/mklog.py: Improve PR handling
    
    Co-authored-by: Martin Sebor <msebor@redhat.com>
    
    contrib/ChangeLog:
    
            * mklog.py (bugzilla_url): Fetch also component.
            (pr_filename_regex): New.
            (get_pr_titles): Update PR string with correct format and component.
            (generate_changelog): Take additional PRs; extract PR from the
            filename.
            (__main__): Add -b/--pr-numbers argument.
            * test_mklog.py (EXPECTED4): Update to expect a PR for the new file.
---
 contrib/mklog.py      | 38 +++++++++++++++++++++++++++++---------
 contrib/test_mklog.py |  3 +++
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 1f59055e723..0b434f67971 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -42,6 +42,7 @@ pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<pr>PR [a-z+-]+\/[0-9]+)')
 prnum_regex = re.compile(r'PR (?P<comp>[a-z+-]+)/(?P<num>[0-9]+)')
 dr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?P<dr>DR [0-9]+)')
 dg_regex = re.compile(r'{\s+dg-(error|warning)')
+pr_filename_regex = re.compile(r'(^|[\W_])[Pp][Rr](?P<pr>\d{4,})')
 identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
 comment_regex = re.compile(r'^\/\*')
 struct_regex = re.compile(r'^(class|struct|union|enum)\s+'
@@ -52,7 +53,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
 template_and_param_regex = re.compile(r'<[^<>]*>')
 md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
 bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&' \
-               'include_fields=summary'
+               'include_fields=summary,component'
 
 function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}
 
@@ -118,20 +119,23 @@ def sort_changelog_files(changed_file):
 
 
 def get_pr_titles(prs):
-    output = ''
-    for pr in prs:
+    output = []
+    for idx, pr in enumerate(prs):
         pr_id = pr.split('/')[-1]
         r = requests.get(bugzilla_url % pr_id)
         bugs = r.json()['bugs']
         if len(bugs) == 1:
-            output += '%s - %s\n' % (pr, bugs[0]['summary'])
-            print(output)
+            prs[idx] = 'PR %s/%s' % (bugs[0]['component'], pr_id)
+            out = '%s - %s\n' % (prs[idx], bugs[0]['summary'])
+            if out not in output:
+                output.append(out)
     if output:
-        output += '\n'
-    return output
+        output.append('')
+    return '\n'.join(output)
 
 
-def generate_changelog(data, no_functions=False, fill_pr_titles=False):
+def generate_changelog(data, no_functions=False, fill_pr_titles=False,
+                       additional_prs=None):
     changelogs = {}
     changelog_list = []
     prs = []
@@ -139,6 +143,8 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
     diff = PatchSet(data)
     global firstpr
 
+    if additional_prs:
+        prs = [pr for pr in additional_prs if pr not in prs]
     for file in diff:
         # skip files that can't be parsed
         if file.path == '/dev/null':
@@ -154,21 +160,32 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False):
             # Only search first ten lines as later lines may
             # contains commented code which a note that it
             # has not been tested due to a certain PR or DR.
+            this_file_prs = []
             for line in list(file)[0][0:10]:
                 m = pr_regex.search(line.value)
                 if m:
                     pr = m.group('pr')
                     if pr not in prs:
                         prs.append(pr)
+                        this_file_prs.append(pr.split('/')[-1])
                 else:
                     m = dr_regex.search(line.value)
                     if m:
                         dr = m.group('dr')
                         if dr not in prs:
                             prs.append(dr)
+                            this_file_prs.append(dr.split('/')[-1])
                     elif dg_regex.search(line.value):
                         # Found dg-warning/dg-error line
                         break
+            # PR number in the file name
+            fname = os.path.basename(file.path)
+            m = pr_filename_regex.search(fname)
+            if m:
+                pr = m.group('pr')
+                pr2 = 'PR ' + pr
+                if pr not in this_file_prs and pr2 not in prs:
+                    prs.append(pr2)
 
     if prs:
         firstpr = prs[0]
@@ -286,6 +303,9 @@ if __name__ == '__main__':
     parser = argparse.ArgumentParser(description=help_message)
     parser.add_argument('input', nargs='?',
                         help='Patch file (or missing, read standard input)')
+    parser.add_argument('-b', '--pr-numbers', action='store',
+                        type=lambda arg: arg.split(','), nargs="?",
+                        help='Add the specified PRs (comma separated)')
     parser.add_argument('-s', '--no-functions', action='store_true',
                         help='Do not generate function names in ChangeLogs')
     parser.add_argument('-p', '--fill-up-bug-titles', action='store_true',
@@ -309,7 +329,7 @@ if __name__ == '__main__':
         update_copyright(data)
     else:
         output = generate_changelog(data, args.no_functions,
-                                    args.fill_up_bug_titles)
+                                    args.fill_up_bug_titles, args.pr_numbers)
         if args.changelog:
             lines = open(args.changelog).read().split('\n')
             start = list(takewhile(lambda l: not l.startswith('#'), lines))
diff --git a/contrib/test_mklog.py b/contrib/test_mklog.py
index a0670dac119..f5e9ecd577c 100755
--- a/contrib/test_mklog.py
+++ b/contrib/test_mklog.py
@@ -240,6 +240,9 @@ index 4ad78c1f77b..6687b368038 100644
 '''
 
 EXPECTED4 = '''\
+
+	PR 50209
+
 gcc/ChangeLog:
 
 	* ipa-icf.c:

^ permalink raw reply	[flat|nested] 84+ messages in thread

* [RFC][PATCH] contrib: add git-commit-mklog wrapper
  2021-06-21 13:26                                                               ` Tobias Burnus
@ 2021-06-22  7:30                                                                 ` Martin Liška
  2021-06-22  8:23                                                                   ` Tobias Burnus
  2021-06-22 18:40                                                                   ` Jason Merrill
  0 siblings, 2 replies; 84+ messages in thread
From: Martin Liška @ 2021-06-22  7:30 UTC (permalink / raw)
  To: Tobias Burnus, Martin Sebor, Jason Merrill
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 260 bytes --]

Hello.

There's a patch candidate that comes up with a wrapper for 'git commit-mklog' alias.
Using my patch, one can do:

$ git commit-mklog -a -b 12345,4444

Thoughts?
Can one do that without the wrapper script and passing data through env. variable?

Martin

[-- Attachment #2: 0001-contrib-add-git-commit-mklog-wrapper.patch --]
[-- Type: text/x-patch, Size: 3796 bytes --]

From 6b63718e2836c1a5a63e476ea981ba65084ba867 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Tue, 22 Jun 2021 09:19:45 +0200
Subject: [PATCH] contrib: add git-commit-mklog wrapper

contrib/ChangeLog:

	* gcc-git-customization.sh: Use the new wrapper.
	* git-commit-mklog.py: New file.
	* prepare-commit-msg: Support GCC_MKLOG_ARGS.
---
 contrib/gcc-git-customization.sh |  2 +-
 contrib/git-commit-mklog.py      | 44 ++++++++++++++++++++++++++++++++
 contrib/prepare-commit-msg       |  2 +-
 3 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100755 contrib/git-commit-mklog.py

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index e7e66623786..6f8f23deebf 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -28,7 +28,7 @@ git config alias.gcc-undescr \!"f() { o=\$(git config --get gcc-config.upstream)
 git config alias.gcc-verify '!f() { "`git rev-parse --show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'
 git config alias.gcc-backport '!f() { "`git rev-parse --show-toplevel`/contrib/git-backport.py" $@; } ; f'
 git config alias.gcc-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/mklog.py" $@; } ; f'
-git config alias.gcc-commit-mklog '!f() { GCC_FORCE_MKLOG=1 git commit "$@"; }; f'
+git config alias.gcc-commit-mklog '!f() { "`git rev-parse --show-toplevel`/contrib/git-commit-mklog.py" $@; }; f'
 
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
diff --git a/contrib/git-commit-mklog.py b/contrib/git-commit-mklog.py
new file mode 100755
index 00000000000..f1ef0dfa86b
--- /dev/null
+++ b/contrib/git-commit-mklog.py
@@ -0,0 +1,44 @@
+#!/usr/bin/env python3
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING.  If not, write to
+# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+# Boston, MA 02110-1301, USA.
+#
+# The script is wrapper for git commit-mklog alias where it parses
+# -b/--pr-numbers argument and passes it via environment variable
+# to mklog.py script.
+
+import argparse
+import os
+import subprocess
+
+if __name__ == '__main__':
+    children_args = []
+    myenv = os.environ.copy()
+
+    parser = argparse.ArgumentParser(description='git-commit-mklog wrapped')
+    parser.add_argument('-b', '--pr-numbers', action='store',
+                        type=lambda arg: arg.split(','), nargs='?',
+                        help='Add the specified PRs (comma separated)')
+    args, unknown_args = parser.parse_known_args()
+
+    myenv['GCC_FORCE_MKLOG'] = '1'
+    if args.pr_numbers:
+        myenv['GCC_MKLOG_ARGS'] = f'-b {",".join(args.pr_numbers)}'
+    commit_args = ' '.join(unknown_args)
+    subprocess.run(f'git commit {commit_args}', shell=True, env=myenv)
diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
index 969847df6f4..5da878458cd 100755
--- a/contrib/prepare-commit-msg
+++ b/contrib/prepare-commit-msg
@@ -78,4 +78,4 @@ else
     tee="cat"
 fi
 
-git $cmd | $tee | git gcc-mklog -c "$COMMIT_MSG_FILE"
+git $cmd | $tee | git gcc-mklog $GCC_MKLOG_ARGS -c "$COMMIT_MSG_FILE"
-- 
2.32.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [RFC][PATCH] contrib: add git-commit-mklog wrapper
  2021-06-22  7:30                                                                 ` [RFC][PATCH] contrib: add git-commit-mklog wrapper Martin Liška
@ 2021-06-22  8:23                                                                   ` Tobias Burnus
  2021-06-22  8:31                                                                     ` Martin Liška
  2021-06-22 18:40                                                                   ` Jason Merrill
  1 sibling, 1 reply; 84+ messages in thread
From: Tobias Burnus @ 2021-06-22  8:23 UTC (permalink / raw)
  To: Martin Liška, Martin Sebor, Jason Merrill
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

Hello,

On 22.06.21 09:30, Martin Liška wrote:
> There's a patch candidate that comes up with a wrapper for 'git
> commit-mklog' alias.
> Using my patch, one can do:
> $ git commit-mklog -a -b 12345,
> Thoughts?

What about '-p' – to fetch the data from GCC Bugzilla? I do note that
'git commit ' supports '-p, --patch' which may or may not be an issue.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [RFC][PATCH] contrib: add git-commit-mklog wrapper
  2021-06-22  8:23                                                                   ` Tobias Burnus
@ 2021-06-22  8:31                                                                     ` Martin Liška
  0 siblings, 0 replies; 84+ messages in thread
From: Martin Liška @ 2021-06-22  8:31 UTC (permalink / raw)
  To: Tobias Burnus, Martin Sebor, Jason Merrill
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

On 6/22/21 10:23 AM, Tobias Burnus wrote:
> Hello,
> 
> On 22.06.21 09:30, Martin Liška wrote:
>> There's a patch candidate that comes up with a wrapper for 'git
>> commit-mklog' alias.
>> Using my patch, one can do:
>> $ git commit-mklog -a -b 12345,
>> Thoughts?
> 
> What about '-p' – to fetch the data from GCC Bugzilla?

Sure, that needs to be supported as well.

> I do note that
> 'git commit ' supports '-p, --patch' which may or may not be an issue.

People likely do not use it with commit-mklog alias.

Martin

> 
> Tobias
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [RFC][PATCH] contrib: add git-commit-mklog wrapper
  2021-06-22  7:30                                                                 ` [RFC][PATCH] contrib: add git-commit-mklog wrapper Martin Liška
  2021-06-22  8:23                                                                   ` Tobias Burnus
@ 2021-06-22 18:40                                                                   ` Jason Merrill
  2021-06-23  7:40                                                                     ` Martin Liška
  1 sibling, 1 reply; 84+ messages in thread
From: Jason Merrill @ 2021-06-22 18:40 UTC (permalink / raw)
  To: Martin Liška, Tobias Burnus, Martin Sebor
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

On 6/22/21 3:30 AM, Martin Liška wrote:
> Hello.
> 
> There's a patch candidate that comes up with a wrapper for 'git 
> commit-mklog' alias.
> Using my patch, one can do:
> 
> $ git commit-mklog -a -b 12345,4444
> 
> Thoughts?

Looks good to me.

> Can one do that without the wrapper script and passing data through env. 
> variable?

The hook seems like the way to adjust the commit message, and we can't 
affect its command line arguments, so we're left with environment 
variables or a file somewhere for communicating to it.

Jason


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [RFC][PATCH] contrib: add git-commit-mklog wrapper
  2021-06-22 18:40                                                                   ` Jason Merrill
@ 2021-06-23  7:40                                                                     ` Martin Liška
  0 siblings, 0 replies; 84+ messages in thread
From: Martin Liška @ 2021-06-23  7:40 UTC (permalink / raw)
  To: Jason Merrill, Tobias Burnus, Martin Sebor
  Cc: Jakub Jelinek, gcc Mailing List, Jonathan Wakely, gcc-patches

On 6/22/21 8:40 PM, Jason Merrill wrote:
> On 6/22/21 3:30 AM, Martin Liška wrote:
>> Hello.
>>
>> There's a patch candidate that comes up with a wrapper for 'git commit-mklog' alias.
>> Using my patch, one can do:
>>
>> $ git commit-mklog -a -b 12345,4444
>>
>> Thoughts?
> 
> Looks good to me.

Cool, I've just pushed that (and started using it). Plus I've added support for -p argument
as noted by Tobias.

> 
>> Can one do that without the wrapper script and passing data through env. variable?
> 
> The hook seems like the way to adjust the commit message, and we can't affect its command line arguments, so we're left with environment variables or a file somewhere for communicating to it.

Yes, that's what I though.

Martin

> 
> Jason
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2021-06-23  7:40 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4c1114a7-2377-99e4-d451-1a086857e991@linux.ibm.com>
2021-06-10  5:22 ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Xionghu Luo
2021-06-10  6:17   ` Martin Liška
2021-06-10  6:25     ` Xionghu Luo
2021-06-10  8:07       ` Martin Liška
2021-06-10  6:35     ` Tobias Burnus
2021-06-10  8:07       ` Martin Liška
2021-06-10  9:44         ` Jonathan Wakely
2021-06-10 10:01           ` Jonathan Wakely
2021-06-10 10:08             ` Jakub Jelinek
2021-06-10 10:40               ` Jonathan Wakely
2021-06-10 14:55                 ` Martin Sebor
2021-06-10 15:54                   ` Tobias Burnus
2021-06-10 16:05                     ` Jonathan Wakely
2021-06-10 15:56                   ` Jonathan Wakely
2021-06-10 17:06                     ` Martin Sebor
2021-06-10 17:20                       ` Martin Sebor
2021-06-10 17:30                         ` Jakub Jelinek
2021-06-10 18:55                           ` Martin Sebor
2021-06-10 19:09                             ` Jakub Jelinek
2021-06-10 21:16                               ` Martin Sebor
2021-06-10 21:28                                 ` Jakub Jelinek
2021-06-10 21:56                                   ` Martin Sebor
2021-06-11  9:13                                 ` Jonathan Wakely
2021-06-11 17:02                                   ` Martin Sebor
2021-06-11 17:05                                     ` Jakub Jelinek
2021-06-11 17:32                                     ` Jonathan Wakely
2021-06-11 18:01                                       ` Martin Sebor
2021-06-11 18:14                                         ` Jonathan Wakely
2021-06-16  0:56                                         ` Hans-Peter Nilsson
2021-06-16  2:03                                           ` Martin Sebor
2021-06-16  3:42                                             ` Jason Merrill
2021-06-16 14:31                                               ` Martin Sebor
2021-06-16 20:49                                               ` Jason Merrill
2021-06-16 21:45                                                 ` Martin Sebor
2021-06-16 23:45                                                   ` Jason Merrill
2021-06-17  0:17                                                     ` Martin Sebor
2021-06-17  0:40                                                       ` Jason Merrill
2021-06-17  1:01                                                         ` Martin Sebor
2021-06-17  1:46                                                           ` Jason Merrill
2021-06-17 10:18                                                           ` Jonathan Wakely
2021-06-17 14:55                                                             ` Martin Sebor
2021-06-17 15:11                                                               ` Michael Matz
2021-06-17 15:33                                                                 ` Martin Sebor
2021-06-17 16:31                                                                   ` Jakub Jelinek
2021-06-17 16:32                                                                   ` Jonathan Wakely
2021-06-17 18:00                                                                     ` Martin Sebor
2021-06-17 10:08                                                         ` Richard Earnshaw
2021-06-17 17:12                                                           ` Joseph Myers
2021-06-17 17:21                                                             ` Jason Merrill
2021-06-17 17:21                                                             ` Jakub Jelinek
2021-06-18  9:32                                                               ` Richard Earnshaw
2021-06-18 11:05                                                                 ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
2021-06-18 11:10                                                                   ` Jonathan Wakely
2021-06-18 11:24                                                                     ` Jakub Jelinek
2021-06-18 11:25                                                                     ` Tobias Burnus
2021-06-18 11:40                                                                       ` Jonathan Wakely
2021-06-21  7:28                                                                         ` Martin Liška
2021-06-18 16:40                                                                       ` [Patch] contrib/mklog.py: Improve PR handling Martin Sebor
2021-06-18 14:41                                                                   ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Jason Merrill
2021-06-18 16:47                                                                     ` [Patch] contrib/mklog.py: Improve PR handling Martin Sebor
2021-06-18 16:59                                                                       ` Iain Sandoe
2021-06-21  6:42                                                                     ` [Patch] contrib/mklog.py: Improve PR handling (was: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
2021-06-21  7:26                                                                       ` Martin Liška
2021-06-21  8:02                                                                         ` Iain Sandoe
2021-06-21  7:54                                                       ` [Patch, v2] contrib/mklog.py: Improve PR handling (was: " Tobias Burnus
2021-06-21  8:09                                                         ` Martin Liška
2021-06-21  8:37                                                           ` Tobias Burnus
2021-06-21 12:53                                                             ` Martin Liška
2021-06-21 13:26                                                               ` Tobias Burnus
2021-06-22  7:30                                                                 ` [RFC][PATCH] contrib: add git-commit-mklog wrapper Martin Liška
2021-06-22  8:23                                                                   ` Tobias Burnus
2021-06-22  8:31                                                                     ` Martin Liška
2021-06-22 18:40                                                                   ` Jason Merrill
2021-06-23  7:40                                                                     ` Martin Liška
2021-06-16 13:46                                             ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Jonathan Wakely
2021-06-16 17:44                                             ` Hans-Peter Nilsson
2021-06-11  9:08                       ` Jonathan Wakely
2021-06-11  9:35                         ` Jonathan Wakely
2021-06-11 15:43                           ` Joseph Myers
2021-06-11 17:02                             ` Jonathan Wakely
2021-06-10 11:51         ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog (was:: git gcc-commit-mklog doesn't extract PR number to ChangeLog) Tobias Burnus
2021-06-10 11:54           ` [Patch] contrig/gcc-changelog: Check that PR in subject in in changelog Florian Weimer
2021-06-10 12:45           ` Jonathan Wakely
2021-06-10  9:41   ` git gcc-commit-mklog doesn't extract PR number to ChangeLog Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).