public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc(refs/vendors/redhat/heads/gcc-8-branch)] Sync gcc-changelog scripts.
Date: Fri, 23 Apr 2021 10:36:53 +0000 (GMT)	[thread overview]
Message-ID: <20210423103653.74133393C842@sourceware.org> (raw)

https://gcc.gnu.org/g:a1fa2adf7c32da94902ba47e7325d126dd9580f2

commit a1fa2adf7c32da94902ba47e7325d126dd9580f2
Author: Martin Liska <mliska@suse.cz>
Date:   Thu Jan 7 11:30:29 2021 +0100

    Sync gcc-changelog scripts.
    
    contrib/ChangeLog:
    
            * gcc-changelog/git_commit.py: Sync from master.
            * gcc-changelog/git_email.py: Likewise.
            * gcc-changelog/git_repository.py: Likewise.
            * gcc-changelog/test_email.py: Likewise.
            * gcc-changelog/test_patches.txt: Likewise.

Diff:
---
 contrib/gcc-changelog/git_commit.py     | 57 +++++++++++++++++------
 contrib/gcc-changelog/git_email.py      |  6 +--
 contrib/gcc-changelog/git_repository.py |  6 +--
 contrib/gcc-changelog/test_email.py     | 24 +++++++++-
 contrib/gcc-changelog/test_patches.txt  | 81 ++++++++++++++++++++++++++++++++-
 5 files changed, 152 insertions(+), 22 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index 5f856660bb3..ee1973371be 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -16,10 +16,12 @@
 # along with GCC; see the file COPYING3.  If not see
 # <http://www.gnu.org/licenses/>.  */
 
+import difflib
 import os
 import re
 
 changelog_locations = {
+    'c++tools',
     'config',
     'contrib',
     'contrib/header-tools',
@@ -50,6 +52,7 @@ changelog_locations = {
     'libatomic',
     'libbacktrace',
     'libcc1',
+    'libcody',
     'libcpp',
     'libcpp/po',
     'libdecnumber',
@@ -138,7 +141,8 @@ ignored_prefixes = {
 
 wildcard_prefixes = {
     'gcc/testsuite/',
-    'libstdc++-v3/doc/html/'
+    'libstdc++-v3/doc/html/',
+    'libstdc++-v3/testsuite/'
     }
 
 misc_files = {
@@ -158,11 +162,11 @@ end_of_location_regex = re.compile(r'[\[<(:]')
 item_empty_regex = re.compile(r'\t(\* \S+ )?\(\S+\):\s*$')
 item_parenthesis_regex = re.compile(r'\t(\*|\(\S+\):)')
 revert_regex = re.compile(r'This reverts commit (?P<hash>\w+).$')
+cherry_pick_regex = re.compile(r'cherry picked from commit (?P<hash>\w+)')
 
 LINE_LIMIT = 100
 TAB_WIDTH = 8
 CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
-CHERRY_PICK_PREFIX = '(cherry picked from commit '
 
 REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
                    'acked-by: ', 'tested-by: ', 'reported-by: ',
@@ -170,6 +174,24 @@ REVIEW_PREFIXES = ('reviewed-by: ', 'reviewed-on: ', 'signed-off-by: ',
 DATE_FORMAT = '%Y-%m-%d'
 
 
+def decode_path(path):
+    # When core.quotepath is true (default value), utf8 chars are encoded like:
+    # "b/ko\304\215ka.txt"
+    #
+    # The upstream bug is fixed:
+    # https://github.com/gitpython-developers/GitPython/issues/1099
+    #
+    # but we still need a workaround for older versions of the library.
+    # Please take a look at the explanation of the transformation:
+    # https://stackoverflow.com/questions/990169/how-do-convert-unicode-escape-sequences-to-unicode-characters-in-a-python-string
+
+    if path.startswith('"') and path.endswith('"'):
+        return (path.strip('"').encode('utf8').decode('unicode-escape')
+                .encode('latin-1').decode('utf8'))
+    else:
+        return path
+
+
 class Error:
     def __init__(self, message, line=None):
         self.message = message
@@ -272,6 +294,10 @@ class GitCommit:
         self.revert_commit = None
         self.commit_to_info_hook = commit_to_info_hook
 
+        # Skip Update copyright years commits
+        if self.info.lines and self.info.lines[0] == 'Update copyright years.':
+            return
+
         # Identify first if the commit is a Revert commit
         for line in self.info.lines:
             m = revert_regex.match(line)
@@ -422,14 +448,16 @@ class GitCommit:
                     continue
                 elif lowered_line.startswith(REVIEW_PREFIXES):
                     continue
-                elif line.startswith(CHERRY_PICK_PREFIX):
-                    commit = line[len(CHERRY_PICK_PREFIX):].rstrip(')')
-                    if self.cherry_pick_commit:
-                        self.errors.append(Error('multiple cherry pick lines',
-                                                 line))
-                    else:
-                        self.cherry_pick_commit = commit
-                    continue
+                else:
+                    m = cherry_pick_regex.search(line)
+                    if m:
+                        commit = m.group('hash')
+                        if self.cherry_pick_commit:
+                            msg = 'multiple cherry pick lines'
+                            self.errors.append(Error(msg, line))
+                        else:
+                            self.cherry_pick_commit = commit
+                        continue
 
                 # ChangeLog name will be deduced later
                 if not last_entry:
@@ -490,7 +518,7 @@ class GitCommit:
         for entry in self.changelog_entries:
             for pattern in entry.file_patterns:
                 name = os.path.join(entry.folder, pattern)
-                if name not in wildcard_prefixes:
+                if not [name.startswith(pr) for pr in wildcard_prefixes]:
                     msg = 'unsupported wildcard prefix'
                     self.errors.append(Error(msg, name))
 
@@ -558,7 +586,7 @@ class GitCommit:
         mentioned_patterns = []
         used_patterns = set()
         for entry in self.changelog_entries:
-            if not entry.files:
+            if not entry.files and not entry.file_patterns:
                 msg = 'no files mentioned for ChangeLog in directory'
                 self.errors.append(Error(msg, entry.folder))
             assert not entry.folder.endswith('/')
@@ -573,6 +601,9 @@ class GitCommit:
         changed_files = set(cand)
         for file in sorted(mentioned_files - changed_files):
             msg = 'unchanged file mentioned in a ChangeLog'
+            candidates = difflib.get_close_matches(file, changed_files, 1)
+            if candidates:
+                msg += f' (did you mean "{candidates[0]}"?)'
             self.errors.append(Error(msg, file))
         for file in sorted(changed_files - mentioned_files):
             if not self.in_ignored_location(file):
@@ -614,7 +645,7 @@ class GitCommit:
 
         for pattern in mentioned_patterns:
             if pattern not in used_patterns:
-                error = 'pattern doesn''t match any changed files'
+                error = "pattern doesn't match any changed files"
                 self.errors.append(Error(error, pattern))
 
     def check_for_correct_changelog(self):
diff --git a/contrib/gcc-changelog/git_email.py b/contrib/gcc-changelog/git_email.py
index 5b53ca4a6a9..00ad00458f4 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -22,7 +22,7 @@ from itertools import takewhile
 
 from dateutil.parser import parse
 
-from git_commit import GitCommit, GitInfo
+from git_commit import GitCommit, GitInfo, decode_path
 
 from unidiff import PatchSet, PatchedFile
 
@@ -52,8 +52,8 @@ class GitEmail(GitCommit):
         modified_files = []
         for f in diff:
             # Strip "a/" and "b/" prefixes
-            source = f.source_file[2:]
-            target = f.target_file[2:]
+            source = decode_path(f.source_file)[2:]
+            target = decode_path(f.target_file)[2:]
 
             if f.is_added_file:
                 t = 'A'
diff --git a/contrib/gcc-changelog/git_repository.py b/contrib/gcc-changelog/git_repository.py
index 8edcff91ad6..a0e293d756d 100755
--- a/contrib/gcc-changelog/git_repository.py
+++ b/contrib/gcc-changelog/git_repository.py
@@ -26,7 +26,7 @@ except ImportError:
     print('  Debian, Ubuntu: python3-git')
     exit(1)
 
-from git_commit import GitCommit, GitInfo
+from git_commit import GitCommit, GitInfo, decode_path
 
 
 def parse_git_revisions(repo_path, revisions, strict=True):
@@ -51,11 +51,11 @@ def parse_git_revisions(repo_path, revisions, strict=True):
                     # Consider that renamed files are two operations:
                     # the deletion of the original name
                     # and the addition of the new one.
-                    modified_files.append((file.a_path, 'D'))
+                    modified_files.append((decode_path(file.a_path), 'D'))
                     t = 'A'
                 else:
                     t = 'M'
-                modified_files.append((file.b_path, t))
+                modified_files.append((decode_path(file.b_path), t))
 
             date = datetime.utcfromtimestamp(c.committed_date)
             author = '%s  <%s>' % (c.author.name, c.author.email)
diff --git a/contrib/gcc-changelog/test_email.py b/contrib/gcc-changelog/test_email.py
index e38c3e52158..5db56caef9e 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -113,7 +113,9 @@ class TestGccChangelog(unittest.TestCase):
         email = self.from_patch_glob('0096')
         assert email.errors
         err = email.errors[0]
-        assert err.message == 'unchanged file mentioned in a ChangeLog'
+        assert err.message == 'unchanged file mentioned in a ChangeLog (did ' \
+            'you mean "gcc/testsuite/gcc.target/aarch64/' \
+            'advsimd-intrinsics/vdot-3-1.c"?)'
         assert err.line == 'gcc/testsuite/gcc.target/aarch64/' \
                            'advsimd-intrinsics/vdot-compile-3-1.c'
 
@@ -333,7 +335,7 @@ class TestGccChangelog(unittest.TestCase):
         assert not email.errors
         email = self.from_patch_glob('0002-libstdc-Fake-test-change-1.patch')
         assert len(email.errors) == 1
-        msg = 'pattern doesn''t match any changed files'
+        msg = "pattern doesn't match any changed files"
         assert email.errors[0].message == msg
         assert email.errors[0].line == 'libstdc++-v3/doc/html/'
         email = self.from_patch_glob('0003-libstdc-Fake-test-change-2.patch')
@@ -355,6 +357,8 @@ class TestGccChangelog(unittest.TestCase):
     def test_backport(self):
         email = self.from_patch_glob('0001-asan-fix-RTX-emission.patch')
         assert not email.errors
+        expected_hash = '8cff672cb9a132d3d3158c2edfc9a64b55292b80'
+        assert email.cherry_pick_commit == expected_hash
         assert len(email.changelog_entries) == 1
         entry = list(email.to_changelog_entries())[0][1]
         assert entry.startswith('2020-06-11  Martin Liska  <mliska@suse.cz>')
@@ -384,3 +388,19 @@ class TestGccChangelog(unittest.TestCase):
         email = self.from_patch_glob('0001-lto-fix-LTO-debug')
         assert not email.errors
         assert len(email.changelog_entries) == 1
+
+    def test_wildcard_in_subdir(self):
+        email = self.from_patch_glob('0001-Wildcard-subdirs.patch')
+        assert len(email.changelog_entries) == 1
+        err = email.errors[0]
+        assert err.message == "pattern doesn't match any changed files"
+        assert err.line == 'libstdc++-v3/testsuite/28_regex_not-existing/'
+
+    def test_unicode_chars_in_filename(self):
+        email = self.from_patch_glob('0001-Add-horse.patch')
+        assert not email.errors
+
+    def test_bad_unicode_chars_in_filename(self):
+        email = self.from_patch_glob('0001-Add-horse2.patch')
+        assert not email.errors
+        assert email.changelog_entries[0].files == ['koníček.txt']
diff --git a/contrib/gcc-changelog/test_patches.txt b/contrib/gcc-changelog/test_patches.txt
index 37f49c851ec..ffd13682d5c 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -3145,7 +3145,7 @@ gcc/ChangeLog:
 	by using Pmode instead of ptr_mode.
 
 Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
-(cherry picked from commit 8cff672cb9a132d3d3158c2edfc9a64b55292b80)
+(cherry picked from commit 8cff672cb9a132d3d3158c2edfc9a64b55292b80 (only part))
 ---
  gcc/asan.c | 1 +
  1 file changed, 1 insertion(+)
@@ -3320,3 +3320,82 @@ index 7c9d492f6a4..37e73348cb7 100644
 -- 
 2.25.1
 
+=== 0001-Wildcard-subdirs.patch ===
+From b798205595426c53eb362065f6ed6c320dcc161d Mon Sep 17 00:00:00 2001
+From: Martin Liska <mliska@suse.cz>
+Date: Mon, 30 Nov 2020 13:27:51 +0100
+Subject: [PATCH] Fix it.
+
+libstdc++-v3/ChangeLog:
+
+	* testsuite/28_regex/*: Fix them all.
+	* testsuite/28_regex_not-existing/*: Fix them all.
+---
+ contrib/gcc-changelog/git_commit.py          | 1 +
+ libstdc++-v3/testsuite/28_regex/init-list.cc | 1 +
+ 2 files changed, 2 insertions(+)
+
+diff --git a/libstdc++-v3/testsuite/28_regex/init-list.cc b/libstdc++-v3/testsuite/28_regex/init-list.cc
+index f51453f019a..d10ecf483f4 100644
+--- a/libstdc++-v3/testsuite/28_regex/init-list.cc
++++ b/libstdc++-v3/testsuite/28_regex/init-list.cc
+@@ -1 +1,2 @@
+ 
++
+--
+2.29.2
+
+=== 0001-Add-horse.patch ===
+From 2884248d07e4e2c922e137365253e2e521c425b0 Mon Sep 17 00:00:00 2001
+From: Martin Liska <mliska@suse.cz>
+Date: Mon, 21 Dec 2020 10:14:46 +0100
+Subject: [PATCH] Add horse.
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+ChangeLog:
+
+	* koníček.txt: New file.
+---
+ koníček.txt | 1 +
+ 1 file changed, 1 insertion(+)
+ create mode 100644 koníček.txt
+
+diff --git a/koníček.txt b/koníček.txt
+new file mode 100644
+index 00000000000..56c67f58752
+--- /dev/null
++++ b/koníček.txt
+@@ -0,0 +1 @@
++I'm a horse.
+-- 
+2.29.2
+=== 0001-Add-horse2.patch ===
+From 2884248d07e4e2c922e137365253e2e521c425b0 Mon Sep 17 00:00:00 2001
+From: Martin Liska <mliska@suse.cz>
+Date: Mon, 21 Dec 2020 10:14:46 +0100
+Subject: [PATCH] Add horse.
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+ChangeLog:
+
+	* koníček.txt: New file.
+---
+ "kon\303\255\304\215ek.txt" | 1 +
+ 1 file changed, 1 insertion(+)
+ create mode 100644 "kon\303\255\304\215ek.txt"
+
+diff --git "a/kon\303\255\304\215ek.txt" "b/kon\303\255\304\215ek.txt"
+new file mode 100644
+index 00000000000..56c67f58752
+--- /dev/null
++++ "b/kon\303\255\304\215ek.txt"
+@@ -0,0 +1 @@
++I'm a horse.
+-- 
+2.29.2
+
+


             reply	other threads:[~2021-04-23 10:36 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-23 10:36 Jakub Jelinek [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-04-23 10:49 Jakub Jelinek
2021-04-23 10:18 Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210423103653.74133393C842@sourceware.org \
    --to=jakub@gcc.gnu.org \
    --cc=gcc-cvs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).