From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x334.google.com (mail-ot1-x334.google.com [IPv6:2607:f8b0:4864:20::334]) by sourceware.org (Postfix) with ESMTPS id 4EE293857801 for ; Thu, 17 Jun 2021 00:17:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4EE293857801 Received: by mail-ot1-x334.google.com with SMTP id p5-20020a9d45450000b029043ee61dce6bso4372749oti.8 for ; Wed, 16 Jun 2021 17:17:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=S/9RcHadrl22L4rB8E1XFZUgshYPgToz8kJTSQGMY1I=; b=p5GJW62Svr74FFhc41TiZTXx4UsjmUXsx7+clFHPlmBz+iYCBuhPnR6LF4DFJcOhDK 8z4Im04O5SgcV3p25xPgQdGsPgpn64VKP9aT30cjNTq6tbvXfRko0pTt4d0LpTB1Anwb sK/FPGBhfJq/IEB112FAKtM5fvCHW/cUzT6f4pEV4lSiPnSR/D//m7rGLQXmY277AMfx qsKLdeJBKOwm2YgtNbcdL6ekjJpyryDN8mjdPLp3/s3OEyk+ABTSwxp3MrYwZNE6B5if umKt3EVAdxao3fio1eYvxaCVzhvxpQyFvclMGQm3yFaIa2YD/gTj+LT4cVnFpN+Q76Ex hiew== X-Gm-Message-State: AOAM531TlT9OJN5oCFcg9AZUVgzkS9llq9jVQ/Pgq/F0ZrPcU107TYl6 GwCqEuFRWxmRQBlI6wgm7gw= X-Google-Smtp-Source: ABdhPJylcxIEF1PrY+hYhdz8x3QNlGg/pZ9qvBEe3jVRuI1LcWpxPy868oBgkL+LeM2VihIhUINpGw== X-Received: by 2002:a05:6830:18d3:: with SMTP id v19mr2035964ote.356.1623889026683; Wed, 16 Jun 2021 17:17:06 -0700 (PDT) Received: from [192.168.0.41] (97-118-105-195.hlrn.qwest.net. [97.118.105.195]) by smtp.gmail.com with ESMTPSA id d12sm874787otf.65.2021.06.16.17.17.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 16 Jun 2021 17:17:05 -0700 (PDT) Subject: Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog To: Jason Merrill Cc: Hans-Peter Nilsson , Jakub Jelinek , gcc Mailing List , Jonathan Wakely References: <1230cb99-ed83-3e4e-8362-94f03ee021bc@gmail.com> <3228435b-aba0-6157-3266-c0f025822829@gmail.com> <5f89ddc0-aed4-2c20-0979-dfafb29046ee@gmail.com> <20210610173005.GI7746@tucnak> <20210610190941.GJ7746@tucnak> <58b63929-01f5-038c-931c-9ff8349d9f95@gmail.com> From: Martin Sebor Message-ID: <71b4a023-efb2-6c6a-9ced-93cce7c96540@gmail.com> Date: Wed, 16 Jun 2021 18:17:04 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------7D73202AB71BEE9612137AE8" Content-Language: en-US X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jun 2021 00:17:10 -0000 This is a multi-part message in MIME format. --------------7D73202AB71BEE9612137AE8 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 6/16/21 5:45 PM, Jason Merrill wrote: > On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor > wrote: > > On 6/16/21 2:49 PM, Jason Merrill wrote: > > On 6/15/21 11:42 PM, Jason Merrill wrote: > >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc > > >> >> wrote: > >> > >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote: > >>      > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote: > >>      > > >>      >> On 6/11/21 11:32 AM, Jonathan Wakely wrote: > >>      >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote: > >>      >>>> My objection is to making our policies and tools more > >> restrictive > >>      >>>> than they need to be.  We shouldn't expect everyone to > study > >> whole > >>      >>>> manuals just to figure out how to successfully commit a > >> change (or > >>      >>>> learn how to format it just the right way).  It should > be easy. > >>      >>> > >>      >>> I agree, to some extent. But consistency is also good. The > >>     conventions > >>      >>> for GNU ChangeLog formatting exist for a reason, and so > do the > >>      >>> conventions for good Git commit messages. > >>      >>> > >>      >>>> Setting this discussion aside for a moment and using a > >> different > >>      >>>> example, the commit hook rejects commit messages that > don't > >> start > >>      >>>> ChangeLog entries with tabs.  It also rejects commit > >> messages that > >>      >>>> don't list all the same test files as those changed by > the > >> commit > >>      >>>> (and probably some others as well).  That's in my view > >> unnecessary > >>      >>>> when the hook could just replace the leading spaces with > >> tabs and > >>      >>>> automatically mention all the tests. > >>      >>>> > >>      >>>> I see this proposal as heading in the same direction. > >> Rather than > >>      >>>> making the script fix things up if we get them wrong > it would > >>     reject > >>      >>>> the commit, requiring the user to massage the ChangeLog by > >>     hand into > >>      >>>> an unnecessarily rigid format. > >>      >>> > >>      >>> You cannot "fix things up" in a server-side receive hook, > >> because > >>      >>> changing the commit message would alter the commit > hash, which > >>     would > >>      >>> require the committer to do a rebase to proceed. That > breaks the > >>      >>> expected behaviour and workflow of a git repo. > >>      >>> > >>      >>> You can use the scripts on the client side to verify > your commit > >>      >>> message before pushing, so you don't have to be surprised > >> when the > >>      >>> server rejects it. > >>      >> > >>      >> That sounds like a killer argument.  Do we have shared > >> client-side > >>      >> scripts that could fix things up for us, or are we each > on our > >> own > >>      >> to write them? > >>      > > >>      > I hope I got your view wrong.  If not: the "scripts fixing > >>      > things up for us" direction is flawed (compared to the > "scripts > >>      > rejecting bad formats"), unless offered as a non-default > option; > >>      > please don't proceed. > >>      > > >>      > Why?  For one, there'll always be bugs in the scripting. > >>      > Mitigate those situations: while wrongly rejecting a > commit is > >>      > bad, wrongly "fixing things up" is worse, as a general rule. > >>      > Better avoid that.  (There's probably a popular "pattern > name" > >>      > for what I try to describe.) > >> > >>     The word that comes to mind is Technophobia.  Is it wise to > trust > >>     compilers to transform programs from their source form into > >>     executables?  What if there are bugs in either?  What about > the OS? > >>     The whole computer, or the Internet?  Our cars? > Fortunately, there's > >>     more to gain than to lose by trusting automation.  If there > weren't > >>     human progress would be stuck sometime in the 1700's. > >> > >>     But we're not talking about anything anywhere that sophisticated > >>     here: a sed script to copy and paste a piece of text in > >>     the description of a change from one place to another.  It's > been > >>     done a few times before with more important data than > ChangeLogs. > >> > >> > >> git gcc-commit-mklog already automates most of the process.  It > could > >> also automate adding [PRxxxxx] to the first line.  Is that what > you're > >> asking for? > > > > Like, say: > > I don't think this solves the problem Xionghu Luo was asking about: > https://gcc.gnu.org/pipermail/gcc/2021-June/236346.html > > > Indeed, their problem was not mentioning the PR in the testcase, which a > script isn't going to fix. > > i.e., they did have a [PRnnnn] in the one line subject but not in > their ChangeLog entries.  It also not clear if they used mklog.py > at all.  IME, mklog.py already puts in a [PRnnnn] near the top of > a patch if it finds in one of the tests.  Though it doesn't seem > to put in the ChangeLog entries.  Odd. > > > It currently puts in > >  PR comp/nnnnn > > at the beginning of the ChangeLog entries; it used to put them in the > entries for each ChangeLog file, but that changed in r12-771.  My patch > also adds the [PRnnnn] to the subject line. To say I'm not good at Python would be an understatement but I hacked up the attached patch that: 1) extracts PR numbers from test names, 2) gets the component for each PR from Bugzilla, 3) adds the PR component/nnnnn to each ChangeLog Running the hacked up script with -p on the patch for PR 100085 prints the following: Resolves: PR 100085/target - Bad code for union transfer from __float128 to vector types gcc/ChangeLog: PR 100085/target * config/rs6000/rs6000-p8swap.c (pattern_is_rotate64): (insn_is_load_p): (insn_is_swap_p): (quad_aligned_load_p): (const_load_sequence_p): (replace_swapped_aligned_load): (recombine_lvx_pattern): (recombine_stvx_pattern): gcc/testsuite/ChangeLog: PR 100085/target * gcc.target/powerpc/float128-call.c: * gcc.target/powerpc/pr100085.c: New test. Without -p it doesn't print the Resolves: line or the component name in the ChangeLog entries. This also mimics what my retired awk script does. What do you think? > I was suggesting to make this (and the other things the commit > hook rejects commit for) happen automatically by running a script > at the same time as git commit. > > But to be clear, I'm not asking for anything for myself.  Although > I use mklog.py I have my own script that does what I suggest that > I could go back to.  I responded to this thread because I think > these minute details could be automated for everyone's benefit. > Before moving to Git we talked about doing much more, including > automatically running a format/style checker on the patch and > (IIRC) Jeff even wanted it to do some basic tweaks on its own > (like replace spaces with tabs). > > > We could do various cleanup in the 'commit-msg' hook on the user side. > Or, probably better, git gcc-verify could clean up some of the issues it > finds. I'm not familiar with these. Where should I look for them to learn how to do this? Martin --------------7D73202AB71BEE9612137AE8 Content-Type: text/x-patch; charset=UTF-8; name="gcc-mklog-prs.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gcc-mklog-prs.diff" Add PRs to ChangeLog entries in output. contrib/ChangeLog: * mklog.py (bugzilla_url): Add component to query. get_pr_titles): Remove stray debugging output. Get components for PRs. Return new PR array including components. (generate_changelog): Extract PRs from test names. Print a Resolves line before PRs. Replace PRS with updated. Print all PRs in each ChangeLog entry. diff --git a/contrib/mklog.py b/contrib/mklog.py index 5c93c707128..e6a418a2fb4 100755 --- a/contrib/mklog.py +++ b/contrib/mklog.py @@ -51,7 +51,7 @@ fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]') template_and_param_regex = re.compile(r'<[^<>]*>') md_def_regex = re.compile(r'\(define.*\s+"(.*)"') bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&' \ - 'include_fields=summary' + 'include_fields=component,summary' function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'} @@ -115,17 +115,22 @@ def sort_changelog_files(changed_file): def get_pr_titles(prs): + new_prs = [] output = '' for pr in prs: pr_id = pr.split('/')[-1] + if pr_id == pr: + pr_id = pr.split(' ')[-1] r = requests.get(bugzilla_url % pr_id) bugs = r.json()['bugs'] if len(bugs) == 1: - output += '%s - %s\n' % (pr, bugs[0]['summary']) - print(output) + comp = bugs[0]['component'] + pr_id_comp = "PR " + pr_id + "/" + comp + new_prs.append (pr_id_comp) + output += 'PR %s/%s - %s\n' % (pr_id, comp, bugs[0]['summary']) if output: output += '\n' - return output + return output, new_prs def generate_changelog(data, no_functions=False, fill_pr_titles=False): @@ -147,6 +152,12 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False): # Extract PR entries from newly added tests if 'testsuite' in file.path and file.is_added_file: + name = os.path.basename(file.path) + name = os.path.splitext(name)[0] + if name.startswith("pr"): + name = name[2:] + name = "PR " + name + prs.append(name) # Only search first ten lines as later lines may # contains commented code which a note that it # has not been tested due to a certain PR or DR. @@ -166,22 +177,28 @@ def generate_changelog(data, no_functions=False, fill_pr_titles=False): # Found dg-warning/dg-error line break - if fill_pr_titles: - out += get_pr_titles(prs) + # Start with a blank line for a subject and description. + out += '\n' - # print list of PR entries before ChangeLog entries - if prs: - if not out: - out += '\n' - for pr in prs: - out += '\t%s\n' % pr - out += '\n' + # Append a Resolves: line with PRs and their titles and replace + # PRS with an array of / in case it's not in + # that format. + if fill_pr_titles: + pr_titles, new_prs = get_pr_titles(prs) + if new_prs: + prs = new_prs + if pr_titles: + out += "Resolves:\n" + out += pr_titles # sort ChangeLog so that 'testsuite' is at the end for changelog in sorted(changelog_list, key=lambda x: 'testsuite' in x): files = changelogs[changelog] out += '%s:\n' % os.path.join(changelog, 'ChangeLog') out += '\n' + # Print list of PR entries before each ChangeLog entry. + for pr in prs: + out += '\t%s\n' % pr # new and deleted files should be at the end for file in sorted(files, key=sort_changelog_files): assert file.path.startswith(changelog) --------------7D73202AB71BEE9612137AE8--