public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Mikael Morin <morin-mikael@orange.fr>
To: Jakub Jelinek <jakub@redhat.com>
Cc: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>,
	Jonathan Wakely <jwakely.gcc@gmail.com>,
	gcc@gcc.gnu.org, Frank Eigler <fche@redhat.com>
Subject: Re: gcc git locked out for hours second day in a row
Date: Sun, 23 Jun 2024 22:22:26 +0200	[thread overview]
Message-ID: <3ac289bf-a264-4726-97a0-5d2e47f2cd55@orange.fr> (raw)
In-Reply-To: <Zmm3TbjL1tRWuc0e@tucnak>

Hello,

Le 12/06/2024 à 16:57, Jakub Jelinek a écrit :
> On Wed, Jun 12, 2024 at 04:53:38PM +0200, Mikael Morin wrote:
>>> Perhaps you could create a mirror version of the repo and do some experiments locally on that to identify where the bottle-neck is coming from?
>>>
>> Not sure where to start for that.Are the hooks published somewhere?
> 
> Yes: https://github.com/AdaCore/git-hooks/tree/master
> 
> Note, we use some tweaks on top of that, but that is mostly for the
> release branches and trunk, so it would be interesting to just try
> to reproduce that with the stock AdaCore git hooks.
> 
I have finally taken some time to investigate this hook slowness, and 
here are my findings.

My tests were run with configs commit-extra-checker and 
commit-email-formatter disabled, and hooks.update-hook set to a minimal 
script (either "true" or "sleep 1").  With that config, I could not 
reproduce the slowness pushing to refs/users/mikael/*.  The push 
finishes in less than a minute.

However, trying to push to a normal tag, there is some email count check 
coming into play, and I can reproduce some slowness (details below). 
This email count check shouldn't happen on the gcc repository in my use 
case (as email checks don't apply to user references), but the slowness 
could well happen in other cases than email count check depending on the 
configuration, as the problem relates to the size of the list of new 
commits and is not restricted to email count.

Anyway, even with email count check triggering, each tag takes less than 
2 minutes to be rejected in my test.  With 330 tags to process, that 
would make an upper bound of 11 hours before rejecting the push in my 
test (I killed it after a few minutes).  On the other hand, with the 
information you gave upthread, the hook on the gcc repository seemed to 
be still processing the first tag after a few hours (assuming they are 
processed in alphabetical order, which seems to be the case).  So this 
still doesn't explain what was happening on the gcc repository.

Regarding the email count check slowness I mentioned above, I traced it 
back to the updates.AbstractUpdate class, whose (procedural) 
new_commits_for_ref attribute is a list of "new" commits, containing 
both really new commits and commits newly on the branch to be updated, 
but already known to the repository.  For a tag or branch creation, a 
list of "new on the branch" commits would be huge as everything is new, 
so parent commits of the oldest "repository-new" commit are not picked 
up.  But in my test the list still amounts to a little less than 80,000 
commits, basically what happened on trunk in the last 8 years.  Anything 
that walks such a big list is bound to be slow.

To sum up:
- The hooks support checking "new on the branch" commits additionally to 
"new on the repository" commits, and that is a feature, not a bug.
- In my use case, that means that the hooks process 80,000 commits, even 
if only 330 of them are new on the repository.
- As the hook is called on a per-reference basis, the same commits would 
be processed over and over again for every reference in my use case, so 
the best would be to push them one by one, in order.
- I still don't know why it took hours (without even finishing) to 
process just one tag the other day on the gcc repository.

Nikael


  reply	other threads:[~2024-06-23 20:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-12 11:48 Jakub Jelinek
2024-06-12 12:14 ` Mark Wielaard
2024-06-12 12:57   ` Mikael Morin
2024-06-12 12:55 ` Mikael Morin
2024-06-12 12:58   ` Jonathan Wakely
2024-06-12 13:23     ` Mikael Morin
2024-06-12 14:34       ` Richard Earnshaw (lists)
2024-06-12 14:53         ` Mikael Morin
2024-06-12 14:57           ` Jakub Jelinek
2024-06-23 20:22             ` Mikael Morin [this message]
2024-06-12 15:55       ` Jonathan Wakely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ac289bf-a264-4726-97a0-5d2e47f2cd55@orange.fr \
    --to=morin-mikael@orange.fr \
    --cc=Richard.Earnshaw@arm.com \
    --cc=fche@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jwakely.gcc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).