From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id AFBD53858C60; Thu, 16 Feb 2023 16:34:33 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AFBD53858C60
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1676565273;
	bh=lvSJ4oNcQZjmEBWpSaUzuR+mmwQlkYCZmtX8Z0526Mw=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=bxEZQ6womwDbktKu2pOs3GxcARvAFQ+zvPNjpmprZ//ciutYbwZ/RP6c2IESEDKBz
	 9aU5R1umSG24ewH3TrF0Cbo9E1Koa+qxCpUoj3OaAE7IBn92anbNzbQ9cbVJ4gJS43
	 +sT+San7v4z2V8M+TXzuz/Y7zO1/BRIIZQmVm46o=
From: "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/108803] [10/11/12/13 Regression] wrong code for 128bit
 rotate on aarch64-unknown-linux-gnu with -Og
Date: Thu, 16 Feb 2023 16:34:32 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jakub at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 10.5
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-108803-4-mTlNFqfQHx@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-108803-4@http.gcc.gnu.org/bugzilla/>
References: <bug-108803-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108803

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I'd say this is a bug in expand_doubleword_shift_condmove or so.
aarch64 when !TARGET_SIMD is a !SHIFT_COUNT_TRUNCATED target and shift_mask=
 is
0.
And HAVE_conditional_move is non-zero.
Now, if the shift count is constant at expansion time, we just select one of
the
expand_superword_shift or expand_subword_shift depending on the exact value=
 and
the shift count ought to be in both cases in the [0, BITS_PER_WORD - 1] ran=
ge.
Similarly, if !HAVE_conditional_move and shift count is non-constant, we do=
 the
same except that we select one at runtime, so again at runtime the chosen s=
hift
count should be [0, BITS_PER_WORD - 1].
But for expand_doubleword_shift_condmove with shift_mask 0, op1 is [0, 2 *
BITS_PER_WORD - 1] and we pass op1, superword_op1 where the latter is op1 -
BITS_PER_WORD.
So, in expand_doubleword_shift_condmove subword_op1 is in [0, 2 * BITS_PER_=
WORD
- 1] range and superword_op1 is in [-BITS_PER_WORD, BITS_PER_WORD - 1] rang=
e.=20
And the
routine just emits expand_superword_shift and expand_subword_shift and sele=
cts
using conditional move one of those.  But that means one of the two shifts =
is
necessarily with out of range count, either subword_op1 is [BITS_PER_WORD, =
2 *
BITS_PER_WORD - 1] i.e. too large, or superword_op1 is in [-BITS_PER_WORD, =
-1]
range (i.e. negative).
Don't we need to mask those counts in that case (both)?
Now, in the testcase __builtin_add_overflow_p is actually evaluated to cons=
tant
0 only during the expansion (or later?) - in this particular case I wonder =
why
we haven't optimized it earlier because for any unsigned addends the additi=
on
is in [0, 2 * UINT_MAX - 2] range and so fits well into signed __int128 ran=
ge,
something to be looked at for GCC 14.
But I fear it is exactly the only during RTL discovered constant that we la=
ter
on propagate into the op1 - BITS_PER_WORD and thus do one of the shifts with
count -64.

CCing Richard as the author of that code from 2004.=