From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id AFBD53858C60; Thu, 16 Feb 2023 16:34:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AFBD53858C60 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1676565273; bh=lvSJ4oNcQZjmEBWpSaUzuR+mmwQlkYCZmtX8Z0526Mw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=bxEZQ6womwDbktKu2pOs3GxcARvAFQ+zvPNjpmprZ//ciutYbwZ/RP6c2IESEDKBz 9aU5R1umSG24ewH3TrF0Cbo9E1Koa+qxCpUoj3OaAE7IBn92anbNzbQ9cbVJ4gJS43 +sT+San7v4z2V8M+TXzuz/Y7zO1/BRIIZQmVm46o= From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/108803] [10/11/12/13 Regression] wrong code for 128bit rotate on aarch64-unknown-linux-gnu with -Og Date: Thu, 16 Feb 2023 16:34:32 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 10.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108803 Jakub Jelinek changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org, | |rsandifo at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- I'd say this is a bug in expand_doubleword_shift_condmove or so. aarch64 when !TARGET_SIMD is a !SHIFT_COUNT_TRUNCATED target and shift_mask= is 0. And HAVE_conditional_move is non-zero. Now, if the shift count is constant at expansion time, we just select one of the expand_superword_shift or expand_subword_shift depending on the exact value= and the shift count ought to be in both cases in the [0, BITS_PER_WORD - 1] ran= ge. Similarly, if !HAVE_conditional_move and shift count is non-constant, we do= the same except that we select one at runtime, so again at runtime the chosen s= hift count should be [0, BITS_PER_WORD - 1]. But for expand_doubleword_shift_condmove with shift_mask 0, op1 is [0, 2 * BITS_PER_WORD - 1] and we pass op1, superword_op1 where the latter is op1 - BITS_PER_WORD. So, in expand_doubleword_shift_condmove subword_op1 is in [0, 2 * BITS_PER_= WORD - 1] range and superword_op1 is in [-BITS_PER_WORD, BITS_PER_WORD - 1] rang= e.=20 And the routine just emits expand_superword_shift and expand_subword_shift and sele= cts using conditional move one of those. But that means one of the two shifts = is necessarily with out of range count, either subword_op1 is [BITS_PER_WORD, = 2 * BITS_PER_WORD - 1] i.e. too large, or superword_op1 is in [-BITS_PER_WORD, = -1] range (i.e. negative). Don't we need to mask those counts in that case (both)? Now, in the testcase __builtin_add_overflow_p is actually evaluated to cons= tant 0 only during the expansion (or later?) - in this particular case I wonder = why we haven't optimized it earlier because for any unsigned addends the additi= on is in [0, 2 * UINT_MAX - 2] range and so fits well into signed __int128 ran= ge, something to be looked at for GCC 14. But I fear it is exactly the only during RTL discovered constant that we la= ter on propagate into the op1 - BITS_PER_WORD and thus do one of the shifts with count -64. CCing Richard as the author of that code from 2004.=