From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 373CE3858C00; Fri, 15 Sep 2023 06:42:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 373CE3858C00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694760154; bh=0DQ4iekPg63xEnB3stMqa7mkLRHTNQZWGoeWrr98XaE=; h=From:To:Subject:Date:In-Reply-To:References:From; b=HQ/NHeBr8L8HpHJvetfSiLtBzPLYpplVNhpE4h7D8n7gdWhKrsolo928UAE3t4oO4 vST5FH5yYX9Yj/5v+RBj2eNAia+ougLBgYdk9DsJxLVc7GalgnKnN1sGje4tQEFX+g ljmzSQishk+Toy+d11tzuMtV0a0d1PDRDhOk/H9k= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS Date: Fri, 15 Sep 2023 06:42:33 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111401 --- Comment #7 from Richard Biener --- (In reply to Robin Dapp from comment #6) > Created attachment 55902 [details] > Tentative >=20 > You're referring to the case where we have init =3D -0.0, the condition is > false and we end up wrongly doing -0.0 + 0.0 =3D 0.0? > I suppose -0.0 the proper neutral element for PLUS (and WIDEN_SUM?) when > honoring signed zeros? And 0.0 for MINUS? Doesn't that also depend on t= he > rounding mode? Yes, if the rounding mode isn't known there isn't a working neutral element. > neutral_op_for_reduction could return a -0 for PLUS if we honor it for th= at > type. Or is that too intrusive? I suppose that could work, but we need to check that we're not using this for the initial value. > Guess I should add a test case for that as well. >=20 > Another thing is that swapping operands is not as easy with COND_ADD beca= use > the addition would be in the else. I'd punt for that case for now. >=20 > Next problem - might be a mistake on my side. For avx512 we create a > COND_ADD but the respective MASK_FOLD_LEFT_PLUS is not available, causing= us > to create numerous vec_extracts as fallback that increase the cost until = we > don't vectorize anymore. Yeah, but then a fold-left reduction wasn't necessary in the first place? We should avoid that (it's slow even when the target supports it) when possible. > Therefore I added a > vectorized_internal_fn_supported_p (IFN_FOLD_LEFT_PLUS, TREE_TYPE (lhs)). > SLP paths and ncopies !=3D 1 are excluded as well. Not really happy with= how > the patch looks now but at least the testsuites on aarch and x86 pass.=