From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 8F5EA3858CDA for ; Tue, 10 Jan 2023 11:32:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8F5EA3858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 329D54B3; Tue, 10 Jan 2023 03:32:49 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AFDEB3F587; Tue, 10 Jan 2023 03:32:06 -0800 (PST) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] tree-optimization/108314 - avoid BIT_NOT optimization for extract-last References: <20230110094614.0E81A1358A@imap2.suse-dmz.suse.de> Date: Tue, 10 Jan 2023 11:32:05 +0000 In-Reply-To: (Richard Biener's message of "Tue, 10 Jan 2023 11:05:34 +0000 (UTC)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-38.0 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Richard Biener writes: > On Tue, 10 Jan 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > The extract-last reduction internal function expects the then and >> > else clause as vector and scalar and thus we cannot perform optimization >> > of the inversion of the condition by swapping the then/else clauses. >> > >> > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? >> >> Sorry for not having found the time to look at the PR yet. >> Like you say in the trail, it seems kind-of familiar. >> >> I think we should instead prevent the else in: >> >> scalar_cond_masked_key cond (cond_expr, ncopies); >> if (loop_vinfo->scalar_cond_masked_set.contains (cond)) >> masks = &LOOP_VINFO_MASKS (loop_vinfo); >> else >> { >> >> for EXTRACT_LAST. We've lost as soon as swap_cond_operands gets >> set to true. > > But we're not getting there - the above is guarded with > > if (reduction_type == EXTRACT_LAST_REDUCTION) > masks = &LOOP_VINFO_MASKS (loop_vinfo); > else > { > > instead we run into > > if (masked) > vec_compare = vec_cond_lhs; > else > { > vec_cond_rhs = vec_oprnds1[i]; > if (bitop1 == NOP_EXPR) > { > ... > else > { > ... > else if (bitop2 == BIT_NOT_EXPR > { > /* Instead of doing ~x ? y : z do x ? z : y. */ > vec_compare = new_temp; > std::swap (vec_then_clause, vec_else_clause); > > so we could instead reject vectorizing for EQ_EXPR but then > applying the negation to the condition allows this to be > vectorized just fine (which is what the patch does)? Ah, OK. I wasn't sure which of the paths we were going down to get here. So yeah, I agree the patch is OK. Sorry for the noise. Richard > Richard. > >> Thanks, >> Richard >> >> > Thanks, >> > Richard. >> > >> > PR tree-optimization/108314 >> > * tree-vect-stmts.cc (vectorizable_condition): Do not >> > perform BIT_NOT_EXPR optimization for EXTRACT_LAST_REDUCTION. >> > >> > * gcc.dg/vect/pr108314.c: New testcase. >> > --- >> > gcc/testsuite/gcc.dg/vect/pr108314.c | 16 ++++++++++++++++ >> > gcc/tree-vect-stmts.cc | 13 +++++++++---- >> > 2 files changed, 25 insertions(+), 4 deletions(-) >> > create mode 100644 gcc/testsuite/gcc.dg/vect/pr108314.c >> > >> > diff --git a/gcc/testsuite/gcc.dg/vect/pr108314.c b/gcc/testsuite/gcc.dg/vect/pr108314.c >> > new file mode 100644 >> > index 00000000000..07260e06915 >> > --- /dev/null >> > +++ b/gcc/testsuite/gcc.dg/vect/pr108314.c >> > @@ -0,0 +1,16 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */ >> > + >> > +int x, y, z; >> > + >> > +void f(void) >> > +{ >> > + int t = 4; >> > + for (; x; x++) >> > + { >> > + if (y) >> > + continue; >> > + t = 0; >> > + } >> > + z = t; >> > +} >> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc >> > index 6ddd41fb473..eb4ca1f184e 100644 >> > --- a/gcc/tree-vect-stmts.cc >> > +++ b/gcc/tree-vect-stmts.cc >> > @@ -10677,7 +10677,8 @@ vectorizable_condition (vec_info *vinfo, >> > vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); >> > if (bitop2 == NOP_EXPR) >> > vec_compare = new_temp; >> > - else if (bitop2 == BIT_NOT_EXPR) >> > + else if (bitop2 == BIT_NOT_EXPR >> > + && reduction_type != EXTRACT_LAST_REDUCTION) >> > { >> > /* Instead of doing ~x ? y : z do x ? z : y. */ >> > vec_compare = new_temp; >> > @@ -10686,9 +10687,13 @@ vectorizable_condition (vec_info *vinfo, >> > else >> > { >> > vec_compare = make_ssa_name (vec_cmp_type); >> > - new_stmt >> > - = gimple_build_assign (vec_compare, bitop2, >> > - vec_cond_lhs, new_temp); >> > + if (bitop2 == BIT_NOT_EXPR) >> > + new_stmt >> > + = gimple_build_assign (vec_compare, bitop2, new_temp); >> > + else >> > + new_stmt >> > + = gimple_build_assign (vec_compare, bitop2, >> > + vec_cond_lhs, new_temp); >> > vect_finish_stmt_generation (vinfo, stmt_info, >> > new_stmt, gsi); >> > } >>