From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sourceware.org (Postfix) with ESMTPS id D6A603858D37 for ; Thu, 6 Aug 2020 11:11:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D6A603858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=inria.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=marc.glisse@inria.fr X-IronPort-AV: E=Sophos;i="5.75,441,1589234400"; d="scan'208";a="356057837" Received: from 85-171-191-139.rev.numericable.fr (HELO stedding) ([85.171.191.139]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Aug 2020 13:11:58 +0200 Date: Thu, 6 Aug 2020 13:11:58 +0200 (CEST) From: Marc Glisse X-X-Sender: glisse@stedding.saclay.inria.fr To: Richard Biener cc: Christophe Lyon , GCC Patches Subject: Re: VEC_COND_EXPR optimizations v2 In-Reply-To: Message-ID: References: User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Aug 2020 11:12:02 -0000 On Thu, 6 Aug 2020, Richard Biener wrote: > On Thu, Aug 6, 2020 at 10:17 AM Christophe Lyon > wrote: >> >> Hi, >> >> >> On Wed, 5 Aug 2020 at 16:24, Richard Biener via Gcc-patches >> wrote: >>> >>> On Wed, Aug 5, 2020 at 3:33 PM Marc Glisse wrote: >>>> >>>> New version that passed bootstrap+regtest during the night. >>>> >>>> When vector comparisons were forced to use vec_cond_expr, we lost a number of >>>> optimizations (my fault for not adding enough testcases to prevent that). >>>> This patch tries to unwrap vec_cond_expr a bit so some optimizations can >>>> still happen. >>>> >>>> I wasn't planning to add all those transformations together, but adding one >>>> caused a regression, whose fix introduced a second regression, etc. >>>> >>>> Restricting to constant folding would not be sufficient, we also need at >>>> least things like X|0 or X&X. The transformations are quite conservative >>>> with :s and folding only if everything simplifies, we may want to relax >>>> this later. And of course we are going to miss things like a?b:c + a?c:b >>>> -> b+c. >>>> >>>> In terms of number of operations, some transformations turning 2 >>>> VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not look >>>> like a gain... I expect the bit_not disappears in most cases, and >>>> VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR. >>>> >>>> I am a bit confused that with avx512 we get types like "vector(4) >>>> " with :2 and not :1 (is it a hack so true is 1 and not >>>> -1?), but that doesn't matter for this patch. >>> >>> OK. >>> >>> Thanks, >>> Richard. >>> >>>> 2020-08-05 Marc Glisse >>>> >>>> PR tree-optimization/95906 >>>> PR target/70314 >>>> * match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e), >>>> (v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations. >>>> (op (c ? a : b)): Update to match the new transformations. >>>> >>>> * gcc.dg/tree-ssa/andnot-2.c: New file. >>>> * gcc.dg/tree-ssa/pr95906.c: Likewise. >>>> * gcc.target/i386/pr70314.c: Likewise. >>>> >> >> I think this patch is causing several ICEs on arm-none-linux-gnueabihf >> --with-cpu cortex-a9 --with-fpu neon-fp16: >> Executed from: gcc.c-torture/compile/compile.exp >> gcc.c-torture/compile/20160205-1.c -O3 -fomit-frame-pointer >> -funroll-loops -fpeel-loops -ftracer -finline-functions (internal >> compiler error) >> gcc.c-torture/compile/20160205-1.c -O3 -g (internal compiler error) >> Executed from: gcc.dg/dg.exp >> gcc.dg/pr87746.c (internal compiler error) >> Executed from: gcc.dg/tree-ssa/tree-ssa.exp >> gcc.dg/tree-ssa/ifc-cd.c (internal compiler error) >> Executed from: gcc.dg/vect/vect.exp >> gcc.dg/vect/pr59591-1.c (internal compiler error) >> gcc.dg/vect/pr59591-1.c -flto -ffat-lto-objects (internal compiler error) >> gcc.dg/vect/pr86927.c (internal compiler error) >> gcc.dg/vect/pr86927.c -flto -ffat-lto-objects (internal compiler error) >> gcc.dg/vect/slp-cond-5.c (internal compiler error) >> gcc.dg/vect/slp-cond-5.c -flto -ffat-lto-objects (internal compiler error) >> gcc.dg/vect/vect-23.c (internal compiler error) >> gcc.dg/vect/vect-23.c -flto -ffat-lto-objects (internal compiler error) >> gcc.dg/vect/vect-24.c (internal compiler error) >> gcc.dg/vect/vect-24.c -flto -ffat-lto-objects (internal compiler error) >> gcc.dg/vect/vect-cond-reduc-6.c (internal compiler error) >> gcc.dg/vect/vect-cond-reduc-6.c -flto -ffat-lto-objects (internal >> compiler error) >> >> Backtrace for gcc.c-torture/compile/20160205-1.c -O3 >> -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer >> -finline-functions >> during RTL pass: expand >> /gcc/testsuite/gcc.c-torture/compile/20160205-1.c:2:5: internal >> compiler error: in do_store_flag, at expr.c:12259 >> 0x8feb26 do_store_flag >> /gcc/expr.c:12259 >> 0x900201 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, >> expand_modifier) >> /gcc/expr.c:9617 >> 0x908cd0 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, >> expand_modifier, rtx_def**, bool) >> /gcc/expr.c:10159 >> 0x91174e expand_expr >> /gcc/expr.h:282 >> 0x91174e expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**, >> rtx_def**, expand_modifier) >> /gcc/expr.c:8065 >> 0x8ff543 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, >> expand_modifier) >> /gcc/expr.c:9950 >> 0x908cd0 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, >> expand_modifier, rtx_def**, bool) >> /gcc/expr.c:10159 >> 0x91174e expand_expr >> /gcc/expr.h:282 >> 0x91174e expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**, >> rtx_def**, expand_modifier) >> /gcc/expr.c:8065 >> 0x8ff543 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, >> expand_modifier) >> /gcc/expr.c:9950 >> 0x908cd0 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, >> expand_modifier, rtx_def**, bool) >> /gcc/expr.c:10159 >> 0x91174e expand_expr >> /gcc/expr.h:282 >> 0x91174e expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**, >> rtx_def**, expand_modifier) >> /gcc/expr.c:8065 >> 0x8ff543 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, >> expand_modifier) >> /gcc/expr.c:9950 >> 0x908cd0 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, >> expand_modifier, rtx_def**, bool) >> /gcc/expr.c:10159 >> 0x91174e expand_expr >> /gcc/expr.h:282 >> 0x91174e expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**, >> rtx_def**, expand_modifier) >> /gcc/expr.c:8065 >> 0x8ff543 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, >> expand_modifier) >> /gcc/expr.c:9950 >> 0x908cd0 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, >> expand_modifier, rtx_def**, bool) >> /gcc/expr.c:10159 >> 0x91174e expand_expr >> /gcc/expr.h:282 > > Hmm, I guess we might need to verify that the VEC_COND_EXPRs > can be RTL expanded, at least if the folding triggers after vector > lowering (but needing to lower a previously expandable VEC_COND_EXPR > would be similarly bad). So we may need to handle VEC_COND_EXPRs > like VEC_PERMs and thus need to check target support. Ick. Maybe. I'd like to see what the gimple looks like that arm fails to expand, if that's really a limitation in the hardware, or just some simple missing case in the target or the expansion code. Is it that we had (a