From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sourceware.org (Postfix) with ESMTPS id 831F63857C58 for ; Thu, 6 Aug 2020 18:07:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 831F63857C58 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=inria.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=marc.glisse@inria.fr X-IronPort-AV: E=Sophos;i="5.75,441,1589234400"; d="scan'208";a="356086938" Received: from 85-171-191-139.rev.numericable.fr (HELO stedding) ([85.171.191.139]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Aug 2020 20:07:58 +0200 Date: Thu, 6 Aug 2020 20:07:57 +0200 (CEST) From: Marc Glisse X-X-Sender: glisse@stedding.saclay.inria.fr To: Christophe Lyon cc: Richard Biener , GCC Patches Subject: Re: VEC_COND_EXPR optimizations v2 In-Reply-To: Message-ID: References: User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Aug 2020 18:08:01 -0000 On Thu, 6 Aug 2020, Christophe Lyon wrote: >> Was I on the right track configuring with >> --target=arm-none-linux-gnueabihf --with-cpu=cortex-a9 >> --with-fpu=neon-fp16 >> then compiling without any special option? > > Maybe you also need --with-float=hard, I don't remember if it's > implied by the 'hf' target suffix Thanks! That's what I was missing to reproduce the issue. Now I can reproduce it with just typedef unsigned int vec __attribute__((vector_size(16))); typedef int vi __attribute__((vector_size(16))); vi f(vec a,vec b){ return a==5 | b==7; } with -fdisable-tree-forwprop1 -fdisable-tree-forwprop2 -fdisable-tree-forwprop3 -O1 _1 = a_5(D) == { 5, 5, 5, 5 }; _3 = b_6(D) == { 7, 7, 7, 7 }; _9 = _1 | _3; _7 = .VCOND (_9, { 0, 0, 0, 0 }, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }, 107); we fail to expand the equality comparison (expand_vec_cmp_expr_p returns false), while with -fdisable-tree-forwprop4 we do manage to expand _2 = .VCONDU (a_5(D), { 5, 5, 5, 5 }, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }, 112); It doesn't make much sense to me that we can expand the more complicated form and not the simpler form of the same operation (both compare a to 5 and produce a vector of -1 or 0 of the same size), especially when the target has an instruction (vceq) that does just what we want. Introducing boolean vectors was fine, but I think they should be real types, that we can operate on, not be forced to appear only as the first argument of a vcond. I can think of 2 natural ways to improve things: either implement vector comparisons in the ARM backend (possibly by forwarding to their existing code for vcond), or in the generic expansion code try using vcond if the direct comparison opcode is not provided. We can temporarily revert my patch, but I would like it to be temporary. Since aarch64 seems to handle the same code just fine, maybe someone who knows arm could copy the relevant code over? Does my message make sense, do people have comments? -- Marc Glisse