From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 55548 invoked by alias); 23 Apr 2015 15:01:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 55530 invoked by uid 89); 23 Apr 2015 15:01:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qk0-f171.google.com Received: from mail-qk0-f171.google.com (HELO mail-qk0-f171.google.com) (209.85.220.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 23 Apr 2015 15:00:58 +0000 Received: by qkgx75 with SMTP id x75so12138186qkg.1 for ; Thu, 23 Apr 2015 08:00:56 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.140.134.145 with SMTP id 139mr3839436qhg.6.1429801256794; Thu, 23 Apr 2015 08:00:56 -0700 (PDT) Received: by 10.140.19.107 with HTTP; Thu, 23 Apr 2015 08:00:56 -0700 (PDT) Reply-To: ramrad01@arm.com In-Reply-To: <54D20CB2.4070200@arm.com> References: <54D20CB2.4070200@arm.com> Date: Thu, 23 Apr 2015 15:01:00 -0000 Message-ID: Subject: Re: [PATCH][ARM] Rewrite vc NEON patterns to use RTL operations rather than UNSPECs From: Ramana Radhakrishnan To: Kyrill Tkachov Cc: GCC Patches , Ramana Radhakrishnan , Richard Earnshaw Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-04/txt/msg01421.txt.bz2 On Wed, Feb 4, 2015 at 12:12 PM, Kyrill Tkachov wrote: > Hi all, > > This patch improves the vc patterns in neon.md to use proper RTL > operations rather than UNSPECS. > It is done in a similar way to the analogous aarch64 operations i.e. vceq is > expressed as > (neg (eq (...) (...))) > since we want to write all 1s to the result element when 'eq' holds and 0s > otherwise. > > The catch is that the floating-point comparisons can only be expanded to the > RTL codes when -funsafe-math-optimizations is given and they must continue > to use the UNSPECS otherwise. > For this I've created a define_expand that generates > the correct RTL depending on -funsafe-math-optimizations and two > define_insns to match the result: one using the RTL codes and one using > UNSPECs. > > I've also compressed some of the patterns together using iterators for the > [eq gt ge le lt] cases. > NOTE: for le and lt before this patch we would never generate 'vclt. > dm, dn, dp' instructions, only 'vclt. dm, dn, #0'. > With this patch we can now generate 'vclt. dm, dn, dp' assembly. > According to the ARM ARM this is just a pseudo-instruction that mapps to > vcgt with the operands swapped around. > I've confirmed that gas supports this code. > > The vcage and vcagt patterns are rewritten to use the form: > (neg > ( > (abs (...)) > (abs (...)))) > > and condensed together using iterators as well. > > Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the > advanced-simd-intrinsics testsuite is passing > (it did catch some bugs during development of this patch) and tried out > other NEON intrinsics codebases. > > The test gcc.target/arm/neon/pr51534.c now generates 'vclt. dn, dm, > #0' instructions where appropriate instead of the previous vmov of #0 into a > temp and then a 'vcgt. dn, temp, dm'. > I think that is correct behaviour since the test was trying to make sure > that we didn't generate a .u-typed comparison with #0, which is what > the PR was talking about (from what I can gather). > > What do people think of this approach? > I'm proposing this for next stage1, of course. > This is OK - thanks. Ramana > Thanks, > Kyrill > > > 2015-02-04 Kyrylo Tkachov > > * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code > iterators. > (cmp_op, cmp_type): New code attributes. > (NEON_VCMP, NEON_VACMP): New int iterators. > (cmp_op_unsp): New int attribute. > * config/arm/neon.md (neon_vc): New define_expand. > (neon_vceq): Delete. > (neon_vc_insn): New pattern. > (neon_vc_insn_unspec): Likewise. > (neon_vcgeu): Delete. > (neon_vcle): Likewise. > (neon_vclt: Likewise. > (neon_vcage): Likewise. > (neon_vcagt): Likewise. > (neon_vca): New define_expand. > (neon_vca_insn): New pattern. > (neon_vca_insn_unspec): Likewise. > > 2015-02-04 Kyrylo Tkachov > > * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns > to look for vcl* where appropriate.