From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 61280 invoked by alias); 4 Dec 2015 15:32:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 61271 invoked by uid 89); 4 Dec 2015 15:32:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 04 Dec 2015 15:32:39 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-32-xteLb8mVQaqyHJWGKyUNDQ-1; Fri, 04 Dec 2015 15:32:34 +0000 Received: from e105915-lin.cambridge.arm.com ([10.1.2.79]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 4 Dec 2015 15:32:33 +0000 Subject: Re: [PATCH][1/2] Fix PR68553 To: Richard Biener , gcc-patches@gcc.gnu.org References: From: Alan Lawrence Message-ID: <5661B211.8000402@arm.com> Date: Fri, 04 Dec 2015 15:32:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: X-MC-Unique: xteLb8mVQaqyHJWGKyUNDQ-1 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-12/txt/msg00580.txt.bz2 On 27/11/15 08:30, Richard Biener wrote: > > This is part 1 of a fix for PR68533 which shows that some targets > cannot can_vec_perm_p on an identity permutation. I chose to fix > this in the vectorizer by detecting the identity itself but with > the current structure of vect_transform_slp_perm_load this is > somewhat awkward. Thus the following no-op patch simplifies it > greatly (from the times it was restricted to do interleaving-kind > of permutes). It turned out to not be 100% no-op as we now can > handle non-adjacent source operands so I split it out from the > actual fix. > > The two adjusted testcases no longer fail to vectorize because > of "need three vectors" but unadjusted would fail because there > are simply not enough scalar iterations in the loop. I adjusted > that and now we vectorize it just fine (running into PR68559 > which I filed). > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. > > Richard. > > 2015-11-27 Richard Biener > > PR tree-optimization/68553 > * tree-vect-slp.c (vect_get_mask_element): Remove. > (vect_transform_slp_perm_load): Implement in a simpler way. > > * gcc.dg/vect/pr45752.c: Adjust. > * gcc.dg/vect/slp-perm-4.c: Likewise. On aarch64 and ARM targets, this causes PASS->FAIL: gcc.dg/vect/O3-pr36098.c scan-tree-dump-times vect "vectorizing= =20 stmts using SLP" 0 That is, we now vectorize using SLP, when previously we did not. On aarch64 (and I expect ARM too), previously we used a VEC_LOAD_LANES, wit= hout=20 unrolling, but now we unroll * 4, and vectorize using 3 loads and permutes: ../gcc/gcc/testsuite/gcc.dg/vect/O3-pr36098.c:15:2: note: add new stmt:=20 vect__31.15_94 =3D VEC_PERM_EXPR ; ../gcc/gcc/testsuite/gcc.dg/vect/O3-pr36098.c:15:2: note: add new stmt:=20 vect__31.16_95 =3D VEC_PERM_EXPR ; ../gcc/gcc/testsuite/gcc.dg/vect/O3-pr36098.c:15:2: note: add new stmt:=20 vect__31.17_96 =3D VEC_PERM_EXPR which *is* a valid vectorization strategy... --Alan