From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31800 invoked by alias); 2 Mar 2015 14:24:39 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 31681 invoked by uid 48); 2 Mar 2015 14:24:36 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug testsuite/63175] [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1 Date: Mon, 02 Mar 2015 14:24:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: testsuite X-Bugzilla-Version: 4.9.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.9.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-03/txt/msg00139.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63175 --- Comment #16 from Richard Biener --- (In reply to Richard Biener from comment #15) > Btw, first of all unaligned stores are not supported according to the targets > vectorization hook, thus you'd need to peel the loop to make the store > aligned > which for some reason doesn't happen. Quite obvious - the loop iterates 8 times but the vectorization factor is 8 as well, so if we peel off a iteration to align the destination the vectorized loop will never enter. Why is the loop bound to i != 16 / sizeof *s? > But when peeled you certainly will see > byte/short/word stores at least. Like when I increase the iteration count I get for copy_short_0_1: .L.copy_Type_0_1: addis 6,2,.LANCHOR0@toc@ha addis 7,2,.LANCHOR1@toc@ha addi 6,6,.LANCHOR0@toc@l addi 7,7,.LANCHOR1@toc@l li 8,7 addi 9,6,2 mr 10,7 mtctr 8 .p2align 4,,15 .L2: addi 10,10,2 lhz 8,-2(10) addi 9,9,2 sth 8,-2(9) bdnz .L2 addi 8,7,14 addi 7,7,29 neg 5,8 lvx 1,0,8 lvx 0,0,7 li 7,16 lvsr 13,0,5 addi 8,10,14 addi 9,9,14 addi 10,10,16 vperm 0,1,0,13 stvx 0,6,7 .p2align 4,,15 .L3: lhzu 7,2(8) cmpld 7,10,8 sthu 7,2(9) bne+ 7,.L3 blr the cost model should probably reject this, but it does not: t.c:36:1: note: Cost model analysis: Vector inside of loop cost: 3 Vector prologue cost: 17 Vector epilogue cost: 2 Scalar iteration cost: 2 Scalar outside cost: 0 Vector outside cost: 19 prologue iterations: 7 epilogue iterations: 1 Calculated minimum iters for profitability: 10