[Bug testsuite/63175] [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug testsuite/63175] [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
Date: Tue, 03 Mar 2015 09:21:00 -0000	[thread overview]
Message-ID: <bug-63175-4-gY2YwqMm84@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-63175-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63175

--- Comment #25 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 3 Mar 2015, msebor at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63175
> 
> --- Comment #24 from Martin Sebor <msebor at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #16)
> > Why is the loop bound to i != 16 / sizeof *s?
> 
> The upper bound is intended to make the copied sequence fit into one vector
> register, irrespective of the size of the array element.
> 
> The vector load and store instructions tolerate unaligned accesses and there
> are permute instructions that combine the contents of two vector registers into
> a single one to compensate for unaligned reads or writes.  I'm not sure it
> makes sense to expect unaligned copies involving a single vector register's
> worth of data to be vectorized (as done in my proposed tests for char and
> short), but I would expect larger unaligned copies (i.e., multiples of 16
> bytes) to benefit from it.  In my experiments I've seen no evidence of GCC
> attempting to vectorize such copies but I need to do some more research to
> understand why.
> 
> (In reply to comment #23)
> 
> The test uses -maltivec and that's what I've been using as well.  But I 
> see in the Power ISA book that lxvw4x and stxvw4x are classified as VSX 
> instructions, so perhaps they shouldn't be emitted without -mvsx.  
> Although 5.0 doesn't emit them even with -vsx.

5.0 doesn't consider stxvw4x without -mvsx - it does so with but then
the vectorizer cost model says the vectorization is not profitable:

t.c:10:10: note: Cost model analysis:
  Vector inside of basic block cost: 29
  Vector prologue cost: 0
  Vector epilogue cost: 0
  Scalar cost of basic block: 8
t.c:10:10: note: not vectorized: vectorization is not profitable.

I'll see if that cost caluclation is sensible.  We have 2 aligned
vector loads (cost 2), one permute (cost 3), one vector stmt (cost 1),
one unaligned store (unknown misalignment) which hits

rs6000_builtin_vectorization_cost (type_of_cost=unaligned_store, 
    vectype=<vector_type 0x7ffff6a39888>, misalign=-1)
    at /space/rguenther/src/svn/trunk2/gcc/config/rs6000/rs6000.c:4376
4376      switch (type_of_cost)
...
4455                        case -1:
4456                          /* Unknown misalignment.  */
4457                        case 4:
4458                        case 12:
4459                          /* Word aligned.  */
4460                          return 23;

cost of 23!(??).  For a misalign of 4?

Well - there you have it.  For the testcase

#define T int
extern const T a [];
T b[8];

void g (void)
{
  const T *p = a + 1;
  T *q = b + 1;

  *q++ = *p++;
  *q++ = *p++;
  *q++ = *p++;
  *q++ = *p++;
}

Eventually 4.8 had the cost model turned off for the testsuite or
it had bugs and misrepresented the case.  But clearly a cost of
23 looks excessive to me here (the scalar store of one of the 4
elements has cost 1!  so the unaligned vector store is nearly
6 times more expensive than doing the 4 unaligned stores.  Nobody
would design an instruction with such a severe penalty).

With -fvect-cost-model=unlimited GCC 5 produces

.L.g:
        addis 9,2,.LC0@toc@ha           # gpr load fusion, type long
        ld 9,.LC0@toc@l(9)
        addis 8,2,.LANCHOR0@toc@ha
        addi 8,8,.LANCHOR0@toc@l
        addi 10,9,12
        neg 7,9
        rldicr 10,10,0,59
        rldicr 9,9,0,59
        lvsr 13,0,7
        lxvw4x 33,0,9
        lxvw4x 32,0,10
        li 9,4
        vperm 0,1,0,13
        stxvw4x 32,8,9
        blr

Ah, GCC 4.8 had the cost model disabled by default (at least for
basic-block vectorization), so you need to enable it via
-fvect-cost-model where it rejects vectorizing the above with the
same reasoning.

So there is no regression and if vectorization is profitable then
the backend needs to adjust its cost model.

next prev parent reply	other threads:[~2015-03-03  9:21 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-63175-4@http.gcc.gnu.org/bugzilla/>
2014-09-05  8:47 ` rguenth at gcc dot gnu.org
2014-10-30 10:42 ` jakub at gcc dot gnu.org
2014-11-24 13:16 ` rguenth at gcc dot gnu.org
2015-02-21 20:07 ` msebor at gcc dot gnu.org
2015-02-22 20:38 ` macro@linux-mips.org
2015-02-24  6:56 ` msebor at gcc dot gnu.org
2015-02-25 20:45 ` msebor at gcc dot gnu.org
2015-02-26 10:19 ` rguenth at gcc dot gnu.org
2015-02-26 10:56 ` rguenth at gcc dot gnu.org
2015-02-27  9:18 ` rguenth at gcc dot gnu.org
2015-02-27  9:49 ` rguenth at gcc dot gnu.org
2015-02-27 11:23 ` rguenth at gcc dot gnu.org
2015-02-27 11:24 ` rguenth at gcc dot gnu.org
2015-02-28  9:26 ` msebor at gcc dot gnu.org
2015-03-02 14:14 ` rguenth at gcc dot gnu.org
2015-03-02 14:24 ` rguenth at gcc dot gnu.org
2015-03-02 16:24 ` msebor at gcc dot gnu.org
2015-03-02 16:48 ` rguenther at suse dot de
2015-03-02 16:50 ` rguenther at suse dot de
2015-03-02 16:58 ` msebor at gcc dot gnu.org
2015-03-02 17:47 ` rguenther at suse dot de
2015-03-02 18:13 ` msebor at gcc dot gnu.org
2015-03-02 18:23 ` rguenther at suse dot de
2015-03-03  5:10 ` msebor at gcc dot gnu.org
2015-03-03  9:21 ` rguenther at suse dot de [this message]
2015-03-03  9:42 ` rguenth at gcc dot gnu.org
2015-03-03 15:05 ` dje at gcc dot gnu.org
2015-03-03 16:19 ` wschmidt at gcc dot gnu.org
2015-03-03 16:22 ` wschmidt at gcc dot gnu.org
2015-03-04  1:15 ` msebor at gcc dot gnu.org
2015-03-04  9:17 ` rguenth at gcc dot gnu.org
2015-03-04 13:42 ` dje at gcc dot gnu.org
2015-03-04 13:55 ` rguenther at suse dot de
2015-03-06 18:44 ` msebor at gcc dot gnu.org
2015-03-07 16:19 ` [Bug testsuite/63175] [4.9 " law at redhat dot com
2015-03-10 21:07 ` msebor at gcc dot gnu.org
2015-03-11 10:09 ` rguenth at gcc dot gnu.org
2015-03-23 18:58 ` msebor at gcc dot gnu.org
2015-05-29 16:49 ` wschmidt at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-63175-4-gY2YwqMm84@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).