From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-474536-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 6654 invoked by alias); 23 Jan 2015 12:10:10 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 6499 invoked by uid 48); 23 Jan 2015 12:09:59 -0000
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/64745] New: Generic vectorization missed opportunities
Date: Fri, 23 Jan 2015 12:10:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter cf_gcctarget
Message-ID: <bug-64745-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-01/txt/msg02530.txt.bz2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64745

            Bug ID: 64745
           Summary: Generic vectorization missed opportunities
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
            Target: x86_64-*-*, i?86-*-*

unsigned short a[2], b[2];
void foo (void)
{
  int i;
  for (i = 0; i < 2; ++i)
    a[i] = b[i];
}

unsigned char x[4], y[4];
void bar (void)
{
  int i;
  for (i = 0; i < 4; ++i)
    x[i] = y[i];
}

vectorizing this on i?86 (without SSE) fails for the first testcase at -O3
because we unroll the loop and SLP refuses to handle the "unaligned" load.
For the 2nd case we loop-vectorize it but apply versioning for alignment.

The alignment checks in the vectorizer do not account for non-vector modes.

If we fix that the first loop fails to SLP vectorize because of bogus
cost calculation:

t.c:6:13: note: Cost model analysis:
  Vector inside of basic block cost: 4
  Vector prologue cost: 0
  Vector epilogue cost: 0
  Scalar cost of basic block: 4
t.c:6:13: note: not vectorized: vectorization is not profitable.

because of the unaligned load/store cost:

t.c:6:13: note: vect_model_load_cost: unaligned supported by hardware.
t.c:6:13: note: vect_model_load_cost: inside_cost = 2, prologue_cost = 0 .
...
t.c:6:13: note: vect_model_store_cost: unaligned supported by hardware.
t.c:6:13: note: vect_model_store_cost: inside_cost = 2, prologue_cost = 0 .

that's a backend bug which doesn't consider !VECTOR_MODE_P vector types
in ix86_builtin_vectorization_cost.  OTOH for SLP vectorization if
the cost is equal we can assume less stmts will be used so eventually
just vectorize anyway if the costs are equal.


The real issue of course is that generic vectorization is not attempted
if a vector ISA is available - but that fails to vectorize the above cases
where SLP vectorization would take care of combining small loads and stores.

So we'd need to support HImode, SImode (and DImode on x86_64) vectorization
sizes which probably comes at a too big cost to consider that though
basic-block vectorization (knowing the size of the loads) could try anyway.
But that needs some re-org of the analysis.