public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/54013] New: Loop with control flow not vectorized
@ 2012-07-18 11:09 rguenth at gcc dot gnu.org
  2015-06-15 16:42 ` [Bug tree-optimization/54013] " alalaw01 at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-18 11:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013

             Bug #: 54013
           Summary: Loop with control flow not vectorized
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: rguenth@gcc.gnu.org
            Blocks: 53947


ICC manages to vectorize the following loop which happens in Polyhedron
mp_prop_design.

int foo (float x, float *tab)
{
  int i;
  for (i = 2; i < 45; ++i)
    if (x < tab[i])
      break;
  return i - 1;
}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/54013] Loop with control flow not vectorized
  2012-07-18 11:09 [Bug tree-optimization/54013] New: Loop with control flow not vectorized rguenth at gcc dot gnu.org
@ 2015-06-15 16:42 ` alalaw01 at gcc dot gnu.org
  2024-06-05 17:35 ` pinskia at gcc dot gnu.org
  2024-06-05 17:37 ` tnfchris at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: alalaw01 at gcc dot gnu.org @ 2015-06-15 16:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013

alalaw01 at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-06-15
                 CC|                            |alalaw01 at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from alalaw01 at gcc dot gnu.org ---
Indeed it does (confirmed).

So there are a few tricks here, but they are not Intel-specific, and don't even
look to require new tree codes. The loop body can be vectorized by computing
the (x < tab[i]) predicate across the vector, and then using a reduction opcode
(a bitwise-or reduction would be most natural but others work) to convert to a
scalar which then jumps out of the loop, i.e. if *any* of the lanes in the
vector would exit:

int foo (float x, float *tab)
{
  for (i = 2; i < 45; i+= 4)
    {
      v4sf v_tab = ...load from tab...
      unsigned v4si v_exit_cond = vec_cond_expr({x,x,x,x} < v_tab, -1, 0);
      if (reduc_max_expr (v_exit_cond)) break;
    }
  ...
}

The epilogue must then work out the value of i at exit (possibly a separate
epilogue for the "break" vs the other exit). I see two schemes:

(1) use vec_pack_trunc_expr, or similar, to narrow v_exit_cond down to a
scalar, where we can find the first set bit, and use this as an index to add to
the value still in i.

(2) compute a vector of the value i would have had if each element had been the
one that exitted:

v4si v_i_on_exit = vec_cond_expr (v_exit_cond,
    {i, i+1, i+2, i+3}, /* Maybe available as induction variable?  */
    {MAX_INT, MAX_INT, MAX_INT, MAX_INT})

and then take a reduc_min_expr to look for the *first* value of i that exits.

(There is one more issue, i.e. that we need to speculate the read of
tab[i+1...i+3], as the vector load will probably read all the lanes before we
know whether earlier iterations should have exited. So we'd need to have some
kind of check against that, or e.g. if tab[] were a global with known bounds.
Similar/complicated conditions apply to any/everything else in the loop, too!)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/54013] Loop with control flow not vectorized
  2012-07-18 11:09 [Bug tree-optimization/54013] New: Loop with control flow not vectorized rguenth at gcc dot gnu.org
  2015-06-15 16:42 ` [Bug tree-optimization/54013] " alalaw01 at gcc dot gnu.org
@ 2024-06-05 17:35 ` pinskia at gcc dot gnu.org
  2024-06-05 17:37 ` tnfchris at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-05 17:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think for SVE(2?) this could be vectorized using the fault first case.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/54013] Loop with control flow not vectorized
  2012-07-18 11:09 [Bug tree-optimization/54013] New: Loop with control flow not vectorized rguenth at gcc dot gnu.org
  2015-06-15 16:42 ` [Bug tree-optimization/54013] " alalaw01 at gcc dot gnu.org
  2024-06-05 17:35 ` pinskia at gcc dot gnu.org
@ 2024-06-05 17:37 ` tnfchris at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-06-05 17:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |115130

--- Comment #4 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
Since there's only one source here, alignment peeling should be enough to
vectorize it.

our pending patches should support it.  Will add it to verify list.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
[Bug 115130] [meta-bug] early break vectorization

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-05 17:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-18 11:09 [Bug tree-optimization/54013] New: Loop with control flow not vectorized rguenth at gcc dot gnu.org
2015-06-15 16:42 ` [Bug tree-optimization/54013] " alalaw01 at gcc dot gnu.org
2024-06-05 17:35 ` pinskia at gcc dot gnu.org
2024-06-05 17:37 ` tnfchris at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).