public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int"
@ 2011-03-09 18:49 vincenzo.innocente at cern dot ch
  2011-03-10  0:21 ` [Bug tree-optimization/48052] " paolo.carlini at oracle dot com
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-03-09 18:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

           Summary: loop not vectorized if index is "unsigned int"
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: vincenzo.innocente@cern.ch


is there any reason why "unsigned int" is not suited to index loop for
auto-vectorization?
example

cat simpleLoop.cc
#include<cstddef>

void loop1( double const * __restrict__ x_in,  double * __restrict__ x_out,
double const * __restrict__ c, int N) { 
   for(int i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}



void loop2( double const * __restrict__ x_in,  double * __restrict__ x_out,
double const * __restrict__ c, unsigned int N) {
   for(unsigned int i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}

void loop21( double const * __restrict__ x_in,  double * __restrict__ x_out,
double const * __restrict__ c, size_t N) {
   for(size_t i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}

void loop21( double const * __restrict__ x_in,  double * __restrict__ x_out,
double const * __restrict__ c, unsigned long long N) {
   for(unsigned long long  i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}


void loop3( double const * __restrict__ x_in,  double * __restrict__ x_out,
double const * __restrict__ c, size_t N) {
   double const * end = x_in+N;
   for(; x_in!=end; ++x_in, ++x_out, ++c)
       (*x_out) = (*c) * (*x_in);
}

result:

g++ -v -O2 -ftree-vectorize -ftree-vectorizer-verbose=2 -c simpleLoop.cc
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --enable-gold=yes --enable-lto --with-fpmath=avx
Thread model: posix
gcc version 4.6.0 20110205 (experimental) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-O2' '-ftree-vectorize' '-ftree-vectorizer-verbose=2'
'-c' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/cc1plus -quiet -v
-D_GNU_SOURCE simpleLoop.cc -quiet -dumpbase simpleLoop.cc -mtune=generic
-march=x86-64 -auxbase simpleLoop -O2 -version -ftree-vectorize
-ftree-vectorizer-verbose=2 -o /tmp/innocent/ccUB9xBg.s
GNU C++ (GCC) version 4.6.0 20110205 (experimental) (x86_64-unknown-linux-gnu)
    compiled by GNU C version 4.6.0 20110205 (experimental), GMP version 4.3.2,
MPFR version 2.4.2, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
ignoring nonexistent directory
"/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../x86_64-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:

/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../include/c++/4.6.0

/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../include/c++/4.6.0/x86_64-unknown-linux-gnu

/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../include/c++/4.6.0/backward
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/include
 /usr/local/include
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/include-fixed
 /usr/include
End of search list.
GNU C++ (GCC) version 4.6.0 20110205 (experimental) (x86_64-unknown-linux-gnu)
    compiled by GNU C version 4.6.0 20110205 (experimental), GMP version 4.3.2,
MPFR version 2.4.2, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 0d52c927b640361d99f7371685058a2b

simpleLoop.cc:4: note: LOOP VECTORIZED.
simpleLoop.cc:3: note: vectorized 1 loops in function.

simpleLoop.cc:11: note: not vectorized: data ref analysis failed D.2386_13 =
*D.2385_12;

simpleLoop.cc:10: note: vectorized 0 loops in function.

simpleLoop.cc:16: note: LOOP VECTORIZED.
simpleLoop.cc:15: note: vectorized 1 loops in function.

simpleLoop.cc:21: note: LOOP VECTORIZED.
simpleLoop.cc:20: note: vectorized 1 loops in function.

simpleLoop.cc:28: note: LOOP VECTORIZED.
simpleLoop.cc:26: note: vectorized 1 loops in function.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
@ 2011-03-10  0:21 ` paolo.carlini at oracle dot com
  2011-03-10  9:46 ` rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: paolo.carlini at oracle dot com @ 2011-03-10  0:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

Paolo Carlini <paolo.carlini at oracle dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #1 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 00:21:19 UTC ---
Richard, is this issue known? Seems indeed rather weird to me.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
  2011-03-10  0:21 ` [Bug tree-optimization/48052] " paolo.carlini at oracle dot com
@ 2011-03-10  9:46 ` rguenth at gcc dot gnu.org
  2011-03-10 10:22 ` paolo.carlini at oracle dot com
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-03-10  9:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011.03.10 09:46:16
                 CC|                            |spop at gcc dot gnu.org
     Ever Confirmed|0                           |1

--- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-03-10 09:46:17 UTC ---
This is a known issue with POINTER_PLUS_EXPR semantics, how the C frontend
handles pointer-based array accesses and fold.  And in the end SCEV analysis.
The issue is we end up with

  *(c + (((long unsigned int)i) * 8))

with that 'long unsigned int' being sizetype.  At the point of SCEV
analysis we do not factor in the fact that i does not wrap around and
that because of this the evolution is
{ c, +, 8 }

With signed integers we simply exploit undefined behavior.

So yes, it's a known problem (but I always fail to remember a testcase
where it matters ;)).

In the very end my plan was to fix this all with no-undefined-overflow
branch, but maybe Sebastian can think of a way to use number-of-iteration
analysis in SCEV?  (Ugh, that's a chicken-and-egg problem, no?)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
  2011-03-10  0:21 ` [Bug tree-optimization/48052] " paolo.carlini at oracle dot com
  2011-03-10  9:46 ` rguenth at gcc dot gnu.org
@ 2011-03-10 10:22 ` paolo.carlini at oracle dot com
  2011-03-10 10:54 ` vincenzo.innocente at cern dot ch
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: paolo.carlini at oracle dot com @ 2011-03-10 10:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #3 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 10:22:48 UTC ---
Thanks for the analysis. I knew about the difference between signed and
unsigned, makes sense. Not knowing in detail the internals of the optimization
the puzzling bit is that types wider than unsigned int already work fine. The
problem seems fixable, somehow ;)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (2 preceding siblings ...)
  2011-03-10 10:22 ` paolo.carlini at oracle dot com
@ 2011-03-10 10:54 ` vincenzo.innocente at cern dot ch
  2011-03-10 11:22 ` paolo.carlini at oracle dot com
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-03-10 10:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #4 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-03-10 10:54:07 UTC ---
  Thanks for the fast reation.
I would like to point out that, at least on x86_64, the only one that does not
work is
"unsigned int"
"unsigned long long (aka size_t)" seems to work (see 3,4 and 5th loop in my
example)

vincenzo


On 10 Mar, 2011, at 11:23 AM, paolo.carlini at oracle dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
> 
> --- Comment #3 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 10:22:48 UTC ---
> Thanks for the analysis. I knew about the difference between signed and
> unsigned, makes sense. Not knowing in detail the internals of the optimization
> the puzzling bit is that types wider than unsigned int already work fine. The
> problem seems fixable, somehow ;)
> 
> -- 
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.

--
Il est bon de suivre sa pente, pourvu que ce soit en montant. 
A.G.
http://www.flickr.com/photos/vin60/1320965757/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (3 preceding siblings ...)
  2011-03-10 10:54 ` vincenzo.innocente at cern dot ch
@ 2011-03-10 11:22 ` paolo.carlini at oracle dot com
  2011-03-10 11:31 ` paolo.carlini at oracle dot com
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: paolo.carlini at oracle dot com @ 2011-03-10 11:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

Paolo Carlini <paolo.carlini at oracle dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |paolo.carlini at oracle dot
                   |                            |com

--- Comment #5 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 11:22:26 UTC ---
Vincenzo, if I understand correctly, maybe Sebastian can also tell us more, the
issue seems that, at some stage, the logic is fully general only assuming the
widest unsigned type (*), doesn't cope with smaller types. Thus, if my theory
is correct, unsigned char, unsigned short, etc, all should cause problems. On
the other hand, on x86_64, unsigned long, unsigned long long, size_t, are all
the same size, and all work (**)

(*) I don't consider int128, I don't think is relevant for loop optimization.
(**) On x86, however, unsigned int (aka unsigned long) appears to work, hum.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (4 preceding siblings ...)
  2011-03-10 11:22 ` paolo.carlini at oracle dot com
@ 2011-03-10 11:31 ` paolo.carlini at oracle dot com
  2011-03-11 10:16 ` vincenzo.innocente at cern dot ch
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: paolo.carlini at oracle dot com @ 2011-03-10 11:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #6 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 11:30:58 UTC ---
Well, on x86, in terms of addressing unsigned int (aka long) *is* the widest
type, morally unsigned long long doesn't count.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (5 preceding siblings ...)
  2011-03-10 11:31 ` paolo.carlini at oracle dot com
@ 2011-03-11 10:16 ` vincenzo.innocente at cern dot ch
  2011-03-11 10:26 ` rguenther at suse dot de
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-03-11 10:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #7 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-03-11 10:16:37 UTC ---
what's the probablity to have this fixed?
We depend on a third party matrix library
that is fully templated and uses everywhere "unsigned int"
I made a test with a
sed -i 's/unsigned int/unsigned long long/g'
and it MAKES a difference (up to a factor 2 in speed).
This modification (although trivial) changes the type of templated vector and
matrix, the signature of functions
and also affects user code. 
It is neither transparent nor backward compatible.
I think we cannot afford the change in production: much easier to change
compiler version!

    Thanks for any effort dedicated to solve this issue,

         Vincenzo

On 10 Mar, 2011, at 12:22 PM, paolo.carlini at oracle dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
> 
> Paolo Carlini <paolo.carlini at oracle dot com> changed:
> 
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                 CC|                            |paolo.carlini at oracle dot
>                   |                            |com
> 
> --- Comment #5 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 11:22:26 UTC ---
> Vincenzo, if I understand correctly, maybe Sebastian can also tell us more, the
> issue seems that, at some stage, the logic is fully general only assuming the
> widest unsigned type (*), doesn't cope with smaller types. Thus, if my theory
> is correct, unsigned char, unsigned short, etc, all should cause problems. On
> the other hand, on x86_64, unsigned long, unsigned long long, size_t, are all
> the same size, and all work (**)
> 
> (*) I don't consider int128, I don't think is relevant for loop optimization.
> (**) On x86, however, unsigned int (aka unsigned long) appears to work, hum.
> 
> -- 
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.

--
Il est bon de suivre sa pente, pourvu que ce soit en montant. 
A.G.
http://www.flickr.com/photos/vin60/1320965757/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (6 preceding siblings ...)
  2011-03-11 10:16 ` vincenzo.innocente at cern dot ch
@ 2011-03-11 10:26 ` rguenther at suse dot de
  2011-03-14 10:08 ` vincenzo.innocente at cern dot ch
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: rguenther at suse dot de @ 2011-03-11 10:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> 2011-03-11 10:26:47 UTC ---
On Fri, 11 Mar 2011, vincenzo.innocente at cern dot ch wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
> 
> --- Comment #7 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-03-11 10:16:37 UTC ---
> what's the probablity to have this fixed?

A fix for this is quite involved (and honestly its on my TODO list for
at least two years - but I chickened out repeatedly because of all the
issues).  I'm not sure if a SCEV local fix is possible, Sebastian
will probably comment on this.

Richard.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (7 preceding siblings ...)
  2011-03-11 10:26 ` rguenther at suse dot de
@ 2011-03-14 10:08 ` vincenzo.innocente at cern dot ch
  2015-05-04 19:33 ` az.zaafrani at gmail dot com
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-03-14 10:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #9 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-03-14 10:08:29 UTC ---
It is interesting to note that in case of fixed size (such as in these trivial
or template examples)
vectorization works also for unsigned int

void loop10( double const * __restrict__ x_in,  double * __restrict__ x_out,
double const * __restrict__ c) {
   for(unsigned int i=0; i!=10; ++i)
       x_out[i] = c[i]*x_in[i];
}


template<typename T, unsigned int N>
void loopTu( T const * __restrict__ x_in,  T * __restrict__ x_out, T const *
__restrict__ c) {
   for(unsigned int i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}

template<typename T, unsigned long long N>
void loopTull( T const * __restrict__ x_in,  T * __restrict__ x_out, T const *
__restrict__ c) {
   for(unsigned long long i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}


void go(double const * __restrict__ x_in,  double * __restrict__ x_out, double
const * __restrict__ c) {

  loopTu<double,10>(x_in, x_out, c);

  loopTull<double,10>(x_in, x_out, c);

}


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (8 preceding siblings ...)
  2011-03-14 10:08 ` vincenzo.innocente at cern dot ch
@ 2015-05-04 19:33 ` az.zaafrani at gmail dot com
  2015-05-06  6:57 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: az.zaafrani at gmail dot com @ 2015-05-04 19:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

zaafrani <az.zaafrani at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |az.zaafrani at gmail dot com

--- Comment #10 from zaafrani <az.zaafrani at gmail dot com> ---
Created attachment 35459
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35459&action=edit
patch

This is an old thread and we are still running into similar issues: Code is not
being vectorized on 64-bit target due to scev not being able to optimally
analyze overflow condition.
While the original test case shown here seems to work now, it does not work if
the start value is not a constant and the loop index variable is of unsigned
type: Ex

void loop2( double const * __restrict__ x_in,  double * __restrict__ x_out,
double const * __restrict__ c, unsigned int N, unsigned int start) {
       for(unsigned int i=start; i!=N; ++i)
                x_out[i] = c[i]*x_in[i];
  }

Here is our unit test: 

int foo(int* A, int* B,  unsigned start, unsigned B)
{
  int s;  
 for (unsigned k = start;  k <start+B; k++)
     s += A[k] * B[k];

    return s;
 }

Our unit test case is extracted from a matrix multiply of a two-dimensional
array and all loops are blocked by hand by a factor of B.  Even though a bit
modified, above loop corresponds to the innermost loop of the blocked matrix
multiply. 

We worked on patch to solve the problem (see attachment)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (9 preceding siblings ...)
  2015-05-04 19:33 ` az.zaafrani at gmail dot com
@ 2015-05-06  6:57 ` rguenth at gcc dot gnu.org
  2015-05-07 20:43 ` az.zaafrani at gmail dot com
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-05-06  6:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
That's an interesting idea - your argument is that if niter analysis was able
to compute an expression for the number of iterations and the cast we are
looking at
is a widening of a BIV then it is ok to assume the BIV does not wrap.

Unfortunately this breaks down (eventually not in practice due to your
exclusion of constant initial BIV value) for cases like


  for (unsigned i = 3; i != 2; i+=7)
    ;

where niter analysis can still compute the number of iterations (I've made
the numbers up, so maybe that loop will never terminate...).

Still the idea is interesting as we might be able to record whether BIVs
overflow or not.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (10 preceding siblings ...)
  2015-05-06  6:57 ` rguenth at gcc dot gnu.org
@ 2015-05-07 20:43 ` az.zaafrani at gmail dot com
  2015-05-22 16:19 ` hiraditya at msn dot com
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: az.zaafrani at gmail dot com @ 2015-05-07 20:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #12 from zaafrani <az.zaafrani at gmail dot com> ---
Thank you for the feedback.

We excluded start value that is constant because it is already
working. To our knowledge, only when the start value is unknown and
the loop index type is of unsigned type that we fail to recognize
non-overflow for situations when it is possible to deduce so. For most
other cases, current analysis done in scev_probably_wraps_p seems to
be working fine. We also added the assumption of step equal 1 so that
we can make correct decision about non-overflow. So, basically we’d
rather catch few simple cases and make them work  then try to
generalize the scope and not being to prove much.


On Wed, May 6, 2015 at 1:56 AM, rguenth at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org> wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
>
> --- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
> That's an interesting idea - your argument is that if niter analysis was able
> to compute an expression for the number of iterations and the cast we are
> looking at
> is a widening of a BIV then it is ok to assume the BIV does not wrap.
>
> Unfortunately this breaks down (eventually not in practice due to your
> exclusion of constant initial BIV value) for cases like
>
>
>   for (unsigned i = 3; i != 2; i+=7)
>     ;
>
> where niter analysis can still compute the number of iterations (I've made
> the numbers up, so maybe that loop will never terminate...).
>
> Still the idea is interesting as we might be able to record whether BIVs
> overflow or not.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
>From gcc-bugs-return-485793-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu May 07 20:45:26 2015
Return-Path: <gcc-bugs-return-485793-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 39374 invoked by alias); 7 May 2015 20:45:26 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 39335 invoked by uid 48); 7 May 2015 20:45:22 -0000
From: "amonakov at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/48302] ICE: SIGSEGV in reposition_prologue_and_epilogue_notes (function.c:5662) with -fcrossjumping -fselective-scheduling2
Date: Thu, 07 May 2015 20:45:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 4.7.0
X-Bugzilla-Keywords: ice-on-valid-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: amonakov at gcc dot gnu.org
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Resolution: FIXED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: amonakov at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_status resolution
Message-ID: <bug-48302-4-sJ52wAsdQP@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-48302-4@http.gcc.gnu.org/bugzilla/>
References: <bug-48302-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-05/txt/msg00633.txt.bz2
Content-length: 431

https://gcc.gnu.org/bugzilla/show_bug.cgi?idH302

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (11 preceding siblings ...)
  2015-05-07 20:43 ` az.zaafrani at gmail dot com
@ 2015-05-22 16:19 ` hiraditya at msn dot com
  2015-06-02 10:19 ` amker at gcc dot gnu.org
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: hiraditya at msn dot com @ 2015-05-22 16:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #13 from AK <hiraditya at msn dot com> ---
We have an updated patch that works for both the cases.
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01991.html


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (12 preceding siblings ...)
  2015-05-22 16:19 ` hiraditya at msn dot com
@ 2015-06-02 10:19 ` amker at gcc dot gnu.org
  2015-06-09  8:22 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: amker at gcc dot gnu.org @ 2015-06-02 10:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #14 from amker at gcc dot gnu.org ---
Author: amker
Date: Tue Jun  2 10:19:18 2015
New Revision: 224020

URL: https://gcc.gnu.org/viewcvs?rev=224020&root=gcc&view=rev
Log:

        PR tree-optimization/48052
        * cfgloop.h (struct control_iv): New.
        (struct loop): New field control_ivs.
        * tree-ssa-loop-niter.c : Include "stor-layout.h".
        (number_of_iterations_lt): Set no_overflow information.
        (number_of_iterations_exit): Init control iv in niter struct.
        (record_control_iv): New.
        (estimate_numbers_of_iterations_loop): Call record_control_iv.
        (loop_exits_before_overflow): New.  Interface factored out of
        scev_probably_wraps_p.
        (scev_probably_wraps_p): Factor loop niter related code into
        loop_exits_before_overflow.
        (free_numbers_of_iterations_estimates_loop): Free control ivs.
        * tree-ssa-loop-niter.h (free_loop_control_ivs): New.

        gcc/testsuite/ChangeLog
        PR tree-optimization/48052
        * gcc.dg/tree-ssa/scev-8.c: New.
        * gcc.dg/tree-ssa/scev-9.c: New.
        * gcc.dg/tree-ssa/scev-10.c: New.
        * gcc.dg/vect/pr48052.c: New.


Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-10.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-8.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-9.c
    trunk/gcc/testsuite/gcc.dg/vect/pr48052.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgloop.h
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-loop-niter.c
    trunk/gcc/tree-ssa-loop-niter.h


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (13 preceding siblings ...)
  2015-06-02 10:19 ` amker at gcc dot gnu.org
@ 2015-06-09  8:22 ` rguenth at gcc dot gnu.org
  2015-06-23 13:23 ` evstupac at gmail dot com
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-09  8:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
Bug 48052 depends on bug 66396, which changed state.

Bug 66396 Summary: [6 regression] FAIL: gcc.dg/graphite/run-id-pr47593.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66396

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (14 preceding siblings ...)
  2015-06-09  8:22 ` rguenth at gcc dot gnu.org
@ 2015-06-23 13:23 ` evstupac at gmail dot com
  2015-06-23 13:52 ` amker at gcc dot gnu.org
  2015-06-24  2:34 ` amker at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: evstupac at gmail dot com @ 2015-06-23 13:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

Stupachenko Evgeny <evstupac at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |evstupac at gmail dot com

--- Comment #15 from Stupachenko Evgeny <evstupac at gmail dot com> ---
The commit caused regressions on some benchmarks. Test to reproduce:
(compilations flags: -Ofast)

int foo (int flag, char *a)                                                    
{                                                                              
  short i, j;                                                                  
  short l = 0;                                                                 
  if (flag == 1)                                                               
    l = 3;                                                                     

  for (i = 0; i < 4; i++)                                                      
    {                                                                          
      for (j = l - 1; j > 0; j--)                                              
        a[j] = a[j - 1];                                                       
      a[0] = i;                                                                
    }                                                                          
}

Here value of l is between 0 and 3, and therefore value of the innermost loop
bound (l - 1) is between -1 and 2.

After the commit the innermost loop is replaced with memmove call. This is
obviously not optimal as amount of memory to move is not greater than 2.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (15 preceding siblings ...)
  2015-06-23 13:23 ` evstupac at gmail dot com
@ 2015-06-23 13:52 ` amker at gcc dot gnu.org
  2015-06-24  2:34 ` amker at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: amker at gcc dot gnu.org @ 2015-06-23 13:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

amker at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amker at gcc dot gnu.org

--- Comment #16 from amker at gcc dot gnu.org ---
(In reply to Stupachenko Evgeny from comment #15)
> The commit caused regressions on some benchmarks. Test to reproduce:
> (compilations flags: -Ofast)
> 
> int foo (int flag, char *a)                                                 
> 
> {                                                                           
> 
>   short i, j;                                                               
> 
>   short l = 0;                                                              
> 
>   if (flag == 1)                                                            
> 
>     l = 3;                                                                  
> 
> 
>   for (i = 0; i < 4; i++)                                                   
> 
>     {                                                                       
> 
>       for (j = l - 1; j > 0; j--)                                           
> 
>         a[j] = a[j - 1];                                                    
> 
>       a[0] = i;                                                             
> 
>     }                                                                       
> 
> }
> 
> Here value of l is between 0 and 3, and therefore value of the innermost
> loop bound (l - 1) is between -1 and 2.
> 
> After the commit the innermost loop is replaced with memmove call. This is
> obviously not optimal as amount of memory to move is not greater than 2.

Hi, thank you for reporting this.  I shall have a look.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug tree-optimization/48052] loop not vectorized if index is "unsigned int"
  2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
                   ` (16 preceding siblings ...)
  2015-06-23 13:52 ` amker at gcc dot gnu.org
@ 2015-06-24  2:34 ` amker at gcc dot gnu.org
  17 siblings, 0 replies; 19+ messages in thread
From: amker at gcc dot gnu.org @ 2015-06-24  2:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052

--- Comment #17 from amker at gcc dot gnu.org ---
(In reply to Stupachenko Evgeny from comment #15)
> The commit caused regressions on some benchmarks. Test to reproduce:
> (compilations flags: -Ofast)
> 
> int foo (int flag, char *a)                                                 
> 
> {                                                                           
> 
>   short i, j;                                                               
> 
>   short l = 0;                                                              
> 
>   if (flag == 1)                                                            
> 
>     l = 3;                                                                  
> 
> 
>   for (i = 0; i < 4; i++)                                                   
> 
>     {                                                                       
> 
>       for (j = l - 1; j > 0; j--)                                           
> 
>         a[j] = a[j - 1];                                                    
> 
>       a[0] = i;                                                             
> 
>     }                                                                       
> 
> }
> 
> Here value of l is between 0 and 3, and therefore value of the innermost
> loop bound (l - 1) is between -1 and 2.
> 
> After the commit the innermost loop is replaced with memmove call. This is
> obviously not optimal as amount of memory to move is not greater than 2.

(In reply to amker from comment #16)
> (In reply to Stupachenko Evgeny from comment #15)
> > The commit caused regressions on some benchmarks. Test to reproduce:
> > (compilations flags: -Ofast)
> > 
> > int foo (int flag, char *a)                                                 
> > 
> > {                                                                           
> > 
> >   short i, j;                                                               
> > 
> >   short l = 0;                                                              
> > 
> >   if (flag == 1)                                                            
> > 
> >     l = 3;                                                                  
> > 
> > 
> >   for (i = 0; i < 4; i++)                                                   
> > 
> >     {                                                                       
> > 
> >       for (j = l - 1; j > 0; j--)                                           
> > 
> >         a[j] = a[j - 1];                                                    
> > 
> >       a[0] = i;                                                             
> > 
> >     }                                                                       
> > 
> > }
> > 
> > Here value of l is between 0 and 3, and therefore value of the innermost
> > loop bound (l - 1) is between -1 and 2.
> > 
> > After the commit the innermost loop is replaced with memmove call. This is
> > obviously not optimal as amount of memory to move is not greater than 2.
> 
> Hi, thank you for reporting this.  I shall have a look.

This is latent optimization issue in loop-niter/loop-dist revealed because more
scev are recognized now.
I filed PR66646 for tracking.

Thanks,
bin


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-06-24  2:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-09 18:49 [Bug tree-optimization/48052] New: loop not vectorized if index is "unsigned int" vincenzo.innocente at cern dot ch
2011-03-10  0:21 ` [Bug tree-optimization/48052] " paolo.carlini at oracle dot com
2011-03-10  9:46 ` rguenth at gcc dot gnu.org
2011-03-10 10:22 ` paolo.carlini at oracle dot com
2011-03-10 10:54 ` vincenzo.innocente at cern dot ch
2011-03-10 11:22 ` paolo.carlini at oracle dot com
2011-03-10 11:31 ` paolo.carlini at oracle dot com
2011-03-11 10:16 ` vincenzo.innocente at cern dot ch
2011-03-11 10:26 ` rguenther at suse dot de
2011-03-14 10:08 ` vincenzo.innocente at cern dot ch
2015-05-04 19:33 ` az.zaafrani at gmail dot com
2015-05-06  6:57 ` rguenth at gcc dot gnu.org
2015-05-07 20:43 ` az.zaafrani at gmail dot com
2015-05-22 16:19 ` hiraditya at msn dot com
2015-06-02 10:19 ` amker at gcc dot gnu.org
2015-06-09  8:22 ` rguenth at gcc dot gnu.org
2015-06-23 13:23 ` evstupac at gmail dot com
2015-06-23 13:52 ` amker at gcc dot gnu.org
2015-06-24  2:34 ` amker at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).