public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis
@ 2024-04-08  9:58 tnfchris at gcc dot gnu.org
  2024-04-08 12:02 ` [Bug tree-optimization/114635] " rguenth at gcc dot gnu.org
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-04-08  9:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

            Bug ID: 114635
           Summary: OpenMP reductions fail dependency analysis
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---

The following testcase reduced from an HPC workload:

#include <math.h>

#define RESTRICT restrict

void work(int n, float *RESTRICT x, float *RESTRICT y,
          float *RESTRICT z, float *RESTRICT mass,
          float x0, float y0, float z0,
          float *RESTRICT ax, float *RESTRICT ay,
          float *RESTRICT az) {
  float lax = 0.0f, lay = 0.0f, laz = 0.0f;

#if _OPENMP >= 201307
#pragma omp simd reduction(+:lax,lay,laz)
#endif
  for (int i = 0; i < n; ++i) {
    float dx = x[i] - x0;
    float dy = y[i] - y0;
    float dz = z[i] - z0;
    float r2 = dx + dy + dz;

    if (r2 == 0.0f)
      continue;

    float f = (1.0f / (r2 * sqrtf(r2))) * mass[i];

    lax += f * dx;
    lay += f * dy;
    laz += f * dz; 
  }

  *ax += lax;
  *ay += lay;
  *az += laz;
}

when compiled with -Ofast -march=armv9-a -fopenmp-simd vectorizes as expected
but when the pragma is in effect, e.g.  -Ofast -march=armv9-a -fopenmp then the
main loop fails to vectorize with:

(compute_affine_dependence
  ref_a: D.5962[_33], stmt_a: _69 = D.5962[_33];
  ref_b: D.5962[_33], stmt_b: D.5962[_33] = _ifc__147;
) -> dependence analysis failed
/app/example.c:16:17: missed:  bad data dependence.
/app/example.c:16:17: note:  ***** Analysis  failed with vector mode VNx4SF

This doesn't seem to happen with just 2 reductions, but with 3 dependency
analysis seems to fail.

I don't know much about openmp but my understanding is that this pragma is
intended for architectures that don't have masking support and works by
splitting the loop and removing the reductions from the main loop creating
openmp "workers" whom each work on one thread.

the reduction values are turned into local arrays and these threads then write
into their own slots into these arrays.

The reduction itself is then done as a final post step.

It looks like the only thing we can vectorize is the post step.

I wonder, since the compiler is the one introducing these local arrays, can we
not mark them safe from inter dependencies?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
@ 2024-04-08 12:02 ` rguenth at gcc dot gnu.org
  2024-04-08 12:19 ` rguenth at gcc dot gnu.org
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-04-08 12:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note even -fopenmp-simd would likely fail if you remove the #if _OPENMP check.
-fopenmp-simd doesn't define _OPENMP, only -fopenmp does.

It seems to work on x86_64 but we get a "scalar" epilog somehow.

As of dependence analysis I can only guess that somehow PTA info is
missing and the SIMD decls are addressable for you?  Note that I
see a masked load and for calls restrict doesn't work.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
  2024-04-08 12:02 ` [Bug tree-optimization/114635] " rguenth at gcc dot gnu.org
@ 2024-04-08 12:19 ` rguenth at gcc dot gnu.org
  2024-04-08 12:26 ` rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-04-08 12:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |openmp
             Blocks|                            |53947
                 CC|                            |rguenth at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
We have

(compute_affine_dependence
  ref_a: D.5604[_33], stmt_a: _65 = D.5604[_33];
  ref_b: D.5604[_33], stmt_b: D.5604[_33] = _ifc__144;
) -> dependence analysis failed

but of course safelen should apply here, but we likely hit

  if (max_vf != MAX_VECTORIZATION_FACTOR
      && maybe_lt (max_vf, min_vf))
    return opt_result::failure_at (vect_location, "bad data dependence.\n");

we have max_vf == 16 but min_vf is {4, 4} with E_VNx4SFmode.  Maybe
safelen (and thus max_vf) needs to be poly-int?  Maybe somehow safelen
needs to be ensured by additional masking with SVE?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
  2024-04-08 12:02 ` [Bug tree-optimization/114635] " rguenth at gcc dot gnu.org
  2024-04-08 12:19 ` rguenth at gcc dot gnu.org
@ 2024-04-08 12:26 ` rguenth at gcc dot gnu.org
  2024-04-08 12:32 ` jakub at gcc dot gnu.org
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-04-08 12:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
As for the dependence analysis itself - the issue is that the D.5606[_33]
style accesses have no access functions - I'm not sure how they evolve,
but I see their domain is even [0, POLY_INT_CST [15, 16] ].  I guess
we don't expect to have dependence analysis handle this but do dependence
fully within 'safelen' here.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-04-08 12:26 ` rguenth at gcc dot gnu.org
@ 2024-04-08 12:32 ` jakub at gcc dot gnu.org
  2024-04-08 12:35 ` jakub at gcc dot gnu.org
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-08 12:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The OpenMP safelen clause argument is a scalar integer, so using poly_int for
something that must be an int doesn't make sense.
Though, the above testcase actually doesn't use safelen clause, so safelen is
there effectively infinity.
The issue is how we represent the clauses like reduction on simd.
The design is that they are represented as initially large magic "omp simd
array" arrays with constant number of iterations derived from the maximum
possible vectorization factor, and then the vectorizer either shrinks those two
the actual vectorization factor (if different from the old constant) or to [1]
array if vectorization wasn't possible and we've in fact used vectorization
factor of 1.

Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int
rather than constant.  One possibility would be to use VLA arrays in those
cases, but then it will be hard to undo that later, or allow both shrinking and
growing the arrays and even turning them into VLA-like ones.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-04-08 12:32 ` jakub at gcc dot gnu.org
@ 2024-04-08 12:35 ` jakub at gcc dot gnu.org
  2024-04-08 14:55 ` tnfchris at gcc dot gnu.org
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-08 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> As for the dependence analysis itself - the issue is that the D.5606[_33]
> style accesses have no access functions - I'm not sure how they evolve,
> but I see their domain is even [0, POLY_INT_CST [15, 16] ].  I guess
> we don't expect to have dependence analysis handle this but do dependence
> fully within 'safelen' here.

The magic arrays are indexed in the loop using .GOMP_SIMD_LANE (...) index,
which should map to the logical iteration modulo vectorization factor after the
vectorization.  After the loop it can be indexed using .GOMP_SIMD_LAST_LANE
(...).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-04-08 12:35 ` jakub at gcc dot gnu.org
@ 2024-04-08 14:55 ` tnfchris at gcc dot gnu.org
  2024-04-08 15:36 ` rguenther at suse dot de
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-04-08 14:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #4)
> Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int
> rather than constant.  One possibility would be to use VLA arrays in those
> cases, but then it will be hard to undo that later, or allow both shrinking
> and growing the arrays and even turning them into VLA-like ones.

I think they are already VLAs of some kind going into vect, unless I've
misunderstood the declaration:

  float D.28295[0:POLY_INT_CST [15, 16]];
  float D.28293[0:POLY_INT_CST [15, 16]];
  float D.28290[0:POLY_INT_CST [15, 16]];

it looks like during vectorization they are lowered though.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-04-08 14:55 ` tnfchris at gcc dot gnu.org
@ 2024-04-08 15:36 ` rguenther at suse dot de
  2024-04-10  6:53 ` kugan at gcc dot gnu.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenther at suse dot de @ 2024-04-08 15:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> ---
> Am 08.04.2024 um 16:55 schrieb tnfchris at gcc dot gnu.org <gcc-bugzilla@gcc.gnu.org>:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
> 
> --- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
> (In reply to Jakub Jelinek from comment #4)
>> Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int
>> rather than constant.  One possibility would be to use VLA arrays in those
>> cases, but then it will be hard to undo that later, or allow both shrinking
>> and growing the arrays and even turning them into VLA-like ones.
> 
> I think they are already VLAs of some kind going into vect, unless I've
> misunderstood the declaration:
> 
>  float D.28295[0:POLY_INT_CST [15, 16]];
>  float D.28293[0:POLY_INT_CST [15, 16]];
>  float D.28290[0:POLY_INT_CST [15, 16]];
> 
> it looks like during vectorization they are lowered though.

Maybe it’s about upper vs lower bound when setting loop->safelen from
‚unlimited‘

> --
> You are receiving this mail because:
> You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-04-08 15:36 ` rguenther at suse dot de
@ 2024-04-10  6:53 ` kugan at gcc dot gnu.org
  2024-04-15  7:44 ` kugan at gcc dot gnu.org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: kugan at gcc dot gnu.org @ 2024-04-10  6:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

kugan at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kugan at gcc dot gnu.org

--- Comment #8 from kugan at gcc dot gnu.org ---
*** Bug 114653 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2024-04-10  6:53 ` kugan at gcc dot gnu.org
@ 2024-04-15  7:44 ` kugan at gcc dot gnu.org
  2024-04-15  7:45 ` kugan at gcc dot gnu.org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: kugan at gcc dot gnu.org @ 2024-04-15  7:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #9 from kugan at gcc dot gnu.org ---
Looking at the options, looks to me that making loop->safelen a poly_in is the
way to go. (In reply to Jakub Jelinek from comment #4)
> The OpenMP safelen clause argument is a scalar integer, so using poly_int
> for something that must be an int doesn't make sense.
> Though, the above testcase actually doesn't use safelen clause, so safelen
> is there effectively infinity.
Thanks. I was looking at this to see if there is a way to handle this
differently. Looks to me that making loop->safelen a poly_int is the way to
handle at least the case when omp safelen clause is not provided. I am
interested in looking into this. Any suggestions? Here is a completely untested
diff that makes loop->safelen a poly_int.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2024-04-15  7:44 ` kugan at gcc dot gnu.org
@ 2024-04-15  7:45 ` kugan at gcc dot gnu.org
  2024-04-15  7:49 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: kugan at gcc dot gnu.org @ 2024-04-15  7:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #10 from kugan at gcc dot gnu.org ---
Created attachment 57946
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57946&action=edit
patch

patch to make loop->safelen a poly_int

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2024-04-15  7:45 ` kugan at gcc dot gnu.org
@ 2024-04-15  7:49 ` jakub at gcc dot gnu.org
  2024-04-15  7:57 ` kugan at gcc dot gnu.org
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-15  7:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to kugan from comment #9)
> Looking at the options, looks to me that making loop->safelen a poly_in is
> the way to go. (In reply to Jakub Jelinek from comment #4)
> > The OpenMP safelen clause argument is a scalar integer, so using poly_int
> > for something that must be an int doesn't make sense.
> > Though, the above testcase actually doesn't use safelen clause, so safelen
> > is there effectively infinity.
> Thanks. I was looking at this to see if there is a way to handle this
> differently. Looks to me that making loop->safelen a poly_int is the way to
> handle at least the case when omp safelen clause is not provided.

Why?
Then it just is INT_MAX value, which is a magic value that says that it is
infinity.
No need to say it is a poly_int infinity.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2024-04-15  7:49 ` jakub at gcc dot gnu.org
@ 2024-04-15  7:57 ` kugan at gcc dot gnu.org
  2024-04-15  8:00 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: kugan at gcc dot gnu.org @ 2024-04-15  7:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #12 from kugan at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #11)
> (In reply to kugan from comment #9)
> > Looking at the options, looks to me that making loop->safelen a poly_in is
> > the way to go. (In reply to Jakub Jelinek from comment #4)
> > > The OpenMP safelen clause argument is a scalar integer, so using poly_int
> > > for something that must be an int doesn't make sense.
> > > Though, the above testcase actually doesn't use safelen clause, so safelen
> > > is there effectively infinity.
> > Thanks. I was looking at this to see if there is a way to handle this
> > differently. Looks to me that making loop->safelen a poly_int is the way to
> > handle at least the case when omp safelen clause is not provided.
> 
> Why?
> Then it just is INT_MAX value, which is a magic value that says that it is
> infinity.
> No need to say it is a poly_int infinity.

For this test case, omp_max_vf gets [16, 16] from the backend. This then
becomes 16. If we keep it as poly_int, it would pass maybe_lt (max_vf, min_vf))
after applying safelen?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2024-04-15  7:57 ` kugan at gcc dot gnu.org
@ 2024-04-15  8:00 ` jakub at gcc dot gnu.org
  2024-04-15  8:06 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-15  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to kugan from comment #12)
> > Why?
> > Then it just is INT_MAX value, which is a magic value that says that it is
> > infinity.
> > No need to say it is a poly_int infinity.
> 
> For this test case, omp_max_vf gets [16, 16] from the backend. This then
> becomes 16. If we keep it as poly_int, it would pass maybe_lt (max_vf,
> min_vf)) after applying safelen?

No.  You should just special case loop->safelen == INT_MAX to mean infinity in
the comparisons where it currently doesn't work as infinity.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2024-04-15  8:00 ` jakub at gcc dot gnu.org
@ 2024-04-15  8:06 ` rguenth at gcc dot gnu.org
  2024-04-15  8:08 ` rguenther at suse dot de
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-04-15  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think

  if (safelen)
    {
      poly_uint64 val;
      safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
      if (!poly_int_tree_p (safelen, &val))
        safelen_int = 0;
      else
        safelen_int = MIN (constant_lower_bound (val), INT_MAX);

should simply become

        safelen_int = constant_upper_bound_with_limit (val, INT_MAX);

?  Usually targets do have a limit on the actual length but I see
constant_upper_bound_with_limit doesn't query such.  But it would
be a more appropriate way to say there might be an actual target limit here?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2024-04-15  8:06 ` rguenth at gcc dot gnu.org
@ 2024-04-15  8:08 ` rguenther at suse dot de
  2024-04-15  8:14 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenther at suse dot de @ 2024-04-15  8:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #15 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 15 Apr 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
> 
> --- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> (In reply to kugan from comment #12)
> > > Why?
> > > Then it just is INT_MAX value, which is a magic value that says that it is
> > > infinity.
> > > No need to say it is a poly_int infinity.
> > 
> > For this test case, omp_max_vf gets [16, 16] from the backend. This then
> > becomes 16. If we keep it as poly_int, it would pass maybe_lt (max_vf,
> > min_vf)) after applying safelen?
> 
> No.  You should just special case loop->safelen == INT_MAX to mean infinity in
> the comparisons where it currently doesn't work as infinity.

But then an actual safelen(INT_MAX) would need to be adjusted.

Maybe using a poly-int safelen internally is cleaner.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2024-04-15  8:08 ` rguenther at suse dot de
@ 2024-04-15  8:14 ` jakub at gcc dot gnu.org
  2024-04-15  8:18 ` rguenther at suse dot de
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-15  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #14)
> I think
> 
>   if (safelen)
>     {
>       poly_uint64 val;
>       safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
>       if (!poly_int_tree_p (safelen, &val))
>         safelen_int = 0;
>       else
>         safelen_int = MIN (constant_lower_bound (val), INT_MAX);
> 
> should simply become
> 
>         safelen_int = constant_upper_bound_with_limit (val, INT_MAX);
> 
> ?  Usually targets do have a limit on the actual length but I see
> constant_upper_bound_with_limit doesn't query such.  But it would
> be a more appropriate way to say there might be an actual target limit here?

OMP_CLAUSE_SAFELEN_EXPR is always an INTEGER_CST, the FEs verify that and error
if it is not.  So, I must say I don't really understand parts of the
r8-5649-g9d2f08ab97be
changes.  I can understand the intent to make max_vf a poly_int, but don't
understand why safelen should be, what would it mean and when it would be set
that way?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2024-04-15  8:14 ` jakub at gcc dot gnu.org
@ 2024-04-15  8:18 ` rguenther at suse dot de
  2024-04-15  9:06 ` kugan at gcc dot gnu.org
  2024-05-14 15:24 ` rsandifo at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: rguenther at suse dot de @ 2024-04-15  8:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #17 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 15 Apr 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
> 
> --- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #14)
> > I think
> > 
> >   if (safelen)
> >     {
> >       poly_uint64 val;
> >       safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
> >       if (!poly_int_tree_p (safelen, &val))
> >         safelen_int = 0;
> >       else
> >         safelen_int = MIN (constant_lower_bound (val), INT_MAX);
> > 
> > should simply become
> > 
> >         safelen_int = constant_upper_bound_with_limit (val, INT_MAX);
> > 
> > ?  Usually targets do have a limit on the actual length but I see
> > constant_upper_bound_with_limit doesn't query such.  But it would
> > be a more appropriate way to say there might be an actual target limit here?
> 
> OMP_CLAUSE_SAFELEN_EXPR is always an INTEGER_CST, the FEs verify that and error
> if it is not.  So, I must say I don't really understand parts of the
> r8-5649-g9d2f08ab97be
> changes.  I can understand the intent to make max_vf a poly_int, but don't
> understand why safelen should be, what would it mean and when it would be set
> that way?

It would be only to "better" encode "infinity".  But I see loop->safelen
is 'int' but only [0, MAX_INT] is specified so we'd conveniently have
say -1 to encode "infinity".  We'd have to special case that value
anyway?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2024-04-15  8:18 ` rguenther at suse dot de
@ 2024-04-15  9:06 ` kugan at gcc dot gnu.org
  2024-05-14 15:24 ` rsandifo at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: kugan at gcc dot gnu.org @ 2024-04-15  9:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #18 from kugan at gcc dot gnu.org ---
Also, can we set INT_MAX when there is no explicit safelen specified in OMP.
Something like:

--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -6975,14 +6975,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq
*ilist, gimple_seq *dlist,
     {
       tree c = omp_find_clause (gimple_omp_for_clauses (ctx->stmt),
                                OMP_CLAUSE_SAFELEN);
-      poly_uint64 safe_len;
-      if (c == NULL_TREE
-         || (poly_int_tree_p (OMP_CLAUSE_SAFELEN_EXPR (c), &safe_len)
-             && maybe_gt (safe_len, sctx.max_vf)))
+      if (c == NULL_TREE)
        {
          c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_SAFELEN);
          OMP_CLAUSE_SAFELEN_EXPR (c) = build_int_cst (integer_type_node,
-                                                      sctx.max_vf);
+                                                      INT_MAX);
          OMP_CLAUSE_CHAIN (c) = gimple_omp_for_clauses (ctx->stmt);
          gimple_omp_for_set_clauses (ctx->stmt, c);
        }

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug tree-optimization/114635] OpenMP reductions fail dependency analysis
  2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2024-04-15  9:06 ` kugan at gcc dot gnu.org
@ 2024-05-14 15:24 ` rsandifo at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2024-05-14 15:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #19 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #14)
> Usually targets do have a limit on the actual length but I see
> constant_upper_bound_with_limit doesn't query such.  But it would
> be a more appropriate way to say there might be an actual target limit here?
The discussion has moved on, but FWIW: this was a deliberate choice.
The thinking at the time was that VLA code should be truly “agnostic”
and not hard-code an upper limit.  Hard-coding a limit would be hard-coding
an assumption that the architectural maximum would never increase in future.

(The main counterargument was that any uses of the .B form of TBL would
break down for >256-byte vectors.  We hardly use such TBLs for autovec
though, and could easily choose not to use them at all.)

That decision is 8 or 9 years old at this point, so it might seem overly
dogmatic now.  Even so, I think we should have a strong reason to change tack.
It shouldn't just be about trying to avoid poly_ints :)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-05-14 15:24 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-08  9:58 [Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis tnfchris at gcc dot gnu.org
2024-04-08 12:02 ` [Bug tree-optimization/114635] " rguenth at gcc dot gnu.org
2024-04-08 12:19 ` rguenth at gcc dot gnu.org
2024-04-08 12:26 ` rguenth at gcc dot gnu.org
2024-04-08 12:32 ` jakub at gcc dot gnu.org
2024-04-08 12:35 ` jakub at gcc dot gnu.org
2024-04-08 14:55 ` tnfchris at gcc dot gnu.org
2024-04-08 15:36 ` rguenther at suse dot de
2024-04-10  6:53 ` kugan at gcc dot gnu.org
2024-04-15  7:44 ` kugan at gcc dot gnu.org
2024-04-15  7:45 ` kugan at gcc dot gnu.org
2024-04-15  7:49 ` jakub at gcc dot gnu.org
2024-04-15  7:57 ` kugan at gcc dot gnu.org
2024-04-15  8:00 ` jakub at gcc dot gnu.org
2024-04-15  8:06 ` rguenth at gcc dot gnu.org
2024-04-15  8:08 ` rguenther at suse dot de
2024-04-15  8:14 ` jakub at gcc dot gnu.org
2024-04-15  8:18 ` rguenther at suse dot de
2024-04-15  9:06 ` kugan at gcc dot gnu.org
2024-05-14 15:24 ` rsandifo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).