public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
  2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
@ 2015-03-19 10:19 ` vries at gcc dot gnu.org
  2015-08-24 13:14 ` vries at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-03-19 10:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468

--- Comment #1 from vries at gcc dot gnu.org ---
Using the patch submitted for gomp-4_0-branch at
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01881.html, we get a simple loop:
...
bar._omp_fn.0 (struct .omp_data_s.0 & restrict .omp_data_i)
{
  int i;
  int a;
  int _3;
  int * _10;
  unsigned int pretmp_14;
  unsigned int _16;
  unsigned int _17;
  unsigned int _19;
  unsigned int prephitmp_22;

  <bb 2>:
  _3 = __builtin_omp_get_num_threads ();
  i_4 = __builtin_omp_get_thread_num ();
  if (i_4 <= 9)
    goto <bb 3>;
  else
    goto <bb 6>;

  <bb 3>:
  # a_5 = PHI <0(2)>
  # i_2 = PHI <i_4(2)>

  <bb 4>:
  # a_18 = PHI <a_5(3), a_7(4)>
  # i_21 = PHI <i_2(3), i_15(4)>
  a_7 = a_18 + i_21;
  _19 = (unsigned int) _3;
  _17 = (unsigned int) i_21;
  _16 = _17 + _19;
  i_15 = (int) _16;
  if (i_15 <= 9)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 5>:
  pretmp_14 = (unsigned int) a_7;

  <bb 6>:
  # prephitmp_22 = PHI <pretmp_14(5), 0(2)>
  _10 = &.omp_data_i_9(D)->a;
  __atomic_fetch_add_4 (_10, prephitmp_22, 0); [tail call]
  return;

}
...


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one
@ 2015-03-19 10:23 vries at gcc dot gnu.org
  2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-03-19 10:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468

            Bug ID: 65468
           Summary: Optimize static schedule with chunk_size one
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org

Consider test.c:
...
extern void abort ();

int
bar ()
{
  int a = 0, i;

#pragma omp parallel for num_threads (3) reduction (+:a) schedule(static, 1)
  for (i = 0; i < 10; i++)
    a += i;

  return a;
}

int
main (void)
{
  int res;
  res = bar ();
  if (res != 45)
    abort ();
  return 0;
}
...


So, we create 3 threads, and the schedule will be:
threadnr | iterations
---------------------
0        | 0 3 6 9
1        | 1 4 7
2        | 2 5 8


The code is generated using expand_for_omp_static_chunk, which results in the
following code for -O2 -fopenmp (optimized dump):
...
bar._omp_fn.0 (struct .omp_data_s.0 & restrict .omp_data_i)
{
  int i;
  int a;
  int _6;
  int _11;
  int * _17;
  int _21;
  unsigned int _23;
  int _25;
  int _26;
  unsigned int _27;
  int _29;
  unsigned int _31;
  unsigned int _32;
  int _33;
  unsigned int _34;
  unsigned int pretmp_35;
  unsigned int prephitmp_36;

  <bb 2>:
  _6 = __builtin_omp_get_num_threads ();
  i_7 = __builtin_omp_get_thread_num ();
  _25 = i_7 + 1;
  _26 = MIN_EXPR <_25, 10>;
  if (i_7 <= 9)
    goto <bb 3>;
  else
    goto <bb 8>;

  <bb 3>:
  # a_3 = PHI <0(2)>
  # i_24 = PHI <i_7(2)>
  # _21 = PHI <_26(2)>

  <bb 4>:
  # a_12 = PHI <a_3(3), a_13(6)>
  # i_5 = PHI <i_24(3), i_22(6)>
  # _29 = PHI <_21(3), _11(6)>

  <bb 5>:
  # a_1 = PHI <a_12(4), a_13(5)>
  # i_4 = PHI <i_5(4), i_14(5)>
  a_13 = a_1 + i_4;
  i_14 = i_4 + 1;
  if (i_14 < _29)
    goto <bb 5>;
  else
    goto <bb 6>;

  <bb 6>:
  _32 = (unsigned int) i_5;
  _31 = (unsigned int) _6;
  _23 = _31 + _32;
  i_22 = (int) _23;
  _27 = _23;
  _34 = _27 + 1;
  _33 = (int) _34;
  _11 = MIN_EXPR <_33, 10>;
  if (i_22 <= 9)
    goto <bb 4>;
  else
    goto <bb 7>;

  <bb 7>:
  pretmp_35 = (unsigned int) a_13;

  <bb 8>:
  # prephitmp_36 = PHI <pretmp_35(7), 0(2)>
  _17 = &.omp_data_i_16(D)->a;
  __atomic_fetch_add_4 (_17, prephitmp_36, 0); [tail call]
  return;

}
...

The code contains a loop nest with two loops. The inner loop handles a single
chunk, the outer loop iterates over the chunks assigned to the thread.

For a chunk size of one, we know that the inner loop will only execute the body
once at all times. But the compiler doesn't manage to optimize the inner loop
away.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
  2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
  2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
@ 2015-08-24 13:14 ` vries at gcc dot gnu.org
  2015-08-24 15:02 ` vries at gcc dot gnu.org
  2015-08-24 15:05 ` vries at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-08-24 13:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468

--- Comment #3 from vries at gcc dot gnu.org ---
Author: vries
Date: Mon Aug 24 13:14:17 2015
New Revision: 227124

URL: https://gcc.gnu.org/viewcvs?rev=227124&root=gcc&view=rev
Log:
Optimize expand_omp_for_static_chunk for chunk_size one

2015-08-24  Tom de Vries  <tom@codesourcery.com>

        PR tree-optimization/65468
        * omp-low.c (expand_omp_for_static_chunk): Remove inner loop if
        chunk_size is one.

        * gcc.dg/gomp/static-chunk-size-one.c: New test.

        * testsuite/libgomp.c/static-chunk-size-one.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/gomp/static-chunk-size-one.c
    trunk/libgomp/testsuite/libgomp.c/static-chunk-size-one.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/omp-low.c
    trunk/gcc/testsuite/ChangeLog
    trunk/libgomp/ChangeLog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
  2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
  2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
  2015-08-24 13:14 ` vries at gcc dot gnu.org
@ 2015-08-24 15:02 ` vries at gcc dot gnu.org
  2015-08-24 15:05 ` vries at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-08-24 15:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468

--- Comment #4 from vries at gcc dot gnu.org ---
Author: vries
Date: Mon Aug 24 15:01:44 2015
New Revision: 227130

URL: https://gcc.gnu.org/viewcvs?rev=227130&root=gcc&view=rev
Log:
Add libgomp.oacc-c-c++-common/vector-loop.c

2015-08-24  Tom de Vries  <tom@codesourcery.com>

        PR tree-optimization/65468
        * testsuite/libgomp.oacc-c-c++-common/vector-loop.c: New test.

Added:
    trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-loop.c
Modified:
    trunk/libgomp/ChangeLog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
  2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-08-24 15:02 ` vries at gcc dot gnu.org
@ 2015-08-24 15:05 ` vries at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-08-24 15:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468

vries at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from vries at gcc dot gnu.org ---
patch and openmp and openacc test-cases committed, marking resolved-fixed.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-08-24 15:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
2015-08-24 13:14 ` vries at gcc dot gnu.org
2015-08-24 15:02 ` vries at gcc dot gnu.org
2015-08-24 15:05 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).