public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
@ 2015-03-19 10:19 ` vries at gcc dot gnu.org
2015-08-24 13:14 ` vries at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-03-19 10:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468
--- Comment #1 from vries at gcc dot gnu.org ---
Using the patch submitted for gomp-4_0-branch at
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01881.html, we get a simple loop:
...
bar._omp_fn.0 (struct .omp_data_s.0 & restrict .omp_data_i)
{
int i;
int a;
int _3;
int * _10;
unsigned int pretmp_14;
unsigned int _16;
unsigned int _17;
unsigned int _19;
unsigned int prephitmp_22;
<bb 2>:
_3 = __builtin_omp_get_num_threads ();
i_4 = __builtin_omp_get_thread_num ();
if (i_4 <= 9)
goto <bb 3>;
else
goto <bb 6>;
<bb 3>:
# a_5 = PHI <0(2)>
# i_2 = PHI <i_4(2)>
<bb 4>:
# a_18 = PHI <a_5(3), a_7(4)>
# i_21 = PHI <i_2(3), i_15(4)>
a_7 = a_18 + i_21;
_19 = (unsigned int) _3;
_17 = (unsigned int) i_21;
_16 = _17 + _19;
i_15 = (int) _16;
if (i_15 <= 9)
goto <bb 4>;
else
goto <bb 5>;
<bb 5>:
pretmp_14 = (unsigned int) a_7;
<bb 6>:
# prephitmp_22 = PHI <pretmp_14(5), 0(2)>
_10 = &.omp_data_i_9(D)->a;
__atomic_fetch_add_4 (_10, prephitmp_22, 0); [tail call]
return;
}
...
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one
@ 2015-03-19 10:23 vries at gcc dot gnu.org
2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-03-19 10:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468
Bug ID: 65468
Summary: Optimize static schedule with chunk_size one
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Consider test.c:
...
extern void abort ();
int
bar ()
{
int a = 0, i;
#pragma omp parallel for num_threads (3) reduction (+:a) schedule(static, 1)
for (i = 0; i < 10; i++)
a += i;
return a;
}
int
main (void)
{
int res;
res = bar ();
if (res != 45)
abort ();
return 0;
}
...
So, we create 3 threads, and the schedule will be:
threadnr | iterations
---------------------
0 | 0 3 6 9
1 | 1 4 7
2 | 2 5 8
The code is generated using expand_for_omp_static_chunk, which results in the
following code for -O2 -fopenmp (optimized dump):
...
bar._omp_fn.0 (struct .omp_data_s.0 & restrict .omp_data_i)
{
int i;
int a;
int _6;
int _11;
int * _17;
int _21;
unsigned int _23;
int _25;
int _26;
unsigned int _27;
int _29;
unsigned int _31;
unsigned int _32;
int _33;
unsigned int _34;
unsigned int pretmp_35;
unsigned int prephitmp_36;
<bb 2>:
_6 = __builtin_omp_get_num_threads ();
i_7 = __builtin_omp_get_thread_num ();
_25 = i_7 + 1;
_26 = MIN_EXPR <_25, 10>;
if (i_7 <= 9)
goto <bb 3>;
else
goto <bb 8>;
<bb 3>:
# a_3 = PHI <0(2)>
# i_24 = PHI <i_7(2)>
# _21 = PHI <_26(2)>
<bb 4>:
# a_12 = PHI <a_3(3), a_13(6)>
# i_5 = PHI <i_24(3), i_22(6)>
# _29 = PHI <_21(3), _11(6)>
<bb 5>:
# a_1 = PHI <a_12(4), a_13(5)>
# i_4 = PHI <i_5(4), i_14(5)>
a_13 = a_1 + i_4;
i_14 = i_4 + 1;
if (i_14 < _29)
goto <bb 5>;
else
goto <bb 6>;
<bb 6>:
_32 = (unsigned int) i_5;
_31 = (unsigned int) _6;
_23 = _31 + _32;
i_22 = (int) _23;
_27 = _23;
_34 = _27 + 1;
_33 = (int) _34;
_11 = MIN_EXPR <_33, 10>;
if (i_22 <= 9)
goto <bb 4>;
else
goto <bb 7>;
<bb 7>:
pretmp_35 = (unsigned int) a_13;
<bb 8>:
# prephitmp_36 = PHI <pretmp_35(7), 0(2)>
_17 = &.omp_data_i_16(D)->a;
__atomic_fetch_add_4 (_17, prephitmp_36, 0); [tail call]
return;
}
...
The code contains a loop nest with two loops. The inner loop handles a single
chunk, the outer loop iterates over the chunks assigned to the thread.
For a chunk size of one, we know that the inner loop will only execute the body
once at all times. But the compiler doesn't manage to optimize the inner loop
away.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
@ 2015-08-24 13:14 ` vries at gcc dot gnu.org
2015-08-24 15:02 ` vries at gcc dot gnu.org
2015-08-24 15:05 ` vries at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-08-24 13:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468
--- Comment #3 from vries at gcc dot gnu.org ---
Author: vries
Date: Mon Aug 24 13:14:17 2015
New Revision: 227124
URL: https://gcc.gnu.org/viewcvs?rev=227124&root=gcc&view=rev
Log:
Optimize expand_omp_for_static_chunk for chunk_size one
2015-08-24 Tom de Vries <tom@codesourcery.com>
PR tree-optimization/65468
* omp-low.c (expand_omp_for_static_chunk): Remove inner loop if
chunk_size is one.
* gcc.dg/gomp/static-chunk-size-one.c: New test.
* testsuite/libgomp.c/static-chunk-size-one.c: New test.
Added:
trunk/gcc/testsuite/gcc.dg/gomp/static-chunk-size-one.c
trunk/libgomp/testsuite/libgomp.c/static-chunk-size-one.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/omp-low.c
trunk/gcc/testsuite/ChangeLog
trunk/libgomp/ChangeLog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
2015-08-24 13:14 ` vries at gcc dot gnu.org
@ 2015-08-24 15:02 ` vries at gcc dot gnu.org
2015-08-24 15:05 ` vries at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-08-24 15:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468
--- Comment #4 from vries at gcc dot gnu.org ---
Author: vries
Date: Mon Aug 24 15:01:44 2015
New Revision: 227130
URL: https://gcc.gnu.org/viewcvs?rev=227130&root=gcc&view=rev
Log:
Add libgomp.oacc-c-c++-common/vector-loop.c
2015-08-24 Tom de Vries <tom@codesourcery.com>
PR tree-optimization/65468
* testsuite/libgomp.oacc-c-c++-common/vector-loop.c: New test.
Added:
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-loop.c
Modified:
trunk/libgomp/ChangeLog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/65468] Optimize static schedule with chunk_size one
2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
` (2 preceding siblings ...)
2015-08-24 15:02 ` vries at gcc dot gnu.org
@ 2015-08-24 15:05 ` vries at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2015-08-24 15:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65468
vries at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #5 from vries at gcc dot gnu.org ---
patch and openmp and openacc test-cases committed, marking resolved-fixed.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-08-24 15:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-19 10:23 [Bug tree-optimization/65468] New: Optimize static schedule with chunk_size one vries at gcc dot gnu.org
2015-03-19 10:19 ` [Bug tree-optimization/65468] " vries at gcc dot gnu.org
2015-08-24 13:14 ` vries at gcc dot gnu.org
2015-08-24 15:02 ` vries at gcc dot gnu.org
2015-08-24 15:05 ` vries at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).