public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/32825] New: Reduction with nonzero start (arbitrary also) causes an extra add to happen
@ 2007-07-19 17:11 pinskia at gcc dot gnu dot org
2007-07-19 18:15 ` [Bug tree-optimization/32825] " dorit at gcc dot gnu dot org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-07-19 17:11 UTC (permalink / raw)
To: gcc-bugs
Testcase (Compile at -O2 -maltivec -ftree-vectorize):
int a[16*100];
int f(int e)
{
int i;
for(i = 0;i<16*100;i++)
e += a[i];
return e;
}
--------- Cut -----
Currently we get:
ivtmp.42 = (long unsigned int) &a;
vect_var_.36 = { 0, 0, 0, 0 };
<bb 3>:
vect_var_.36 = MEM[index: ivtmp.42] + vect_var_.36;
ivtmp.42 = ivtmp.42 + 16;
if (ivtmp.42 != (long unsigned int) (&a + 6400))
goto <bb 3>;
else
goto <bb 4>;
<bb 4>:
vect_var_.39 = vect_var_.36 v>> 64;
vect_var_.47 = vect_var_.39 + vect_var_.36;
vect_var_.48 = vect_var_.47 v>> 32;
stmp_var_.38 = BIT_FIELD_REF <vect_var_.48 + vect_var_.47, 32, 96>;
return stmp_var_.38 + e;
Though the last add is extra and does not need to be done, we can get rid of it
by having vect_var_.36 being set initially to {e, 0, 0, 0} .
Note this happens with a non zero start also, that is:
int a[16*100];
int f(int e)
{
int i;
e = 1;
for(i = 0;i<16*100;i++)
e += a[i];
return e;
}
--
Summary: Reduction with nonzero start (arbitrary also) causes an
extra add to happen
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: pinskia at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32825
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/32825] Reduction with nonzero start (arbitrary also) causes an extra add to happen
2007-07-19 17:11 [Bug tree-optimization/32825] New: Reduction with nonzero start (arbitrary also) causes an extra add to happen pinskia at gcc dot gnu dot org
@ 2007-07-19 18:15 ` dorit at gcc dot gnu dot org
2007-07-19 18:32 ` pinskia at gcc dot gnu dot org
2007-07-24 13:06 ` dorit at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-19 18:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from dorit at gcc dot gnu dot org 2007-07-19 18:15 -------
...
> Though the last add is extra and does not need to be done, we can get rid of it
> by having vect_var_.36 being set initially to {e, 0, 0, 0} .
The problem is that often initializing a vector to {e, 0, 0, 0} is (much?) more
expensive than initializing a vector to {0, 0, 0, 0} and then adding e to the
final scalar result. We actually had both options in the vectorizer for a while
(guarded by ADJUST_IN_EPILOG hard-coded #define), however we didn't know how to
choose between the two options (cost wise), so we just arbitrarily chose one.
Now that we're starting to build a cost model we may try to evaluate which of
the two options to generate.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32825
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/32825] Reduction with nonzero start (arbitrary also) causes an extra add to happen
2007-07-19 17:11 [Bug tree-optimization/32825] New: Reduction with nonzero start (arbitrary also) causes an extra add to happen pinskia at gcc dot gnu dot org
2007-07-19 18:15 ` [Bug tree-optimization/32825] " dorit at gcc dot gnu dot org
@ 2007-07-19 18:32 ` pinskia at gcc dot gnu dot org
2007-07-24 13:06 ` dorit at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-07-19 18:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2007-07-19 18:32 -------
> The problem is that often initializing a vector to {e, 0, 0, 0} is (much?) more
On SPU, it is not:
cwd $2,0($sp)
shufb $5,$3,$5,$2
vs:
ori $7,$3,0
il $5,0
...
a $8,$9,$7
Also it increases register pressure by long gating incomming argument.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32825
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/32825] Reduction with nonzero start (arbitrary also) causes an extra add to happen
2007-07-19 17:11 [Bug tree-optimization/32825] New: Reduction with nonzero start (arbitrary also) causes an extra add to happen pinskia at gcc dot gnu dot org
2007-07-19 18:15 ` [Bug tree-optimization/32825] " dorit at gcc dot gnu dot org
2007-07-19 18:32 ` pinskia at gcc dot gnu dot org
@ 2007-07-24 13:06 ` dorit at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-24 13:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from dorit at gcc dot gnu dot org 2007-07-24 13:05 -------
(In reply to comment #1)
> ... We actually had both options in the vectorizer for a while
> (guarded by ADJUST_IN_EPILOG hard-coded #define), however we didn't know how to
> choose between the two options (cost wise), so we just arbitrarily chose one.
> Now that we're starting to build a cost model we may try to evaluate which of
> the two options to generate.
for the record - this was the patch that removed the second option:
2007-04-18 Dorit Nuzman <dorit@il.ibm.com>
* tree-vect-transform.c (get_initial_def_for_reduction): Clean away
the unused code for reduction without adjust-in-epilog to simplify the
function.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32825
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-07-24 13:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-19 17:11 [Bug tree-optimization/32825] New: Reduction with nonzero start (arbitrary also) causes an extra add to happen pinskia at gcc dot gnu dot org
2007-07-19 18:15 ` [Bug tree-optimization/32825] " dorit at gcc dot gnu dot org
2007-07-19 18:32 ` pinskia at gcc dot gnu dot org
2007-07-24 13:06 ` dorit at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).