public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
* Re: optimization/6883: Fails to optimize temporary objects.
@ 2003-05-11 12:06 Richard Guenther
0 siblings, 0 replies; 4+ messages in thread
From: Richard Guenther @ 2003-05-11 12:06 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR optimization/6883; it has been noted by GNATS.
From: Richard Guenther <rguenth@tat.physik.uni-tuebingen.de>
To: Dara Hazeghi <dhazeghi@yahoo.com>
Cc: gcc-gnats@gcc.gnu.org, <rguenth@tat.physik.uni-tuebingen.de>
Subject: Re: optimization/6883: Fails to optimize temporary objects.
Date: Sun, 11 May 2003 14:00:37 +0200 (CEST)
On Sat, 10 May 2003, Dara Hazeghi wrote:
> http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-
> trail&database=gcc&pr=6883
>
> Hello,
>
> this bug was reported against gcc 3.1. Would it be possible to test
> your testcase against a more current version of gcc (ie 3.2.3 or 3.3
> prerelease) and report back on the results? Thanks,
Some more information, based on my last reply - the Intel compiler
Intel(R) C++ Compiler for 32-bit applications, Version 7.1 Build
20030424Z produces with -O2
iterators with temporaries
..B1.18: # Preds ..B1.18 ..B1.17
lea 1(%esi), %edi #65.24
cmpl $254, %edi #66.16
fldl -8(%edx,%esi,8) #65.21
faddl 8(%edx,%esi,8) #65.28
fmull (%eax,%esi,8) #65.32
fstpl (%ecx,%esi,8) #65.9
movl %edi, %esi #65.24
jle ..B1.18 # Prob 97% #66.16
iterators without temporaries
..B1.21: # Preds ..B1.21 ..B1.20 # Infreq
fldl -8(%edx,%esi,8) #74.17
faddl 8(%edx,%esi,8) #74.25
fmull (%eax,%esi,8) #74.34
fstpl (%ecx,%esi,8) #74.9
incl %esi #75.16
cmpl $254, %esi #75.16
jle ..B1.21 # Prob 97% #75.16
no iterators, just int
..B1.24: # Preds ..B1.24 ..B1.23 # Infreq
lea 1(%edi), %esi #83.24
fldl -8(%eax,%edi,8) #83.17
faddl 8(%eax,%edi,8) #83.24
fmull (%edx,%edi,8) #83.32
fstpl (%ecx,%edi,8) #83.9
addl $2, %edi #83.24
cmpl $254, %edi #84.16
fldl -8(%eax,%esi,8) #83.17
faddl 8(%eax,%esi,8) #83.24
fmull (%edx,%esi,8) #83.32
fstpl (%ecx,%esi,8) #83.9
jle ..B1.24 # Prob 97% #84.16
it even detects the loop is run even times and unrolls it 2-times ;)
This is what I like the code to get optimized to, at least all three
versions look nearly the same.
Richard.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: optimization/6883: Fails to optimize temporary objects.
@ 2003-05-11 11:16 Richard Guenther
0 siblings, 0 replies; 4+ messages in thread
From: Richard Guenther @ 2003-05-11 11:16 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR optimization/6883; it has been noted by GNATS.
From: Richard Guenther <rguenth@tat.physik.uni-tuebingen.de>
To: Dara Hazeghi <dhazeghi@yahoo.com>
Cc: gcc-gnats@gcc.gnu.org, <rguenth@tat.physik.uni-tuebingen.de>
Subject: Re: optimization/6883: Fails to optimize temporary objects.
Date: Sun, 11 May 2003 13:13:57 +0200 (CEST)
On Sat, 10 May 2003, Dara Hazeghi wrote:
> http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-
> trail&database=gcc&pr=6883
>
> Hello,
>
> this bug was reported against gcc 3.1. Would it be possible to test
> your testcase against a more current version of gcc (ie 3.2.3 or 3.3
> prerelease) and report back on the results? Thanks,
The problem is the same with g++-3.3 (GCC) 3.3 20030505 (prerelease)
the first loop body now gets
.L6:
movl $1, -104(%ebp)
movl -80(%ebp), %eax # <variable>.m_i, <anonymous>
movl %esi, -100(%ebp)
movl $1, -120(%ebp)
leal -1(%eax), %edx
movl %esi, -116(%ebp)
leal 0(,%eax,8), %ecx
incl %eax
fldl (%ebx,%eax,8)
cmpl %esi, %eax
faddl (%ebx,%edx,8)
movl %edx, -96(%ebp) # it.m_i
movl -140(%ebp), %edx
fmull (%ecx,%edi)
movl %eax, -112(%ebp) # it.m_i
movl %eax, -80(%ebp) # <variable>.m_i
fstpl (%ecx,%edx)
jle .L6
while the second, manually optimized, version results in the lot better
.L23:
movl -128(%ebp), %eax # <variable>.m_i, <anonymous>
leal 1(%eax), %edx
leal 0(,%eax,8), %ecx
fldl (%ebx,%edx,8)
cmpl %esi, %edx
faddl -8(%ebx,%eax,8)
movl -140(%ebp), %eax
movl %edx, -128(%ebp) # <variable>.m_i
fmull (%ecx,%edi)
fstpl (%ecx,%eax)
jle .L23
Of course this is still not optimal, as the loop iterator itself can be
optimized away and turned into a completely int-driven loop (add a
operator()(int) to Array class and use
int i = 1;
do {
a(i) = (b(i-1)+b(i+1))*c(i);
} while (++i <= 254);
to iterate gives
.L37:
fldl 8(%ebx,%edx)
incl %eax # i
faddl -8(%ebx,%edx)
fmull (%edi,%edx)
fstpl (%esi,%edx)
addl $8, %edx
cmpl $254, %eax # i
jle .L37
which seems nearly optimal here (one might use a byte for the loop
counter here, or unify it with the array offset %edx).
g++-3.4 (GCC) 3.4 20030505 (experimental) for the third and the second
case is the same (for the second case it is able to hoist one more movl
out of the loop). For the first, inadequately optimized version it does
slightly better than 3.3, probably due to the new loop optimizer:
.L6:
movl -80(%ebp), %eax # <variable>.m_i, <anonymous>
leal 0(,%eax,8), %edx #, tmp88
leal -1(%eax), %ecx #, tmp98
movl %ecx, -112(%ebp) # tmp98, it.m_i
incl %eax # tmp116
cmpl $254, %eax #, tmp116
fldl (%ebx,%eax,8) #* <anonymous>
faddl (%ebx,%ecx,8) #* <anonymous>
movl %eax, -144(%ebp) # tmp116, it.m_i
movl %eax, -80(%ebp) # tmp116, <variable>.m_i
fmull (%edx,%edi) #
fstpl (%edx,%esi) #
jle .L6 #,
but this is still far away from the optimal and the manually optimized
version. I.e. it still doesnt avoid creating the temporary iterator object
to hold i-1/i+1.
So after all, while its getting better, its still not anywhere near to
satisfactory.
Richard.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: optimization/6883: Fails to optimize temporary objects.
@ 2003-05-10 23:36 Dara Hazeghi
0 siblings, 0 replies; 4+ messages in thread
From: Dara Hazeghi @ 2003-05-10 23:36 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR optimization/6883; it has been noted by GNATS.
From: Dara Hazeghi <dhazeghi@yahoo.com>
To: gcc-gnats@gcc.gnu.org, rguenth@tat.physik.uni-tuebingen.de
Cc:
Subject: Re: optimization/6883: Fails to optimize temporary objects.
Date: Sat, 10 May 2003 16:29:36 -0700
http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-
trail&database=gcc&pr=6883
Hello,
this bug was reported against gcc 3.1. Would it be possible to test
your testcase against a more current version of gcc (ie 3.2.3 or 3.3
prerelease) and report back on the results? Thanks,
Dara
^ permalink raw reply [flat|nested] 4+ messages in thread
* optimization/6883: Fails to optimize temporary objects.
@ 2002-05-31 3:46 rguenth
0 siblings, 0 replies; 4+ messages in thread
From: rguenth @ 2002-05-31 3:46 UTC (permalink / raw)
To: gcc-gnats
>Number: 6883
>Category: optimization
>Synopsis: Fails to optimize temporary objects.
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: unassigned
>State: open
>Class: pessimizes-code
>Submitter-Id: net
>Arrival-Date: Fri May 31 03:06:02 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator: Richard Guenther <rguenth@tat.physik.uni-tuebingen.de>
>Release: gcc version 3.1.1 20020516 (prerelease)
>Organization:
>Environment:
Linux bellatrix 2.4.18-rc4-TATUP #2 Tue Mar 19 16:30:05 CET 2002 i686 unknown
>Description:
Temporary objects are not optimized away, which is a serious problem for medium-complex iterators. See attachment for an example.
The first loop body which uses temporary objects compiles (ix86, -O3 -fno-exceptions) to
movl -88(%ebp), %ecx
movl -80(%ebp), %eax
movl -140(%ebp), %esi
movl %ecx, -104(%ebp)
movl -56(%ebp), %ebx
leal 0(,%eax,8), %edx
movl %ecx, -120(%ebp)
movl -40(%ebp), %edi
movl -140(%ebp), %ecx
movl %esi, -100(%ebp)
leal -1(%eax), %esi
incl %eax
movl %eax, -112(%ebp)
addl %edx, %edi
movl %esi, -96(%ebp)
movl %ecx, -116(%ebp)
fldl (%ebx,%eax,8)
movl -72(%ebp), %eax
faddl (%ebx,%esi,8)
addl %eax, %edx
fmull (%edx)
fstpl (%edi)
which fails to optimize the created temporary objects. It should be optimized to the equivalent code for the second implementation manually doing this optimization:
movl -128(%ebp), %edi
movl -56(%ebp), %eax
movl -40(%ebp), %esi
movl -72(%ebp), %ecx
leal 0(,%edi,8), %ebx
addl %ebx, %esi
fldl 8(%eax,%edi,8)
addl %ecx, %ebx
faddl -8(%eax,%edi,8)
fmull (%ebx)
fstpl (%esi)
This problem pessimizes code which uses iterators like that by about a factor of 3 compared to equivalent C code which is a serious problem.
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
----gnatsweb-attachment----
Content-Type: text/plain; name="iterator-gnats.cpp"
Content-Disposition: inline; filename="iterator-gnats.cpp"
class Iterator {
public:
Iterator(int i0, int i1)
: m_i0(i0), m_i1(i1)
{
m_i = i0;
}
Iterator operator+(int i) const {
Iterator it(*this);
it.m_i += i;
return it;
}
Iterator operator-(int i) const {
Iterator it(*this);
it.m_i -= i;
return it;
}
bool operator++()
{
m_i++;
if (m_i > m_i1)
return false;
return true;
}
int getIndex() const { return m_i; }
private:
const int m_i0, m_i1;
int m_i;
};
class Array {
public:
Array(int size)
: m_v(new double(size)) {}
~Array() { delete m_v; }
double& operator()(const Iterator& i) { return m_v[i.getIndex()]; }
double& operator()(const Iterator& i, int d) { return m_v[i.getIndex()+d]; }
private:
double * const m_v;
};
int main(int argc, char **argv)
{
Array a(256);
Array b(256);
Array c(256);
// temporary objects not optimized
{
Iterator i(1, 254);
do {
a(i) = (b(i-1)+b(i+1))*c(i);
} while (++i);
}
// optimized ok
{
Iterator i(1, 254);
do {
a(i) = (b(i,-1)+b(i,+1))*c(i);
} while (++i);
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-05-11 12:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-11 12:06 optimization/6883: Fails to optimize temporary objects Richard Guenther
-- strict thread matches above, loose matches on Subject: below --
2003-05-11 11:16 Richard Guenther
2003-05-10 23:36 Dara Hazeghi
2002-05-31 3:46 rguenth
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).