* Re: Useless code generated by gcc 2.95.2????
@ 1999-11-03 14:22 Mike Stump
1999-11-03 16:06 ` Martin v. Loewis
1999-11-30 23:37 ` Mike Stump
0 siblings, 2 replies; 18+ messages in thread
From: Mike Stump @ 1999-11-03 14:22 UTC (permalink / raw)
To: deweese, martin; +Cc: gcc
LL5 is dead and should be removed. Region 3 is near trivial and it
might be able to be removed, but it is in some sense, correct. LL11
is used to protect LL3. Technically operator delete can't throw, and
if we knew that and used that information then we could get rid of
LL11. Last time I worked on the code, it didn't have that type of
information at it's disposal (meaning LL11 could not be deleted).
I don't think some of the extra code is caused by a bug, but rather
the addition of a feature. LL5 non-removal however is a bug, that I
suspect someone introduced. A binary search of cvs could tell how
broke it and how they broke it. That would then prompt them into
fixing it (maybe).
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 14:22 Useless code generated by gcc 2.95.2???? Mike Stump
@ 1999-11-03 16:06 ` Martin v. Loewis
1999-11-06 7:36 ` Thomas E Deweese
1999-11-30 23:37 ` Martin v. Loewis
1999-11-30 23:37 ` Mike Stump
1 sibling, 2 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-03 16:06 UTC (permalink / raw)
To: mrs; +Cc: deweese, gcc
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]
> I don't think some of the extra code is caused by a bug, but rather
> the addition of a feature. LL5 non-removal however is a bug, that I
> suspect someone introduced. A binary search of cvs could tell how
> broke it and how they broke it. That would then prompt them into
> fixing it (maybe).
Indeed. As we've found later on, the mainline *does* remove the
additional __builtin_delete call.
I'm not sure whether this really is a regression over egcs 1.1 - in
egcs 1.1, the processing of the implicit call to the deallocator was
quite different. So it may not be the case that egcs 2.9x (with x>3,
x<6) ever did the right thing. FWIW, the change activating the current
front-end behavior was
1998-10-22 Martin von Löwis <loewis@informatik.hu-berlin.de>
* init.c (build_new_1): Delay cleanup until end of full expression.
So if the binary search yields the source of the problem, I agree that
the relevant patch should be considered for 2.95.3 (as 2.95 now
produces a significant increase in code size).
Regards,
Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 16:06 ` Martin v. Loewis
@ 1999-11-06 7:36 ` Thomas E Deweese
1999-11-30 23:37 ` Thomas E Deweese
1999-11-30 23:37 ` Martin v. Loewis
1 sibling, 1 reply; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-06 7:36 UTC (permalink / raw)
To: Martin v. Loewis; +Cc: mrs, deweese, gcc
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3854 bytes --]
What follows is a summary of discussions between Martin and
myself which have uncovered a cleaner way to handle call's to new.
I'm also willing to try my hand at implementing the change but
I could definitely use a few pointers.
>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:
>> I don't think some of the extra code is caused by a bug, but rather
>> the addition of a feature. LL5 non-removal however is a bug, that
>> I suspect someone introduced. A binary search of cvs could tell
>> how broke it and how they broke it. That would then prompt them
>> into fixing it (maybe).
ML> Indeed. As we've found later on, the mainline *does* remove the
ML> additional __builtin_delete call.
ML> I'm not sure whether this really is a regression over egcs 1.1 -
ML> in egcs 1.1, the processing of the implicit call to the
ML> deallocator was quite different. So it may not be the case that
ML> egcs 2.9x (with x>3, x<6) ever did the right thing. FWIW, the
ML> change activating the current front-end behavior was
ML> 1998-10-22 Martin von Löwis <loewis@informatik.hu-berlin.de>
ML> * init.c (build_new_1): Delay cleanup until end of full
ML> expression.
Martin and I have discussed the reasons behind the change
introduced above. It concerns the life span of temporaries in the
call to a constructor called via new who's results is a parameter to a
function call. Here is the example we used:
f(new Foo(A(), B(), ...), ...);
In particular the lifespan of A, B must extend until
'f' returns (which wasn't true in egcs 1.1). This turns out to be
very tricky to accomplish given the requirement that any memory
allocated by new must be deallocated if, Foo::Foo, A::A, or B::B throw
an error. This is because in the simplest expansion of this code:
{
Foo *tmp = ::new(sizeof(Foo), ...);
+- +-
| | A a();
| 1 B b();
2 | Foo::Foo(tmp, a, b, ...);
| +-
| f(tmp, ...)
+-----}
You need to catch and delete 'tmp' in region 1 but the
lifespan of a & b needs to be '2'. This is not trivial (possible?) to
communicate to the back end, and it is impossible for normal C++ code
to generate such a situation.
The patch referenced above introduced a sentry variable which
prevented the error handler from deleting the allocated memory after
region 1 completed, although the error handler was in effect for all
of region 2.
Fortunately, the C++ standard allows you some freedom in
rewriting this simple version of the code, in particular you are free
to rearrange the call to new with the construction of parameters for
the constructor [see 5.3.4 sec 21 in final C++ Spec (sec 22 in CD2)]:
Whether the allocation function is called before
evaluating the constructor arguments or after evaluating the
constructor arguments but before entering the constructor is
unspecified. It is also unspecified whether the arguments to a
constructor are evaluated if the allocation function returns
the null pointer or exits using an exception.
Given this we can safely rewrite the above code as:
{
A a();
B b();
Foo *tmp = ::new(sizeof(Foo), ...);
try {
Foo::Foo(tmp, a, b);
} catch(...) {
::delete(tmp, ...);
}
f(tmp, ...);
}
In this case if the construction of 'a', or 'b' throws an
error you don't need to catch the error and free the memory since you
haven't allocated it yet!!!
This should (hopefully) lead to much less convoluted code in
all cases. However, now that I have a possible solution I would
appreciate pointers to any places where similar sorts of rewriting are
done, as well as where in the c++ font end such rewriting should be
done.
Any help and or comments are appreciated.
--
Thomas DeWeese
deweese@kodak.com
"The only difference between theory and practice is
that in theory there isn't any." -- unknown
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-06 7:36 ` Thomas E Deweese
@ 1999-11-30 23:37 ` Thomas E Deweese
0 siblings, 0 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-30 23:37 UTC (permalink / raw)
To: Martin v. Loewis; +Cc: mrs, deweese, gcc
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3854 bytes --]
What follows is a summary of discussions between Martin and
myself which have uncovered a cleaner way to handle call's to new.
I'm also willing to try my hand at implementing the change but
I could definitely use a few pointers.
>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:
>> I don't think some of the extra code is caused by a bug, but rather
>> the addition of a feature. LL5 non-removal however is a bug, that
>> I suspect someone introduced. A binary search of cvs could tell
>> how broke it and how they broke it. That would then prompt them
>> into fixing it (maybe).
ML> Indeed. As we've found later on, the mainline *does* remove the
ML> additional __builtin_delete call.
ML> I'm not sure whether this really is a regression over egcs 1.1 -
ML> in egcs 1.1, the processing of the implicit call to the
ML> deallocator was quite different. So it may not be the case that
ML> egcs 2.9x (with x>3, x<6) ever did the right thing. FWIW, the
ML> change activating the current front-end behavior was
ML> 1998-10-22 Martin von Löwis <loewis@informatik.hu-berlin.de>
ML> * init.c (build_new_1): Delay cleanup until end of full
ML> expression.
Martin and I have discussed the reasons behind the change
introduced above. It concerns the life span of temporaries in the
call to a constructor called via new who's results is a parameter to a
function call. Here is the example we used:
f(new Foo(A(), B(), ...), ...);
In particular the lifespan of A, B must extend until
'f' returns (which wasn't true in egcs 1.1). This turns out to be
very tricky to accomplish given the requirement that any memory
allocated by new must be deallocated if, Foo::Foo, A::A, or B::B throw
an error. This is because in the simplest expansion of this code:
{
Foo *tmp = ::new(sizeof(Foo), ...);
+- +-
| | A a();
| 1 B b();
2 | Foo::Foo(tmp, a, b, ...);
| +-
| f(tmp, ...)
+-----}
You need to catch and delete 'tmp' in region 1 but the
lifespan of a & b needs to be '2'. This is not trivial (possible?) to
communicate to the back end, and it is impossible for normal C++ code
to generate such a situation.
The patch referenced above introduced a sentry variable which
prevented the error handler from deleting the allocated memory after
region 1 completed, although the error handler was in effect for all
of region 2.
Fortunately, the C++ standard allows you some freedom in
rewriting this simple version of the code, in particular you are free
to rearrange the call to new with the construction of parameters for
the constructor [see 5.3.4 sec 21 in final C++ Spec (sec 22 in CD2)]:
Whether the allocation function is called before
evaluating the constructor arguments or after evaluating the
constructor arguments but before entering the constructor is
unspecified. It is also unspecified whether the arguments to a
constructor are evaluated if the allocation function returns
the null pointer or exits using an exception.
Given this we can safely rewrite the above code as:
{
A a();
B b();
Foo *tmp = ::new(sizeof(Foo), ...);
try {
Foo::Foo(tmp, a, b);
} catch(...) {
::delete(tmp, ...);
}
f(tmp, ...);
}
In this case if the construction of 'a', or 'b' throws an
error you don't need to catch the error and free the memory since you
haven't allocated it yet!!!
This should (hopefully) lead to much less convoluted code in
all cases. However, now that I have a possible solution I would
appreciate pointers to any places where similar sorts of rewriting are
done, as well as where in the c++ font end such rewriting should be
done.
Any help and or comments are appreciated.
--
Thomas DeWeese
deweese@kodak.com
"The only difference between theory and practice is
that in theory there isn't any." -- unknown
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 16:06 ` Martin v. Loewis
1999-11-06 7:36 ` Thomas E Deweese
@ 1999-11-30 23:37 ` Martin v. Loewis
1 sibling, 0 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-30 23:37 UTC (permalink / raw)
To: mrs; +Cc: deweese, gcc
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]
> I don't think some of the extra code is caused by a bug, but rather
> the addition of a feature. LL5 non-removal however is a bug, that I
> suspect someone introduced. A binary search of cvs could tell how
> broke it and how they broke it. That would then prompt them into
> fixing it (maybe).
Indeed. As we've found later on, the mainline *does* remove the
additional __builtin_delete call.
I'm not sure whether this really is a regression over egcs 1.1 - in
egcs 1.1, the processing of the implicit call to the deallocator was
quite different. So it may not be the case that egcs 2.9x (with x>3,
x<6) ever did the right thing. FWIW, the change activating the current
front-end behavior was
1998-10-22 Martin von Löwis <loewis@informatik.hu-berlin.de>
* init.c (build_new_1): Delay cleanup until end of full expression.
So if the binary search yields the source of the problem, I agree that
the relevant patch should be considered for 2.95.3 (as 2.95 now
produces a significant increase in code size).
Regards,
Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 14:22 Useless code generated by gcc 2.95.2???? Mike Stump
1999-11-03 16:06 ` Martin v. Loewis
@ 1999-11-30 23:37 ` Mike Stump
1 sibling, 0 replies; 18+ messages in thread
From: Mike Stump @ 1999-11-30 23:37 UTC (permalink / raw)
To: deweese, martin; +Cc: gcc
LL5 is dead and should be removed. Region 3 is near trivial and it
might be able to be removed, but it is in some sense, correct. LL11
is used to protect LL3. Technically operator delete can't throw, and
if we knew that and used that information then we could get rid of
LL11. Last time I worked on the code, it didn't have that type of
information at it's disposal (meaning LL11 could not be deleted).
I don't think some of the extra code is caused by a bug, but rather
the addition of a feature. LL5 non-removal however is a bug, that I
suspect someone introduced. A binary search of cvs could tell how
broke it and how they broke it. That would then prompt them into
fixing it (maybe).
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 15:22 ` Joern Rennecke
1999-11-03 15:40 ` Richard Henderson
@ 1999-11-30 23:37 ` Joern Rennecke
1 sibling, 0 replies; 18+ messages in thread
From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw)
To: Richard Henderson; +Cc: deweese, martin, gcc
> We make no attempt to remove dead code in the -O0 case.
> Which IMO is a mistake, but not one I've gotten around
> to correcting just yet.
If the code was written by the user, it should not be deleted at -O0.
setting breakpoints and looking at the compiled code shouldn't yield
any surprises at -O0.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 4:47 ` Thomas E Deweese
1999-11-03 10:32 ` Richard Henderson
@ 1999-11-30 23:37 ` Thomas E Deweese
1 sibling, 0 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-30 23:37 UTC (permalink / raw)
To: Martin v. Loewis; +Cc: gcc
>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:
>> Based on my observations I have generated a small test case:
ML> I have not tried to understand in detail why this code is
ML> emitted. It could be either a duplication of exception handlers,
ML> or it could be cover some case that does not appear in your
ML> example.
This was part of the reason I posted the note. Flow
analysis obviously gets complicated around exception handling. While
the code looked dead I know I didn't/don't know enough to be sure.
ML> Please understand that I'm not terribly worried about this code. I
ML> compiled your code with optimization (-O2); the assembler code
ML> generated for that is attached. Long and obfuscated machine code
ML> in non-optimizing mode is not strictly a bug - incorrect code
ML> would be a bug. Also, dead code in optimizing mode could be a bug.
I can understand your interest in fixing 'bigger things',
given the speed with which C++ has evolved in the recent past.
However, emitting significant dead code is pretty bad, even in the
debug case.
With respect to your last sentence, with a small change to
the example (one I didn't think to try until after I sent message) the
dead code remains at all optimization levels. I'm fairly sure that
this happens because the compiler can no longer determine that no
error can be thrown (via inlining).
ML> If you want to understand in detail why this code is generated,
ML> you may try using the "-da" option. This dumps a number of
ML> internal structures (RTL) into files. Those structures are later
ML> used to generate assembler output.
Thanks for the info, I may do this.
I'll let you know if I come up with anything...
------------------------------------------------------------------------
Updated example that leaves the duplicate code at all '-O' levels:
------------------------------------------------------------------------
#include <stdlib.h>
struct Foo {
Foo(); // Note: Body no longer provided inline (force 'call').
};
Foo *fu8=NULL;
void reset(void) {
fu8 = new Foo();
}
------------------------------------------------
Asm Output compiled: g++ -O3 -S Foo.cc
Note: LL3 & LL5
------------------------------------------------
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
call __builtin_new, 0
mov 1, %o0
mov %o0, %l0
.LLEHB3:
call __3Foo, 0
mov 1, %l1
sethi %hi(fu8), %o1
st %o0, [%o1+%lo(fu8)]
.LLEHE3:
b .LL5
mov 0, %l1
.LL4:
call __throw, 0
nop
.LL5:
cmp %l1, 0
be .LL14
nop
call __builtin_delete, 0
mov %l0, %o0
b,a .LL14
.LLEHB11:
.LL3:
cmp %l1, 0
be .LL4
nop
call __builtin_delete, 0
mov %l0, %o0
b,a .LL4
.LLEHE11:
.LL11:
call terminate__Fv, 0
nop
.LL14:
ret
restore
--
Thomas DeWeese
deweese@kodak.com
"The only difference between theory and practice is
that in theory there isn't any." -- unknown
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 10:32 ` Richard Henderson
1999-11-03 15:22 ` Joern Rennecke
@ 1999-11-30 23:37 ` Richard Henderson
1 sibling, 0 replies; 18+ messages in thread
From: Richard Henderson @ 1999-11-30 23:37 UTC (permalink / raw)
To: Thomas E Deweese; +Cc: Martin v. Loewis, gcc
On Wed, Nov 03, 1999 at 07:47:15AM -0500, Thomas E Deweese wrote:
> However, emitting significant dead code is pretty bad, even in the
> debug case.
We make no attempt to remove dead code in the -O0 case.
Which IMO is a mistake, but not one I've gotten around
to correcting just yet.
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 15:40 ` Richard Henderson
@ 1999-11-30 23:37 ` Richard Henderson
0 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 1999-11-30 23:37 UTC (permalink / raw)
To: Joern Rennecke; +Cc: deweese, martin, gcc
On Wed, Nov 03, 1999 at 11:20:40PM +0000, Joern Rennecke wrote:
> If the code was written by the user, it should not be deleted at -O0.
> setting breakpoints and looking at the compiled code shouldn't yield
> any surprises at -O0.
Oh please.
We can quite safely provide a single unreachable nop
(instead of multiple kilobytes of code) on which to
set such worthless breakpoints.
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* Useless code generated by gcc 2.95.2????
1999-11-01 13:43 Thomas E Deweese
1999-11-02 23:46 ` Martin v. Loewis
@ 1999-11-30 23:37 ` Thomas E Deweese
1 sibling, 0 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-30 23:37 UTC (permalink / raw)
To: gcc
Hi all,
We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
we noticed that object file sizes have gone up significantly (~50%)
between the two releases (On Sparc). So, I thought I would take a
look at some of the generated asm. I noticed what looked to be dead
code that was generated. Based on my observations I have generated a
small test case:
------------------------------------------------------------------------
#include <stdlib.h>
template <class T>
class Foo {
public:
Foo() { }
};
Foo<unsigned char> *fu8=NULL;
void reset(void) {
fu8 = new Foo<unsigned char>();
}
------------------------------------------------------------------------
I have included complete asm and verbose compile dumps at the end of
the message but I have brought out the part that most interests me up
here. My basic question is: How can it ever reach '.LL5' with %l1
equal to zero?
It looks to me like the deletes following '.LL5' and '.LL3'
are duplicates and that the delete at '.LL5' can never be reached (and
if it were to be reached the result would be a double delete).
Here is my thinking:
1) The only way to reach '.LL5' without having cleared %l1 is
for an error to be thrown in the call to the constructor.
2) This (if I understand how errorhandling is done) would lead
to a jump to '.LL3' which _would_ delete the 'Foo', then
jump to the call to __throw at '.LL4'.
3) If that were to ever return (which I don't think it will)
then the delete following '.LL5' would also be called (a
double delete of the contents of %l0).
Is there some unseen way for the system to branch to
'.LLEHE3'? (through the error handler?)
I missing something, or is that truely dead code?
I don't pretend to be an expert on the internals of g++, asm,
or gcc's error handling mechanisms, but that duplicate code just looks
suspicious to me. Any help on this would be greatly appreciated.
I doubt this would fully account for my 50% code growth but
it's a start.
------------------------------------------------------------------------
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
sethi %hi(fu8), %o0
or %o0, %lo(fu8), %l2
mov 1, %o0
call __builtin_new, 0
nop
mov %o0, %l0
mov 1, %l1
.LLEHB3:
mov %l0, %o0
call __t3Foo1ZUc, 0
nop
mov 0, %l1
st %o0, [%l2]
.LLEHE3:
b .LL5
nop
.LL4:
call __throw, 0
nop
.LL5:
cmp %l1, 0
be .LL7
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL7
nop
.LL6:
.LL7:
b .LL10
nop
.LLEHB11:
.LL3:
cmp %l1, 0
be .LL9
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL9
nop
.LL8:
.LL9:
b .LL4
nop
------------------------------------------------------------------------
I compiled my example with the following line:
g++ -fverbose-asm --verbose Foo.cc -o Foo-295-3.s -S
Reading specs from /freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/specs
gcc version 2.95.2 19991024 (release)
/freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cpp -lang-c++ -v -D__GNUC__=2 -D__GNUG__=2 -D__GNUC_MINOR__=95 -D__cplusplus -Dsparc -Dsun -Dunix -D__svr4__ -D__SVR4 -D__sparc__ -D__sun__ -D__unix__ -D__svr4__ -D__SVR4 -D__sparc -D__sun -D__unix -Asystem(unix) -Asystem(svr4) -D__EXCEPTIONS -D__GCC_NEW_VARARGS__ -Acpu(sparc) -Amachine(sparc) Foo.cc /home/deweese/tmp/ccIwWesd.ii
GNU CPP version 2.95.2 19991024 (release) (sparc)
[...]
/freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cc1plus /home/deweese/tmp/ccIwWesd.ii -quiet -dumpbase Foo.cc -version -fverbose-asm -o Foo-295-3.s
GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).
------------------------------------------------------------------------
The generated assembly:
.file "Foo.cc"
! GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).
! options passed: -fverbose-asm
! options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
! -fpcc-struct-return -fsched-interblock -fsched-spec -fexceptions -fcommon
! -fverbose-asm -fgnu-linker -fargument-alias -fident -mepilogue -mapp-regs
gcc2_compiled.:
.global fu8
.section ".data"
.align 4
.type fu8,#object
.size fu8,4
fu8:
.uaword 0
.global __throw
.section ".text"
.align 4
.global reset__Fv
.type reset__Fv,#function
.proc 020
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
sethi %hi(fu8), %o0
or %o0, %lo(fu8), %l2
mov 1, %o0
call __builtin_new, 0
nop
mov %o0, %l0
mov 1, %l1
.LLEHB3:
mov %l0, %o0
call __t3Foo1ZUc, 0
nop
mov 0, %l1
st %o0, [%l2]
.LLEHE3:
b .LL5
nop
.LL4:
call __throw, 0
nop
.LL5:
cmp %l1, 0
be .LL7
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL7
nop
.LL6:
.LL7:
b .LL10
nop
.LLEHB11:
.LL3:
cmp %l1, 0
be .LL9
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL9
nop
.LL8:
.LL9:
b .LL4
nop
.LLEHE11:
b .LL11
nop
.LL12:
call __throw, 0
nop
.LL13:
.LL11:
call terminate__Fv, 0
nop
.LL10:
b .LL14
nop
b .LL2
nop
.LL14:
.LL2:
ret
restore
.LLFE1:
.LLfe1:
.size reset__Fv,.LLfe1-reset__Fv
.section ".gcc_except_table",#alloc,#write
.align 4
__EXCEPTION_TABLE__:
.uaword .LLEHB3
.uaword .LLEHE3
.uaword .LL3
.uaword .LLEHB11
.uaword .LLEHE11
.uaword .LL11
.LLRTH1:
.uaword -1
.uaword -1
.section ".eh_frame",#alloc,#write
__FRAME_BEGIN__:
.uaword .LLECIE1-.LLSCIE1
.LLSCIE1:
.uaword 0x0
.byte 0x1
.asciz "eh"
.uaword __EXCEPTION_TABLE__
.byte 0x1
.byte 0x7c
.byte 0x65
.byte 0xc
.byte 0xe
.byte 0x0
.byte 0x9
.byte 0x65
.byte 0xf
.align 4
.LLECIE1:
.uaword .LLEFDE1-.LLSFDE1
.LLSFDE1:
.uaword .LLSFDE1-__FRAME_BEGIN__
.uaword .LLFB1
.uaword .LLFE1-.LLFB1
.byte 0x4
.uaword .LLCFI0-.LLFB1
.byte 0xd
.byte 0x1e
.byte 0x2d
.byte 0x9
.byte 0x65
.byte 0x1f
.align 4
.LLEFDE1:
.ident "GCC: (GNU) 2.95.2 19991024 (release)"
--
Thomas DeWeese
deweese@kodak.com
"The only difference between theory and practice is
that in theory there isn't any." -- unknown
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-02 23:46 ` Martin v. Loewis
1999-11-03 4:47 ` Thomas E Deweese
@ 1999-11-30 23:37 ` Martin v. Loewis
1 sibling, 0 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-30 23:37 UTC (permalink / raw)
To: deweese; +Cc: gcc
> We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
> we noticed that object file sizes have gone up significantly (~50%)
> between the two releases (On Sparc). So, I thought I would take a
> look at some of the generated asm. I noticed what looked to be dead
> code that was generated. Based on my observations I have generated a
> small test case:
Thanks for your bug report. My (shallow) analysis agrees with yours;
it indeed appears that this is dead code. Notice how the code
following LL5 is similar to the one following LL3.
I have not tried to understand in detail why this code is emitted. It
could be either a duplication of exception handlers, or it could be
cover some case that does not appear in your example.
Please understand that I'm not terribly worried about this code. I
compiled your code with optimization (-O2); the assembler code
generated for that is attached. Long and obfuscated machine code in
non-optimizing mode is not strictly a bug - incorrect code would be a
bug. Also, dead code in optimizing mode could be a bug.
If you want to understand in detail why this code is generated, you
may try using the "-da" option. This dumps a number of internal
structures (RTL) into files. Those structures are later used to
generate assembler output.
Hope this helps,
Martin
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
mov 1, %o0
call __builtin_new, 0
sethi %hi(fu8), %l0
st %o0, [%l0+%lo(fu8)]
ret
restore
.LLFE1:
.LLfe1:
.size reset__Fv,.LLfe1-reset__Fv
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 15:22 ` Joern Rennecke
@ 1999-11-03 15:40 ` Richard Henderson
1999-11-30 23:37 ` Richard Henderson
1999-11-30 23:37 ` Joern Rennecke
1 sibling, 1 reply; 18+ messages in thread
From: Richard Henderson @ 1999-11-03 15:40 UTC (permalink / raw)
To: Joern Rennecke; +Cc: deweese, martin, gcc
On Wed, Nov 03, 1999 at 11:20:40PM +0000, Joern Rennecke wrote:
> If the code was written by the user, it should not be deleted at -O0.
> setting breakpoints and looking at the compiled code shouldn't yield
> any surprises at -O0.
Oh please.
We can quite safely provide a single unreachable nop
(instead of multiple kilobytes of code) on which to
set such worthless breakpoints.
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 10:32 ` Richard Henderson
@ 1999-11-03 15:22 ` Joern Rennecke
1999-11-03 15:40 ` Richard Henderson
1999-11-30 23:37 ` Joern Rennecke
1999-11-30 23:37 ` Richard Henderson
1 sibling, 2 replies; 18+ messages in thread
From: Joern Rennecke @ 1999-11-03 15:22 UTC (permalink / raw)
To: Richard Henderson; +Cc: deweese, martin, gcc
> We make no attempt to remove dead code in the -O0 case.
> Which IMO is a mistake, but not one I've gotten around
> to correcting just yet.
If the code was written by the user, it should not be deleted at -O0.
setting breakpoints and looking at the compiled code shouldn't yield
any surprises at -O0.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-03 4:47 ` Thomas E Deweese
@ 1999-11-03 10:32 ` Richard Henderson
1999-11-03 15:22 ` Joern Rennecke
1999-11-30 23:37 ` Richard Henderson
1999-11-30 23:37 ` Thomas E Deweese
1 sibling, 2 replies; 18+ messages in thread
From: Richard Henderson @ 1999-11-03 10:32 UTC (permalink / raw)
To: Thomas E Deweese; +Cc: Martin v. Loewis, gcc
On Wed, Nov 03, 1999 at 07:47:15AM -0500, Thomas E Deweese wrote:
> However, emitting significant dead code is pretty bad, even in the
> debug case.
We make no attempt to remove dead code in the -O0 case.
Which IMO is a mistake, but not one I've gotten around
to correcting just yet.
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-02 23:46 ` Martin v. Loewis
@ 1999-11-03 4:47 ` Thomas E Deweese
1999-11-03 10:32 ` Richard Henderson
1999-11-30 23:37 ` Thomas E Deweese
1999-11-30 23:37 ` Martin v. Loewis
1 sibling, 2 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-03 4:47 UTC (permalink / raw)
To: Martin v. Loewis; +Cc: gcc
>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:
>> Based on my observations I have generated a small test case:
ML> I have not tried to understand in detail why this code is
ML> emitted. It could be either a duplication of exception handlers,
ML> or it could be cover some case that does not appear in your
ML> example.
This was part of the reason I posted the note. Flow
analysis obviously gets complicated around exception handling. While
the code looked dead I know I didn't/don't know enough to be sure.
ML> Please understand that I'm not terribly worried about this code. I
ML> compiled your code with optimization (-O2); the assembler code
ML> generated for that is attached. Long and obfuscated machine code
ML> in non-optimizing mode is not strictly a bug - incorrect code
ML> would be a bug. Also, dead code in optimizing mode could be a bug.
I can understand your interest in fixing 'bigger things',
given the speed with which C++ has evolved in the recent past.
However, emitting significant dead code is pretty bad, even in the
debug case.
With respect to your last sentence, with a small change to
the example (one I didn't think to try until after I sent message) the
dead code remains at all optimization levels. I'm fairly sure that
this happens because the compiler can no longer determine that no
error can be thrown (via inlining).
ML> If you want to understand in detail why this code is generated,
ML> you may try using the "-da" option. This dumps a number of
ML> internal structures (RTL) into files. Those structures are later
ML> used to generate assembler output.
Thanks for the info, I may do this.
I'll let you know if I come up with anything...
------------------------------------------------------------------------
Updated example that leaves the duplicate code at all '-O' levels:
------------------------------------------------------------------------
#include <stdlib.h>
struct Foo {
Foo(); // Note: Body no longer provided inline (force 'call').
};
Foo *fu8=NULL;
void reset(void) {
fu8 = new Foo();
}
------------------------------------------------
Asm Output compiled: g++ -O3 -S Foo.cc
Note: LL3 & LL5
------------------------------------------------
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
call __builtin_new, 0
mov 1, %o0
mov %o0, %l0
.LLEHB3:
call __3Foo, 0
mov 1, %l1
sethi %hi(fu8), %o1
st %o0, [%o1+%lo(fu8)]
.LLEHE3:
b .LL5
mov 0, %l1
.LL4:
call __throw, 0
nop
.LL5:
cmp %l1, 0
be .LL14
nop
call __builtin_delete, 0
mov %l0, %o0
b,a .LL14
.LLEHB11:
.LL3:
cmp %l1, 0
be .LL4
nop
call __builtin_delete, 0
mov %l0, %o0
b,a .LL4
.LLEHE11:
.LL11:
call terminate__Fv, 0
nop
.LL14:
ret
restore
--
Thomas DeWeese
deweese@kodak.com
"The only difference between theory and practice is
that in theory there isn't any." -- unknown
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Useless code generated by gcc 2.95.2????
1999-11-01 13:43 Thomas E Deweese
@ 1999-11-02 23:46 ` Martin v. Loewis
1999-11-03 4:47 ` Thomas E Deweese
1999-11-30 23:37 ` Martin v. Loewis
1999-11-30 23:37 ` Thomas E Deweese
1 sibling, 2 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-02 23:46 UTC (permalink / raw)
To: deweese; +Cc: gcc
> We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
> we noticed that object file sizes have gone up significantly (~50%)
> between the two releases (On Sparc). So, I thought I would take a
> look at some of the generated asm. I noticed what looked to be dead
> code that was generated. Based on my observations I have generated a
> small test case:
Thanks for your bug report. My (shallow) analysis agrees with yours;
it indeed appears that this is dead code. Notice how the code
following LL5 is similar to the one following LL3.
I have not tried to understand in detail why this code is emitted. It
could be either a duplication of exception handlers, or it could be
cover some case that does not appear in your example.
Please understand that I'm not terribly worried about this code. I
compiled your code with optimization (-O2); the assembler code
generated for that is attached. Long and obfuscated machine code in
non-optimizing mode is not strictly a bug - incorrect code would be a
bug. Also, dead code in optimizing mode could be a bug.
If you want to understand in detail why this code is generated, you
may try using the "-da" option. This dumps a number of internal
structures (RTL) into files. Those structures are later used to
generate assembler output.
Hope this helps,
Martin
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
mov 1, %o0
call __builtin_new, 0
sethi %hi(fu8), %l0
st %o0, [%l0+%lo(fu8)]
ret
restore
.LLFE1:
.LLfe1:
.size reset__Fv,.LLfe1-reset__Fv
^ permalink raw reply [flat|nested] 18+ messages in thread
* Useless code generated by gcc 2.95.2????
@ 1999-11-01 13:43 Thomas E Deweese
1999-11-02 23:46 ` Martin v. Loewis
1999-11-30 23:37 ` Thomas E Deweese
0 siblings, 2 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-01 13:43 UTC (permalink / raw)
To: gcc
Hi all,
We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
we noticed that object file sizes have gone up significantly (~50%)
between the two releases (On Sparc). So, I thought I would take a
look at some of the generated asm. I noticed what looked to be dead
code that was generated. Based on my observations I have generated a
small test case:
------------------------------------------------------------------------
#include <stdlib.h>
template <class T>
class Foo {
public:
Foo() { }
};
Foo<unsigned char> *fu8=NULL;
void reset(void) {
fu8 = new Foo<unsigned char>();
}
------------------------------------------------------------------------
I have included complete asm and verbose compile dumps at the end of
the message but I have brought out the part that most interests me up
here. My basic question is: How can it ever reach '.LL5' with %l1
equal to zero?
It looks to me like the deletes following '.LL5' and '.LL3'
are duplicates and that the delete at '.LL5' can never be reached (and
if it were to be reached the result would be a double delete).
Here is my thinking:
1) The only way to reach '.LL5' without having cleared %l1 is
for an error to be thrown in the call to the constructor.
2) This (if I understand how errorhandling is done) would lead
to a jump to '.LL3' which _would_ delete the 'Foo', then
jump to the call to __throw at '.LL4'.
3) If that were to ever return (which I don't think it will)
then the delete following '.LL5' would also be called (a
double delete of the contents of %l0).
Is there some unseen way for the system to branch to
'.LLEHE3'? (through the error handler?)
I missing something, or is that truely dead code?
I don't pretend to be an expert on the internals of g++, asm,
or gcc's error handling mechanisms, but that duplicate code just looks
suspicious to me. Any help on this would be greatly appreciated.
I doubt this would fully account for my 50% code growth but
it's a start.
------------------------------------------------------------------------
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
sethi %hi(fu8), %o0
or %o0, %lo(fu8), %l2
mov 1, %o0
call __builtin_new, 0
nop
mov %o0, %l0
mov 1, %l1
.LLEHB3:
mov %l0, %o0
call __t3Foo1ZUc, 0
nop
mov 0, %l1
st %o0, [%l2]
.LLEHE3:
b .LL5
nop
.LL4:
call __throw, 0
nop
.LL5:
cmp %l1, 0
be .LL7
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL7
nop
.LL6:
.LL7:
b .LL10
nop
.LLEHB11:
.LL3:
cmp %l1, 0
be .LL9
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL9
nop
.LL8:
.LL9:
b .LL4
nop
------------------------------------------------------------------------
I compiled my example with the following line:
g++ -fverbose-asm --verbose Foo.cc -o Foo-295-3.s -S
Reading specs from /freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/specs
gcc version 2.95.2 19991024 (release)
/freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cpp -lang-c++ -v -D__GNUC__=2 -D__GNUG__=2 -D__GNUC_MINOR__=95 -D__cplusplus -Dsparc -Dsun -Dunix -D__svr4__ -D__SVR4 -D__sparc__ -D__sun__ -D__unix__ -D__svr4__ -D__SVR4 -D__sparc -D__sun -D__unix -Asystem(unix) -Asystem(svr4) -D__EXCEPTIONS -D__GCC_NEW_VARARGS__ -Acpu(sparc) -Amachine(sparc) Foo.cc /home/deweese/tmp/ccIwWesd.ii
GNU CPP version 2.95.2 19991024 (release) (sparc)
[...]
/freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cc1plus /home/deweese/tmp/ccIwWesd.ii -quiet -dumpbase Foo.cc -version -fverbose-asm -o Foo-295-3.s
GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).
------------------------------------------------------------------------
The generated assembly:
.file "Foo.cc"
! GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).
! options passed: -fverbose-asm
! options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
! -fpcc-struct-return -fsched-interblock -fsched-spec -fexceptions -fcommon
! -fverbose-asm -fgnu-linker -fargument-alias -fident -mepilogue -mapp-regs
gcc2_compiled.:
.global fu8
.section ".data"
.align 4
.type fu8,#object
.size fu8,4
fu8:
.uaword 0
.global __throw
.section ".text"
.align 4
.global reset__Fv
.type reset__Fv,#function
.proc 020
reset__Fv:
.LLFB1:
!#PROLOGUE# 0
save %sp, -112, %sp
.LLCFI0:
!#PROLOGUE# 1
sethi %hi(fu8), %o0
or %o0, %lo(fu8), %l2
mov 1, %o0
call __builtin_new, 0
nop
mov %o0, %l0
mov 1, %l1
.LLEHB3:
mov %l0, %o0
call __t3Foo1ZUc, 0
nop
mov 0, %l1
st %o0, [%l2]
.LLEHE3:
b .LL5
nop
.LL4:
call __throw, 0
nop
.LL5:
cmp %l1, 0
be .LL7
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL7
nop
.LL6:
.LL7:
b .LL10
nop
.LLEHB11:
.LL3:
cmp %l1, 0
be .LL9
nop
mov %l0, %o0
call __builtin_delete, 0
nop
b .LL9
nop
.LL8:
.LL9:
b .LL4
nop
.LLEHE11:
b .LL11
nop
.LL12:
call __throw, 0
nop
.LL13:
.LL11:
call terminate__Fv, 0
nop
.LL10:
b .LL14
nop
b .LL2
nop
.LL14:
.LL2:
ret
restore
.LLFE1:
.LLfe1:
.size reset__Fv,.LLfe1-reset__Fv
.section ".gcc_except_table",#alloc,#write
.align 4
__EXCEPTION_TABLE__:
.uaword .LLEHB3
.uaword .LLEHE3
.uaword .LL3
.uaword .LLEHB11
.uaword .LLEHE11
.uaword .LL11
.LLRTH1:
.uaword -1
.uaword -1
.section ".eh_frame",#alloc,#write
__FRAME_BEGIN__:
.uaword .LLECIE1-.LLSCIE1
.LLSCIE1:
.uaword 0x0
.byte 0x1
.asciz "eh"
.uaword __EXCEPTION_TABLE__
.byte 0x1
.byte 0x7c
.byte 0x65
.byte 0xc
.byte 0xe
.byte 0x0
.byte 0x9
.byte 0x65
.byte 0xf
.align 4
.LLECIE1:
.uaword .LLEFDE1-.LLSFDE1
.LLSFDE1:
.uaword .LLSFDE1-__FRAME_BEGIN__
.uaword .LLFB1
.uaword .LLFE1-.LLFB1
.byte 0x4
.uaword .LLCFI0-.LLFB1
.byte 0xd
.byte 0x1e
.byte 0x2d
.byte 0x9
.byte 0x65
.byte 0x1f
.align 4
.LLEFDE1:
.ident "GCC: (GNU) 2.95.2 19991024 (release)"
--
Thomas DeWeese
deweese@kodak.com
"The only difference between theory and practice is
that in theory there isn't any." -- unknown
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~1999-11-30 23:37 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-11-03 14:22 Useless code generated by gcc 2.95.2???? Mike Stump
1999-11-03 16:06 ` Martin v. Loewis
1999-11-06 7:36 ` Thomas E Deweese
1999-11-30 23:37 ` Thomas E Deweese
1999-11-30 23:37 ` Martin v. Loewis
1999-11-30 23:37 ` Mike Stump
-- strict thread matches above, loose matches on Subject: below --
1999-11-01 13:43 Thomas E Deweese
1999-11-02 23:46 ` Martin v. Loewis
1999-11-03 4:47 ` Thomas E Deweese
1999-11-03 10:32 ` Richard Henderson
1999-11-03 15:22 ` Joern Rennecke
1999-11-03 15:40 ` Richard Henderson
1999-11-30 23:37 ` Richard Henderson
1999-11-30 23:37 ` Joern Rennecke
1999-11-30 23:37 ` Richard Henderson
1999-11-30 23:37 ` Thomas E Deweese
1999-11-30 23:37 ` Martin v. Loewis
1999-11-30 23:37 ` Thomas E Deweese
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).