public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Useless code generated by gcc 2.95.2????
@ 1999-11-03 14:22 Mike Stump
  1999-11-03 16:06 ` Martin v. Loewis
  1999-11-30 23:37 ` Mike Stump
  0 siblings, 2 replies; 18+ messages in thread
From: Mike Stump @ 1999-11-03 14:22 UTC (permalink / raw)
  To: deweese, martin; +Cc: gcc

LL5 is dead and should be removed.  Region 3 is near trivial and it
might be able to be removed, but it is in some sense, correct.  LL11
is used to protect LL3.  Technically operator delete can't throw, and
if we knew that and used that information then we could get rid of
LL11.  Last time I worked on the code, it didn't have that type of
information at it's disposal (meaning LL11 could not be deleted).

I don't think some of the extra code is caused by a bug, but rather
the addition of a feature.  LL5 non-removal however is a bug, that I
suspect someone introduced.  A binary search of cvs could tell how
broke it and how they broke it.  That would then prompt them into
fixing it (maybe).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 14:22 Useless code generated by gcc 2.95.2???? Mike Stump
@ 1999-11-03 16:06 ` Martin v. Loewis
  1999-11-06  7:36   ` Thomas E Deweese
  1999-11-30 23:37   ` Martin v. Loewis
  1999-11-30 23:37 ` Mike Stump
  1 sibling, 2 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-03 16:06 UTC (permalink / raw)
  To: mrs; +Cc: deweese, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]

> I don't think some of the extra code is caused by a bug, but rather
> the addition of a feature.  LL5 non-removal however is a bug, that I
> suspect someone introduced.  A binary search of cvs could tell how
> broke it and how they broke it.  That would then prompt them into
> fixing it (maybe).

Indeed. As we've found later on, the mainline *does* remove the
additional __builtin_delete call.

I'm not sure whether this really is a regression over egcs 1.1 - in
egcs 1.1, the processing of the implicit call to the deallocator was
quite different. So it may not be the case that egcs 2.9x (with x>3,
x<6) ever did the right thing. FWIW, the change activating the current
front-end behavior was

1998-10-22  Martin von Löwis  <loewis@informatik.hu-berlin.de>

	* init.c (build_new_1): Delay cleanup until end of full expression.

So if the binary search yields the source of the problem, I agree that
the relevant patch should be considered for 2.95.3 (as 2.95 now
produces a significant increase in code size).

Regards,
Martin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 16:06 ` Martin v. Loewis
@ 1999-11-06  7:36   ` Thomas E Deweese
  1999-11-30 23:37     ` Thomas E Deweese
  1999-11-30 23:37   ` Martin v. Loewis
  1 sibling, 1 reply; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-06  7:36 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: mrs, deweese, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3854 bytes --]

	What follows is a summary of discussions between Martin and
myself which have uncovered a cleaner way to handle call's to new.

	I'm also willing to try my hand at implementing the change but
I could definitely use a few pointers.

>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:

>> I don't think some of the extra code is caused by a bug, but rather
>> the addition of a feature.  LL5 non-removal however is a bug, that
>> I suspect someone introduced.  A binary search of cvs could tell
>> how broke it and how they broke it.  That would then prompt them
>> into fixing it (maybe).

ML> Indeed. As we've found later on, the mainline *does* remove the
ML> additional __builtin_delete call.

ML> I'm not sure whether this really is a regression over egcs 1.1 -
ML> in egcs 1.1, the processing of the implicit call to the
ML> deallocator was quite different. So it may not be the case that
ML> egcs 2.9x (with x>3, x<6) ever did the right thing. FWIW, the
ML> change activating the current front-end behavior was

ML> 1998-10-22 Martin von Löwis <loewis@informatik.hu-berlin.de>

ML> * init.c (build_new_1): Delay cleanup until end of full
ML> 	  expression.


	Martin and I have discussed the reasons behind the change
introduced above.  It concerns the life span of temporaries in the
call to a constructor called via new who's results is a parameter to a
function call.  Here is the example we used:


f(new Foo(A(), B(), ...), ...);

	In particular the lifespan of A, B must extend until
'f' returns (which wasn't true in egcs 1.1).  This turns out to be
very tricky to accomplish given the requirement that any memory
allocated by new must be deallocated if, Foo::Foo, A::A, or B::B throw
an error.  This is because in the simplest expansion of this code:

	 {
	   Foo *tmp = ::new(sizeof(Foo), ...);
   +- +-
   |  |	   A a();
   |  1	   B b();
   2  |	   Foo::Foo(tmp, a, b, ...);
   |  +-
   |       f(tmp, ...)
   +-----}

	You need to catch and delete 'tmp' in region 1 but the
lifespan of a & b needs to be '2'.  This is not trivial (possible?) to
communicate to the back end, and it is impossible for normal C++ code
to generate such a situation.

	The patch referenced above introduced a sentry variable which
prevented the error handler from deleting the allocated memory after
region 1 completed, although the error handler was in effect for all
of region 2.

	Fortunately, the C++ standard allows you some freedom in
rewriting this simple version of the code, in particular you are free
to rearrange the call to new with the construction of parameters for
the constructor [see 5.3.4 sec 21 in final C++ Spec (sec 22 in CD2)]:

	    Whether the allocation function is called before
	evaluating the constructor arguments or after evaluating the
	constructor arguments but before entering the constructor is
	unspecified. It is also unspecified whether the arguments to a
	constructor are evaluated if the allocation function returns
	the null pointer or exits using an exception.

Given this we can safely rewrite the above code as:

{
  A a();
  B b();

  Foo *tmp = ::new(sizeof(Foo), ...);
  try {
    Foo::Foo(tmp, a, b);
  } catch(...) {
   ::delete(tmp, ...);
  }

  f(tmp, ...);
}

	In this case if the construction of 'a', or 'b' throws an
error you don't need to catch the error and free the memory since you
haven't allocated it yet!!!

	This should (hopefully) lead to much less convoluted code in
all cases.  However, now that I have a possible solution I would
appreciate pointers to any places where similar sorts of rewriting are
done, as well as where in the c++ font end such rewriting should be
done.

	Any help and or comments are appreciated.

-- 
							Thomas DeWeese
deweese@kodak.com
			"The only difference between theory and practice is
			 that in theory there isn't any." -- unknown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-06  7:36   ` Thomas E Deweese
@ 1999-11-30 23:37     ` Thomas E Deweese
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-30 23:37 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: mrs, deweese, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3854 bytes --]

	What follows is a summary of discussions between Martin and
myself which have uncovered a cleaner way to handle call's to new.

	I'm also willing to try my hand at implementing the change but
I could definitely use a few pointers.

>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:

>> I don't think some of the extra code is caused by a bug, but rather
>> the addition of a feature.  LL5 non-removal however is a bug, that
>> I suspect someone introduced.  A binary search of cvs could tell
>> how broke it and how they broke it.  That would then prompt them
>> into fixing it (maybe).

ML> Indeed. As we've found later on, the mainline *does* remove the
ML> additional __builtin_delete call.

ML> I'm not sure whether this really is a regression over egcs 1.1 -
ML> in egcs 1.1, the processing of the implicit call to the
ML> deallocator was quite different. So it may not be the case that
ML> egcs 2.9x (with x>3, x<6) ever did the right thing. FWIW, the
ML> change activating the current front-end behavior was

ML> 1998-10-22 Martin von Löwis <loewis@informatik.hu-berlin.de>

ML> * init.c (build_new_1): Delay cleanup until end of full
ML> 	  expression.


	Martin and I have discussed the reasons behind the change
introduced above.  It concerns the life span of temporaries in the
call to a constructor called via new who's results is a parameter to a
function call.  Here is the example we used:


f(new Foo(A(), B(), ...), ...);

	In particular the lifespan of A, B must extend until
'f' returns (which wasn't true in egcs 1.1).  This turns out to be
very tricky to accomplish given the requirement that any memory
allocated by new must be deallocated if, Foo::Foo, A::A, or B::B throw
an error.  This is because in the simplest expansion of this code:

	 {
	   Foo *tmp = ::new(sizeof(Foo), ...);
   +- +-
   |  |	   A a();
   |  1	   B b();
   2  |	   Foo::Foo(tmp, a, b, ...);
   |  +-
   |       f(tmp, ...)
   +-----}

	You need to catch and delete 'tmp' in region 1 but the
lifespan of a & b needs to be '2'.  This is not trivial (possible?) to
communicate to the back end, and it is impossible for normal C++ code
to generate such a situation.

	The patch referenced above introduced a sentry variable which
prevented the error handler from deleting the allocated memory after
region 1 completed, although the error handler was in effect for all
of region 2.

	Fortunately, the C++ standard allows you some freedom in
rewriting this simple version of the code, in particular you are free
to rearrange the call to new with the construction of parameters for
the constructor [see 5.3.4 sec 21 in final C++ Spec (sec 22 in CD2)]:

	    Whether the allocation function is called before
	evaluating the constructor arguments or after evaluating the
	constructor arguments but before entering the constructor is
	unspecified. It is also unspecified whether the arguments to a
	constructor are evaluated if the allocation function returns
	the null pointer or exits using an exception.

Given this we can safely rewrite the above code as:

{
  A a();
  B b();

  Foo *tmp = ::new(sizeof(Foo), ...);
  try {
    Foo::Foo(tmp, a, b);
  } catch(...) {
   ::delete(tmp, ...);
  }

  f(tmp, ...);
}

	In this case if the construction of 'a', or 'b' throws an
error you don't need to catch the error and free the memory since you
haven't allocated it yet!!!

	This should (hopefully) lead to much less convoluted code in
all cases.  However, now that I have a possible solution I would
appreciate pointers to any places where similar sorts of rewriting are
done, as well as where in the c++ font end such rewriting should be
done.

	Any help and or comments are appreciated.

-- 
							Thomas DeWeese
deweese@kodak.com
			"The only difference between theory and practice is
			 that in theory there isn't any." -- unknown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 16:06 ` Martin v. Loewis
  1999-11-06  7:36   ` Thomas E Deweese
@ 1999-11-30 23:37   ` Martin v. Loewis
  1 sibling, 0 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-30 23:37 UTC (permalink / raw)
  To: mrs; +Cc: deweese, gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]

> I don't think some of the extra code is caused by a bug, but rather
> the addition of a feature.  LL5 non-removal however is a bug, that I
> suspect someone introduced.  A binary search of cvs could tell how
> broke it and how they broke it.  That would then prompt them into
> fixing it (maybe).

Indeed. As we've found later on, the mainline *does* remove the
additional __builtin_delete call.

I'm not sure whether this really is a regression over egcs 1.1 - in
egcs 1.1, the processing of the implicit call to the deallocator was
quite different. So it may not be the case that egcs 2.9x (with x>3,
x<6) ever did the right thing. FWIW, the change activating the current
front-end behavior was

1998-10-22  Martin von Löwis  <loewis@informatik.hu-berlin.de>

	* init.c (build_new_1): Delay cleanup until end of full expression.

So if the binary search yields the source of the problem, I agree that
the relevant patch should be considered for 2.95.3 (as 2.95 now
produces a significant increase in code size).

Regards,
Martin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 14:22 Useless code generated by gcc 2.95.2???? Mike Stump
  1999-11-03 16:06 ` Martin v. Loewis
@ 1999-11-30 23:37 ` Mike Stump
  1 sibling, 0 replies; 18+ messages in thread
From: Mike Stump @ 1999-11-30 23:37 UTC (permalink / raw)
  To: deweese, martin; +Cc: gcc

LL5 is dead and should be removed.  Region 3 is near trivial and it
might be able to be removed, but it is in some sense, correct.  LL11
is used to protect LL3.  Technically operator delete can't throw, and
if we knew that and used that information then we could get rid of
LL11.  Last time I worked on the code, it didn't have that type of
information at it's disposal (meaning LL11 could not be deleted).

I don't think some of the extra code is caused by a bug, but rather
the addition of a feature.  LL5 non-removal however is a bug, that I
suspect someone introduced.  A binary search of cvs could tell how
broke it and how they broke it.  That would then prompt them into
fixing it (maybe).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 15:22       ` Joern Rennecke
  1999-11-03 15:40         ` Richard Henderson
@ 1999-11-30 23:37         ` Joern Rennecke
  1 sibling, 0 replies; 18+ messages in thread
From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: deweese, martin, gcc

> We make no attempt to remove dead code in the -O0 case.
> Which IMO is a mistake, but not one I've gotten around
> to correcting just yet.

If the code was written by the user, it should not be deleted at -O0.
setting breakpoints and looking at the compiled code shouldn't yield
any surprises at -O0.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03  4:47   ` Thomas E Deweese
  1999-11-03 10:32     ` Richard Henderson
@ 1999-11-30 23:37     ` Thomas E Deweese
  1 sibling, 0 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-30 23:37 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: gcc

>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:

>> Based on my observations I have generated a small test case:

ML> I have not tried to understand in detail why this code is
ML> emitted. It could be either a duplication of exception handlers,
ML> or it could be cover some case that does not appear in your
ML> example.

	This was part of the reason I posted the note.  Flow
analysis obviously gets complicated around exception handling.  While
the code looked dead I know I didn't/don't know enough to be sure.

ML> Please understand that I'm not terribly worried about this code. I
ML> compiled your code with optimization (-O2); the assembler code
ML> generated for that is attached. Long and obfuscated machine code
ML> in non-optimizing mode is not strictly a bug - incorrect code
ML> would be a bug. Also, dead code in optimizing mode could be a bug.

	I can understand your interest in fixing 'bigger things',
given the speed with which C++ has evolved in the recent past.
However, emitting significant dead code is pretty bad, even in the
debug case.

	With respect to your last sentence, with a small change to
the example (one I didn't think to try until after I sent message) the
dead code remains at all optimization levels.  I'm fairly sure that
this happens because the compiler can no longer determine that no
error can be thrown (via inlining).

ML> If you want to understand in detail why this code is generated,
ML> you may try using the "-da" option. This dumps a number of
ML> internal structures (RTL) into files. Those structures are later
ML> used to generate assembler output.

	Thanks for the info, I may do this. 
	I'll let you know if I come up with anything...

------------------------------------------------------------------------

Updated example that leaves the duplicate code at all '-O' levels:

------------------------------------------------------------------------

#include <stdlib.h>
struct Foo {
    Foo();      // Note: Body no longer provided inline (force 'call').
};

Foo *fu8=NULL;

void reset(void) {
  fu8 = new Foo();
}


------------------------------------------------

Asm Output compiled: g++ -O3 -S Foo.cc
Note: LL3 & LL5

------------------------------------------------

reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	call	__builtin_new, 0
	mov	1, %o0
	mov	%o0, %l0
.LLEHB3:
	call	__3Foo, 0
	mov	1, %l1
	sethi	%hi(fu8), %o1
	st	%o0, [%o1+%lo(fu8)]
.LLEHE3:
	b	.LL5
	mov	0, %l1
.LL4:
	call	__throw, 0
	 nop
.LL5:
	cmp	%l1, 0
	be	.LL14
	nop
	call	__builtin_delete, 0
	mov	%l0, %o0
	b,a	.LL14
.LLEHB11:
.LL3:
	cmp	%l1, 0
	be	.LL4
	nop
	call	__builtin_delete, 0
	mov	%l0, %o0
	b,a	.LL4
.LLEHE11:
.LL11:
	call	terminate__Fv, 0
	 nop
.LL14:
	ret
	restore

-- 
							Thomas DeWeese
deweese@kodak.com
			"The only difference between theory and practice is
			 that in theory there isn't any." -- unknown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 10:32     ` Richard Henderson
  1999-11-03 15:22       ` Joern Rennecke
@ 1999-11-30 23:37       ` Richard Henderson
  1 sibling, 0 replies; 18+ messages in thread
From: Richard Henderson @ 1999-11-30 23:37 UTC (permalink / raw)
  To: Thomas E Deweese; +Cc: Martin v. Loewis, gcc

On Wed, Nov 03, 1999 at 07:47:15AM -0500, Thomas E Deweese wrote:
> However, emitting significant dead code is pretty bad, even in the
> debug case.

We make no attempt to remove dead code in the -O0 case.
Which IMO is a mistake, but not one I've gotten around
to correcting just yet.


r~

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 15:40         ` Richard Henderson
@ 1999-11-30 23:37           ` Richard Henderson
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 1999-11-30 23:37 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: deweese, martin, gcc

On Wed, Nov 03, 1999 at 11:20:40PM +0000, Joern Rennecke wrote:
> If the code was written by the user, it should not be deleted at -O0.
> setting breakpoints and looking at the compiled code shouldn't yield
> any surprises at -O0.

Oh please. 

We can quite safely provide a single unreachable nop
(instead of multiple kilobytes of code) on which to
set such worthless breakpoints.



r~

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Useless code generated by gcc 2.95.2????
  1999-11-01 13:43 Thomas E Deweese
  1999-11-02 23:46 ` Martin v. Loewis
@ 1999-11-30 23:37 ` Thomas E Deweese
  1 sibling, 0 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-30 23:37 UTC (permalink / raw)
  To: gcc

Hi all,

	We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
we noticed that object file sizes have gone up significantly (~50%)
between the two releases (On Sparc).  So, I thought I would take a
look at some of the generated asm. I noticed what looked to be dead
code that was generated. Based on my observations I have generated a
small test case:

------------------------------------------------------------------------

#include <stdlib.h>
template <class T>
class Foo {
  public:
    Foo() { }
};

Foo<unsigned char> *fu8=NULL;

void reset(void) {
  fu8 = new Foo<unsigned char>();
}

------------------------------------------------------------------------

I have included complete asm and verbose compile dumps at the end of
the message but I have brought out the part that most interests me up
here.  My basic question is: How can it ever reach '.LL5' with %l1
equal to zero?

	It looks to me like the deletes following '.LL5' and '.LL3'
are duplicates and that the delete at '.LL5' can never be reached (and
if it were to be reached the result would be a double delete).

Here is my thinking:
	1) The only way to reach '.LL5' without having cleared %l1 is
	   for an error to be thrown in the call to the constructor.  

	2) This (if I understand how errorhandling is done) would lead
	   to a jump to '.LL3' which _would_ delete the 'Foo', then
	   jump to the call to __throw at '.LL4'.

	3) If that were to ever return (which I don't think it will)
	   then the delete following '.LL5' would also be called (a
	   double delete of the contents of %l0).

	Is there some unseen way for the system to branch to
'.LLEHE3'?  (through the error handler?) 
	I missing something, or is that truely dead code? 

	I don't pretend to be an expert on the internals of g++, asm,
or gcc's error handling mechanisms, but that duplicate code just looks
suspicious to me.  Any help on this would be greatly appreciated.

	I doubt this would fully account for my 50% code growth but
it's a start.

------------------------------------------------------------------------

reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	sethi	%hi(fu8), %o0
	or	%o0, %lo(fu8), %l2
	mov	1, %o0
	call	__builtin_new, 0
	 nop
	mov	%o0, %l0
	mov	1, %l1
.LLEHB3:
	mov	%l0, %o0
	call	__t3Foo1ZUc, 0
	 nop
	mov	0, %l1
	st	%o0, [%l2]
.LLEHE3:
	b	.LL5
	 nop
.LL4:
	call	__throw, 0
	 nop
.LL5:
	cmp	%l1, 0
	be	.LL7
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL7
	 nop
.LL6:
.LL7:
	b	.LL10
	 nop
.LLEHB11:
.LL3:
	cmp	%l1, 0
	be	.LL9
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL9
	 nop
.LL8:
.LL9:
	b	.LL4
	 nop

------------------------------------------------------------------------

I compiled my example with the following line:

g++ -fverbose-asm --verbose Foo.cc -o Foo-295-3.s -S 

Reading specs from /freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/specs
gcc version 2.95.2 19991024 (release)
 /freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cpp -lang-c++ -v -D__GNUC__=2 -D__GNUG__=2 -D__GNUC_MINOR__=95 -D__cplusplus -Dsparc -Dsun -Dunix -D__svr4__ -D__SVR4 -D__sparc__ -D__sun__ -D__unix__ -D__svr4__ -D__SVR4 -D__sparc -D__sun -D__unix -Asystem(unix) -Asystem(svr4) -D__EXCEPTIONS -D__GCC_NEW_VARARGS__ -Acpu(sparc) -Amachine(sparc) Foo.cc /home/deweese/tmp/ccIwWesd.ii
GNU CPP version 2.95.2 19991024 (release) (sparc)
[...]
/freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cc1plus /home/deweese/tmp/ccIwWesd.ii -quiet -dumpbase Foo.cc -version -fverbose-asm -o Foo-295-3.s
GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).

------------------------------------------------------------------------

The generated assembly:

	.file	"Foo.cc"
! GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).
! options passed:  -fverbose-asm
! options enabled:  -fpeephole -ffunction-cse -fkeep-static-consts
! -fpcc-struct-return -fsched-interblock -fsched-spec -fexceptions -fcommon
! -fverbose-asm -fgnu-linker -fargument-alias -fident -mepilogue -mapp-regs

gcc2_compiled.:
	.global fu8
.section	".data"
	.align 4
	.type	 fu8,#object
	.size	 fu8,4
fu8:
	.uaword	0
	.global __throw
.section	".text"
	.align 4
	.global reset__Fv
	.type	 reset__Fv,#function
	.proc	020
reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	sethi	%hi(fu8), %o0
	or	%o0, %lo(fu8), %l2
	mov	1, %o0
	call	__builtin_new, 0
	 nop
	mov	%o0, %l0
	mov	1, %l1
.LLEHB3:
	mov	%l0, %o0
	call	__t3Foo1ZUc, 0
	 nop
	mov	0, %l1
	st	%o0, [%l2]
.LLEHE3:
	b	.LL5
	 nop
.LL4:
	call	__throw, 0
	 nop
.LL5:
	cmp	%l1, 0
	be	.LL7
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL7
	 nop
.LL6:
.LL7:
	b	.LL10
	 nop
.LLEHB11:
.LL3:
	cmp	%l1, 0
	be	.LL9
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL9
	 nop
.LL8:
.LL9:
	b	.LL4
	 nop
.LLEHE11:
	b	.LL11
	 nop
.LL12:
	call	__throw, 0
	 nop
.LL13:
.LL11:
	call	terminate__Fv, 0
	 nop
.LL10:
	b	.LL14
	 nop
	b	.LL2
	 nop
.LL14:
.LL2:
	ret
	restore
.LLFE1:
.LLfe1:
	.size	 reset__Fv,.LLfe1-reset__Fv
.section	".gcc_except_table",#alloc,#write
	.align 4
__EXCEPTION_TABLE__:
	.uaword	.LLEHB3
	.uaword	.LLEHE3
	.uaword	.LL3

	.uaword	.LLEHB11
	.uaword	.LLEHE11
	.uaword	.LL11

.LLRTH1:
	.uaword	-1
	.uaword	-1


.section	".eh_frame",#alloc,#write
__FRAME_BEGIN__:
	.uaword	.LLECIE1-.LLSCIE1
.LLSCIE1:
	.uaword	0x0
	.byte	0x1
	.asciz	"eh"

	.uaword	__EXCEPTION_TABLE__
	.byte	0x1
	.byte	0x7c
	.byte	0x65
	.byte	0xc
	.byte	0xe
	.byte	0x0
	.byte	0x9
	.byte	0x65
	.byte	0xf
	.align 4
.LLECIE1:
	.uaword	.LLEFDE1-.LLSFDE1
.LLSFDE1:
	.uaword	.LLSFDE1-__FRAME_BEGIN__
	.uaword	.LLFB1
	.uaword	.LLFE1-.LLFB1
	.byte	0x4
	.uaword	.LLCFI0-.LLFB1
	.byte	0xd
	.byte	0x1e
	.byte	0x2d
	.byte	0x9
	.byte	0x65
	.byte	0x1f
	.align 4
.LLEFDE1:
	.ident	"GCC: (GNU) 2.95.2 19991024 (release)"

-- 
							Thomas DeWeese
deweese@kodak.com
			"The only difference between theory and practice is
			 that in theory there isn't any." -- unknown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-02 23:46 ` Martin v. Loewis
  1999-11-03  4:47   ` Thomas E Deweese
@ 1999-11-30 23:37   ` Martin v. Loewis
  1 sibling, 0 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-30 23:37 UTC (permalink / raw)
  To: deweese; +Cc: gcc

> 	We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
> we noticed that object file sizes have gone up significantly (~50%)
> between the two releases (On Sparc).  So, I thought I would take a
> look at some of the generated asm. I noticed what looked to be dead
> code that was generated. Based on my observations I have generated a
> small test case:


Thanks for your bug report. My (shallow) analysis agrees with yours;
it indeed appears that this is dead code. Notice how the code
following LL5 is similar to the one following LL3.

I have not tried to understand in detail why this code is emitted. It
could be either a duplication of exception handlers, or it could be
cover some case that does not appear in your example.

Please understand that I'm not terribly worried about this code. I
compiled your code with optimization (-O2); the assembler code
generated for that is attached. Long and obfuscated machine code in
non-optimizing mode is not strictly a bug - incorrect code would be a
bug. Also, dead code in optimizing mode could be a bug.

If you want to understand in detail why this code is generated, you
may try using the "-da" option. This dumps a number of internal
structures (RTL) into files. Those structures are later used to
generate assembler output.

Hope this helps,
Martin

reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	mov	1, %o0
	call	__builtin_new, 0
	sethi	%hi(fu8), %l0
	st	%o0, [%l0+%lo(fu8)]
	ret
	restore
.LLFE1:
.LLfe1:
	.size	 reset__Fv,.LLfe1-reset__Fv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 15:22       ` Joern Rennecke
@ 1999-11-03 15:40         ` Richard Henderson
  1999-11-30 23:37           ` Richard Henderson
  1999-11-30 23:37         ` Joern Rennecke
  1 sibling, 1 reply; 18+ messages in thread
From: Richard Henderson @ 1999-11-03 15:40 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: deweese, martin, gcc

On Wed, Nov 03, 1999 at 11:20:40PM +0000, Joern Rennecke wrote:
> If the code was written by the user, it should not be deleted at -O0.
> setting breakpoints and looking at the compiled code shouldn't yield
> any surprises at -O0.

Oh please. 

We can quite safely provide a single unreachable nop
(instead of multiple kilobytes of code) on which to
set such worthless breakpoints.



r~

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03 10:32     ` Richard Henderson
@ 1999-11-03 15:22       ` Joern Rennecke
  1999-11-03 15:40         ` Richard Henderson
  1999-11-30 23:37         ` Joern Rennecke
  1999-11-30 23:37       ` Richard Henderson
  1 sibling, 2 replies; 18+ messages in thread
From: Joern Rennecke @ 1999-11-03 15:22 UTC (permalink / raw)
  To: Richard Henderson; +Cc: deweese, martin, gcc

> We make no attempt to remove dead code in the -O0 case.
> Which IMO is a mistake, but not one I've gotten around
> to correcting just yet.

If the code was written by the user, it should not be deleted at -O0.
setting breakpoints and looking at the compiled code shouldn't yield
any surprises at -O0.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-03  4:47   ` Thomas E Deweese
@ 1999-11-03 10:32     ` Richard Henderson
  1999-11-03 15:22       ` Joern Rennecke
  1999-11-30 23:37       ` Richard Henderson
  1999-11-30 23:37     ` Thomas E Deweese
  1 sibling, 2 replies; 18+ messages in thread
From: Richard Henderson @ 1999-11-03 10:32 UTC (permalink / raw)
  To: Thomas E Deweese; +Cc: Martin v. Loewis, gcc

On Wed, Nov 03, 1999 at 07:47:15AM -0500, Thomas E Deweese wrote:
> However, emitting significant dead code is pretty bad, even in the
> debug case.

We make no attempt to remove dead code in the -O0 case.
Which IMO is a mistake, but not one I've gotten around
to correcting just yet.


r~

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-02 23:46 ` Martin v. Loewis
@ 1999-11-03  4:47   ` Thomas E Deweese
  1999-11-03 10:32     ` Richard Henderson
  1999-11-30 23:37     ` Thomas E Deweese
  1999-11-30 23:37   ` Martin v. Loewis
  1 sibling, 2 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-03  4:47 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: gcc

>>>>> "ML" == Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:

>> Based on my observations I have generated a small test case:

ML> I have not tried to understand in detail why this code is
ML> emitted. It could be either a duplication of exception handlers,
ML> or it could be cover some case that does not appear in your
ML> example.

	This was part of the reason I posted the note.  Flow
analysis obviously gets complicated around exception handling.  While
the code looked dead I know I didn't/don't know enough to be sure.

ML> Please understand that I'm not terribly worried about this code. I
ML> compiled your code with optimization (-O2); the assembler code
ML> generated for that is attached. Long and obfuscated machine code
ML> in non-optimizing mode is not strictly a bug - incorrect code
ML> would be a bug. Also, dead code in optimizing mode could be a bug.

	I can understand your interest in fixing 'bigger things',
given the speed with which C++ has evolved in the recent past.
However, emitting significant dead code is pretty bad, even in the
debug case.

	With respect to your last sentence, with a small change to
the example (one I didn't think to try until after I sent message) the
dead code remains at all optimization levels.  I'm fairly sure that
this happens because the compiler can no longer determine that no
error can be thrown (via inlining).

ML> If you want to understand in detail why this code is generated,
ML> you may try using the "-da" option. This dumps a number of
ML> internal structures (RTL) into files. Those structures are later
ML> used to generate assembler output.

	Thanks for the info, I may do this. 
	I'll let you know if I come up with anything...

------------------------------------------------------------------------

Updated example that leaves the duplicate code at all '-O' levels:

------------------------------------------------------------------------

#include <stdlib.h>
struct Foo {
    Foo();      // Note: Body no longer provided inline (force 'call').
};

Foo *fu8=NULL;

void reset(void) {
  fu8 = new Foo();
}


------------------------------------------------

Asm Output compiled: g++ -O3 -S Foo.cc
Note: LL3 & LL5

------------------------------------------------

reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	call	__builtin_new, 0
	mov	1, %o0
	mov	%o0, %l0
.LLEHB3:
	call	__3Foo, 0
	mov	1, %l1
	sethi	%hi(fu8), %o1
	st	%o0, [%o1+%lo(fu8)]
.LLEHE3:
	b	.LL5
	mov	0, %l1
.LL4:
	call	__throw, 0
	 nop
.LL5:
	cmp	%l1, 0
	be	.LL14
	nop
	call	__builtin_delete, 0
	mov	%l0, %o0
	b,a	.LL14
.LLEHB11:
.LL3:
	cmp	%l1, 0
	be	.LL4
	nop
	call	__builtin_delete, 0
	mov	%l0, %o0
	b,a	.LL4
.LLEHE11:
.LL11:
	call	terminate__Fv, 0
	 nop
.LL14:
	ret
	restore

-- 
							Thomas DeWeese
deweese@kodak.com
			"The only difference between theory and practice is
			 that in theory there isn't any." -- unknown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Useless code generated by gcc 2.95.2????
  1999-11-01 13:43 Thomas E Deweese
@ 1999-11-02 23:46 ` Martin v. Loewis
  1999-11-03  4:47   ` Thomas E Deweese
  1999-11-30 23:37   ` Martin v. Loewis
  1999-11-30 23:37 ` Thomas E Deweese
  1 sibling, 2 replies; 18+ messages in thread
From: Martin v. Loewis @ 1999-11-02 23:46 UTC (permalink / raw)
  To: deweese; +Cc: gcc

> 	We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
> we noticed that object file sizes have gone up significantly (~50%)
> between the two releases (On Sparc).  So, I thought I would take a
> look at some of the generated asm. I noticed what looked to be dead
> code that was generated. Based on my observations I have generated a
> small test case:


Thanks for your bug report. My (shallow) analysis agrees with yours;
it indeed appears that this is dead code. Notice how the code
following LL5 is similar to the one following LL3.

I have not tried to understand in detail why this code is emitted. It
could be either a duplication of exception handlers, or it could be
cover some case that does not appear in your example.

Please understand that I'm not terribly worried about this code. I
compiled your code with optimization (-O2); the assembler code
generated for that is attached. Long and obfuscated machine code in
non-optimizing mode is not strictly a bug - incorrect code would be a
bug. Also, dead code in optimizing mode could be a bug.

If you want to understand in detail why this code is generated, you
may try using the "-da" option. This dumps a number of internal
structures (RTL) into files. Those structures are later used to
generate assembler output.

Hope this helps,
Martin

reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	mov	1, %o0
	call	__builtin_new, 0
	sethi	%hi(fu8), %l0
	st	%o0, [%l0+%lo(fu8)]
	ret
	restore
.LLFE1:
.LLfe1:
	.size	 reset__Fv,.LLfe1-reset__Fv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Useless code generated by gcc 2.95.2????
@ 1999-11-01 13:43 Thomas E Deweese
  1999-11-02 23:46 ` Martin v. Loewis
  1999-11-30 23:37 ` Thomas E Deweese
  0 siblings, 2 replies; 18+ messages in thread
From: Thomas E Deweese @ 1999-11-01 13:43 UTC (permalink / raw)
  To: gcc

Hi all,

	We are looking to upgrade from egcs 1.0.2 to gcc 2.95.2, and
we noticed that object file sizes have gone up significantly (~50%)
between the two releases (On Sparc).  So, I thought I would take a
look at some of the generated asm. I noticed what looked to be dead
code that was generated. Based on my observations I have generated a
small test case:

------------------------------------------------------------------------

#include <stdlib.h>
template <class T>
class Foo {
  public:
    Foo() { }
};

Foo<unsigned char> *fu8=NULL;

void reset(void) {
  fu8 = new Foo<unsigned char>();
}

------------------------------------------------------------------------

I have included complete asm and verbose compile dumps at the end of
the message but I have brought out the part that most interests me up
here.  My basic question is: How can it ever reach '.LL5' with %l1
equal to zero?

	It looks to me like the deletes following '.LL5' and '.LL3'
are duplicates and that the delete at '.LL5' can never be reached (and
if it were to be reached the result would be a double delete).

Here is my thinking:
	1) The only way to reach '.LL5' without having cleared %l1 is
	   for an error to be thrown in the call to the constructor.  

	2) This (if I understand how errorhandling is done) would lead
	   to a jump to '.LL3' which _would_ delete the 'Foo', then
	   jump to the call to __throw at '.LL4'.

	3) If that were to ever return (which I don't think it will)
	   then the delete following '.LL5' would also be called (a
	   double delete of the contents of %l0).

	Is there some unseen way for the system to branch to
'.LLEHE3'?  (through the error handler?) 
	I missing something, or is that truely dead code? 

	I don't pretend to be an expert on the internals of g++, asm,
or gcc's error handling mechanisms, but that duplicate code just looks
suspicious to me.  Any help on this would be greatly appreciated.

	I doubt this would fully account for my 50% code growth but
it's a start.

------------------------------------------------------------------------

reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	sethi	%hi(fu8), %o0
	or	%o0, %lo(fu8), %l2
	mov	1, %o0
	call	__builtin_new, 0
	 nop
	mov	%o0, %l0
	mov	1, %l1
.LLEHB3:
	mov	%l0, %o0
	call	__t3Foo1ZUc, 0
	 nop
	mov	0, %l1
	st	%o0, [%l2]
.LLEHE3:
	b	.LL5
	 nop
.LL4:
	call	__throw, 0
	 nop
.LL5:
	cmp	%l1, 0
	be	.LL7
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL7
	 nop
.LL6:
.LL7:
	b	.LL10
	 nop
.LLEHB11:
.LL3:
	cmp	%l1, 0
	be	.LL9
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL9
	 nop
.LL8:
.LL9:
	b	.LL4
	 nop

------------------------------------------------------------------------

I compiled my example with the following line:

g++ -fverbose-asm --verbose Foo.cc -o Foo-295-3.s -S 

Reading specs from /freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/specs
gcc version 2.95.2 19991024 (release)
 /freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cpp -lang-c++ -v -D__GNUC__=2 -D__GNUG__=2 -D__GNUC_MINOR__=95 -D__cplusplus -Dsparc -Dsun -Dunix -D__svr4__ -D__SVR4 -D__sparc__ -D__sun__ -D__unix__ -D__svr4__ -D__SVR4 -D__sparc -D__sun -D__unix -Asystem(unix) -Asystem(svr4) -D__EXCEPTIONS -D__GCC_NEW_VARARGS__ -Acpu(sparc) -Amachine(sparc) Foo.cc /home/deweese/tmp/ccIwWesd.ii
GNU CPP version 2.95.2 19991024 (release) (sparc)
[...]
/freeware/gnu/gcc-2.95.2/sun/lib/gcc-lib/sparc-sun-solaris2.5.1/2.95.2/cc1plus /home/deweese/tmp/ccIwWesd.ii -quiet -dumpbase Foo.cc -version -fverbose-asm -o Foo-295-3.s
GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).

------------------------------------------------------------------------

The generated assembly:

	.file	"Foo.cc"
! GNU C++ version 2.95.2 19991024 (release) (sparc-sun-solaris2.5.1) compiled by GNU C version 2.95.2 19991024 (release).
! options passed:  -fverbose-asm
! options enabled:  -fpeephole -ffunction-cse -fkeep-static-consts
! -fpcc-struct-return -fsched-interblock -fsched-spec -fexceptions -fcommon
! -fverbose-asm -fgnu-linker -fargument-alias -fident -mepilogue -mapp-regs

gcc2_compiled.:
	.global fu8
.section	".data"
	.align 4
	.type	 fu8,#object
	.size	 fu8,4
fu8:
	.uaword	0
	.global __throw
.section	".text"
	.align 4
	.global reset__Fv
	.type	 reset__Fv,#function
	.proc	020
reset__Fv:
.LLFB1:
	!#PROLOGUE# 0
	save	%sp, -112, %sp
.LLCFI0:
	!#PROLOGUE# 1
	sethi	%hi(fu8), %o0
	or	%o0, %lo(fu8), %l2
	mov	1, %o0
	call	__builtin_new, 0
	 nop
	mov	%o0, %l0
	mov	1, %l1
.LLEHB3:
	mov	%l0, %o0
	call	__t3Foo1ZUc, 0
	 nop
	mov	0, %l1
	st	%o0, [%l2]
.LLEHE3:
	b	.LL5
	 nop
.LL4:
	call	__throw, 0
	 nop
.LL5:
	cmp	%l1, 0
	be	.LL7
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL7
	 nop
.LL6:
.LL7:
	b	.LL10
	 nop
.LLEHB11:
.LL3:
	cmp	%l1, 0
	be	.LL9
	nop
	mov	%l0, %o0
	call	__builtin_delete, 0
	 nop
	b	.LL9
	 nop
.LL8:
.LL9:
	b	.LL4
	 nop
.LLEHE11:
	b	.LL11
	 nop
.LL12:
	call	__throw, 0
	 nop
.LL13:
.LL11:
	call	terminate__Fv, 0
	 nop
.LL10:
	b	.LL14
	 nop
	b	.LL2
	 nop
.LL14:
.LL2:
	ret
	restore
.LLFE1:
.LLfe1:
	.size	 reset__Fv,.LLfe1-reset__Fv
.section	".gcc_except_table",#alloc,#write
	.align 4
__EXCEPTION_TABLE__:
	.uaword	.LLEHB3
	.uaword	.LLEHE3
	.uaword	.LL3

	.uaword	.LLEHB11
	.uaword	.LLEHE11
	.uaword	.LL11

.LLRTH1:
	.uaword	-1
	.uaword	-1


.section	".eh_frame",#alloc,#write
__FRAME_BEGIN__:
	.uaword	.LLECIE1-.LLSCIE1
.LLSCIE1:
	.uaword	0x0
	.byte	0x1
	.asciz	"eh"

	.uaword	__EXCEPTION_TABLE__
	.byte	0x1
	.byte	0x7c
	.byte	0x65
	.byte	0xc
	.byte	0xe
	.byte	0x0
	.byte	0x9
	.byte	0x65
	.byte	0xf
	.align 4
.LLECIE1:
	.uaword	.LLEFDE1-.LLSFDE1
.LLSFDE1:
	.uaword	.LLSFDE1-__FRAME_BEGIN__
	.uaword	.LLFB1
	.uaword	.LLFE1-.LLFB1
	.byte	0x4
	.uaword	.LLCFI0-.LLFB1
	.byte	0xd
	.byte	0x1e
	.byte	0x2d
	.byte	0x9
	.byte	0x65
	.byte	0x1f
	.align 4
.LLEFDE1:
	.ident	"GCC: (GNU) 2.95.2 19991024 (release)"

-- 
							Thomas DeWeese
deweese@kodak.com
			"The only difference between theory and practice is
			 that in theory there isn't any." -- unknown

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~1999-11-30 23:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-11-03 14:22 Useless code generated by gcc 2.95.2???? Mike Stump
1999-11-03 16:06 ` Martin v. Loewis
1999-11-06  7:36   ` Thomas E Deweese
1999-11-30 23:37     ` Thomas E Deweese
1999-11-30 23:37   ` Martin v. Loewis
1999-11-30 23:37 ` Mike Stump
  -- strict thread matches above, loose matches on Subject: below --
1999-11-01 13:43 Thomas E Deweese
1999-11-02 23:46 ` Martin v. Loewis
1999-11-03  4:47   ` Thomas E Deweese
1999-11-03 10:32     ` Richard Henderson
1999-11-03 15:22       ` Joern Rennecke
1999-11-03 15:40         ` Richard Henderson
1999-11-30 23:37           ` Richard Henderson
1999-11-30 23:37         ` Joern Rennecke
1999-11-30 23:37       ` Richard Henderson
1999-11-30 23:37     ` Thomas E Deweese
1999-11-30 23:37   ` Martin v. Loewis
1999-11-30 23:37 ` Thomas E Deweese

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).