public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3
@ 1997-12-10 12:02 Alexandre Oliva
  1997-12-10 23:09 ` Manfred.Hollstein
  0 siblings, 1 reply; 25+ messages in thread
From: Alexandre Oliva @ 1997-12-10 12:02 UTC (permalink / raw)
  To: gcc2, egcs

Hi there!

I've been unable to bootstrap the latest snapshots of gcc and egcs on
sparc-sun-sunos4.1.3, configured --with-gnu-as --enable-shared, using
GNU as and GNU ld from binutils 2.8.1 as assembler and linker, with
BOOT_CFLAGS="-O4 -g", using egcs-1.0 as the stage1 compiler.  I don't
know whether this is a bug in egcs-1.0 or in the current snapshots of
both packages.  The error I get is exactly the same for both builds:

stage1/xgcc -Bstage1/  -DIN_GCC    -O4 -g  -DHAVE_CONFIG_H  -o genattr \
 genattr.o rtl.o ` case "obstack.o" in ?*) echo obstack.o ;; esac ` ` case "stage1/xgcc -Bstage1/"@"" in "cc"@?*) echo  ;; esac ` ` case "" in ?*) echo  ;; esac `
./genattr /n/temp1/gcctest/bin/../src/ss/gcc/config/sparc/sparc.md > tmp-attr.h
/bin/sh: 9857 Memory fault - core dumped
make[2]: *** [stamp-attr] Error 139
make[2]: Leaving directory `/tmp_mnt/n/temp1/tmp/gcctest/src/atibaia/ss/gcc'
make[1]: *** [bootstrap] Error 2
make[1]: Leaving directory `/tmp_mnt/n/temp1/tmp/gcctest/src/atibaia/ss/gcc'
make: *** [bootstrap] Error 2

genattr crashes in the initialization code, with a stack trace like
this (I removed the buggy program before I decided to post this
message, so I quote from memory):

??? (invalid address)
memcpy
__main
main

-- 
Alexandre Oliva
mailto:oliva@dcc.unicamp.br mailto:aoliva@acm.org
http://www.dcc.unicamp.br/~oliva
Universidade Estadual de Campinas, SP, Brasil

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on          sparc-sun-sunos4.1.3
  1997-12-10 12:02 cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3 Alexandre Oliva
@ 1997-12-10 23:09 ` Manfred.Hollstein
  1997-12-11  3:32   ` Paul Eggert
  1997-12-22 12:06   ` Jeffrey A Law
  0 siblings, 2 replies; 25+ messages in thread
From: Manfred.Hollstein @ 1997-12-10 23:09 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc2, egcs

On , 10 December 1997, 16:17:27, oliva@dcc.unicamp.br wrote:

 > Hi there!
 > 
 > I've been unable to bootstrap the latest snapshots of gcc and egcs on
 > sparc-sun-sunos4.1.3, configured --with-gnu-as --enable-shared, using
 > GNU as and GNU ld from binutils 2.8.1 as assembler and linker, with
 > BOOT_CFLAGS="-O4 -g", using egcs-1.0 as the stage1 compiler.  I don't
 > know whether this is a bug in egcs-1.0 or in the current snapshots of
 > both packages.  The error I get is exactly the same for both builds:
 > 
 > stage1/xgcc -Bstage1/  -DIN_GCC    -O4 -g  -DHAVE_CONFIG_H  -o genattr \
 >  genattr.o rtl.o ` case "obstack.o" in ?*) echo obstack.o ;; esac ` ` case "stage1/xgcc -Bstage1/"@"" in "cc"@?*) echo  ;; esac ` ` case "" in ?*) echo  ;; esac `
 > ./genattr /n/temp1/gcctest/bin/../src/ss/gcc/config/sparc/sparc.md > tmp-attr.h
 > /bin/sh: 9857 Memory fault - core dumped
 > make[2]: *** [stamp-attr] Error 139
 > make[2]: Leaving directory `/tmp_mnt/n/temp1/tmp/gcctest/src/atibaia/ss/gcc'
 > make[1]: *** [bootstrap] Error 2
 > make[1]: Leaving directory `/tmp_mnt/n/temp1/tmp/gcctest/src/atibaia/ss/gcc'
 > make: *** [bootstrap] Error 2
 > 
 > genattr crashes in the initialization code, with a stack trace like
 > this (I removed the buggy program before I decided to post this
 > message, so I quote from memory):
 > 
 > ??? (invalid address)
 > memcpy
 > __main
 > main
 > 

Same here:

$ gdb genattr core 
Core was generated by `genattr'.
Program terminated with signal 11, Segmentation fault.
#0  0xef7f1d08 in ?? ()
Breakpoint 1 at 0x8130
(gdb) bt
#0  0xef7f1d08 in ?? ()
#1  0xef7f1c28 in ?? ()
#2  0xef7f0084 in ?? ()
#3  0x8180 in memset ()
#4  0x6368 in __do_global_ctors ()
#5  0x6390 in __main ()
#6  0x309c in main ()

Also interesting, cpp built by the stage1 compiler:

$ ./cpp -v -dM
ld.so: unidentifiable procedure reference at 0x1e2a4

Looks like gcc/egcs on SunOS are seriously broken!

Manfred

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on          sparc-sun-sunos4.1.3
  1997-12-11  3:32   ` Paul Eggert
@ 1997-12-11  1:51     ` Manfred.Hollstein
  0 siblings, 0 replies; 25+ messages in thread
From: Manfred.Hollstein @ 1997-12-11  1:51 UTC (permalink / raw)
  To: eggert; +Cc: oliva, gcc2, egcs

On Thu, 11 December 1997, 01:30:37, eggert@twinsun.com wrote:

 >    Date: Thu, 11 Dec 97 08:07:31 +0100
 >    From: Manfred.Hollstein@ks.sel.alcatel.de
 > 
 >    Looks like gcc/egcs on SunOS are seriously broken!
 > 
 > I don't think it's a problem with SunOS in general.  I'm not observing
 > the bug on my sparc-sun-sunos4.1.4 build with gcc 2.8 1997-12-10.  I'm
 > using SunOS 4.1.4 with all patches recommended by Sun, and compiling
 > with -O; I tried -O4 and could still build stamp-attr without problem.
 > 
 > The problem could be specific to SunOS 4.1.3; or it could be a Sun bug
 > fixed by one of the Sun patches I've installed; or it could be
 > something else.
 > 
 > Here's how to find out which patches Sun recommends for SunOS 4.1.x.
 > 
 > for SunOS:	see:
 > 4.1.3		ftp://sunsolve.sun.com/pub/patches/Solaris1.1.PatchReport
 > 4.1.3_U1	ftp://sunsolve.sun.com/pub/patches/Solaris1.1.1.PatchReport
 > 4.1.4		ftp://sunsolve.sun.com/pub/patches/Solaris1.1.2.PatchReport

Thanks for the info; I'll have to talk to our local IT about that.

But, the problem did occur only with the latest egcs-971207 snapshot;
up to egcs-971201 and egcs-1.0 everything was OK, even with -O9!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on          sparc-sun-sunos4.1.3
  1997-12-10 23:09 ` Manfred.Hollstein
@ 1997-12-11  3:32   ` Paul Eggert
  1997-12-11  1:51     ` Manfred.Hollstein
  1997-12-22 12:06   ` Jeffrey A Law
  1 sibling, 1 reply; 25+ messages in thread
From: Paul Eggert @ 1997-12-11  3:32 UTC (permalink / raw)
  To: Manfred.Hollstein; +Cc: oliva, gcc2, egcs

   Date: Thu, 11 Dec 97 08:07:31 +0100
   From: Manfred.Hollstein@ks.sel.alcatel.de

   Looks like gcc/egcs on SunOS are seriously broken!

I don't think it's a problem with SunOS in general.  I'm not observing
the bug on my sparc-sun-sunos4.1.4 build with gcc 2.8 1997-12-10.  I'm
using SunOS 4.1.4 with all patches recommended by Sun, and compiling
with -O; I tried -O4 and could still build stamp-attr without problem.

The problem could be specific to SunOS 4.1.3; or it could be a Sun bug
fixed by one of the Sun patches I've installed; or it could be
something else.

Here's how to find out which patches Sun recommends for SunOS 4.1.x.

for SunOS:	see:
4.1.3		ftp://sunsolve.sun.com/pub/patches/Solaris1.1.PatchReport
4.1.3_U1	ftp://sunsolve.sun.com/pub/patches/Solaris1.1.1.PatchReport
4.1.4		ftp://sunsolve.sun.com/pub/patches/Solaris1.1.2.PatchReport

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3
  1997-12-10 23:09 ` Manfred.Hollstein
  1997-12-11  3:32   ` Paul Eggert
@ 1997-12-22 12:06   ` Jeffrey A Law
  1997-12-22 22:04     ` Alexandre Oliva
  1997-12-26  8:11     ` New problems with gcc-2.8.0 based code [was: Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3 ] Manfred Hollstein
  1 sibling, 2 replies; 25+ messages in thread
From: Jeffrey A Law @ 1997-12-22 12:06 UTC (permalink / raw)
  To: Manfred.Hollstein; +Cc: Alexandre Oliva, gcc2, egcs

  In message < 9712110707.AA07283@lts.sel.alcatel.de >you write:
  > Same here:
  > 
  > $ gdb genattr core 
  > Core was generated by `genattr'.
  > Program terminated with signal 11, Segmentation fault.
  > #0  0xef7f1d08 in ?? ()
  > Breakpoint 1 at 0x8130
  > (gdb) bt
  > #0  0xef7f1d08 in ?? ()
  > #1  0xef7f1c28 in ?? ()
  > #2  0xef7f0084 in ?? ()
  > #3  0x8180 in memset ()
  > #4  0x6368 in __do_global_ctors ()
  > #5  0x6390 in __main ()
  > #6  0x309c in main ()
  > 
  > Also interesting, cpp built by the stage1 compiler:
  > 
  > $ ./cpp -v -dM
  > ld.so: unidentifiable procedure reference at 0x1e2a4
  > 
  > Looks like gcc/egcs on SunOS are seriously broken!
Are we still having this problem?

jeff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3
  1997-12-22 12:06   ` Jeffrey A Law
@ 1997-12-22 22:04     ` Alexandre Oliva
  1997-12-26  8:11     ` New problems with gcc-2.8.0 based code [was: Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3 ] Manfred Hollstein
  1 sibling, 0 replies; 25+ messages in thread
From: Alexandre Oliva @ 1997-12-22 22:04 UTC (permalink / raw)
  To: law; +Cc: Manfred.Hollstein, gcc2, egcs

Jeffrey A Law writes:

> Are we still having this problem?

No, I was able to bootstrap the latest snapshots of both gcc-2.8.0 and
egcs on SunOS 4.1.3, with exactly the same configuration that had
previously failed.

-- 
Alexandre Oliva
mailto:oliva@dcc.unicamp.br mailto:aoliva@acm.org
http://www.dcc.unicamp.br/~oliva
Universidade Estadual de Campinas, SP, Brasil

^ permalink raw reply	[flat|nested] 25+ messages in thread

* New problems with gcc-2.8.0 based code [was: Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3 ]
  1997-12-22 12:06   ` Jeffrey A Law
  1997-12-22 22:04     ` Alexandre Oliva
@ 1997-12-26  8:11     ` Manfred Hollstein
  1997-12-27 12:13       ` New problems with gcc-2.8.0 based code - NOW FIXED! Manfred Hollstein
  1 sibling, 1 reply; 25+ messages in thread
From: Manfred Hollstein @ 1997-12-26  8:11 UTC (permalink / raw)
  To: law; +Cc: Manfred.Hollstein, oliva, gcc2, egcs

On Mon, 22 December 1997, 13:09:14, law@cygnus.com wrote:

 > 
 >   In message < 9712110707.AA07283@lts.sel.alcatel.de >you write:
 >   > Same here:
 >   > 
 >   > $ gdb genattr core 
 >   > Core was generated by `genattr'.
 >   > Program terminated with signal 11, Segmentation fault.
 >   > #0  0xef7f1d08 in ?? ()
 >   > Breakpoint 1 at 0x8130
 >   > (gdb) bt
 >   > #0  0xef7f1d08 in ?? ()
 >   > #1  0xef7f1c28 in ?? ()
 >   > #2  0xef7f0084 in ?? ()
 >   > #3  0x8180 in memset ()
 >   > #4  0x6368 in __do_global_ctors ()
 >   > #5  0x6390 in __main ()
 >   > #6  0x309c in main ()
 >   > 
 >   > Also interesting, cpp built by the stage1 compiler:
 >   > 
 >   > $ ./cpp -v -dM
 >   > ld.so: unidentifiable procedure reference at 0x1e2a4
 >   > 
 >   > Looks like gcc/egcs on SunOS are seriously broken!
 > Are we still having this problem?
 > 
 > jeff

Sorry, I  can't say, as I currently  don't have access to the machines
at work.  But, as Alexandre already pointed out, this problem seems to
have been gone.

There are other problems since the merge with gcc-2.8.0, though.

Up to egcs-1.0  (incl. egcs-971201) I've  been able to  built my Linux
kernel  with  `-O6 -march=pentium  -mcpu=pentium  -fomit-frame-pointer
-malign-loops=0      -malign-jumps=0      -malign-functions=0'   _and_
`-funroll-all-loops'!

Newer   snapshots and  gcc-2.8.0-971213 don't  allow  me   to do that;
`-funroll-all-loops' causes   `isapnp'  and  `clock'  to   fail   with
`segmentation  violations'.  Omitting   `-funroll-all-loops' helps for
the most current 2.1.7x kernels, but 2.0.33 still fails if compiled by
gcc-2.8.0 and egcs-971215!

Did anybody else see similar symptoms?

Looks - at least  for me - like  the loop unrolling stuff (and perhaps
other as well) hasn't really been improved by the merge with gcc-2.8.0
:-(

Manfred

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-26  8:11     ` New problems with gcc-2.8.0 based code [was: Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3 ] Manfred Hollstein
@ 1997-12-27 12:13       ` Manfred Hollstein
  1997-12-28 23:23         ` Richard Stallman
  0 siblings, 1 reply; 25+ messages in thread
From: Manfred Hollstein @ 1997-12-27 12:13 UTC (permalink / raw)
  To: manfred; +Cc: law, Manfred.Hollstein, oliva, gcc2, egcs

On Fri, 26 December 1997, 15:11:14, manfred@s-direktnet.de wrote:

 > There are other problems since the merge with gcc-2.8.0, though.
 > 
 > Up to egcs-1.0  (incl. egcs-971201) I've  been able to  built my Linux
 > kernel  with  `-O6 -march=pentium  -mcpu=pentium  -fomit-frame-pointer
 > -malign-loops=0      -malign-jumps=0      -malign-functions=0'   _and_
 > `-funroll-all-loops'!
 > 
 > Newer   snapshots and  gcc-2.8.0-971213 don't  allow  me   to do that;
 > `-funroll-all-loops' causes   `isapnp'  and  `clock'  to   fail   with
 > `segmentation  violations'.  Omitting   `-funroll-all-loops' helps for
 > the most current 2.1.7x kernels, but 2.0.33 still fails if compiled by
 > gcc-2.8.0 and egcs-971215!
 > 
 > Did anybody else see similar symptoms?
 > 
 > Looks - at least  for me - like  the loop unrolling stuff (and perhaps
 > other as well) hasn't really been improved by the merge with gcc-2.8.0
 > :-(

Just compiled my 2.1.76 kernel once  again this time using egcs-971225
to see if the `-funroll-all-loops' still persists.

IT'S FIXED!

I writing this e-mail  on the freshly compiled  kernel and didn't  had
any problems so far.  I'll only have  to reboot later  on to sent this
mail out into  the world, as  ISDN is still broken  in the most recent
kernels :-(

Keep up the good work, cheers 1998

Manfred

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-27 12:13       ` New problems with gcc-2.8.0 based code - NOW FIXED! Manfred Hollstein
@ 1997-12-28 23:23         ` Richard Stallman
  1997-12-29  7:48           ` Jeffrey A Law
  1997-12-29 11:08           ` Manfred Hollstein
  0 siblings, 2 replies; 25+ messages in thread
From: Richard Stallman @ 1997-12-28 23:23 UTC (permalink / raw)
  To: manfred; +Cc: manfred, law, Manfred.Hollstein, oliva, gcc2, egcs

    Just compiled my 2.1.76 kernel once  again this time using egcs-971225
    to see if the `-funroll-all-loops' still persists.

    IT'S FIXED!

Is it fixed in the latest GCC snapshot?  If not, the job isn't done
yet.  Can someone identify what change deals with this, and get it
installed in GCC?


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-28 23:23         ` Richard Stallman
@ 1997-12-29  7:48           ` Jeffrey A Law
  1997-12-29 11:17             ` Manfred Hollstein
  1997-12-29 11:08           ` Manfred Hollstein
  1 sibling, 1 reply; 25+ messages in thread
From: Jeffrey A Law @ 1997-12-29  7:48 UTC (permalink / raw)
  To: rms; +Cc: manfred, Manfred.Hollstein, oliva, gcc2, egcs

  In message < 199712290718.AAA13426@wijiji.santafe.edu >you write:
  >     Just compiled my 2.1.76 kernel once  again this time using egcs-971225
  >     to see if the `-funroll-all-loops' still persists.
  > 
  >     IT'S FIXED!
  > 
  > Is it fixed in the latest GCC snapshot?  If not, the job isn't done
  > yet.  Can someone identify what change deals with this, and get it
  > installed in GCC?
Manfred -- What was the failure mode?  Mis-compiled code, compiler
abort, etc?

The only unrolling bug we've fixed recently was a problem with
find_splittable_givs trying to split givs with a dest_reg that was
created by loop.
(which could cause either a segfault in unroll, or incorrect code).

We fixed this in egcs back in late Nov.  I can forward that fix to
gcc2 if the maintainers want to look at it.

jeff



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-29 11:17             ` Manfred Hollstein
@ 1997-12-29 10:20               ` Jeffrey A Law
  1997-12-30  8:57                 ` Manfred Hollstein
  0 siblings, 1 reply; 25+ messages in thread
From: Jeffrey A Law @ 1997-12-29 10:20 UTC (permalink / raw)
  To: Manfred Hollstein; +Cc: rms, Manfred.Hollstein, oliva, gcc2, egcs

  In message < 199712291737.SAA07160@saturn.s-direktnet.de >you write:
  > Well, it wasn't a failure of the compiler! The kernel and all the modules
  > could be built successfully, but when booting this kernel some system
  > calls issued by the two programs `isapnp' and `clock' reproducably fail,
  > i.e. cause `segmentation violations' of the two programs.
OK.  That points to either a code generation bug or a bug in the linux
source.

  > I then recompiled isapnp and clock with gcc-2.8.0 and now the situation
  > became even worse: no more SIGSEGV's but kernel oops's!
:(


  > Perhaps I should look at the source of the particular programs and find
  > out using strace what's really going on.
Might be helpful.  Though I suspect it's a kernel module that's being
mis-compiled or is incorrectly written, so the strace might tell us
what syscall is failing.

jeff


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-28 23:23         ` Richard Stallman
  1997-12-29  7:48           ` Jeffrey A Law
@ 1997-12-29 11:08           ` Manfred Hollstein
  1 sibling, 0 replies; 25+ messages in thread
From: Manfred Hollstein @ 1997-12-29 11:08 UTC (permalink / raw)
  To: rms; +Cc: law, Manfred.Hollstein, oliva, gcc2, egcs

On Mon, 29 December 1997, 00:18:31, rms@santafe.edu wrote:

 >     Just compiled my 2.1.76 kernel once  again this time using egcs-971225
 >     to see if the `-funroll-all-loops' still persists.
 > 
 >     IT'S FIXED!
 > 
 > Is it fixed in the latest GCC snapshot?  If not, the job isn't done
 > yet.  Can someone identify what change deals with this, and get it
 > installed in GCC?
 > 

I just tried the latest gcc2 snapshot (971225), and it's not fixed. Only
egcs-971225 works correctly. I have no idea though, whether it's actually
a gcc or a Linux bug (the fact that linux-2.1.76 works ok with
`-funroll-all-loops' while 2.0.33 doesn't indicates perhaps that Linux is
to blame).

Manfred

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-29  7:48           ` Jeffrey A Law
@ 1997-12-29 11:17             ` Manfred Hollstein
  1997-12-29 10:20               ` Jeffrey A Law
  0 siblings, 1 reply; 25+ messages in thread
From: Manfred Hollstein @ 1997-12-29 11:17 UTC (permalink / raw)
  To: law; +Cc: rms, Manfred.Hollstein, oliva, gcc2, egcs

On Mon, 29 December 1997, 08:46:13, law@hurl.cygnus.com wrote:

 > 
 >   In message < 199712290718.AAA13426@wijiji.santafe.edu >you write:
 >   >     Just compiled my 2.1.76 kernel once  again this time using egcs-971225
 >   >     to see if the `-funroll-all-loops' still persists.
 >   > 
 >   >     IT'S FIXED!
 >   > 
 >   > Is it fixed in the latest GCC snapshot?  If not, the job isn't done
 >   > yet.  Can someone identify what change deals with this, and get it
 >   > installed in GCC?
 > Manfred -- What was the failure mode?  Mis-compiled code, compiler
 > abort, etc?
 > 
 > The only unrolling bug we've fixed recently was a problem with
 > find_splittable_givs trying to split givs with a dest_reg that was
 > created by loop.
 > (which could cause either a segfault in unroll, or incorrect code).
 > 
 > We fixed this in egcs back in late Nov.  I can forward that fix to
 > gcc2 if the maintainers want to look at it.
 > 

Well, it wasn't a failure of the compiler! The kernel and all the modules
could be built successfully, but when booting this kernel some system
calls issued by the two programs `isapnp' and `clock' reproducably fail,
i.e. cause `segmentation violations' of the two programs.

I then recompiled isapnp and clock with gcc-2.8.0 and now the situation
became even worse: no more SIGSEGV's but kernel oops's!

Perhaps I should look at the source of the particular programs and find
out using strace what's really going on.

Manfred

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-29 10:20               ` Jeffrey A Law
@ 1997-12-30  8:57                 ` Manfred Hollstein
  1997-12-30  9:47                   ` Andi Kleen
                                     ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Manfred Hollstein @ 1997-12-30  8:57 UTC (permalink / raw)
  To: law; +Cc: rms, Manfred.Hollstein, oliva, gcc2, egcs

On Mon, 29 December 1997, 11:16:37, law@hurl.cygnus.com wrote:

 >   In message < 199712291737.SAA07160@saturn.s-direktnet.de >you write:
 >   > Well, it wasn't a failure of the compiler! The kernel and all the modules
 >   > could be built successfully, but when booting this kernel some system
 >   > calls issued by the two programs `isapnp' and `clock' reproducably fail,
 >   > i.e. cause `segmentation violations' of the two programs.
 > OK.  That points to either a code generation bug or a bug in the linux
 > source.
 > 
 >   > I then recompiled isapnp and clock with gcc-2.8.0 and now the situation
 >   > became even worse: no more SIGSEGV's but kernel oops's!
 > :(
 > 
 > 
 >   > Perhaps I should look at the source of the particular programs and find
 >   > out using strace what's really going on.
 > Might be helpful.  Though I suspect it's a kernel module that's being
 > mis-compiled or is incorrectly written, so the strace might tell us
 > what syscall is failing.
 > 
 > jeff
 > 

I'm very sorry  for bothering you  with my kernel logs  :-( But as the
replies from RMS and likewise important GNU people indicate, it really
looks  as if we   reached a  situation  where  something needs to   be
clarified between the Linux and the GNU/gcc folks.

I just compiled Linux-2.1.76 twice:

  1. Using egcs-971225
  2. Using gcc-2.8.0-971225

Both compilations used the following flags and finished successfully:

  -march=pentium -mcpu=pentium -malign-loops=0 -malign-jumps=0 \
  -malign-functions=0 -DCPU=586 -fomit-frame-pointer -funroll-all-loops

Booting kernel 1. results in the following (from /var/log/kern.log):

  Dec 30 14:10:51 saturn kernel: klogd 1.3-3, log source = /proc/kmsg started.
  Dec 30 14:10:51 saturn kernel: Loaded 4218 symbols from /System.map.
  Dec 30 14:10:51 saturn kernel: Symbols match kernel version 2.1.76.
  Dec 30 14:10:51 saturn kernel: Error seeking in /dev/kmem 
  Dec 30 14:10:51 saturn kernel: Error adding kernel module table entry. 
  Dec 30 14:10:51 saturn kernel: Linux version 2.1.76 (manfred@saturn) (gcc version egcs-2.91.03 971225 (gcc-2.8.0)) #1 Sun Dec 28 12:20:54 MET 1997 
  Dec 30 14:10:51 saturn kernel: Console: 15 point font, 600 scans 
  Dec 30 14:10:51 saturn kernel: Console: colour VGA+ 100x40, 1 virtual console (max 63) 
  Dec 30 14:10:51 saturn kernel: PCI: BIOS32 Service Directory structure at 0xc00f99e0 
  Dec 30 14:10:51 saturn kernel: PCI: BIOS32 Service Directory entry at 0xf0400 
  Dec 30 14:10:51 saturn kernel: PCI: PCI BIOS revision 2.10 entry at 0xf0430 
  Dec 30 14:10:51 saturn kernel: Probing PCI hardware. 
  Dec 30 14:10:51 saturn kernel: Calibrating delay loop... 79.87 BogoMIPS 
  Dec 30 14:10:51 saturn kernel: Memory: 63132k/65536k available (952k kernel code, 392k reserved, 1028k data, 32k init) 
  Dec 30 14:10:51 saturn kernel: Swansea University Computer Society NET3.039 for Linux 2.1 
  Dec 30 14:10:51 saturn kernel: NET3: Unix domain sockets 0.16 for Linux NET3.038. 
  Dec 30 14:10:51 saturn kernel: Swansea University Computer Society TCP/IP for NET3.037 
  Dec 30 14:10:51 saturn kernel: IP Protocols: ICMP, UDP, TCP 
  Dec 30 14:10:51 saturn kernel: CPU: Intel Pentium 75+ stepping 0c 
  Dec 30 14:10:51 saturn kernel: Checking 386/387 coupling... Ok, fpu using exception 16 error reporting. 
  Dec 30 14:10:51 saturn kernel: Checking 'hlt' instruction... Ok. 
  Dec 30 14:10:51 saturn kernel: Intel Pentium with F0 0F bug - workaround enabled. 
  Dec 30 14:10:51 saturn kernel: POSIX conformance testing by UNIFIX 
  Dec 30 14:10:51 saturn kernel: Starting kswapd v 1.23  
  Dec 30 14:10:51 saturn kernel: Serial driver version 4.24 with no serial options enabled 
  Dec 30 14:10:51 saturn kernel: ttyS00 at 0x03f8 (irq = 4) is a 16550A 
  Dec 30 14:10:51 saturn kernel: ttyS01 at 0x02f8 (irq = 3) is a 16550A 
  Dec 30 14:10:51 saturn kernel: Uniform CD-ROM driver revision 2.0 
  Dec 30 14:10:51 saturn kernel: aic7xxx: <Adaptec AHA-294X SCSI host adapter> at PCI 12 
  Dec 30 14:10:51 saturn kernel: aic7xxx: BIOS enabled, IO Port 0xd800, IO Mem 0xe3800000, IRQ 11, Revision B 
  Dec 30 14:10:51 saturn kernel: aic7xxx: Single Channel, SCSI ID 7, 16/255 SCBs, QFull 16, QMask 0x1f 
  Dec 30 14:10:51 saturn kernel: scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 4.1/3.2 
  Dec 30 14:10:51 saturn kernel: scsi : 1 host. 
  Dec 30 14:10:51 saturn kernel: scsi0: Scanning channel A for devices. 
  Dec 30 14:10:51 saturn kernel:   Vendor: IBM       Model: DORS-32160        Rev: S82C 
  Dec 30 14:10:51 saturn kernel:   Type:   Direct-Access                      ANSI SCSI revision: 02 
  Dec 30 14:10:51 saturn kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 
  Dec 30 14:10:51 saturn kernel:   Vendor: IBM       Model: DORS-32160        Rev: WA0A 
  Dec 30 14:10:51 saturn kernel:   Type:   Direct-Access                      ANSI SCSI revision: 02 
  Dec 30 14:10:51 saturn kernel: Detected scsi disk sda at scsi0, channel 0, id 1, lun 0 
  Dec 30 14:10:51 saturn kernel:   Vendor: PIONEER   Model: CD-ROM DR-124X    Rev: 1.00 
  Dec 30 14:10:51 saturn kernel:   Type:   CD-ROM                             ANSI SCSI revision: 02 
  Dec 30 14:10:51 saturn kernel: scsi : detected 2 SCSI disks total. 
  Dec 30 14:10:51 saturn kernel: SCSI device sda: hdwr sector= 512 bytes. Sectors= 4226725 [2063 MB] [2.1 GB] 
  Dec 30 14:10:51 saturn kernel: SCSI device sda: hdwr sector= 512 bytes. Sectors= 4226725 [2063 MB] [2.1 GB] 
  Dec 30 14:10:51 saturn kernel: Partition check: 
  Dec 30 14:10:51 saturn kernel:  sda: sda1 sda2 sda3 < sda5 > 
  Dec 30 14:10:51 saturn kernel:  sdb: sdb1 < sdb5 sdb6 sdb7 sdb8 > 
  Dec 30 14:10:51 saturn kernel: VFS: Mounted root (ext2 filesystem) readonly. 
  Dec 30 14:10:51 saturn kernel: Freeing unused kernel memory: 32k freed 
  Dec 30 14:10:51 saturn kernel: Adding Swap: 130748k swap-space (priority -1) 
  Dec 30 14:10:51 saturn kernel: Soundblaster audio driver Copyright (C) by Hannu Savolainen 1993-1996 

and so on .. nothing really interesting happens.
BUT booting kernel 2. will work comparable only until the swap file is
added:

[Same messages deleted]
  Dec 30 13:58:50 saturn kernel: Freeing unused kernel memory: 32k freed 
  Dec 30 13:58:50 saturn kernel: Adding Swap: 130748k swap-space (priority -1) 
  Dec 30 13:58:50 saturn kernel: kmem_free: Either bad obj addr or double free (objp=c3e7b800, name=size-2048) 
  Dec 30 13:58:50 saturn kernel: kmem_free: Either bad obj addr or double free (objp=c3e7b000, name=size-2048) 
  Dec 30 13:58:50 saturn kernel: kmem_free: Either bad obj addr or double free (objp=c3e7a800, name=size-2048) 
  Dec 30 13:58:50 saturn kernel: kmem_free: Either bad obj addr or double free (objp=c3e7a000, name=size-2048) 
  Dec 30 13:58:50 saturn kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000018 
  Dec 30 13:58:50 saturn kernel: current->tss.cr3 = 00101000, ^_r3 = 00101000 
  Dec 30 13:58:50 saturn kernel: *pde = 00000000 
  Dec 30 13:58:50 saturn kernel: Oops: 0000 
  Dec 30 13:58:50 saturn kernel: CPU:    0 
  Dec 30 13:58:50 saturn kernel: EIP:    0010:[<c01275d9>] 
  Dec 30 13:58:50 saturn kernel: EFLAGS: 00010002 
  Dec 30 13:58:50 saturn kernel: eax: 00000100   ebx: c02f52c0   ecx: 00000000   edx: c0221000 
  Dec 30 13:58:50 saturn kernel: esi: c3e79800   edi: 00000202   ebp: c3e79800   esp: c3e25f88 
  Dec 30 13:58:50 saturn kernel: ds: 0018   es: 0018   ss: 0018 
  Dec 30 13:58:50 saturn kernel: Process modutils (pid: 25, process nr: 8, stackpage=c3e25000) 
  Dec 30 13:58:50 saturn kernel: Stack: c3e24000 c3e887c0 00000400 00000021 c011b96d c009cda0 c011b99d c3e79800  
  Dec 30 13:58:50 saturn kernel:        c3e24000 ffffffff fffffffc 00000000 c011c14c 00000000 c0109f1a 00000000  
  Dec 30 13:58:50 saturn kernel:        00000000 400d2c44 ffffffff fffffffc 00000000 00000001 0000002b 0000002b  
  Dec 30 13:58:50 saturn kernel: Call Trace: [<c011b96d>] [<c011b99d>] [<c011c14c>] [<c0109f1a>]  
  Dec 30 13:58:50 saturn kernel: Code: 2b 69 18 89 e8 31 d2 f7 73 08 8b 51 04 8d 14 82 89 54 24 10  
  Dec 30 13:58:50 saturn kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000018 
  Dec 30 13:58:50 saturn kernel: current->tss.cr3 = 00101000, ^_r3 = 00101000 
  Dec 30 13:58:50 saturn kernel: *pde = 00000000 
  Dec 30 13:58:50 saturn kernel: Oops: 0000 
  Dec 30 13:58:50 saturn kernel: CPU:    0 
  Dec 30 13:58:50 saturn kernel: EIP:    0010:[<c01275d9>] 
  Dec 30 13:58:50 saturn kernel: EFLAGS: 00010002 
  Dec 30 13:58:50 saturn kernel: eax: 00000100   ebx: c02f52c0   ecx: 00000000   edx: c0221000 
  Dec 30 13:58:50 saturn kernel: esi: c3e79000   edi: 00000202   ebp: c3e79000   esp: c3e1ff88 
  Dec 30 13:58:50 saturn kernel: ds: 0018   es: 0018   ss: 0018 
  Dec 30 13:58:50 saturn kernel: Process modutils (pid: 26, process nr: 9, stackpage=c3e1f000) 
  Dec 30 13:58:50 saturn kernel: Stack: c3e1e000 c3e898e0 00000400 00000021 c011b96d c009cfa0 c011b99d c3e79000  
  Dec 30 13:58:50 saturn kernel:        c3e1e000 ffffffff fffffffc 00000000 c011c14c 00000000 c0109f1a 00000000  
  Dec 30 13:58:50 saturn kernel:        00000000 400d2c44 ffffffff fffffffc 00000000 00000001 0000002b 0000002b  
  Dec 30 13:58:50 saturn kernel: Call Trace: [<c011b96d>] [<c011b99d>] [<c011c14c>] [<c0109f1a>]  
  Dec 30 13:58:50 saturn kernel: Code: 2b 69 18 89 e8 31 d2 f7 73 08 8b 51 04 8d 14 82 89 54 24 10  
  Dec 30 13:58:50 saturn kernel: kmem_free: Either bad obj addr or double free (objp=c3c57000, name=size-2048) 
  Dec 30 13:58:50 saturn kernel: kmem_free: Either bad obj addr or double free (objp=c3c56800, name=size-2048) 

And so on - a lot of kernel Oops's follow.  I really don't know, who's
to blame?!?!  Looking at Linux's `mm/slab.c' where the `kmem_free:..."
messages result from, doesn't really help, as there is _NO_ loop! BUT,
omitting `-funrool-all-loops' during compilation of the kernel results
in a working kernel without such messages?!?!

To summarize:

   - Kernel 2.1.76 built by egcs-971225 _and_ `-funroll-all-loops'
     works OK!
   - Kernel 2.1.76 built by egcs-971207 and -971215 and
     `-funroll-all-loops' DOESN'T work (for egcs-971207 we did
     the gcc-2.8.0 MERGE)!
   - Kernel 2.1.76 built by egcs-1.0 (and all previous snapshots!)
     and `-funroll-all-loops' works OK!
   - Kernel 2.1.76 built by gcc-2.8.0-971211, -971213 and -971225
     and `-funroll-all-loops' DOESN'T work!

Interestingly,    kernel    2.0.33    built   by  egcs-971225    _and_
`-funroll-all-loops' also doesn't work!

So, perhaps it's actually Linux or its  applications who's to blame?!?
Does  anybody know  an   e-mail address whom    we can  forward  these
questions?

Nevertheless, cheers 1998

Manfred

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-30  8:57                 ` Manfred Hollstein
@ 1997-12-30  9:47                   ` Andi Kleen
  1998-01-01 10:02                     ` Manfred Hollstein
  1997-12-30 11:33                   ` Philip Blundell
  1997-12-30 13:17                   ` Paul Koning
  2 siblings, 1 reply; 25+ messages in thread
From: Andi Kleen @ 1997-12-30  9:47 UTC (permalink / raw)
  To: Manfred Hollstein; +Cc: law, rms, Manfred.Hollstein, oliva, gcc2, egcs

Manfred Hollstein <manfred@s-direktnet.de> writes:
> 
> To summarize:
> 
>    - Kernel 2.1.76 built by egcs-971225 _and_ `-funroll-all-loops'
>      works OK!
>    - Kernel 2.1.76 built by egcs-971207 and -971215 and
>      `-funroll-all-loops' DOESN'T work (for egcs-971207 we did
>      the gcc-2.8.0 MERGE)!
>    - Kernel 2.1.76 built by egcs-1.0 (and all previous snapshots!)
>      and `-funroll-all-loops' works OK!
>    - Kernel 2.1.76 built by gcc-2.8.0-971211, -971213 and -971225
>      and `-funroll-all-loops' DOESN'T work!

I remember a bugfix between egcs 1.0 and 1.0.1 so that it treats 
__asm__ statements without output operands always as volatile. The Linux
kernel depends on this behaviour. If gcc 2.8.0 doesn't have this bugfix
it could be the cause for your problems.

-Andi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-30  8:57                 ` Manfred Hollstein
  1997-12-30  9:47                   ` Andi Kleen
@ 1997-12-30 11:33                   ` Philip Blundell
  1997-12-30 13:17                   ` Paul Koning
  2 siblings, 0 replies; 25+ messages in thread
From: Philip Blundell @ 1997-12-30 11:33 UTC (permalink / raw)
  To: Manfred Hollstein; +Cc: law, rms, Manfred.Hollstein, oliva, gcc2, egcs

>  Dec 30 13:58:50 saturn kernel: Call Trace: [<c011b96d>] [<c011b99d>] [<c011c
>14c>] [<c0109f1a>]  

Can you run this through the ksymoops decoder to get a meaningful backtrace 
(with function names)?

>And so on - a lot of kernel Oops's follow.  I really don't know, who's
>to blame?!?!  Looking at Linux's `mm/slab.c' where the `kmem_free:..."
>messages result from, doesn't really help, as there is _NO_ loop! BUT,

More likely it's the function that calls the SLAB code that is to blame.  If 
you find out what that is and compile it with -S -funroll-all-loops, that 
might be the way forward.

>Does  anybody know  an   e-mail address whom    we can  forward  these
>questions?

linux-kernel@vger.rutgers.edu is probably your best bet.

p.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-30  8:57                 ` Manfred Hollstein
  1997-12-30  9:47                   ` Andi Kleen
  1997-12-30 11:33                   ` Philip Blundell
@ 1997-12-30 13:17                   ` Paul Koning
  2 siblings, 0 replies; 25+ messages in thread
From: Paul Koning @ 1997-12-30 13:17 UTC (permalink / raw)
  To: manfred; +Cc: law, rms, gcc2, egcs

GCC 2.7.2.1 had a bug that showed up in some backends, and EGCS
exposed it with more backends, where asm statements that should have
been treated as unmovable were actually moved.  (Spefically, those
with no output arguments; the documentation quite rightly says that
those are always treated as if they were marked "volatile" even when
not so marked.)

That bug was fixed in egcs just a few weeks ago.  If gcc 2.8.x doesn't
have the corresponding fix, that could account for the problem.  As I
recall from the earlier discussion, one symptom of the bug was that
Linux would crash or otherwise misbehave, but only if you turned on
"enough" optimization.

The fix was only a few lines, you might try the experiment of applying
it to gcc 2.8 if the relevant module is reasonably similar.

	paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-30  9:47                   ` Andi Kleen
@ 1998-01-01 10:02                     ` Manfred Hollstein
  0 siblings, 0 replies; 25+ messages in thread
From: Manfred Hollstein @ 1998-01-01 10:02 UTC (permalink / raw)
  To: gcc2; +Cc: law, rms, Manfred.Hollstein, oliva, egcs

On , 30 December 1997, 18:50:24, ak@muc.de wrote:

 > I remember a bugfix between egcs 1.0 and 1.0.1 so that it treats 
 > __asm__ statements without output operands always as volatile. The Linux
 > kernel depends on this behaviour. If gcc 2.8.0 doesn't have this bugfix
 > it could be the cause for your problems.
 > 

After the  discussion  about __volatile__  __asm__ statements  I  just
added Jeffrey Law's patch to gcc-2.8.0-971225:

Mon Dec 15 08:48:24 1997  Jeffrey A Law  (law@cygnus.com)

	* stmt.c (expand_asm_operands): If an ASM has no outputs, then treat
	it as volatile.

diff --context --recursive --show-c-function -x *.o -x *.info* -x *.html* -x *.elc -x *.dvi -x *.orig -x *~ -x version.el gcc-2.8.0.orig/gcc/stmt.c gcc-2.8.0/gcc/stmt.c
*** gcc-2.8.0.orig/gcc/stmt.c	Tue Dec  9 01:07:38 1997
--- gcc-2.8.0/gcc/stmt.c	Wed Dec 31 23:42:54 1997
*************** expand_asm_operands (string, outputs, in
*** 1421,1426 ****
--- 1421,1430 ----
    /* The insn we have emitted.  */
    rtx insn;
  
+   /* An ASM with no outputs needs to be treated as volatile.  */
+   if (noutputs == 0)
+     vol = 1;
+ 
    if (output_bytecode)
      {
        error ("`asm' is invalid when generating bytecode");

rebuilt and installed the compiler. I then used this compiler to built
linux-2.1.76 once  again and I'm  still getting the same boot failures
(16 times the _same_ Oops's from boot to reboot):

 Unable to handle kernel NULL pointer dereference at virtual address 00000018 
 current->tss.cr3 = 00101000, ^_r3 = 00101000 
 *pde = 00000000 
 Oops: 0000 
 CPU:    0 
 EIP:    0010:[<kfree+12b/1cd>] 
 EFLAGS: 00010002 
 eax: 00000100   ebx: c02f52c0   ecx: 00000000   edx: c0221000 
 esi: c3e79800   edi: 00000202   ebp: c3e79800   esp: c3e25f88 
 ds: 0018   es: 0018   ss: 0018 
 Process modutils (pid: 25, process nr: 8, stackpage=c3e25000) 
 Stack: c3e24000 c3e887c0 00000400 00000021 c011b96d c009cda0 c011b99d c3e79800  
        c3e24000 ffffffff fffffffc 00000000 c011c14c 00000000 c0109f1a 00000000  
        00000000 400d2c44 ffffffff fffffffc 00000000 00000001 0000002b 0000002b  
 Call Trace: [<do_exit+1f7/29b>] [<do_exit+227/29b>] [<sys_waitpid>] [<system_call+3a/40>]
 Code: c01275d9 <kfree+12b/1cd>  2b 69 18       	subl   0x18(%ecx),%ebp
 Code: c01275dc <kfree+12e/1cd>  89 e8          	movl   %ebp,%eax
 Code: c01275de <kfree+130/1cd>  31 d2          	xorl   %edx,%edx
 Code: c01275e0 <kfree+132/1cd>  f7 73 08       	divl   0x8(%ebx),%eax
 Code: c01275e9 <kfree+13b/1cd>  8b 51 04       	movl   0x4(%ecx),%edx
 Code: c01275ec <kfree+13e/1cd>  8d 14 82       	leal   (%edx,%eax,4),%edx
 Code: c01275ef <kfree+141/1cd>  89 54 24 10    	movl   %edx,0x10(%esp,1)


Another   problem has been   shown   by  several  test programs   from
libg++-2.8.0b6.5.   As an   example,  try to    compile the  following
snippet:

/* File: t002.cc
 * Compiled by gcc-2.8.0-971225 using
 *		optimization
 *	  _and_ -funroll-all-loops
 *	  _and_	-g
 * results in an `Internal compiler error'.  */

#include <String.h>
#include <Regex.h>
#include <iostream.h>

int main (void)
{
	String z = "This string\thas\nfive words";
	String w[10];
	int nw = split (z, w, 10, RXwhite);

	for (int i = 0; i < nw; ++i)
	{
		cout << "z[" << i << "] = \"" << w[i] << "\"" << endl;
	}

	return 0;
}

$ gcc -v
Reading specs from /tools/gnu/lib/gcc-lib/i586-linux-gnulibc1/2.8.0/specs
gcc version 2.8.0
$ uname -a
Linux saturn 2.0.33 #7 Mon Dec 29 13:09:29 MET 1997 i586 unknown
$ gcc -O -funroll-all-loops -g -S t002.cc
t002.cc: In function `int main()':
t002.cc:17: Internal compiler error.
t002.cc:17: Please submit a full bug report to `bug-g++@prep.ai.mit.edu'.

Omitting either `-funroll-all-loops' or `-g' works, though.

If you need the expanded .ii file send me an e-mail.

And you guys from the gcc2 mailing  list, please add my e-mail address
to the To: or Cc: fields explicitly, as I  still couldn't get onto the
mailing list :-(

Manfred

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
       [not found] ` <199801030231.DAA25241@mail.macqel.be>
@ 1998-01-05 10:13   ` Paul Koning
  0 siblings, 0 replies; 25+ messages in thread
From: Paul Koning @ 1998-01-05 10:13 UTC (permalink / raw)
  To: phdm; +Cc: Philip.Blundell, rms, kenner, gcc2, egcs

>>>>> "Philippe" == Philippe De Muyter <phdm@macqel.be> writes:

 >> Perhaps in the short term we should arrange for asms with no
 >> outputs to be automatically marked volatile, and to issue a
 >> warning (something like "nonvolatile `asm' has no outputs" would
 >> do) to point out to people that they are on shaky ground.  Also,
 >> we should fix the manual to be unambiguous.  Then,
 Philippe> Perhaps could `automatically' depend of a
 Philippe> -foutputless-asm-is-volatile flag, just like g++ has a flag
 Philippe> to ask for the old `for'-scope rule.

All those things are possible.  Any of them could be the best choice
if there are cases where it is really *useful* or *important* for
outputless asm to be treated as not volatile.

But I don't think there are any such cases.  I haven't seen anyone
mention any.  (If they were mentioned in a message to gcc2 only, my
apologies, I haven't yet succeeded in subscribing to that so if it
wasn't sent to egcs or cc'ed to me I haven't seen it.)  Richard did
mention some cases where treating outputless asm as non-volatile is
*harmless*, but that's a different matter.

The current state is that there is an explicit sentence in the
documentation that says outputless asm is volatile.  (The example
directly above it muddles this by saying "volatile" explicitly.  But
it's hard to argue that this means the documentation intended the
opposite, because then how would you explain the existence of the
sentence that says outputless means volatile?)

Furthermore, the fix to make the compiler do that is one line long.
And it has been strongly argued and never refuted that doing this will
make code more reliable.

So, with all that it really baffles me how much energy is being spent
proposing alternatives that will create more work, more confusing
documentation, more switches in the user interface, all to enable a
capability (non-volatile outputless asm) that serves no purpose.

	paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-30 14:41 ` Paul Koning
@ 1997-12-30 23:05   ` Richard Stallman
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Stallman @ 1997-12-30 23:05 UTC (permalink / raw)
  To: kenner, pkoning; +Cc: gcc2, egcs

When we consider whether to support a certain optimization, we need to
keep in mind why we want optimizations in the first place.

An optimization is worth having if it gives users a substantial
speedup, substantially often.  In that case, it may be worth some
sacrifice to have the optimization.  But if the optimization happens
only rarely and provides small benefit, then it is not worth paying
any price for--especially not if the users pay the price.

Perhaps there are some cases of asm without outputs where it is safe
to permit moving the asm.  But it is clear that the benefits to be had
by such optimization are little and rare.  So given the choice between
permitting this optimization in rare cases, and any other benefit that
really matters, the latter wins hands down.

Therefore it is certain that GCC should do what the manual now says:
an asm with no outputs is treated as volatile.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
       [not found] <9712302324.AA01798@vlsi1.ultra.nyu.edu>
@ 1997-12-30 16:39 ` Paul Koning
  0 siblings, 0 replies; 25+ messages in thread
From: Paul Koning @ 1997-12-30 16:39 UTC (permalink / raw)
  To: kenner; +Cc: gcc2, egcs

>>>>> "Richard" == Richard Kenner <kenner@vlsi1.ultra.nyu.edu> writes:

 Paul> The alternatives are (a) to allow the compiler to do
 Paul> something it's not documented to do and that will cause
 Paul> subtle bugs, or (b) require the compiler to do something
 Paul> that it is documented to do, does no harm in any case, and
 Paul> avoids hairy bugs.  I think the right choice is pretty
 Paul> clear.

 Richard> First of all, there's no question that, at least in the
 Richard> short term, we have to make this change since there's code
 Richard> out there that depends on this behavior.

If I understand right, what you're saying is that the near-term change
should be to have input-only asms default to volatile.  Right?  Sounds
good.

 Richard> But the issue is what should the documented behavior *be*.
 Richard> In other words, how do we disambiguate the documentation.

Um, so are you saying then that you want the compiler to change to
match the documentation, and then later change both documentation and
compiler to behave once again the way they do in 2.7?

 Richard> There are *lots* of ways to write undefined C that can cause
 Richard> "subtle bugs" (things like "a[i++] * 2 + a[i]" are good
 Richard> examples of that).  We don't decide on the semantics of a
 Richard> language based on the fact that a programmer can use it
 Richard> improperly!

Actually, in many programming languages the likelihood of subtle bugs
IS a language design consideration.  Look at the Algol-68 report for
examples.  On the other hand, it's clear that C is an exception to
this.  But yes, if a feature has significant benefits, then the fact
that it can also create subtle bugs is usually not enough reason to
exclude it.

On the other hand, when considering a design decision where the issue
isn't forced by some standards committee, my inclination would be to
reduce the number of suble-bug-inducing features rather than
increasing it, ALL OTHER THINGS BEING EQUAL.  And no one has made an
argument so far that there is any BENEFIT to doing it the other way.
So, on one side of the balance there is a benefit (arguably small, but
a benefit nonetheless) and on the other side of the balance there is
nothing.  So the balance tips, doesn't it?

	paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
       [not found] <9712302109.AA01369@vlsi1.ultra.nyu.edu>
@ 1997-12-30 14:41 ` Paul Koning
  1997-12-30 23:05   ` Richard Stallman
  0 siblings, 1 reply; 25+ messages in thread
From: Paul Koning @ 1997-12-30 14:41 UTC (permalink / raw)
  To: kenner; +Cc: gcc2, egcs

>>>>> "Richard" == Richard Kenner <kenner@vlsi1.ultra.nyu.edu> writes:

 Paul> You mentioned the issue of an asm with inputs only that
 Paul> should still be treated as non-volatile.  I have a hard time
 Paul> conceiving of such a beast.  Can you show a real world
 Paul> example?

 Richard> Lighting some light. In such cases, you don't care whether
 Richard> or not it was moved around a few statements in the basic
 Richard> block.  Indeed, I'd say that *most* occurrences of asm would
 Richard> be in this category.

Hm.  I suppose that's possible, IF you're just trying to turn it on.
It doesn't work if you want it to blink, because the compiler can move
the on/off asm statement right around your blink delay unless you use
the volatile rule.

As for "most occurrences", I beg to differ.  I can't think of any
outputless asm statement I've ever written or seen anywhere that IS in
the "movable" category.  The example you gave is the first I've seen
that comes even close.

And in that example it would be perfectly harmless to use the rule as
documented (i.e., the asm is volatile).  Is there any example of an
input-only asm statement that *should not* be treated as volatile,
i.e., it actually hurts to make it non-movable automatically?  If such
a thing realistically exists it might make sense to invent a
"novolatile" keyword; without meaningful examples, the right answer is
to do what's documented.

Considering Joe Buck's analogy: would you consider it acceptable for a
compiler to permute the order of void function invocations unless
explicitly instructed not to do so?  I don't think so.  An asm
statement is also a function invocation, it just has a somewhat
different syntax.

	paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
       [not found] <9712301949.AA01218@vlsi1.ultra.nyu.edu>
@ 1997-12-30 13:52 ` Paul Koning
  1997-12-30 12:27   ` Linus Torvalds
  0 siblings, 1 reply; 25+ messages in thread
From: Paul Koning @ 1997-12-30 13:52 UTC (permalink / raw)
  To: kenner; +Cc: gcc2, egcs, torvalds

>>>>> "Richard" == Richard Kenner <kenner@vlsi1.ultra.nyu.edu> writes:

 Paul> GCC 2.7.2.1 had a bug that showed up in some backends, and
 Paul> EGCS exposed it with more backends, where asm statements
 Paul> that should have been treated as unmovable were actually
 Paul> moved.  (Spefically, those with no output arguments; the
 Paul> documentation quite rightly says that those are always
 Paul> treated as if they were marked "volatile" even when not so
 Paul> marked.)

 Richard> Actually, the documentation is quite unclear since the
 Richard> example of "volatile" is one where there are no outputs and
 Richard> hence the "volatile" was unneeded according to a literal
 Richard> reading of the section.

I'd say the documentation is quite clear; the words "An instruction
without output operands will not be deleted or moved significantly,
regardless" are not particularly ambiguous.  The only thing that's at
all strange is that the example directly above it has no outputs and
yet is tagged "volatile".  Of course, that would explain the presence
of the word "regardless".

 Richard> I think the clear intent of the manual was to say that an
 Richard> asm without *operands* (i.e., an "old style") asm would
 Richard> always be treated as volatile, but that for others, you had
 Richard> to specify "volatile" to have it be treated that way.

I disagree (see below).  More importantly, it's very clear that the
writers of Linux code interpreted the documentation the same as I did.

 Richard> That's what the code does and what makes the most sense.

 Richard> If we say that asms with only inputs are *always* volatile,
 Richard> we have no way to write one that isn't volatile since there
 Richard> is no "novolatile" keyword!

 Richard> So my inclination is to treat this as a documentation error
 Richard> and a Linux bug, though we may want 2.8 to treat it this to
 Richard> give the Linux folks time to change things.

Well, it's not just 2.8.  The problem is there already in 2.7.2.1 at
least for some code sequences and for some backends.  (I tripped over
it in some R4000 code with 2.7.2.1.)  What has happened since then is
that the optimizers keep getting better, so more and more code that
was written in conformance to the documentation is now broken in
subtle and hard-to-debug ways.

More to the point, I believe the documented behavior is the most
logical, for the following reason:

The asm statement, just like any other statement or any function,
performs some operation that produces some effect in the system.  That
effect may consist of explicit results, or side effects, or both.

If an asm has outputs, it may be that these explicit outputs are its
only effect on the system state, i.e., it has no side effects.  If so,
the optimizer has all the information it needs to do the right thing.
However, an asm with explicit outputs may also have side effects, in
which case the optimizer should not move it around.  "volatile" warns
the optimizer of this.

It seems to be a common guideline that one tries to avoid combining
side effects and explicit outputs, so defaulting explicit output asm
statements to "not volatile" is logical.

On the other hand, an asm that has no explicit output clearly must
have side effects, otherwise it wouldn't have been put there.  So for
such an asm, "volatile" is the right treatment.

You mentioned the issue of an asm with inputs only that should still
be treated as non-volatile.  I have a hard time conceiving of such a
beast.  Can you show a real world example?

Whether an asm has inputs or not doesn't enter into the above
analysis, which is why I believe it is right to base the treatment
only on whether there are output operands.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
       [not found] <199712302041.MAA25118@atrus.synopsys.com>
@ 1997-12-30 13:45 ` Toon Moene
  0 siblings, 0 replies; 25+ messages in thread
From: Toon Moene @ 1997-12-30 13:45 UTC (permalink / raw)
  To: Joe Buck; +Cc: egcs

Joe, you must be kiddin'

>  Imagine a C-like language where functions are pure
>  (side-effect-free) by default, and one had to say "impure"
>  to turn this off.  Clearly it would be reasonable to make
>  void functions "impure" by default, and it would seem
>  strange to be swayed by the argument that a pure void
>  function cannot then be written.  A pure void function
>  is necessarily a no-op, same as a nonvolatile asm
>  instruction with no outputs.

You are describing a Fortran-95 PURE FUNCTION :-)
[ IMPURE functions, obviously, are SUBROUTINEs ... ]

Cheers,
Toon.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: New problems with gcc-2.8.0 based code - NOW FIXED!
  1997-12-30 13:52 ` Paul Koning
@ 1997-12-30 12:27   ` Linus Torvalds
  0 siblings, 0 replies; 25+ messages in thread
From: Linus Torvalds @ 1997-12-30 12:27 UTC (permalink / raw)
  To: Paul Koning; +Cc: kenner, gcc2, egcs

Paul Koning writes:
> 
> On the other hand, an asm that has no explicit output clearly must
> have side effects, otherwise it wouldn't have been put there.  So for
> such an asm, "volatile" is the right treatment.

I agree 100%.

Any asm without any outputs by very definition has to have some side
effect to be useful, and as such the compiler should consider it volatile
by default. That makes gcc not only conform with the documentation, it is
also the only interpretation that makes any sense at all. 

The lack of a "novolatile" keyword is a non-issue, as anybody who would
ever want to use it seems rather misguided.

		Linus


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~1998-01-05 10:13 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-12-10 12:02 cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3 Alexandre Oliva
1997-12-10 23:09 ` Manfred.Hollstein
1997-12-11  3:32   ` Paul Eggert
1997-12-11  1:51     ` Manfred.Hollstein
1997-12-22 12:06   ` Jeffrey A Law
1997-12-22 22:04     ` Alexandre Oliva
1997-12-26  8:11     ` New problems with gcc-2.8.0 based code [was: Re: cannot bootstrap neither gcc-2.8.0-971206 nor egcs-971207 on sparc-sun-sunos4.1.3 ] Manfred Hollstein
1997-12-27 12:13       ` New problems with gcc-2.8.0 based code - NOW FIXED! Manfred Hollstein
1997-12-28 23:23         ` Richard Stallman
1997-12-29  7:48           ` Jeffrey A Law
1997-12-29 11:17             ` Manfred Hollstein
1997-12-29 10:20               ` Jeffrey A Law
1997-12-30  8:57                 ` Manfred Hollstein
1997-12-30  9:47                   ` Andi Kleen
1998-01-01 10:02                     ` Manfred Hollstein
1997-12-30 11:33                   ` Philip Blundell
1997-12-30 13:17                   ` Paul Koning
1997-12-29 11:08           ` Manfred Hollstein
     [not found] <199712302041.MAA25118@atrus.synopsys.com>
1997-12-30 13:45 ` Toon Moene
     [not found] <9712301949.AA01218@vlsi1.ultra.nyu.edu>
1997-12-30 13:52 ` Paul Koning
1997-12-30 12:27   ` Linus Torvalds
     [not found] <9712302109.AA01369@vlsi1.ultra.nyu.edu>
1997-12-30 14:41 ` Paul Koning
1997-12-30 23:05   ` Richard Stallman
     [not found] <9712302324.AA01798@vlsi1.ultra.nyu.edu>
1997-12-30 16:39 ` Paul Koning
     [not found] <E0xnN2N-0005UN-00@paddington.london.uk.eu.org>
     [not found] ` <199801030231.DAA25241@mail.macqel.be>
1998-01-05 10:13   ` Paul Koning

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).