public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* m68k: Simple loop compiles into boundless recursion with -O2
@ 2021-01-13 16:01 Fredrik Noring
  2021-01-13 16:09 ` Alexander Monakov
  2021-01-13 16:23 ` AW: " Stefan Franke
  0 siblings, 2 replies; 17+ messages in thread
From: Fredrik Noring @ 2021-01-13 16:01 UTC (permalink / raw)
  To: gcc-help

Hi,

Compiler used is GCC m68k-elf version 10.2.0. A variant of the classic
memset

	void *memset2(void *s, int c, unsigned int n)
	{
		char *b = s;
		for (unsigned int i = 0; i < n; i++)
			b[i] = c;
		return s;
	}

compiles into boundless recursion with O2 optimisation and the m68k-elf
target. This will, of course, exhaust the stack and crash badly.

The commands

	m68k-elf-gcc -O2 -march=68000 -c -o memset2.o memset2.c
	m68k-elf-objdump -d memset2.o

produce

	00000000 <memset2>:
	   0:	2f02           	movel %d2,%sp@-
	   2:	242f 0008      	movel %sp@(8),%d2
	   6:	202f 0010      	movel %sp@(16),%d0
	   a:	6718           	beqs 24 <memset2+0x24>
	   c:	2f00           	movel %d0,%sp@-
	   e:	102f 0013      	moveb %sp@(19),%d0
	  12:	4880           	extw %d0
	  14:	3040           	moveaw %d0,%a0
	  16:	2f08           	movel %a0,%sp@-
	  18:	2f02           	movel %d2,%sp@-
	  1a:	4eb9 0000 0000 	jsr 0 <memset2>   /* <<<--- recursion */
	  20:	4fef 000c      	lea %sp@(12),%sp
	  24:	2002           	movel %d2,%d0
	  26:	241f           	movel %sp@+,%d2
	  28:	4e75           	rts

O1 optimisation is more reasonable, as it instead produces

	00000000 <memset2>:
	   0:	2f02           	movel %d2,%sp@-
	   2:	202f 0008      	movel %sp@(8),%d0
	   6:	242f 000c      	movel %sp@(12),%d2
	   a:	4aaf 0010      	tstl %sp@(16)
	   e:	670e           	beqs 1e <memset2+0x1e>
	  10:	2040           	moveal %d0,%a0
	  12:	222f 0010      	movel %sp@(16),%d1
	  16:	d280           	addl %d0,%d1
	  18:	10c2           	moveb %d2,%a0@+
	  1a:	b288           	cmpl %a0,%d1
	  1c:	66fa           	bnes 18 <memset2+0x18>
	  1e:	241f           	movel %sp@+,%d2
	  20:	4e75           	rts

The machine code with O2 looks like a plain compiler bug to me.

What to do?

Fredrik

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 16:01 m68k: Simple loop compiles into boundless recursion with -O2 Fredrik Noring
@ 2021-01-13 16:09 ` Alexander Monakov
  2021-01-13 16:30   ` Fredrik Noring
  2021-01-13 16:23 ` AW: " Stefan Franke
  1 sibling, 1 reply; 17+ messages in thread
From: Alexander Monakov @ 2021-01-13 16:09 UTC (permalink / raw)
  To: Fredrik Noring; +Cc: gcc-help

On Wed, 13 Jan 2021, Fredrik Noring wrote:

> Hi,
> 
> Compiler used is GCC m68k-elf version 10.2.0. A variant of the classic
> memset
> 
> 	void *memset2(void *s, int c, unsigned int n)
> 	{
> 		char *b = s;
> 		for (unsigned int i = 0; i < n; i++)
> 			b[i] = c;
> 		return s;
> 	}
> 
> compiles into boundless recursion with O2 optimisation and the m68k-elf
> target. This will, of course, exhaust the stack and crash badly.
> 
> The commands
> 
> 	m68k-elf-gcc -O2 -march=68000 -c -o memset2.o memset2.c
> 	m68k-elf-objdump -d memset2.o

Please invoke objdump with -dr instead to see the relocations.

> 
> produce
> 
> 	00000000 <memset2>:
> 	   0:	2f02           	movel %d2,%sp@-
> 	   2:	242f 0008      	movel %sp@(8),%d2
> 	   6:	202f 0010      	movel %sp@(16),%d0
> 	   a:	6718           	beqs 24 <memset2+0x24>
> 	   c:	2f00           	movel %d0,%sp@-
> 	   e:	102f 0013      	moveb %sp@(19),%d0
> 	  12:	4880           	extw %d0
> 	  14:	3040           	moveaw %d0,%a0
> 	  16:	2f08           	movel %a0,%sp@-
> 	  18:	2f02           	movel %d2,%sp@-
> 	  1a:	4eb9 0000 0000 	jsr 0 <memset2>   /* <<<--- recursion */

The relocation associated with this instruction should point to memset.
Most likely the compiler is optimizing your memset2 function to call
the standard function 'memset'.

When implementing memset itself you need to pass -ffreestanding to GCC,
which will disable this optimization.

Alexander

^ permalink raw reply	[flat|nested] 17+ messages in thread

* AW: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 16:01 m68k: Simple loop compiles into boundless recursion with -O2 Fredrik Noring
  2021-01-13 16:09 ` Alexander Monakov
@ 2021-01-13 16:23 ` Stefan Franke
  2021-01-13 17:27   ` Fredrik Noring
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Franke @ 2021-01-13 16:23 UTC (permalink / raw)
  To: 'Fredrik Noring', gcc-help



> -----Ursprüngliche Nachricht-----
> Von: Gcc-help <gcc-help-bounces@gcc.gnu.org> Im Auftrag von Fredrik
> Noring
> Gesendet: Mittwoch, 13. Januar 2021 17:02
> An: gcc-help@gcc.gnu.org
> Betreff: m68k: Simple loop compiles into boundless recursion with -O2
> 
> Hi,
> 
> Compiler used is GCC m68k-elf version 10.2.0. A variant of the classic memset
> 
> 	void *memset2(void *s, int c, unsigned int n)
> 	{
> 		char *b = s;
> 		for (unsigned int i = 0; i < n; i++)
> 			b[i] = c;
> 		return s;
> 	}
> 
> compiles into boundless recursion with O2 optimisation and the m68k-elf
> target. This will, of course, exhaust the stack and crash badly.
> 
> The commands
> 
> 	m68k-elf-gcc -O2 -march=68000 -c -o memset2.o memset2.c
> 	m68k-elf-objdump -d memset2.o
> 
> produce
> 
> 	00000000 <memset2>:
> 	   0:	2f02           	movel %d2,%sp@-
> 	   2:	242f 0008      	movel %sp@(8),%d2
> 	   6:	202f 0010      	movel %sp@(16),%d0
> 	   a:	6718           	beqs 24 <memset2+0x24>
> 	   c:	2f00           	movel %d0,%sp@-
> 	   e:	102f 0013      	moveb %sp@(19),%d0
> 	  12:	4880           	extw %d0
> 	  14:	3040           	moveaw %d0,%a0
> 	  16:	2f08           	movel %a0,%sp@-
> 	  18:	2f02           	movel %d2,%sp@-
> 	  1a:	4eb9 0000 0000 	jsr 0 <memset2>   /* <<<--- recursion
> */
> 	  20:	4fef 000c      	lea %sp@(12),%sp
> 	  24:	2002           	movel %d2,%d0
> 	  26:	241f           	movel %sp@+,%d2
> 	  28:	4e75           	rts
> 
> O1 optimisation is more reasonable, as it instead produces
> 
> 	00000000 <memset2>:
> 	   0:	2f02           	movel %d2,%sp@-
> 	   2:	202f 0008      	movel %sp@(8),%d0
> 	   6:	242f 000c      	movel %sp@(12),%d2
> 	   a:	4aaf 0010      	tstl %sp@(16)
> 	   e:	670e           	beqs 1e <memset2+0x1e>
> 	  10:	2040           	moveal %d0,%a0
> 	  12:	222f 0010      	movel %sp@(16),%d1
> 	  16:	d280           	addl %d0,%d1
> 	  18:	10c2           	moveb %d2,%a0@+
> 	  1a:	b288           	cmpl %a0,%d1
> 	  1c:	66fa           	bnes 18 <memset2+0x18>
> 	  1e:	241f           	movel %sp@+,%d2
> 	  20:	4e75           	rts
> 
> The machine code with O2 looks like a plain compiler bug to me.
> 
> What to do?
> 
> Fredrik

I guess that the label here

> 1a:	4eb9 0000 0000 	jsr 0 <memset2>   /* <<<--- recursion

is printed incorrectly and instead a call to memset is done.

objdump is known to use the first label it finds for an offset, and also often uses the wrong section... You should look at the assembly or run objdump with `-dr` to also print relocations.

You may also use the compiler explorer here https://franke.ms/cex/  to view the results for some different m68k gcc versions. (note that the % for registers is omitted and labels have an underscore).

Here is the asm output of gcc-10.2.0-elf:

_memset2:
        move.l d2,-(sp)
        move.l 8(sp),d2
        move.l 16(sp),d0
        jeq .L4
        move.l d0,-(sp)
        move.b 19(sp),d0
        ext.w d0
        move.w d0,a0
        move.l a0,-(sp)
        move.l d2,-(sp)
        jsr _memset
        lea (12,sp),sp
.L4:
        move.l d2,d0
        move.l (sp)+,d2
        rts


/cheers

Stefan



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 16:09 ` Alexander Monakov
@ 2021-01-13 16:30   ` Fredrik Noring
  2021-01-13 19:20     ` Segher Boessenkool
  2021-01-15  2:23     ` Liu Hao
  0 siblings, 2 replies; 17+ messages in thread
From: Fredrik Noring @ 2021-01-13 16:30 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-help

Many thanks, Alexander,

> Please invoke objdump with -dr instead to see the relocations.

Indeed:

  1a:	4eb9 0000 0000 	jsr 0 <memset2>
			1c: R_68K_32	memset

> The relocation associated with this instruction should point to memset.
> Most likely the compiler is optimizing your memset2 function to call
> the standard function 'memset'.
> 
> When implementing memset itself you need to pass -ffreestanding to GCC,
> which will disable this optimization.

Yes, I had -nostdlib but -ffreestanding is apparently needed as well. Thanks
again.

Fredrik

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 16:23 ` AW: " Stefan Franke
@ 2021-01-13 17:27   ` Fredrik Noring
  0 siblings, 0 replies; 17+ messages in thread
From: Fredrik Noring @ 2021-01-13 17:27 UTC (permalink / raw)
  To: stefan; +Cc: gcc-help

Thanks Stefan,

> I guess that the label here
> 
> > 1a:	4eb9 0000 0000 	jsr 0 <memset2>   /* <<<--- recursion
> 
> is printed incorrectly and instead a call to memset is done.

I think so too!

> objdump is known to use the first label it finds for an offset, and also
> often uses the wrong section... You should look at the assembly or run
> objdump with `-dr` to also print relocations.

Indeed.

> You may also use the compiler explorer here https://franke.ms/cex/  to
> view the results for some different m68k gcc versions. (note that the %
> for registers is omitted and labels have an underscore).

Neat, so it's printing Motorola (MRI) rather than MIT syntax too. :)

Incidentally, I recently contributed a (small) fix to m68k/Binutils, and
was wondering if anyone would be interested in improving its disassembler:

https://sourceware.org/pipermail/binutils/2021-January/114809.html

Perhaps your Amiga menu could be supplemented with Atari, eventually? ;)

A few weeks ago I begun TOS/libc <https://github.com/frno7/toslibc> and
it's quite functional already. TOS is rather small, after all, compared
to the Freemint project for instance. I provisionally use the restricted
Vlink to produce the program from the final object file produced by GCC
but I plan to write my own free TOS-linker, or, if possible, extend GCC.

Fredrik

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 16:30   ` Fredrik Noring
@ 2021-01-13 19:20     ` Segher Boessenkool
  2021-01-13 19:53       ` Fredrik Noring
  2021-01-15  2:23     ` Liu Hao
  1 sibling, 1 reply; 17+ messages in thread
From: Segher Boessenkool @ 2021-01-13 19:20 UTC (permalink / raw)
  To: Fredrik Noring; +Cc: Alexander Monakov, gcc-help

Hi!

On Wed, Jan 13, 2021 at 05:30:35PM +0100, Fredrik Noring wrote:
> Many thanks, Alexander,
> 
> > Please invoke objdump with -dr instead to see the relocations.
> 
> Indeed:
> 
>   1a:	4eb9 0000 0000 	jsr 0 <memset2>
> 			1c: R_68K_32	memset

Yup, always always always use -dr :-)

> > The relocation associated with this instruction should point to memset.
> > Most likely the compiler is optimizing your memset2 function to call
> > the standard function 'memset'.
> > 
> > When implementing memset itself you need to pass -ffreestanding to GCC,
> > which will disable this optimization.
> 
> Yes, I had -nostdlib but -ffreestanding is apparently needed as well. Thanks
> again.

You probably want -ffreestanding anyway for your situation: -nostdlib
does not mean "there is no standard library", it just means "do not link
with it".

For the archives, no one has said it yet: to just disable the
optimisation transforming the loop into a memset, you can use
-fno-tree-loop-distribute-patterns .


Segher

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 19:20     ` Segher Boessenkool
@ 2021-01-13 19:53       ` Fredrik Noring
  2021-01-13 21:46         ` Segher Boessenkool
  0 siblings, 1 reply; 17+ messages in thread
From: Fredrik Noring @ 2021-01-13 19:53 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Alexander Monakov, gcc-help

Hi Segher,

> Yup, always always always use -dr :-)

Noted!

> You probably want -ffreestanding anyway for your situation: -nostdlib
> does not mean "there is no standard library", it just means "do not link
> with it".

Ah, yes.

> For the archives, no one has said it yet: to just disable the
> optimisation transforming the loop into a memset, you can use
> -fno-tree-loop-distribute-patterns .

It's a new 32-bit libc (for Atari TOS), and maximum GCC optimisations are
welcome, although I haven't yet figured out exactly where the boundary is
between the GCC built-ins and this libc implementation. There is obviously
some overlap between the two, and I wouldn't want the libc to stand in the
way of GCC's optimisations.

Other than that things work quite well so far, compiling programs, etc.

It was surprisingly simple and straight-forward, but then of course this
vintage 1985 operating system isn't complicated either. :)

[ The name TOS/libc is provisional, it will be renamed to Fuji/libc shortly. ]

Fredrik

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 19:53       ` Fredrik Noring
@ 2021-01-13 21:46         ` Segher Boessenkool
  2021-01-13 21:54           ` AW: " Stefan Franke
  2021-01-14 14:54           ` Fredrik Noring
  0 siblings, 2 replies; 17+ messages in thread
From: Segher Boessenkool @ 2021-01-13 21:46 UTC (permalink / raw)
  To: Fredrik Noring; +Cc: Alexander Monakov, gcc-help

On Wed, Jan 13, 2021 at 08:53:13PM +0100, Fredrik Noring wrote:
> > For the archives, no one has said it yet: to just disable the
> > optimisation transforming the loop into a memset, you can use
> > -fno-tree-loop-distribute-patterns .
> 
> It's a new 32-bit libc (for Atari TOS), and maximum GCC optimisations are
> welcome, although I haven't yet figured out exactly where the boundary is
> between the GCC built-ins and this libc implementation. There is obviously
> some overlap between the two, and I wouldn't want the libc to stand in the
> way of GCC's optimisations.

You would use this flag only for compiling the memset (etc.) routines
themselves.  Maybe using a pragma or similar.


Segher

^ permalink raw reply	[flat|nested] 17+ messages in thread

* AW: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 21:46         ` Segher Boessenkool
@ 2021-01-13 21:54           ` Stefan Franke
  2021-01-14 14:54           ` Fredrik Noring
  1 sibling, 0 replies; 17+ messages in thread
From: Stefan Franke @ 2021-01-13 21:54 UTC (permalink / raw)
  To: gcc-help



> -----Ursprüngliche Nachricht-----
> Von: Gcc-help <gcc-help-bounces@gcc.gnu.org> Im Auftrag von Segher
> Boessenkool
> Gesendet: Mittwoch, 13. Januar 2021 22:46
> An: Fredrik Noring <noring@nocrew.org>
> Cc: gcc-help@gcc.gnu.org; Alexander Monakov <amonakov@ispras.ru>
> Betreff: Re: m68k: Simple loop compiles into boundless recursion with -O2
> 
> On Wed, Jan 13, 2021 at 08:53:13PM +0100, Fredrik Noring wrote:
> > > For the archives, no one has said it yet: to just disable the
> > > optimisation transforming the loop into a memset, you can use
> > > -fno-tree-loop-distribute-patterns .
> >
> > It's a new 32-bit libc (for Atari TOS), and maximum GCC optimisations
> > are welcome, although I haven't yet figured out exactly where the
> > boundary is between the GCC built-ins and this libc implementation.
> > There is obviously some overlap between the two, and I wouldn't want
> > the libc to stand in the way of GCC's optimisations.
> 
> You would use this flag only for compiling the memset (etc.) routines
> themselves.  Maybe using a pragma or similar.
> 
> 
> Segher

e.g. this one:

__attribute__((optimize("no-tree-loop-distribute-patterns"))) 
void *memset2(void *s, int c, unsigned int n)
{
...



Stefan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 21:46         ` Segher Boessenkool
  2021-01-13 21:54           ` AW: " Stefan Franke
@ 2021-01-14 14:54           ` Fredrik Noring
  2021-01-14 15:05             ` Alexander Monakov
  1 sibling, 1 reply; 17+ messages in thread
From: Fredrik Noring @ 2021-01-14 14:54 UTC (permalink / raw)
  To: Segher Boessenkool, Stefan Franke; +Cc: Alexander Monakov, gcc-help

[ Stefan Franke, there are SMTP issues, see below. ]

Segher Boessenkool wrote:
> You would use this flag only for compiling the memset (etc.) routines
> themselves.  Maybe using a pragma or similar.

This was helpful. I've now replaced -ffreestanding and with GNU Make one can
easily amend the general compilation recipe with extra CFLAGS for specific
files, like so:

memcpy.o memset.o: CFLAGS += -fno-tree-loop-distribute-patterns

$(OBJS): %.o : %.c
	$(CC) $(CFLAGS) -c -o $@ $<

Stefan Franke wrote:
> e.g. this one:
> 
> __attribute__((optimize("no-tree-loop-distribute-patterns"))) 
> void *memset2(void *s, int c, unsigned int n)
> {
> ...

This works, too, and I prefer it because the __attribute__ is attached
to the code itself.

Stefan, I received a bounce for my reply to you yesterday:

<stefan@franke.ms>: host serveronline.org[78.46.86.77] said: 451-sending server
    is not yet permitted to send mail for senders domain 451 server:
    ste-pvt-msa1.bahnhof.se[213.80.101.70=ste-pvt-msa1.bahnhof.se] tried to
    send for domain: nocrew.org (in reply to RCPT TO command)

The GNU help mailing list accepted it, though:

https://gcc.gnu.org/pipermail/gcc-help/2021-January/139797.html

Similarly, your reply

https://gcc.gnu.org/pipermail/gcc-help/2021-January/139801.html

wasn't delivered to me.

Fredrik

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-14 14:54           ` Fredrik Noring
@ 2021-01-14 15:05             ` Alexander Monakov
  2021-01-14 15:09               ` Alexander Monakov
  0 siblings, 1 reply; 17+ messages in thread
From: Alexander Monakov @ 2021-01-14 15:05 UTC (permalink / raw)
  To: Fredrik Noring; +Cc: Segher Boessenkool, Stefan Franke, gcc-help

On Thu, 14 Jan 2021, Fredrik Noring wrote:

> > __attribute__((optimize("no-tree-loop-distribute-patterns"))) 
> > void *memset2(void *s, int c, unsigned int n)
> > {
> > ...
> 
> This works, too, and I prefer it because the __attribute__ is attached
> to the code itself.

Just keep in mind that this attribute is currently not intended for use
apart from debugging/testing:

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-optimize-function-attribute

  The optimize attribute should be used for debugging purposes only. It is not
  suitable in production code.

Alexander

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-14 15:05             ` Alexander Monakov
@ 2021-01-14 15:09               ` Alexander Monakov
  2021-01-14 15:15                 ` AW: " Stefan Franke
  0 siblings, 1 reply; 17+ messages in thread
From: Alexander Monakov @ 2021-01-14 15:09 UTC (permalink / raw)
  To: Fredrik Noring; +Cc: gcc-help, Stefan Franke, Segher Boessenkool

On Thu, 14 Jan 2021, Alexander Monakov via Gcc-help wrote:

> On Thu, 14 Jan 2021, Fredrik Noring wrote:
> 
> > > __attribute__((optimize("no-tree-loop-distribute-patterns"))) 
> > > void *memset2(void *s, int c, unsigned int n)
> > > {
> > > ...
> > 
> > This works, too, and I prefer it because the __attribute__ is attached
> > to the code itself.
> 
> Just keep in mind that this attribute is currently not intended for use
> apart from debugging/testing:
> 
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-optimize-function-attribute
> 
>   The optimize attribute should be used for debugging purposes only. It is not
>   suitable in production code.

(part of the reason for that being that the interaction of the attribute with
options passed on the command line is poorly defined: currently adding the
attribute results in other -O and -f options dropped for the function, IIRC;
this should have gone out with the previous email, sorry about sending that
prematurely)

> Alexander
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* AW: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-14 15:09               ` Alexander Monakov
@ 2021-01-14 15:15                 ` Stefan Franke
  2021-01-14 15:33                   ` Alexander Monakov
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Franke @ 2021-01-14 15:15 UTC (permalink / raw)
  To: gcc-help

> -----Ursprüngliche Nachricht-----
> Von: Gcc-help <gcc-help-bounces@gcc.gnu.org> Im Auftrag von Alexander
> Monakov via Gcc-help
> Gesendet: Donnerstag, 14. Januar 2021 16:10
> An: Fredrik Noring <noring@nocrew.org>
> Cc: gcc-help@gcc.gnu.org; Segher Boessenkool
> <segher@kernel.crashing.org>; Stefan Franke <s.franke@bebbosoft.de>
> Betreff: Re: m68k: Simple loop compiles into boundless recursion with -O2
> 
> On Thu, 14 Jan 2021, Alexander Monakov via Gcc-help wrote:
> 
> > On Thu, 14 Jan 2021, Fredrik Noring wrote:
> >
> > > > __attribute__((optimize("no-tree-loop-distribute-patterns")))
> > > > void *memset2(void *s, int c, unsigned int n) { ...
> > >
> > > This works, too, and I prefer it because the __attribute__ is
> > > attached to the code itself.
> >
> > Just keep in mind that this attribute is currently not intended for
> > use apart from debugging/testing:
> >
> > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-
> Attributes.html#ind
> > ex-optimize-function-attribute
> >
> >   The optimize attribute should be used for debugging purposes only. It
is
> not
> >   suitable in production code.
> 
> (part of the reason for that being that the interaction of the attribute
with
> options passed on the command line is poorly defined: currently adding the
> attribute results in other -O and -f options dropped for the function,
IIRC; this
> should have gone out with the previous email, sorry about sending that
> prematurely)

Sorry, but I can't reproduce that 'dropping' of other options. I played
around here: https://franke.ms/cex/z/EnK1GY
Could you please provide a reproduceable example?

Stefan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: AW: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-14 15:15                 ` AW: " Stefan Franke
@ 2021-01-14 15:33                   ` Alexander Monakov
  2021-01-14 15:56                     ` AW: " Stefan Franke
  0 siblings, 1 reply; 17+ messages in thread
From: Alexander Monakov @ 2021-01-14 15:33 UTC (permalink / raw)
  To: stefan; +Cc: gcc-help

On Thu, 14 Jan 2021, Stefan Franke wrote:

> Sorry, but I can't reproduce that 'dropping' of other options. I played
> around here: https://franke.ms/cex/z/EnK1GY
> Could you please provide a reproduceable example?

It is easily reproducible on x86: https://godbolt.org/z/Mx1KMe

Kernel people hit this issue recently when attemping to use the attribute:
https://lore.kernel.org/lkml/alpine.LSU.2.21.2004151445520.11688@wotan.suse.de/

Alexander

^ permalink raw reply	[flat|nested] 17+ messages in thread

* AW: AW: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-14 15:33                   ` Alexander Monakov
@ 2021-01-14 15:56                     ` Stefan Franke
  0 siblings, 0 replies; 17+ messages in thread
From: Stefan Franke @ 2021-01-14 15:56 UTC (permalink / raw)
  To: gcc-help



> -----Ursprüngliche Nachricht-----
> Von: Alexander Monakov <amonakov@ispras.ru>
> Gesendet: Donnerstag, 14. Januar 2021 16:33
> An: stefan@franke.ms
> Cc: gcc-help@gcc.gnu.org
> Betreff: Re: AW: m68k: Simple loop compiles into boundless recursion with
-
> O2
> 
> On Thu, 14 Jan 2021, Stefan Franke wrote:
> 
> > Sorry, but I can't reproduce that 'dropping' of other options. I
> > played around here: https://franke.ms/cex/z/EnK1GY Could you please
> > provide a reproduceable example?
> 
> It is easily reproducible on x86: https://godbolt.org/z/Mx1KMe
> 
> Kernel people hit this issue recently when attemping to use the attribute:
> https://lore.kernel.org/lkml/alpine.LSU.2.21.2004151445520.11688@wotan.s
> use.de/
> 
> Alexander

Changing -Os to -O2 or other -O* options still do work...
... but: it kills the -fno-omit-frame-pointer

So you need to be careful if using attributed options - as with every
tweaking.

Stefan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-13 16:30   ` Fredrik Noring
  2021-01-13 19:20     ` Segher Boessenkool
@ 2021-01-15  2:23     ` Liu Hao
  2021-01-15  6:23       ` Fredrik Noring
  1 sibling, 1 reply; 17+ messages in thread
From: Liu Hao @ 2021-01-15  2:23 UTC (permalink / raw)
  To: Fredrik Noring, Alexander Monakov; +Cc: gcc-help


[-- Attachment #1.1: Type: text/plain, Size: 1204 bytes --]

在 2021/1/14 上午12:30, Fredrik Noring 写道:
> Many thanks, Alexander,
> 
>> Please invoke objdump with -dr instead to see the relocations.
> 
> Indeed:
> 
>   1a:	4eb9 0000 0000 	jsr 0 <memset2>
> 			1c: R_68K_32	memset
> 
>> The relocation associated with this instruction should point to memset.
>> Most likely the compiler is optimizing your memset2 function to call
>> the standard function 'memset'.
>>
>> When implementing memset itself you need to pass -ffreestanding to GCC,
>> which will disable this optimization.
> 

I used to run into the same issue around CRT code on x86. Use of `-ffreestanding` disables a number
of optimizations, for example, the compiler cannot optimize

    int data[4];
    memset(&data, 0, sizeof(data));

to a series of store operations, but leave it as a function call, which is rather overkill.


The issue in the original post can be resolved by writing through a pointer to `volatile char` like
this:

    void *memset2(void *s, int c, unsigned int n)
    {
        volatile char *b = s;
        for (unsigned int i = 0; i < n; i++)
            b[i] = c;
        return s;
    }




-- 
Best regards,
LH_Mouse


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: m68k: Simple loop compiles into boundless recursion with -O2
  2021-01-15  2:23     ` Liu Hao
@ 2021-01-15  6:23       ` Fredrik Noring
  0 siblings, 0 replies; 17+ messages in thread
From: Fredrik Noring @ 2021-01-15  6:23 UTC (permalink / raw)
  To: Liu Hao; +Cc: Alexander Monakov, gcc-help

Hi Hao,

> I used to run into the same issue around CRT code on x86. Use of
> `-ffreestanding` disables a number of optimizations, for example, the
> compiler cannot optimize
> 
>     int data[4];
>     memset(&data, 0, sizeof(data));
> 
> to a series of store operations, but leave it as a function call, which
> is rather overkill.

With an array initialiser GCC (having -ffreestanding) will optimise

	void g(int *data);
	void f()
	{
		int data[N] = { };
		g(data);
	}

to stores, for N < 15, as observed at <http://franke.ms/cex/z/WEse7e>.
Structures have the nice property in GCC that one can do assignments like

	struct s { int data[N]; } d;
	void f()
	{
		d = (struct s) { };
	}

and once again GCC will store for N < 15 <http://franke.ms/cex/z/KM4ced>.

> The issue in the original post can be resolved by writing through a
> pointer to `volatile char` like this:
> 
>     void *memset2(void *s, int c, unsigned int n)
>     {
>         volatile char *b = s;
>         for (unsigned int i = 0; i < n; i++)
>             b[i] = c;
>         return s;
>     }

The GCC optimiser will also yield with a small amount of loop unrolling,

	void *memset2(void *s, int c, unsigned int n)
	{
		unsigned int i = 0;
		char *b = s;
	
		while (i + 1 < n) {
			b[i++] = c;
			b[i++] = c;
		}
	
		if (i < n)
			b[i] = c;
	
		return s;
	}

and I suppose I will implement something like this in assembly, eventually,
for speed reasons. Storing 32-bit longs rather than bytes is faster still,
but one complication is that the 68000 has address alignment restrictions
for 16-bit and 32-bit load/stores. And then the 68000 has the MOVEM.L
instruction, if one wants to max out on larger sizes. :)

Fredrik

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-01-15  6:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-13 16:01 m68k: Simple loop compiles into boundless recursion with -O2 Fredrik Noring
2021-01-13 16:09 ` Alexander Monakov
2021-01-13 16:30   ` Fredrik Noring
2021-01-13 19:20     ` Segher Boessenkool
2021-01-13 19:53       ` Fredrik Noring
2021-01-13 21:46         ` Segher Boessenkool
2021-01-13 21:54           ` AW: " Stefan Franke
2021-01-14 14:54           ` Fredrik Noring
2021-01-14 15:05             ` Alexander Monakov
2021-01-14 15:09               ` Alexander Monakov
2021-01-14 15:15                 ` AW: " Stefan Franke
2021-01-14 15:33                   ` Alexander Monakov
2021-01-14 15:56                     ` AW: " Stefan Franke
2021-01-15  2:23     ` Liu Hao
2021-01-15  6:23       ` Fredrik Noring
2021-01-13 16:23 ` AW: " Stefan Franke
2021-01-13 17:27   ` Fredrik Noring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).