public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
@ 2019-10-16 13:19 Josef Wolf
  2019-10-16 13:30 ` Matthias Pfaller
  2019-10-16 18:18 ` Martin Sebor
  0 siblings, 2 replies; 26+ messages in thread
From: Josef Wolf @ 2019-10-16 13:19 UTC (permalink / raw)
  To: gcc-help

Hello all,

I experience target crashing when cross compiling for ARM with
-ftree-loop-distribute-patterns, which is enabled by the -O3 flag.

The crash happens in the startup code, before main() is called. This startup
code looks like this:

 extern unsigned long _sidata; /* Set by the linker */
 extern unsigned long _sdata;  /* Set by the linker */
 extern unsigned long _sbss; /* Set by the linker */
 extern unsigned long _ebss;  /* Set by the linker */
 
  void Reet_Handler (void)
  {
    unsigned long *src = &_sidata
    unsigned long *src = &_sdata
  
    /* Copy data segment into RAM */
    if (src != dst) {
      while (dst < &_edata)
        *(dst++) = *(src++);
    }
  
    /* Zero BSS segment */
    dst = &_sbss;
    while (dst < &_ebss)
      *(dst++) = 0;
  
    main();
  }


With -ftree-loop-distribute-patterns those two loops are replaced by calls to
memcpy() and memset().

The memcpy function finishes just fine. But the memset function doesn't seem
to finish.  It looks like this:

  void memset (void *s, int c, size_t n)
  {
    int i;
    for (i=0; i<n; i++)
      ((char *)s)[i] = c;
  }

Any ideas why this function is crashing? I can't see anything suspicious here.

Thanks

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-16 13:19 Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns Josef Wolf
@ 2019-10-16 13:30 ` Matthias Pfaller
  2019-10-17  8:10   ` Josef Wolf
  2019-10-16 18:18 ` Martin Sebor
  1 sibling, 1 reply; 26+ messages in thread
From: Matthias Pfaller @ 2019-10-16 13:30 UTC (permalink / raw)
  To: gcc-help

[-- Attachment #1: Type: text/plain, Size: 1838 bytes --]

On 10/16/19 3:17 PM, Josef Wolf wrote:
> Hello all,
> 
> I experience target crashing when cross compiling for ARM with
> -ftree-loop-distribute-patterns, which is enabled by the -O3 flag.
> 
> The crash happens in the startup code, before main() is called. This startup
> code looks like this:
> 
>  extern unsigned long _sidata; /* Set by the linker */
>  extern unsigned long _sdata;  /* Set by the linker */
>  extern unsigned long _sbss; /* Set by the linker */
>  extern unsigned long _ebss;  /* Set by the linker */
>  
>   void Reet_Handler (void)
>   {
>     unsigned long *src = &_sidata
>     unsigned long *src = &_sdata
>   
>     /* Copy data segment into RAM */
>     if (src != dst) {
>       while (dst < &_edata)
>         *(dst++) = *(src++);
>     }
>   
>     /* Zero BSS segment */
>     dst = &_sbss;
>     while (dst < &_ebss)
>       *(dst++) = 0;
>   
>     main();
>   }
> 
> 
> With -ftree-loop-distribute-patterns those two loops are replaced by calls to
> memcpy() and memset().
> 
> The memcpy function finishes just fine. But the memset function doesn't seem
> to finish.  It looks like this:
> 
>   void memset (void *s, int c, size_t n)
>   {
>     int i;
>     for (i=0; i<n; i++)
>       ((char *)s)[i] = c;
>   }
> 
> Any ideas why this function is crashing? I can't see anything suspicious here.
> 
> Thanks

Is your stack part of the BSS? In that case the call to memset will
cause a crash.

Matthias
-- 
Matthias Pfaller                          Software Entwicklung
marco Systemanalyse und Entwicklung GmbH  Tel   +49 8131 5161 41
Hans-Böckler-Str. 2, D 85221 Dachau       Fax   +49 8131 5161 66
http://www.marco.de/                      Email leo@marco.de
Geschäftsführer Martin Reuter             HRB 171775 Amtsgericht München


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3591 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-16 13:19 Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns Josef Wolf
  2019-10-16 13:30 ` Matthias Pfaller
@ 2019-10-16 18:18 ` Martin Sebor
  2019-10-17 11:40   ` Josef Wolf
  2019-10-18 13:10   ` Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns Josef Wolf
  1 sibling, 2 replies; 26+ messages in thread
From: Martin Sebor @ 2019-10-16 18:18 UTC (permalink / raw)
  To: Josef Wolf, gcc-help

On 10/16/19 7:17 AM, Josef Wolf wrote:
> Hello all,
> 
> I experience target crashing when cross compiling for ARM with
> -ftree-loop-distribute-patterns, which is enabled by the -O3 flag.
> 
> The crash happens in the startup code, before main() is called. This startup
> code looks like this:
> 
>   extern unsigned long _sidata; /* Set by the linker */
>   extern unsigned long _sdata;  /* Set by the linker */
>   extern unsigned long _sbss; /* Set by the linker */
>   extern unsigned long _ebss;  /* Set by the linker */
>   
>    void Reet_Handler (void)
>    {
>      unsigned long *src = &_sidata
>      unsigned long *src = &_sdata
>    
>      /* Copy data segment into RAM */
>      if (src != dst) {
>        while (dst < &_edata)
>          *(dst++) = *(src++);
>      }
>    
>      /* Zero BSS segment */
>      dst = &_sbss;
>      while (dst < &_ebss)
>        *(dst++) = 0;
>    
>      main();
>    }
> 
> 
> With -ftree-loop-distribute-patterns those two loops are replaced by calls to
> memcpy() and memset().
> 
> The memcpy function finishes just fine. But the memset function doesn't seem
> to finish.  It looks like this:
> 
>    void memset (void *s, int c, size_t n)
>    {
>      int i;
>      for (i=0; i<n; i++)
>        ((char *)s)[i] = c;
>    }

This is probably not the cause of the crash but it's worth keeping
in mind.  The standard memset function returns void* and (unless
disabled) recent versions of GCC will issue a warning:

conflicting types for built-in function 'memset'; expected 'void *(void 
*, int,  unsigned int)' [-Wbuiltin-declaration-mismatch]

GCC expects a conforming memset and memcpy implementation even in
freestanding/embedded environments so defining these functions in
a different way could cause trouble.

> 
> Any ideas why this function is crashing? I can't see anything suspicious here.

I doubt it's the cause of the crash either but only addresses of
bytes of the same object can be used in relational expressions
(i.e., the two less-than controlling expressions).  Using address
to unrelated objects is undefined.  Concerns about invalidating
code like the above prevents compilers from implementing useful
optimizations.

GCC doesn't issue a warning for this bug yet but it might in
the future.   To avoid the undefined behavior and future
warnings, convert the addresses to uintptr_t first and compare
those instead.

Martin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-16 13:30 ` Matthias Pfaller
@ 2019-10-17  8:10   ` Josef Wolf
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Wolf @ 2019-10-17  8:10 UTC (permalink / raw)
  To: gcc-help

Thanks for your help, Matthias!

On Wed, Oct 16, 2019 at 03:30:42PM +0200, Matthias Pfaller wrote:
>
> Is your stack part of the BSS? In that case the call to memset will
> cause a crash.

This would be an explanation. But the stack doesn't seem to be part of BSS:

_sbss      = 0x20003758 ;; Start of bss
_ebss      = 0x20012b48 ;; End of BSS
_susrstack = 0x20012b48 ;; Start of user stack (not used in this application)
_eusrstack = 0x20013b48 ;; End of user stack (not used in this application)
SP         = 0x20017d20 ;; Value of stack pointer when memset() is entered
_estack    = 0x20018000 ;; Initial stack pointer

So I don't see anything that might be wrong here.

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-16 18:18 ` Martin Sebor
@ 2019-10-17 11:40   ` Josef Wolf
  2019-10-17 12:37     ` Matthias Pfaller
  2019-10-18  9:10     ` Propagating addresses from linker to the runtie (was: Re: Crash when cross compiling for ARM with GCC-8-2-0 and) -ftree-loop-distribute-patterns Josef Wolf
  2019-10-18 13:10   ` Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns Josef Wolf
  1 sibling, 2 replies; 26+ messages in thread
From: Josef Wolf @ 2019-10-17 11:40 UTC (permalink / raw)
  To: gcc-help

Thanks for your Help, Martin!

On Wed, Oct 16, 2019 at 12:18:15PM -0600, Martin Sebor wrote:
> On 10/16/19 7:17 AM, Josef Wolf wrote:
> >Hello all,
> >
> >I experience target crashing when cross compiling for ARM with
> >-ftree-loop-distribute-patterns, which is enabled by the -O3 flag.
> >
> >The crash happens in the startup code, before main() is called. This startup
> >code looks like this:
> >
> >  extern unsigned long _sidata; /* Set by the linker */
> >  extern unsigned long _sdata;  /* Set by the linker */
> >  extern unsigned long _sbss; /* Set by the linker */
> >  extern unsigned long _ebss;  /* Set by the linker */
> >   void Reet_Handler (void)
> >   {
> >     unsigned long *src = &_sidata
> >     unsigned long *src = &_sdata
> >     /* Copy data segment into RAM */
> >     if (src != dst) {
> >       while (dst < &_edata)
> >         *(dst++) = *(src++);
> >     }
> >     /* Zero BSS segment */
> >     dst = &_sbss;
> >     while (dst < &_ebss)
> >       *(dst++) = 0;
> >     main();
> >   }
> >
> >
> >With -ftree-loop-distribute-patterns those two loops are replaced by calls to
> >memcpy() and memset().
> >
> >The memcpy function finishes just fine. But the memset function doesn't seem
> >to finish.  It looks like this:
> >
> >   void memset (void *s, int c, size_t n)
> >   {
> >     int i;
> >     for (i=0; i<n; i++)
> >       ((char *)s)[i] = c;
> >   }
> 
> This is probably not the cause of the crash but it's worth keeping
> in mind.  The standard memset function returns void* and (unless
> disabled) recent versions of GCC will issue a warning:

Ooops!

This was actually my fault. Since the computer doesn't have network, I
had typed the code by hand into the mail.

Although I double-checked multiple times, I managed to introduce several
typos :-///

Sorry for the confusion!


The code of Reset_Handler() and memset() actually looks like this:

    void Reset_Handler (void)
    {
      unsigned long *src = &_sidata
      unsigned long *dst = &_sdata

      /* Copy data segment into RAM */
      if (src != dst) {
        while (dst < &_edata)
          *(dst++) = *(src++);
      }

      /* Zero BSS segment */
      dst = &_sbss;
      while (dst < &_ebss)
        *(dst++) = 0;
 
      main();
    }

    void *memset (void *s, int c, size_t n)
    {
         int i;
         for (i=0; i<n; i++)
/* B */   ((char *)s)[i] = c;
   
/* B */ return s;
    }

> >Any ideas why this function is crashing? I can't see anything suspicious here.
> 
> I doubt it's the cause of the crash either but only addresses of
> bytes of the same object can be used in relational expressions
> (i.e., the two less-than controlling expressions).

Hmm, you are talking about the two loops in Reset_Handler(), right?

> Using address to unrelated objects is undefined.

Hmmm... I am not an expert on this topic. But I tend to think the BSS segment
is an object, which in turn is an array of uint8_t and/or uint32_t.
Taking the address one past the last element of an array for comparison is
a perfectly valid operation, AFAIK.

So what would be the proper way to communicate the dimensions of the BSS
segment from the linker to the runtime of the compiled program?

The memset() function is called with the right parameters. And it seems to
work when I single-step it on instruction level (that is, "stepi" command in
gdb). But it crashes if I set breakpints to the two instructions marked above
and use the "cont" statement or the "step" statement in gdb.

> Concerns about invalidating
> code like the above prevents compilers from implementing useful
> optimizations.

Hmmm...

> GCC doesn't issue a warning for this bug yet but it might in
> the future.   To avoid the undefined behavior and future
> warnings, convert the addresses to uintptr_t first and compare
> those instead.

Changing the code to use uintptr_t did not change anything. Still crashing.

BTW: The official documentation of gnu-ld contains an example with almost
     the same code, but using char* instead:
      https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_chapter/ld_toc.html#TOC21
     Maybe this needs an update?

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-17 11:40   ` Josef Wolf
@ 2019-10-17 12:37     ` Matthias Pfaller
  2019-10-17 14:10       ` Josef Wolf
  2019-10-18  9:10     ` Propagating addresses from linker to the runtie (was: Re: Crash when cross compiling for ARM with GCC-8-2-0 and) -ftree-loop-distribute-patterns Josef Wolf
  1 sibling, 1 reply; 26+ messages in thread
From: Matthias Pfaller @ 2019-10-17 12:37 UTC (permalink / raw)
  To: gcc-help

[-- Attachment #1: Type: text/plain, Size: 4743 bytes --]

On 10/17/19 1:31 PM, Josef Wolf wrote:
> Thanks for your Help, Martin!
> 
> On Wed, Oct 16, 2019 at 12:18:15PM -0600, Martin Sebor wrote:
>> On 10/16/19 7:17 AM, Josef Wolf wrote:
>>> Hello all,
>>>
>>> I experience target crashing when cross compiling for ARM with
>>> -ftree-loop-distribute-patterns, which is enabled by the -O3 flag.
>>>
>>> The crash happens in the startup code, before main() is called. This startup
>>> code looks like this:
>>>
>>>  extern unsigned long _sidata; /* Set by the linker */
>>>  extern unsigned long _sdata;  /* Set by the linker */
>>>  extern unsigned long _sbss; /* Set by the linker */
>>>  extern unsigned long _ebss;  /* Set by the linker */
>>>   void Reet_Handler (void)
>>>   {
>>>     unsigned long *src = &_sidata
>>>     unsigned long *src = &_sdata
>>>     /* Copy data segment into RAM */
>>>     if (src != dst) {
>>>       while (dst < &_edata)
>>>         *(dst++) = *(src++);
>>>     }
>>>     /* Zero BSS segment */
>>>     dst = &_sbss;
>>>     while (dst < &_ebss)
>>>       *(dst++) = 0;
>>>     main();
>>>   }
>>>
>>>
>>> With -ftree-loop-distribute-patterns those two loops are replaced by calls to
>>> memcpy() and memset().
>>>
>>> The memcpy function finishes just fine. But the memset function doesn't seem
>>> to finish.  It looks like this:
>>>
>>>   void memset (void *s, int c, size_t n)
>>>   {
>>>     int i;
>>>     for (i=0; i<n; i++)
>>>       ((char *)s)[i] = c;
>>>   }
>>
>> This is probably not the cause of the crash but it's worth keeping
>> in mind.  The standard memset function returns void* and (unless
>> disabled) recent versions of GCC will issue a warning:
> 
> Ooops!
> 
> This was actually my fault. Since the computer doesn't have network, I
> had typed the code by hand into the mail.
> 
> Although I double-checked multiple times, I managed to introduce several
> typos :-///
> 
> Sorry for the confusion!
> 
> 
> The code of Reset_Handler() and memset() actually looks like this:
> 
>     void Reset_Handler (void)
>     {
>       unsigned long *src = &_sidata
>       unsigned long *dst = &_sdata
> 
>       /* Copy data segment into RAM */
>       if (src != dst) {
>         while (dst < &_edata)
>           *(dst++) = *(src++);
>       }
> 
>       /* Zero BSS segment */
>       dst = &_sbss;
>       while (dst < &_ebss)
>         *(dst++) = 0;
>  
>       main();
>     }
> 
>     void *memset (void *s, int c, size_t n)
>     {
>          int i;
>          for (i=0; i<n; i++)
> /* B */   ((char *)s)[i] = c;
>    
> /* B */ return s;
>     }
> 
>>> Any ideas why this function is crashing? I can't see anything suspicious here.
>>
>> I doubt it's the cause of the crash either but only addresses of
>> bytes of the same object can be used in relational expressions
>> (i.e., the two less-than controlling expressions).
> 
> Hmm, you are talking about the two loops in Reset_Handler(), right?
> 
>> Using address to unrelated objects is undefined.
> 
> Hmmm... I am not an expert on this topic. But I tend to think the BSS segment
> is an object, which in turn is an array of uint8_t and/or uint32_t.
> Taking the address one past the last element of an array for comparison is
> a perfectly valid operation, AFAIK.
> 
> So what would be the proper way to communicate the dimensions of the BSS
> segment from the linker to the runtime of the compiled program?
> 
> The memset() function is called with the right parameters. And it seems to
> work when I single-step it on instruction level (that is, "stepi" command in
> gdb). But it crashes if I set breakpints to the two instructions marked above
> and use the "cont" statement or the "step" statement in gdb.
> 

Have a look at "arm-eabi-objdump -S -d main.elf". Sometimes this is
quite revealing.

Are you using openocd or something similar for debugging? You are
compiling for a cortex-m0/3/4? Are you single stepping through the
complete startup sequence or do set a break point ath the top of memset
(i.e. are break points working at all)?

Interrupts are still disabled?

Why is the stack pointer so low at this point of execution? Using
0x20018000-0x20017d20 == 0x2e0 bytes of stack seems a little excessive
for just one call.

I usually start toggling output lines when I'm stuck like this...

Matthias
-- 
Matthias Pfaller                          Software Entwicklung
marco Systemanalyse und Entwicklung GmbH  Tel   +49 8131 5161 41
Hans-Böckler-Str. 2, D 85221 Dachau       Fax   +49 8131 5161 66
http://www.marco.de/                      Email leo@marco.de
Geschäftsführer Martin Reuter             HRB 171775 Amtsgericht München


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3591 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-17 12:37     ` Matthias Pfaller
@ 2019-10-17 14:10       ` Josef Wolf
  2019-10-17 14:55         ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 26+ messages in thread
From: Josef Wolf @ 2019-10-17 14:10 UTC (permalink / raw)
  To: gcc-help

Tahnks for your help, Matthias!

On Thu, Oct 17, 2019 at 02:37:11PM +0200, Matthias Pfaller wrote:

> Have a look at "arm-eabi-objdump -S -d main.elf". Sometimes this is
> quite revealing.

Yeah.

> Are you using openocd or something similar for debugging?

Yes. Openocd with gdb.

> You are compiling for a cortex-m0/3/4?

Cortex-m3

> Are you single stepping through the complete startup sequence or do set a
> break point ath the top of memset (i.e. are break points working at all)?

Breakpoints are working. But there is only a limited set of hardware
breakpoints (four, AFAIR).

> Interrupts are still disabled?

There are no interrupt sources enabled yet. But I wonder why the CPU is not
starting up with disabled IRQs? I am new to the ARM architecture, but every
other architecture I know of would come out from reset with disabled
interrupts... I'd expect BASEPRI and PRIMASK to be set to sane values before
the first instruction is executed?

Anyway, explicitly calling __set_PRIMASK(1) did also not help, although
primask ist still set when the processor crashes.

> Why is the stack pointer so low at this point of execution? Using
> 0x20018000-0x20017d20 == 0x2e0 bytes of stack seems a little excessive
> for just one call.

Ah!... Looks like you've spotted the problem! Actually, the SP is decremented
on every cycle of the loop:

  (gdb) disass
  Dump of assembler code for function memset:
     0x08001008 <+0>:    push {r4, lr}
     0x0800100a <+2>:    mov  r4, r0
     0x0800100c <+4>:    cbz  r2, 0x8001014 <memset+12>
  => 0x0800100e <+6>:    uxtb r1, r1
     0x08001010 <+8>:    bl   0x8001008 <memset>
     0x08001014 <+12>:   mov  r0, r4
     0x08001016 <+14>:   pop  {r4, pc}
  End of assembler dump.

This looks REALLY suspicous to me. Every cycle of the loop in memset() is
pushing something onto the stack?!?

Without the  -ftree-loop-distribute-patterns option, the memset() function
looks entirely different:

         cbz    r2, <memset+18>
         add    r2, r0
         subs   r2, #1
         uxtb   r1, r1
         subs   r3, r0, #1
  <+10>: strb.w r1, [r3, #1]!
         cmp    r3, r2
         bne.n  <memset+10>
  <+18>: bx     lr

> I usually start toggling output lines when I'm stuck like this...

?


-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-17 14:10       ` Josef Wolf
@ 2019-10-17 14:55         ` Richard Earnshaw (lists)
  2019-10-18  9:00           ` Josef Wolf
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Earnshaw (lists) @ 2019-10-17 14:55 UTC (permalink / raw)
  To: Josef Wolf, gcc-help

On 17/10/2019 15:04, Josef Wolf wrote:
> Tahnks for your help, Matthias!
> 
> On Thu, Oct 17, 2019 at 02:37:11PM +0200, Matthias Pfaller wrote:
> 
>> Have a look at "arm-eabi-objdump -S -d main.elf". Sometimes this is
>> quite revealing.
> 
> Yeah.
> 
>> Are you using openocd or something similar for debugging?
> 
> Yes. Openocd with gdb.
> 
>> You are compiling for a cortex-m0/3/4?
> 
> Cortex-m3
> 
>> Are you single stepping through the complete startup sequence or do set a
>> break point ath the top of memset (i.e. are break points working at all)?
> 
> Breakpoints are working. But there is only a limited set of hardware
> breakpoints (four, AFAIR).
> 
>> Interrupts are still disabled?
> 
> There are no interrupt sources enabled yet. But I wonder why the CPU is not
> starting up with disabled IRQs? I am new to the ARM architecture, but every
> other architecture I know of would come out from reset with disabled
> interrupts... I'd expect BASEPRI and PRIMASK to be set to sane values before
> the first instruction is executed?
> 
> Anyway, explicitly calling __set_PRIMASK(1) did also not help, although
> primask ist still set when the processor crashes.
> 
>> Why is the stack pointer so low at this point of execution? Using
>> 0x20018000-0x20017d20 == 0x2e0 bytes of stack seems a little excessive
>> for just one call.
> 
> Ah!... Looks like you've spotted the problem! Actually, the SP is decremented
> on every cycle of the loop:
> 
>    (gdb) disass
>    Dump of assembler code for function memset:
>       0x08001008 <+0>:    push {r4, lr}
>       0x0800100a <+2>:    mov  r4, r0
>       0x0800100c <+4>:    cbz  r2, 0x8001014 <memset+12>
>    => 0x0800100e <+6>:    uxtb r1, r1
>       0x08001010 <+8>:    bl   0x8001008 <memset>
>       0x08001014 <+12>:   mov  r0, r4
>       0x08001016 <+14>:   pop  {r4, pc}
>    End of assembler dump.
> 
> This looks REALLY suspicous to me. Every cycle of the loop in memset() is
> pushing something onto the stack?!?
> 
> Without the  -ftree-loop-distribute-patterns option, the memset() function
> looks entirely different:
> 
>           cbz    r2, <memset+18>
>           add    r2, r0
>           subs   r2, #1
>           uxtb   r1, r1
>           subs   r3, r0, #1
>    <+10>: strb.w r1, [r3, #1]!
>           cmp    r3, r2
>           bne.n  <memset+10>
>    <+18>: bx     lr
> 
>> I usually start toggling output lines when I'm stuck like this...
> 
> ?
> 
> 

The compiler has spotted that you've written something that acts like 
memset and optimized it into a function call to memset.  So now you're 
recursing to oblivion.  Try adding -fno-builtin-memset to your compile 
options.

R.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-17 14:55         ` Richard Earnshaw (lists)
@ 2019-10-18  9:00           ` Josef Wolf
  2019-10-18 10:26             ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 26+ messages in thread
From: Josef Wolf @ 2019-10-18  9:00 UTC (permalink / raw)
  To: gcc-help

Thanks for your help, Richard!

On Thu, Oct 17, 2019 at 03:55:31PM +0100, Richard Earnshaw (lists) wrote:
> On 17/10/2019 15:04, Josef Wolf wrote:
> >On Thu, Oct 17, 2019 at 02:37:11PM +0200, Matthias Pfaller wrote:
> >>Why is the stack pointer so low at this point of execution? Using
> >>0x20018000-0x20017d20 == 0x2e0 bytes of stack seems a little excessive
> >>for just one call.
> >
> >Ah!... Looks like you've spotted the problem! Actually, the SP is decremented
> >on every cycle of the loop:
> >
> >   (gdb) disass
> >   Dump of assembler code for function memset:
> >      0x08001008 <+0>:    push {r4, lr}
> >      0x0800100a <+2>:    mov  r4, r0
> >      0x0800100c <+4>:    cbz  r2, 0x8001014 <memset+12>
> >   => 0x0800100e <+6>:    uxtb r1, r1
> >      0x08001010 <+8>:    bl   0x8001008 <memset>
> >      0x08001014 <+12>:   mov  r0, r4
> >      0x08001016 <+14>:   pop  {r4, pc}
> >   End of assembler dump.
> >
> >This looks REALLY suspicous to me. Every cycle of the loop in memset() is
> >pushing something onto the stack?!?
> >
> >Without the  -ftree-loop-distribute-patterns option, the memset() function
> >looks entirely different:
> >
> >          cbz    r2, <memset+18>
> >          add    r2, r0
> >          subs   r2, #1
> >          uxtb   r1, r1
> >          subs   r3, r0, #1
> >   <+10>: strb.w r1, [r3, #1]!
> >          cmp    r3, r2
> >          bne.n  <memset+10>
> >   <+18>: bx     lr
> 
> The compiler has spotted that you've written something that acts like memset
> and optimized it into a function call to memset.  So now you're recursing to
> oblivion.  Try adding -fno-builtin-memset to your compile options.

This sounds reasonable, and I was actually thinking it would solve the
problem.

Unfortunately, -fno-built-memset doesn't have any effect. The same code is
generated.

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Propagating addresses from linker to the runtie (was: Re: Crash when cross compiling for ARM with GCC-8-2-0 and) -ftree-loop-distribute-patterns
  2019-10-17 11:40   ` Josef Wolf
  2019-10-17 12:37     ` Matthias Pfaller
@ 2019-10-18  9:10     ` Josef Wolf
  2019-10-18  9:15       ` Propagating addresses from linker to the runtie Florian Weimer
  1 sibling, 1 reply; 26+ messages in thread
From: Josef Wolf @ 2019-10-18  9:10 UTC (permalink / raw)
  To: gcc-help

On Thu, Oct 17, 2019 at 01:31:57PM +0200, Josef Wolf wrote:
> On Wed, Oct 16, 2019 at 12:18:15PM -0600, Martin Sebor wrote:
> [ ... ]
> The code of Reset_Handler() and memset() actually looks like this:
> 
>     void Reset_Handler (void)
>     {
>       unsigned long *src = &_sidata
>       unsigned long *dst = &_sdata
> 
>       /* Copy data segment into RAM */
>       if (src != dst) {
>         while (dst < &_edata)
>           *(dst++) = *(src++);
>       }
> 
>       /* Zero BSS segment */
>       dst = &_sbss;
>       while (dst < &_ebss)
>         *(dst++) = 0;
>  
>       main();
>     }
> 
> > I doubt it's the cause of the crash either but only addresses of
> > bytes of the same object can be used in relational expressions
> > (i.e., the two less-than controlling expressions).
> 
> Hmm, you are talking about the two loops in Reset_Handler(), right?
> 
> > Using address to unrelated objects is undefined.
> 
> Hmmm... I am not an expert on this topic. But I tend to think the BSS segment
> is an object, which in turn is an array of uint8_t and/or uint32_t.
> Taking the address one past the last element of an array for comparison is
> a perfectly valid operation, AFAIK.
> 
> So what would be the proper way to communicate the dimensions of the BSS
> segment from the linker to the runtime of the compiled program?

I would like to repeat this question:

Strictly speaking, the symbols defined by the linker (_sidata, _sdata, _edata,
_sbss and _ebss) are unrelated when seen from the perspective of the
compiler. Therefore, it is not allowed by the standard to use their addresses
for comparison.

So what would be the proper way to pass this information from the linker to the
compiler?

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18  9:10     ` Propagating addresses from linker to the runtie (was: Re: Crash when cross compiling for ARM with GCC-8-2-0 and) -ftree-loop-distribute-patterns Josef Wolf
@ 2019-10-18  9:15       ` Florian Weimer
  2019-10-18  9:50         ` Josef Wolf
  0 siblings, 1 reply; 26+ messages in thread
From: Florian Weimer @ 2019-10-18  9:15 UTC (permalink / raw)
  To: Josef Wolf; +Cc: gcc-help

* Josef Wolf:

> Strictly speaking, the symbols defined by the linker (_sidata, _sdata, _edata,
> _sbss and _ebss) are unrelated when seen from the perspective of the
> compiler. Therefore, it is not allowed by the standard to use their addresses
> for comparison.
>
> So what would be the proper way to pass this information from the linker to the
> compiler?

In glibc, we use this:

/* Perform vtable pointer validation.  If validation fails, terminate
   the process.  */
static inline const struct _IO_jump_t *
IO_validate_vtable (const struct _IO_jump_t *vtable)
{
  /* Fast path: The vtable pointer is within the __libc_IO_vtables
     section.  */
  uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;
  uintptr_t ptr = (uintptr_t) vtable;
  uintptr_t offset = ptr - (uintptr_t) __start___libc_IO_vtables;
  if (__glibc_unlikely (offset >= section_length))
    /* The vtable pointer is not in the expected section.  Use the
       slow path, which will terminate the process if necessary.  */
    _IO_vtable_check ();
  return vtable;
}

I do not know how effective this is.

In C++, you can use std::less, which was enhanced to cover your use
case.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18  9:15       ` Propagating addresses from linker to the runtie Florian Weimer
@ 2019-10-18  9:50         ` Josef Wolf
  2019-10-18 10:47           ` Florian Weimer
  0 siblings, 1 reply; 26+ messages in thread
From: Josef Wolf @ 2019-10-18  9:50 UTC (permalink / raw)
  To: gcc-help

Thanks for the reply, Florian!

On Fri, Oct 18, 2019 at 11:15:22AM +0200, Florian Weimer wrote:
> > So what would be the proper way to pass this information from the linker to the
> > compiler?
> 
> In glibc, we use this:
> 
> [ ... ]
>   uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;

  #define symbol_set_declare(set) \
    extern char const __start_##set[] __symbol_set_attribute; \
    extern char const __stop_##set[] __symbol_set_attribute;

Due to symbol_set_declare, those symbols expand to two unrelated
symbols. Using unrelated symbols for pointer arithmetic again violates the
standard.

Thus, the issue that Martin mentioned applies here, too.

To get this conforming, the linker would need to export a symbol to the start
of the section and the length of the section, IMHO.

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-18  9:00           ` Josef Wolf
@ 2019-10-18 10:26             ` Richard Earnshaw (lists)
  2019-10-18 12:10               ` Josef Wolf
  2019-10-18 12:50               ` Josef Wolf
  0 siblings, 2 replies; 26+ messages in thread
From: Richard Earnshaw (lists) @ 2019-10-18 10:26 UTC (permalink / raw)
  To: Josef Wolf, gcc-help

On 18/10/2019 09:53, Josef Wolf wrote:
> Thanks for your help, Richard!
> 
> On Thu, Oct 17, 2019 at 03:55:31PM +0100, Richard Earnshaw (lists) wrote:
>> On 17/10/2019 15:04, Josef Wolf wrote:
>>> On Thu, Oct 17, 2019 at 02:37:11PM +0200, Matthias Pfaller wrote:
>>>> Why is the stack pointer so low at this point of execution? Using
>>>> 0x20018000-0x20017d20 == 0x2e0 bytes of stack seems a little excessive
>>>> for just one call.
>>>
>>> Ah!... Looks like you've spotted the problem! Actually, the SP is decremented
>>> on every cycle of the loop:
>>>
>>>    (gdb) disass
>>>    Dump of assembler code for function memset:
>>>       0x08001008 <+0>:    push {r4, lr}
>>>       0x0800100a <+2>:    mov  r4, r0
>>>       0x0800100c <+4>:    cbz  r2, 0x8001014 <memset+12>
>>>    => 0x0800100e <+6>:    uxtb r1, r1
>>>       0x08001010 <+8>:    bl   0x8001008 <memset>
>>>       0x08001014 <+12>:   mov  r0, r4
>>>       0x08001016 <+14>:   pop  {r4, pc}
>>>    End of assembler dump.
>>>
>>> This looks REALLY suspicous to me. Every cycle of the loop in memset() is
>>> pushing something onto the stack?!?
>>>
>>> Without the  -ftree-loop-distribute-patterns option, the memset() function
>>> looks entirely different:
>>>
>>>           cbz    r2, <memset+18>
>>>           add    r2, r0
>>>           subs   r2, #1
>>>           uxtb   r1, r1
>>>           subs   r3, r0, #1
>>>    <+10>: strb.w r1, [r3, #1]!
>>>           cmp    r3, r2
>>>           bne.n  <memset+10>
>>>    <+18>: bx     lr
>>
>> The compiler has spotted that you've written something that acts like memset
>> and optimized it into a function call to memset.  So now you're recursing to
>> oblivion.  Try adding -fno-builtin-memset to your compile options.
> 
> This sounds reasonable, and I was actually thinking it would solve the
> problem.
> 
> Unfortunately, -fno-built-memset doesn't have any effect. The same code is
> generated.
> 

Ah, yes.  Looking at some libc sources it puts an explicit optimization 
attribute onto the memset (and similar mem... functions) to disable 
-ftree-loop-distribute-patterns for such functions.  So something like

void *
__attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
memset (void *s, int c, size_t n)
{
   int i;
   for (i=0; i<n; i++)
     ((char *)s)[i] = c;

   return s;
}

Though, of course, it's wrapped up in a macro to make it look a bit 
prettier ;-)

This is just one of those gotchas that you have to be aware of when 
implementing the standard library.

R.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18  9:50         ` Josef Wolf
@ 2019-10-18 10:47           ` Florian Weimer
  2019-10-18 12:51             ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Florian Weimer @ 2019-10-18 10:47 UTC (permalink / raw)
  To: Josef Wolf; +Cc: gcc-help

* Josef Wolf:

> Thanks for the reply, Florian!
>
> On Fri, Oct 18, 2019 at 11:15:22AM +0200, Florian Weimer wrote:
>> > So what would be the proper way to pass this information from the linker to the
>> > compiler?
>> 
>> In glibc, we use this:
>> 
>> [ ... ]
>>   uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;
>
>   #define symbol_set_declare(set) \
>     extern char const __start_##set[] __symbol_set_attribute; \
>     extern char const __stop_##set[] __symbol_set_attribute;
>
> Due to symbol_set_declare, those symbols expand to two unrelated
> symbols. Using unrelated symbols for pointer arithmetic again violates the
> standard.

Ah.  So we need more uintptr_t casts here.

> Thus, the issue that Martin mentioned applies here, too.
>
> To get this conforming, the linker would need to export a symbol to the start
> of the section and the length of the section, IMHO.

Right, but we do not have that today. 8-(

An explicit size also helps targets where there are restrictions on
alignment for global symbols, and the section size is actually measured
in bytes.  (Which of course leads to the old question whether an object
can have a size which is not a multiple of its alignment.)

Thanks,
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-18 10:26             ` Richard Earnshaw (lists)
@ 2019-10-18 12:10               ` Josef Wolf
  2019-10-18 13:07                 ` Segher Boessenkool
  2019-10-18 12:50               ` Josef Wolf
  1 sibling, 1 reply; 26+ messages in thread
From: Josef Wolf @ 2019-10-18 12:10 UTC (permalink / raw)
  To: gcc-help

On Fri, Oct 18, 2019 at 11:25:53AM +0100, Richard Earnshaw (lists) wrote:
> 
> Ah, yes.  Looking at some libc sources it puts an explicit optimization
> attribute onto the memset (and similar mem... functions) to disable
> -ftree-loop-distribute-patterns for such functions.  So something like
> 
> void *
> __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
> memset (void *s, int c, size_t n)
> {
>   int i;
>   for (i=0; i<n; i++)
>     ((char *)s)[i] = c;
> 
>   return s;
> }
> 
> Though, of course, it's wrapped up in a macro to make it look a bit prettier
> ;-)

Thanks, Richard! This finally fixes the crashes!

But Umm... Honestly, this solution looks pretty wired to me.

When the compiler decides to replace code by a call to some function,
shouldn't it make sure not to replace _all_ occurences of such code (and thus
the final implementation of it) also?

To me, this looks as if the compiler failed to implement the base case of the
recursion. Instead, the user is expected to specify obscure attributes.

What if the next -fsuper-duper-optimization flag again starts doing such
replacements?

> This is just one of those gotchas that you have to be aware of when
> implementing the standard library.

I am not implementing the standard library. I am just trying to get rid of
it. I need just a couple of functions from the stdlib, and I was happy with
simple/lightwight re-implementations of those functions for decades.

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-18 10:26             ` Richard Earnshaw (lists)
  2019-10-18 12:10               ` Josef Wolf
@ 2019-10-18 12:50               ` Josef Wolf
  2019-10-18 14:04                 ` Richard Earnshaw (lists)
  1 sibling, 1 reply; 26+ messages in thread
From: Josef Wolf @ 2019-10-18 12:50 UTC (permalink / raw)
  To: gcc-help

On Fri, Oct 18, 2019 at 11:25:53AM +0100, Richard Earnshaw (lists) wrote:
>
> void *
> __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
> memset (void *s, int c, size_t n)
> {
>   int i;
>   for (i=0; i<n; i++)
>     ((char *)s)[i] = c;
> 
>   return s;
> }

Wouldn't

   void *memset (void *s, int c, size_t n)
   {
           return __builtin_memset (s, c, n);
   }

be a cleaner solution to this?

Unfortunately, this compiles to a jump to itself. No matter whether I use the
-fno-builtin-memset flag or not.

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18 10:47           ` Florian Weimer
@ 2019-10-18 12:51             ` Segher Boessenkool
  2019-10-18 12:56               ` Florian Weimer
  2019-10-18 13:30               ` Josef Wolf
  0 siblings, 2 replies; 26+ messages in thread
From: Segher Boessenkool @ 2019-10-18 12:51 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Josef Wolf, gcc-help

On Fri, Oct 18, 2019 at 12:47:49PM +0200, Florian Weimer wrote:
> >   #define symbol_set_declare(set) \
> >     extern char const __start_##set[] __symbol_set_attribute; \
> >     extern char const __stop_##set[] __symbol_set_attribute;
> >
> > Due to symbol_set_declare, those symbols expand to two unrelated
> > symbols. Using unrelated symbols for pointer arithmetic again violates the
> > standard.
> 
> Ah.  So we need more uintptr_t casts here.

Using reserved names (like those starting with two underscores) is UB
already.  And those particular names actually clash with names GNU LD
already creates!

> > Thus, the issue that Martin mentioned applies here, too.
> >
> > To get this conforming, the linker would need to export a symbol to the start
> > of the section and the length of the section, IMHO.
> 
> Right, but we do not have that today. 8-(

We have had it since over 25 years:

Thu Aug 18 15:37:45 1994  Ian Lance Taylor  (ian@sanguine.cygnus.com)

	Make the ELF linker handle orphaned sections reasonably.  Also,
	define __start_SECNAME and __stop_SECNAME around sections whose
	names can be represented in C, for the benefit of symbol sets in
	glibc.

You just need to have linker scripts that do not sabotage this :-)

> An explicit size also helps targets where there are restrictions on
> alignment for global symbols, and the section size is actually measured
> in bytes.  (Which of course leads to the old question whether an object
> can have a size which is not a multiple of its alignment.)

It can.  This is common, even, for example a .rodata.str1.8 section has
alignment 8, but its size can be anything.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18 12:51             ` Segher Boessenkool
@ 2019-10-18 12:56               ` Florian Weimer
  2019-10-18 14:14                 ` Segher Boessenkool
  2019-10-18 13:30               ` Josef Wolf
  1 sibling, 1 reply; 26+ messages in thread
From: Florian Weimer @ 2019-10-18 12:56 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Josef Wolf, gcc-help

* Segher Boessenkool:

> On Fri, Oct 18, 2019 at 12:47:49PM +0200, Florian Weimer wrote:
>> >   #define symbol_set_declare(set) \
>> >     extern char const __start_##set[] __symbol_set_attribute; \
>> >     extern char const __stop_##set[] __symbol_set_attribute;
>> >
>> > Due to symbol_set_declare, those symbols expand to two unrelated
>> > symbols. Using unrelated symbols for pointer arithmetic again violates the
>> > standard.
>> 
>> Ah.  So we need more uintptr_t casts here.
>
> Using reserved names (like those starting with two underscores) is UB
> already.  And those particular names actually clash with names GNU LD
> already creates!

Well, those names are created by BFD ld for us as well.

But that doesn't mean that GNU C will do anything sensible with them.
You do not need C language extensions for that.  Some of them an
non-controversial, others less so.

>> > Thus, the issue that Martin mentioned applies here, too.
>> >
>> > To get this conforming, the linker would need to export a symbol to the start
>> > of the section and the length of the section, IMHO.
>> 
>> Right, but we do not have that today. 8-(
>
> We have had it since over 25 years:
>
> Thu Aug 18 15:37:45 1994  Ian Lance Taylor  (ian@sanguine.cygnus.com)
>
> 	Make the ELF linker handle orphaned sections reasonably.  Also,
> 	define __start_SECNAME and __stop_SECNAME around sections whose
> 	names can be represented in C, for the benefit of symbol sets in
> 	glibc.
>
> You just need to have linker scripts that do not sabotage this :-)

But it doesn't work on s390x if the section has a size that is not a
multiple of 2 because all global symbols must have alignment 2 or more
on that architecture (of course __alignof (symbol) may still say 1
because *that* has to reflect C semantics).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-18 12:10               ` Josef Wolf
@ 2019-10-18 13:07                 ` Segher Boessenkool
  2019-10-18 13:40                   ` Josef Wolf
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2019-10-18 13:07 UTC (permalink / raw)
  To: Josef Wolf, gcc-help

On Fri, Oct 18, 2019 at 02:07:37PM +0200, Josef Wolf wrote:
> But Umm... Honestly, this solution looks pretty wired to me.
> 
> When the compiler decides to replace code by a call to some function,
> shouldn't it make sure not to replace _all_ occurences of such code (and thus
> the final implementation of it) also?

That would be ideal of course, but how can the compiler know?

> > This is just one of those gotchas that you have to be aware of when
> > implementing the standard library.
> 
> I am not implementing the standard library. I am just trying to get rid of
> it. I need just a couple of functions from the stdlib, and I was happy with
> simple/lightwight re-implementations of those functions for decades.

memset is a reserved name.  If you use it you are implementing part of
the standard library (except when using -ffreestanding, but GCC has an
exception there).


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-16 18:18 ` Martin Sebor
  2019-10-17 11:40   ` Josef Wolf
@ 2019-10-18 13:10   ` Josef Wolf
  1 sibling, 0 replies; 26+ messages in thread
From: Josef Wolf @ 2019-10-18 13:10 UTC (permalink / raw)
  To: gcc-help

On Wed, Oct 16, 2019 at 12:18:15PM -0600, Martin Sebor wrote:
[ ... ]
> >  extern unsigned long _sidata; /* Set by the linker */
> >  extern unsigned long _sdata;  /* Set by the linker */
> >  extern unsigned long _sbss; /* Set by the linker */
> >  extern unsigned long _ebss;  /* Set by the linker */
> >   void Reset_Handler (void)
> >   {
> >     unsigned long *src = &_sidata
> >     unsigned long *dst = &_sdata
> >  
> >     /* Copy data segment into RAM */
> >     if (src != dst) {
> >       while (dst < &_edata)
> >         *(dst++) = *(src++);
> >     }
> >  
> >     /* Zero BSS segment */
> >     dst = &_sbss;
> >     while (dst < &_ebss)
> >       *(dst++) = 0;
> >  
> >     main();
> >   }

> GCC doesn't issue a warning for this bug yet but it might in
> the future.   To avoid the undefined behavior and future
> warnings, convert the addresses to uintptr_t first and compare
> those instead.

I still have a hard time to fully understand this one.

_sbss and friends are symbols defined by the linker. They are never ever used
to store any data. They are only used to get their addresses. Therefore, it
should not matter which data type they have, as long as alignment requirements
are met.

I am using "unsigned long" here because this is convenient for
copying/deleting the sections. Why would a pointer to "unsigned long" be
insufficient to point a an array of "unsigned long"?

Given the fact that there is no way to communicate the length of a section
from the linker to the runtime, this is the most clean code I can think of.

What am I missing here?

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18 12:51             ` Segher Boessenkool
  2019-10-18 12:56               ` Florian Weimer
@ 2019-10-18 13:30               ` Josef Wolf
  2019-10-18 14:20                 ` Segher Boessenkool
  1 sibling, 1 reply; 26+ messages in thread
From: Josef Wolf @ 2019-10-18 13:30 UTC (permalink / raw)
  To: gcc-help

On Fri, Oct 18, 2019 at 07:51:38AM -0500, Segher Boessenkool wrote:
[ ... ]
> Using reserved names (like those starting with two underscores) is UB
> already.

What does "UB" mean?

> > > To get this conforming, the linker would need to export a symbol to the start
> > > of the section and the length of the section, IMHO.
> > 
> > Right, but we do not have that today. 8-(
> 
> We have had it since over 25 years:
> 
> Thu Aug 18 15:37:45 1994  Ian Lance Taylor  (ian@sanguine.cygnus.com)
> 
> 	Make the ELF linker handle orphaned sections reasonably.  Also,
> 	define __start_SECNAME and __stop_SECNAME around sections whose
> 	names can be represented in C, for the benefit of symbol sets in
> 	glibc.

???

I was talking about exporting start+length instead of start+stop.

Having start+length would allow a confirming implementation, because pointer
arithmetic on unrelated objects would no longer be needed:

    unsigned long *dst = &_s_bss;
    for (n = 0; n < _l_bss / sizeof (unsigned long); n++)
        dst[n] = 0;


-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-18 13:07                 ` Segher Boessenkool
@ 2019-10-18 13:40                   ` Josef Wolf
  0 siblings, 0 replies; 26+ messages in thread
From: Josef Wolf @ 2019-10-18 13:40 UTC (permalink / raw)
  To: gcc-help

On Fri, Oct 18, 2019 at 08:07:34AM -0500, Segher Boessenkool wrote:
> memset is a reserved name.  If you use it you are implementing part of
> the standard library (except when using -ffreestanding, but GCC has an
> exception there).

Right. memset() is reserved by the stdlib.

I find it somewhat surprising to get into trouble with it when I explicitly
specify the -nostdlib option.

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns
  2019-10-18 12:50               ` Josef Wolf
@ 2019-10-18 14:04                 ` Richard Earnshaw (lists)
  0 siblings, 0 replies; 26+ messages in thread
From: Richard Earnshaw (lists) @ 2019-10-18 14:04 UTC (permalink / raw)
  To: Josef Wolf, gcc-help

On 18/10/2019 13:41, Josef Wolf wrote:
> On Fri, Oct 18, 2019 at 11:25:53AM +0100, Richard Earnshaw (lists) wrote:
>>
>> void *
>> __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
>> memset (void *s, int c, size_t n)
>> {
>>    int i;
>>    for (i=0; i<n; i++)
>>      ((char *)s)[i] = c;
>>
>>    return s;
>> }
> 
> Wouldn't
> 
>     void *memset (void *s, int c, size_t n)
>     {
>             return __builtin_memset (s, c, n);
>     }
> 
> be a cleaner solution to this?
> 
> Unfortunately, this compiles to a jump to itself. No matter whether I use the
> -fno-builtin-memset flag or not.
> 

On most targets __builtin_memset will only compile to in-lined code if 
the size is known (and sufficiently small), it's intended for cases 
where you probably don't want a loop, but do want to make use of known 
size and alignment.  It's not expected to be an all-singing all-dancing 
memset for this specific CPU.

Writing a good memset can be hard (writing memcpy is even harder) and 
compilers rarely do as well as the best assembly code when trying to 
handle all the important cases; but they can do better in the limited 
conditions where the size and alignment are statically known since many 
hard-to-predict branches can be entirely eliminated.

So in most cases, you *want* the compiler to call memset if the 
operation cannot really be optimized.

R.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18 12:56               ` Florian Weimer
@ 2019-10-18 14:14                 ` Segher Boessenkool
  2019-10-18 14:34                   ` Florian Weimer
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2019-10-18 14:14 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Josef Wolf, gcc-help

On Fri, Oct 18, 2019 at 02:56:03PM +0200, Florian Weimer wrote:
> > You just need to have linker scripts that do not sabotage this :-)
> 
> But it doesn't work on s390x if the section has a size that is not a
> multiple of 2 because all global symbols must have alignment 2 or more
> on that architecture (of course __alignof (symbol) may still say 1
> because *that* has to reflect C semantics).

How does the output section not get enough alignment?  Is that the linker's
fault, or are the command-line flags wrong, or is it the linker script?

Does the *input* section have enough alignment?


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18 13:30               ` Josef Wolf
@ 2019-10-18 14:20                 ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2019-10-18 14:20 UTC (permalink / raw)
  To: Josef Wolf, gcc-help

On Fri, Oct 18, 2019 at 03:20:41PM +0200, Josef Wolf wrote:
> On Fri, Oct 18, 2019 at 07:51:38AM -0500, Segher Boessenkool wrote:
> [ ... ]
> > Using reserved names (like those starting with two underscores) is UB
> > already.
> 
> What does "UB" mean?

Undefined behaviour.  I should have spelled that out, sorry.

> I was talking about exporting start+length instead of start+stop.
> 
> Having start+length would allow a confirming implementation, because pointer
> arithmetic on unrelated objects would no longer be needed:
> 
>     unsigned long *dst = &_s_bss;
>     for (n = 0; n < _l_bss / sizeof (unsigned long); n++)
>         dst[n] = 0;

Using anything like these symbols is not conforming *already*.

You can spell out exactly what implementation behaviour your program expects
in your documentation, that is much more realistic than saying "conforming".


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Propagating addresses from linker to the runtie
  2019-10-18 14:14                 ` Segher Boessenkool
@ 2019-10-18 14:34                   ` Florian Weimer
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Weimer @ 2019-10-18 14:34 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Josef Wolf, gcc-help

* Segher Boessenkool:

> On Fri, Oct 18, 2019 at 02:56:03PM +0200, Florian Weimer wrote:
>> > You just need to have linker scripts that do not sabotage this :-)
>> 
>> But it doesn't work on s390x if the section has a size that is not a
>> multiple of 2 because all global symbols must have alignment 2 or more
>> on that architecture (of course __alignof (symbol) may still say 1
>> because *that* has to reflect C semantics).
>
> How does the output section not get enough alignment?  Is that the linker's
> fault, or are the command-line flags wrong, or is it the linker script?
>
> Does the *input* section have enough alignment?

Sorry, I don't understand.  The problem is that the difference between
two global symbols is always a multiple of two.  If your section length
is odd, you can't express the section length as a difference between two
symbols.  It's not something that stems from section properties, it
comes from restrictions that relocations have.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-10-18 14:34 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-16 13:19 Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns Josef Wolf
2019-10-16 13:30 ` Matthias Pfaller
2019-10-17  8:10   ` Josef Wolf
2019-10-16 18:18 ` Martin Sebor
2019-10-17 11:40   ` Josef Wolf
2019-10-17 12:37     ` Matthias Pfaller
2019-10-17 14:10       ` Josef Wolf
2019-10-17 14:55         ` Richard Earnshaw (lists)
2019-10-18  9:00           ` Josef Wolf
2019-10-18 10:26             ` Richard Earnshaw (lists)
2019-10-18 12:10               ` Josef Wolf
2019-10-18 13:07                 ` Segher Boessenkool
2019-10-18 13:40                   ` Josef Wolf
2019-10-18 12:50               ` Josef Wolf
2019-10-18 14:04                 ` Richard Earnshaw (lists)
2019-10-18  9:10     ` Propagating addresses from linker to the runtie (was: Re: Crash when cross compiling for ARM with GCC-8-2-0 and) -ftree-loop-distribute-patterns Josef Wolf
2019-10-18  9:15       ` Propagating addresses from linker to the runtie Florian Weimer
2019-10-18  9:50         ` Josef Wolf
2019-10-18 10:47           ` Florian Weimer
2019-10-18 12:51             ` Segher Boessenkool
2019-10-18 12:56               ` Florian Weimer
2019-10-18 14:14                 ` Segher Boessenkool
2019-10-18 14:34                   ` Florian Weimer
2019-10-18 13:30               ` Josef Wolf
2019-10-18 14:20                 ` Segher Boessenkool
2019-10-18 13:10   ` Crash when cross compiling for ARM with GCC-8-2-0 and -ftree-loop-distribute-patterns Josef Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).