Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
@ 2015-02-18 19:24 Jeff Prothero
  2015-02-18 19:29 ` Jakub Jelinek
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Jeff Prothero @ 2015-02-18 19:24 UTC (permalink / raw)
  To: gcc

Starting with gcc 4.9, -O2 implicitly invokes

    -fisolate-erroneous-paths-dereference:

which

    https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

documents as

    Detect paths that trigger erroneous or undefined behavior due to
    dereferencing a null pointer. Isolate those paths from the main control
    flow and turn the statement with erroneous or undefined behavior into a
    trap. This flag is enabled by default at -O2 and higher.

This results in a sizable number of previously working embedded programs mysteriously
crashing when recompiled under gcc 4.9.  The problem is that embedded
programs will often have ram starting at address zero (think hardware-defined
interrupt vectors, say) which gets initialized by code which the
-fisolate-erroneous-paths-deference logic can recognize as reading and/or
writing address zero.

What happens then is that the previously running program compiles without
any warnings, but then typically locks up mysteriously (often disabling the
remote debug link) due to the trap not being gracefully handled by the
embedded runtime.

Granted, such code is out-of-spec wrt to C standards.

None the less, the problem is quite painful to track down and
unexpected.

Is there any good reason the 

    -fisolate-erroneous-paths-dereference

logic could not issue a compiletime warning or error, instead of just
silently generating code virtually certain to crash at runtime?

Such a warning/error would save a lot of engineers significant amounts
of time, energy and frustration tracking down this problem.

I would like to think that the spirit of gcc is about helping engineers
efficiently correct nonstandard pain, rather than inflicting maximal
pain upon engineers violating C standards.  :-)

-Jeff

BTW, I'd also be curious to know what is regarded as engineering best
practice for writing a value to address zero when this is architecturally
required by the hardware platform at hand.  Obviously one can do various
things to obscure the process sufficiently that the current gcc implementation
won't detect it and complain, but as gcc gets smarter about optimization
those are at risk of failing in a future release.  It would be nice to have
a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
of retreating to assembly code?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-18 19:24 Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Jeff Prothero
@ 2015-02-18 19:29 ` Jakub Jelinek
  2015-02-19 20:56   ` Sandra Loosemore
  2015-02-18 19:30 ` Andrew Pinski
  2015-02-20  9:30 ` Andrew Haley
  2 siblings, 1 reply; 33+ messages in thread
From: Jakub Jelinek @ 2015-02-18 19:29 UTC (permalink / raw)
  To: Jeff Prothero; +Cc: gcc

On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> Starting with gcc 4.9, -O2 implicitly invokes
> 
>     -fisolate-erroneous-paths-dereference:
> 
> which
> 
>     https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> 
> documents as
> 
>     Detect paths that trigger erroneous or undefined behavior due to
>     dereferencing a null pointer. Isolate those paths from the main control
>     flow and turn the statement with erroneous or undefined behavior into a
>     trap. This flag is enabled by default at -O2 and higher.
> 
> This results in a sizable number of previously working embedded programs mysteriously
> crashing when recompiled under gcc 4.9.  The problem is that embedded
> programs will often have ram starting at address zero (think hardware-defined
> interrupt vectors, say) which gets initialized by code which the
> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
> writing address zero.

If you have some pages mapped at address 0, you really should compile your
code with -fno-delete-null-pointer-checks, otherwise you can run into tons
of other issues.
Also, there is -fsanitize=undefined that allows discovery of such invalid
calls at runtime, though admittedly it isn't supported for all targets.

	Jakub

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-18 19:24 Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Jeff Prothero
  2015-02-18 19:29 ` Jakub Jelinek
@ 2015-02-18 19:30 ` Andrew Pinski
  2015-02-20  9:30 ` Andrew Haley
  2 siblings, 0 replies; 33+ messages in thread
From: Andrew Pinski @ 2015-02-18 19:30 UTC (permalink / raw)
  To: Jeff Prothero; +Cc: GCC Mailing List

On Wed, Feb 18, 2015 at 11:21 AM, Jeff Prothero <jprother@altera.com> wrote:
>
> Starting with gcc 4.9, -O2 implicitly invokes
>
>     -fisolate-erroneous-paths-dereference:
>
> which
>
>     https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>
> documents as
>
>     Detect paths that trigger erroneous or undefined behavior due to
>     dereferencing a null pointer. Isolate those paths from the main control
>     flow and turn the statement with erroneous or undefined behavior into a
>     trap. This flag is enabled by default at -O2 and higher.
>
> This results in a sizable number of previously working embedded programs mysteriously
> crashing when recompiled under gcc 4.9.  The problem is that embedded
> programs will often have ram starting at address zero (think hardware-defined
> interrupt vectors, say) which gets initialized by code which the
> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
> writing address zero.

You should have used -fno-delete-null-pointer-checks which has been
doing this optimization for a long time now, just it got better with
-fisolate-erroneous-paths-dereference pass.

Thanks,
Andrew Pinski



>
> What happens then is that the previously running program compiles without
> any warnings, but then typically locks up mysteriously (often disabling the
> remote debug link) due to the trap not being gracefully handled by the
> embedded runtime.
>
> Granted, such code is out-of-spec wrt to C standards.
>
> None the less, the problem is quite painful to track down and
> unexpected.
>
> Is there any good reason the
>
>     -fisolate-erroneous-paths-dereference
>
> logic could not issue a compiletime warning or error, instead of just
> silently generating code virtually certain to crash at runtime?
>
> Such a warning/error would save a lot of engineers significant amounts
> of time, energy and frustration tracking down this problem.
>
> I would like to think that the spirit of gcc is about helping engineers
> efficiently correct nonstandard pain, rather than inflicting maximal
> pain upon engineers violating C standards.  :-)
>
> -Jeff
>
> BTW, I'd also be curious to know what is regarded as engineering best
> practice for writing a value to address zero when this is architecturally
> required by the hardware platform at hand.  Obviously one can do various
> things to obscure the process sufficiently that the current gcc implementation
> won't detect it and complain, but as gcc gets smarter about optimization
> those are at risk of failing in a future release.  It would be nice to have
> a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
> of retreating to assembly code?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-18 19:29 ` Jakub Jelinek
@ 2015-02-19 20:56   ` Sandra Loosemore
  2015-02-19 21:16     ` Daniel Gutson
                       ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Sandra Loosemore @ 2015-02-19 20:56 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Prothero, gcc

Jakub Jelinek wrote:
>
> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>> Starting with gcc 4.9, -O2 implicitly invokes
>>
>>     -fisolate-erroneous-paths-dereference:
>>
>> which
>>
>>     https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>
>> documents as
>>
>>     Detect paths that trigger erroneous or undefined behavior due to
>>     dereferencing a null pointer. Isolate those paths from the main control
>>     flow and turn the statement with erroneous or undefined behavior into a
>>     trap. This flag is enabled by default at -O2 and higher.
>>
>> This results in a sizable number of previously working embedded programs mysteriously
>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>> programs will often have ram starting at address zero (think hardware-defined
>> interrupt vectors, say) which gets initialized by code which the
>> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
>> writing address zero.
>
> If you have some pages mapped at address 0, you really should compile your
> code with -fno-delete-null-pointer-checks, otherwise you can run into tons
> of other issues.

Hmmmm,  Passing the additional option in user code would be one thing, 
but what about library code?  E.g., using memcpy (either explicitly or 
implicitly for a structure copy)?

It looks to me like cr16 and avr are currently the only architectures 
that disable flag_delete_null_pointer_checks entirely, but I am sure 
that this issue affects other embedded targets besides nios2, too.  E.g. 
scanning Mentor's ARM board support library, I see a whole pile of 
devices that have memory mapped at address zero (TI Stellaris/Tiva, 
Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ....).  Plus our simulator 
BSPs assume a flat address space starting at address 0.

I can see both sides of the issue here....  On the one hand, you get 
better code for 99.99% of situations by enabling 
-fdelete-null-pointer-checks, but if it makes GCC unusable in the other 
.01% case, that is a problem for the users for whom that case is 
critical.  :-S  So I think  Jeff's request here is something that 
deserves an answer:

>> BTW, I'd also be curious to know what is regarded as engineering best
>> practice for writing a value to address zero when this is architecturally
>> required by the hardware platform at hand.  Obviously one can do various
>> things to obscure the process sufficiently that the current gcc implementation
>> won't detect it and complain, but as gcc gets smarter about optimization
>> those are at risk of failing in a future release.  It would be nice to have
>> a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
>> of retreating to assembly code?

-Sandra

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-19 20:56   ` Sandra Loosemore
@ 2015-02-19 21:16     ` Daniel Gutson
  2015-02-19 22:10       ` Jakub Jelinek
  2015-02-19 21:23     ` Joel Sherrill
  2015-02-20 11:06     ` Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Florian Weimer
  2 siblings, 1 reply; 33+ messages in thread
From: Daniel Gutson @ 2015-02-19 21:16 UTC (permalink / raw)
  To: Sandra Loosemore; +Cc: Jakub Jelinek, Jeff Prothero, gcc Mailing List

(Hi Sandra, so long!)

On Thu, Feb 19, 2015 at 5:56 PM, Sandra Loosemore
<sandra@codesourcery.com> wrote:
> Jakub Jelinek wrote:
>>
>>
>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>>>
>>> Starting with gcc 4.9, -O2 implicitly invokes
>>>
>>>     -fisolate-erroneous-paths-dereference:
>>>
>>> which
>>>
>>>     https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>
>>> documents as
>>>
>>>     Detect paths that trigger erroneous or undefined behavior due to
>>>     dereferencing a null pointer. Isolate those paths from the main
>>> control
>>>     flow and turn the statement with erroneous or undefined behavior into
>>> a
>>>     trap. This flag is enabled by default at -O2 and higher.
>>>
>>> This results in a sizable number of previously working embedded programs
>>> mysteriously
>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>>> programs will often have ram starting at address zero (think
>>> hardware-defined
>>> interrupt vectors, say) which gets initialized by code which the
>>> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
>>> writing address zero.
>>
>>
>> If you have some pages mapped at address 0, you really should compile your
>> code with -fno-delete-null-pointer-checks, otherwise you can run into tons
>> of other issues.
>
>
> Hmmmm,  Passing the additional option in user code would be one thing, but
> what about library code?  E.g., using memcpy (either explicitly or
> implicitly for a structure copy)?
>
> It looks to me like cr16 and avr are currently the only architectures that
> disable flag_delete_null_pointer_checks entirely, but I am sure that this
> issue affects other embedded targets besides nios2, too.  E.g. scanning
> Mentor's ARM board support library, I see a whole pile of devices that have
> memory mapped at address zero (TI Stellaris/Tiva, Energy Micro EFM32Gxxx,
> Atmel AT91SAMxxx, ....).  Plus our simulator BSPs assume a flat address
> space starting at address 0.
>
> I can see both sides of the issue here....  On the one hand, you get better
> code for 99.99% of situations by enabling -fdelete-null-pointer-checks, but
> if it makes GCC unusable in the other .01% case, that is a problem for the
> users for whom that case is critical.  :-S  So I think  Jeff's request here
> is something that deserves an answer:

what about then two warnings (disabled by default), one intended to
tell the user each time the compiler removes a conditional
(-fdelete-null-pointer-checks)
and another intended to tell the user each time the compiler adds a trap due to
dereference an address 0?

E.g.
   -Wnull-pointer-check-deleted
   -Wnull-dereference-considered-erroneous

or alike

 ?

  Daniel.

>
>
>>> BTW, I'd also be curious to know what is regarded as engineering best
>>> practice for writing a value to address zero when this is architecturally
>>> required by the hardware platform at hand.  Obviously one can do various
>>> things to obscure the process sufficiently that the current gcc
>>> implementation
>>> won't detect it and complain, but as gcc gets smarter about optimization
>>> those are at risk of failing in a future release.  It would be nice to
>>> have
>>> a guaranteed-to-work future-proof idiom for doing this. Do we have one,
>>> short
>>> of retreating to assembly code?
>
>
> -Sandra
>



-- 

Daniel F. Gutson
Chief Technology Officer, SPD

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:    dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-19 20:56   ` Sandra Loosemore
  2015-02-19 21:16     ` Daniel Gutson
@ 2015-02-19 21:23     ` Joel Sherrill
  2015-02-19 21:56       ` Chris Johns
  2015-02-20 11:06     ` Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Florian Weimer
  2 siblings, 1 reply; 33+ messages in thread
From: Joel Sherrill @ 2015-02-19 21:23 UTC (permalink / raw)
  To: Sandra Loosemore, Jakub Jelinek; +Cc: Jeff Prothero, gcc


On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
> Jakub Jelinek wrote:
>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>>> Starting with gcc 4.9, -O2 implicitly invokes
>>>
>>>     -fisolate-erroneous-paths-dereference:
>>>
>>> which
>>>
>>>     https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>
>>> documents as
>>>
>>>     Detect paths that trigger erroneous or undefined behavior due to
>>>     dereferencing a null pointer. Isolate those paths from the main control
>>>     flow and turn the statement with erroneous or undefined behavior into a
>>>     trap. This flag is enabled by default at -O2 and higher.
>>>
>>> This results in a sizable number of previously working embedded programs mysteriously
>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>>> programs will often have ram starting at address zero (think hardware-defined
>>> interrupt vectors, say) which gets initialized by code which the
>>> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
>>> writing address zero.
>> If you have some pages mapped at address 0, you really should compile your
>> code with -fno-delete-null-pointer-checks, otherwise you can run into tons
>> of other issues.
> Hmmmm,  Passing the additional option in user code would be one thing, 
> but what about library code?  E.g., using memcpy (either explicitly or 
> implicitly for a structure copy)?
>
> It looks to me like cr16 and avr are currently the only architectures 
> that disable flag_delete_null_pointer_checks entirely, but I am sure 
> that this issue affects other embedded targets besides nios2, too.  E.g. 
> scanning Mentor's ARM board support library, I see a whole pile of 
> devices that have memory mapped at address zero (TI Stellaris/Tiva, 
> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ....).  Plus our simulator 
> BSPs assume a flat address space starting at address 0.
I forwarded this to the RTEMS list and was promptly pointed to a patch
on a Coldfire BSP where someone worked around this behavior.

We are discussing how to deal with this. It is likely OK in user code but
horrible in BSP and driver code. We don't have a solution ourselves. We
just recognize it impacts a number of targets.

> I can see both sides of the issue here....  On the one hand, you get 
> better code for 99.99% of situations by enabling 
> -fdelete-null-pointer-checks, but if it makes GCC unusable in the other 
> .01% case, that is a problem for the users for whom that case is 
> critical.  :-S  So I think  Jeff's request here is something that 
> deserves an answer:
Which side is benefiting? The embedded side or self-hosted side?
What is the actual benefit?

Can the option be disabled/enabled on a per target basis?  I honestly
could see disabling it for *-rtems* and on purely embedded
architectures.
>>> BTW, I'd also be curious to know what is regarded as engineering best
>>> practice for writing a value to address zero when this is architecturally
>>> required by the hardware platform at hand.  Obviously one can do various
>>> things to obscure the process sufficiently that the current gcc implementation
>>> won't detect it and complain, but as gcc gets smarter about optimization
>>> those are at risk of failing in a future release.  It would be nice to have
>>> a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
>>> of retreating to assembly code?
Or GCC specific code.

I considered a special memcpy clone for our BSPs which was not inlined
and used a pragma to turn this optimization off. Then it would just
be a matter of using it in the right places. But knowing all the right
places
is a hard problem by itself. :(
> -Sandra
>

-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel.sherrill@OARcorp.com        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available                (256) 722-9985


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-19 21:23     ` Joel Sherrill
@ 2015-02-19 21:56       ` Chris Johns
  2015-02-20 17:30         ` Jeff Law
  0 siblings, 1 reply; 33+ messages in thread
From: Chris Johns @ 2015-02-19 21:56 UTC (permalink / raw)
  To: Joel Sherrill, Sandra Loosemore, Jakub Jelinek; +Cc: Jeff Prothero, gcc

On 20/02/2015 8:23 am, Joel Sherrill wrote:
>
> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
>> Jakub Jelinek wrote:
>>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>>>> Starting with gcc 4.9, -O2 implicitly invokes
>>>>
>>>>      -fisolate-erroneous-paths-dereference:
>>>>
>>>> which
>>>>
>>>>      https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>>
>>>> documents as
>>>>
>>>>      Detect paths that trigger erroneous or undefined behavior due to
>>>>      dereferencing a null pointer. Isolate those paths from the main control
>>>>      flow and turn the statement with erroneous or undefined behavior into a
>>>>      trap. This flag is enabled by default at -O2 and higher.
>>>>
>>>> This results in a sizable number of previously working embedded programs mysteriously
>>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>>>> programs will often have ram starting at address zero (think hardware-defined
>>>> interrupt vectors, say) which gets initialized by code which the
>>>> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
>>>> writing address zero.
>>> If you have some pages mapped at address 0, you really should compile your
>>> code with -fno-delete-null-pointer-checks, otherwise you can run into tons
>>> of other issues.
>> Hmmmm,  Passing the additional option in user code would be one thing,
>> but what about library code?  E.g., using memcpy (either explicitly or
>> implicitly for a structure copy)?
>>
>> It looks to me like cr16 and avr are currently the only architectures
>> that disable flag_delete_null_pointer_checks entirely, but I am sure
>> that this issue affects other embedded targets besides nios2, too.  E.g.
>> scanning Mentor's ARM board support library, I see a whole pile of
>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ....).  Plus our simulator
>> BSPs assume a flat address space starting at address 0.
> I forwarded this to the RTEMS list and was promptly pointed to a patch
> on a Coldfire BSP where someone worked around this behavior.
>
> We are discussing how to deal with this. It is likely OK in user code but
> horrible in BSP and driver code. We don't have a solution ourselves. We
> just recognize it impacts a number of targets.
>

My main concern is not knowing the trap has been added to the code. If I 
could build an application and audit it somehow then I can manage it. We 
have a similar issue with the possible use of FP registers being used in 
general code (ISR save/restore trade off).

Can the ELF be annotated in some GCC specific way that makes it to the 
final executable to flag this is happening ? We can then create tools to 
audit the executables.

Chris

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-19 21:16     ` Daniel Gutson
@ 2015-02-19 22:10       ` Jakub Jelinek
  2015-02-19 22:26         ` Sandra Loosemore
  0 siblings, 1 reply; 33+ messages in thread
From: Jakub Jelinek @ 2015-02-19 22:10 UTC (permalink / raw)
  To: Daniel Gutson; +Cc: Sandra Loosemore, Jeff Prothero, gcc Mailing List

On Thu, Feb 19, 2015 at 06:16:05PM -0300, Daniel Gutson wrote:
> what about then two warnings (disabled by default), one intended to
> tell the user each time the compiler removes a conditional
> (-fdelete-null-pointer-checks)
> and another intended to tell the user each time the compiler adds a trap due to
> dereference an address 0?
> 
> E.g.
>    -Wnull-pointer-check-deleted
>    -Wnull-dereference-considered-erroneous
> 
> or alike

That would be extremely difficult.  The -fdelete-null-pointer-checks option
is used in many places, like the path isolation, value range propagation,
alias oracle, number of iteration analysis etc.  E.g. in case of value
range propagation, it is really hard to warn if something has been optimized
some way because of it, because you really don't know the reason why after
all the propagation some SSA_NAME got certain range, to warn you'd
essentially have to do all of VRP twice, once with
-fdelete-null-pointer-checks and once without, and then compare that when
actually performing optimizations.  But some optimizations are also done far
later than directly in the VRP pass.

If you have hw where NULL is mapped and you know your code violates the
C/C++ standards by placing objects at that address, simply do use
the option that is designed for that purpose.

	Jakub

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-19 22:10       ` Jakub Jelinek
@ 2015-02-19 22:26         ` Sandra Loosemore
  0 siblings, 0 replies; 33+ messages in thread
From: Sandra Loosemore @ 2015-02-19 22:26 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Daniel Gutson, Jeff Prothero, gcc Mailing List

[-- Attachment #1: Type: text/plain, Size: 929 bytes --]

On 02/19/2015 03:09 PM, Jakub Jelinek wrote:
>
> If you have hw where NULL is mapped and you know your code violates the
> C/C++ standards by placing objects at that address, simply do use
> the option that is designed for that purpose.

As I pointed out earlier, though, that won't help you if your program 
(perhaps implicitly) uses library code that's been built without the 
magic option.

I thought that declaring an object explicitly placed at address 0 as 
"volatile" ought to be sufficient to prevent writes to it from being 
optimized away and replaced with traps, but alas, that is not the case. 
  Also, a structure copy to such a volatile object is implicitly 
invoking the regular memcpy function without complaining about casting 
away volatile-ness, as an explicit call would do.  I've attached a 
couple quick test cases and .s output from arm-none-eabi-gcc -O2, using 
a mainline build from last night.

-Sandra

[-- Attachment #2: test.c --]
[-- Type: text/x-csrc, Size: 261 bytes --]

static int * const x0 = (int *) 0x00000000;
static volatile int * const xv = (int *) 0x00000000;
static int * const xn = (int *) 0x10000000;

void f0 (int *data)
{
  *x0 = *data;
}

void fv (int *data)
{
  *xv = *data;
}

void fn (int *data)
{
  *xn = *data;
}

[-- Attachment #3: test.s --]
[-- Type: text/plain, Size: 1193 bytes --]

	.cpu arm7tdmi
	.fpu softvfp
	.eabi_attribute 20, 1
	.eabi_attribute 21, 1
	.eabi_attribute 23, 3
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 1
	.eabi_attribute 30, 2
	.eabi_attribute 34, 0
	.eabi_attribute 18, 4
	.arm
	.syntax divided
	.file	"test.c"
	.text
	.align	2
	.global	f0
	.type	f0, %function
f0:
	@ Function supports interworking.
	@ Volatile: function does not return.
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	mov	r3, #0
	str	r3, [r3]
	.inst	0xe7f000f0
	.size	f0, .-f0
	.align	2
	.global	fv
	.type	fv, %function
fv:
	@ Function supports interworking.
	@ Volatile: function does not return.
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	mov	r3, #0
	str	r3, [r3]
	.inst	0xe7f000f0
	.size	fv, .-fv
	.align	2
	.global	fn
	.type	fn, %function
fn:
	@ Function supports interworking.
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	mov	r3, #268435456
	ldr	r2, [r0]
	str	r2, [r3]
	bx	lr
	.size	fn, .-fn
	.ident	"GCC: (GNU) 5.0.0 20150219 (experimental)"

[-- Attachment #4: test2.c --]
[-- Type: text/x-csrc, Size: 370 bytes --]

struct big {
  int a, b;
  char c [100];
};

static struct big * const x0 = (struct big *) 0x00000000;
static volatile struct big * const xv = (struct big *) 0x00000000;
static struct big * const xn = (struct big *) 0x10000000;

void f0 (struct big *data)
{
  *x0 = *data;
}

void fv (struct big *data)
{
  *xv = *data;
}

void fn (struct big *data)
{
  *xn = *data;
}


[-- Attachment #5: test2.s --]
[-- Type: text/plain, Size: 1231 bytes --]

	.cpu arm7tdmi
	.fpu softvfp
	.eabi_attribute 20, 1
	.eabi_attribute 21, 1
	.eabi_attribute 23, 3
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 1
	.eabi_attribute 30, 2
	.eabi_attribute 34, 0
	.eabi_attribute 18, 4
	.arm
	.syntax divided
	.file	"test2.c"
	.text
	.align	2
	.global	f0
	.type	f0, %function
f0:
	@ Function supports interworking.
	@ Volatile: function does not return.
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	mov	r1, r0
	stmfd	sp!, {r4, lr}
	mov	r2, #108
	mov	r0, #0
	bl	memcpy
	.inst	0xe7f000f0
	.size	f0, .-f0
	.align	2
	.global	fv
	.type	fv, %function
fv:
	@ Function supports interworking.
	@ Volatile: function does not return.
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	mov	r1, r0
	stmfd	sp!, {r4, lr}
	mov	r2, #108
	mov	r0, #0
	bl	memcpy
	.inst	0xe7f000f0
	.size	fv, .-fv
	.align	2
	.global	fn
	.type	fn, %function
fn:
	@ Function supports interworking.
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	stmfd	sp!, {r4, lr}
	mov	r1, r0
	mov	r2, #108
	mov	r0, #268435456
	bl	memcpy
	ldmfd	sp!, {r4, lr}
	bx	lr
	.size	fn, .-fn
	.ident	"GCC: (GNU) 5.0.0 20150219 (experimental)"

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-18 19:24 Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Jeff Prothero
  2015-02-18 19:29 ` Jakub Jelinek
  2015-02-18 19:30 ` Andrew Pinski
@ 2015-02-20  9:30 ` Andrew Haley
  2015-02-20 11:45   ` Florian Weimer
  2 siblings, 1 reply; 33+ messages in thread
From: Andrew Haley @ 2015-02-20  9:30 UTC (permalink / raw)
  To: Jeff Prothero, gcc

On 18/02/15 19:21, Jeff Prothero wrote:

> BTW, I'd also be curious to know what is regarded as engineering
> best practice for writing a value to address zero when this is
> architecturally required by the hardware platform at hand.
> Obviously one can do various things to obscure the process
> sufficiently that the current gcc implementation won't detect it and
> complain, but as gcc gets smarter about optimization those are at
> risk of failing in a future release.  It would be nice to have a
> guaranteed-to-work future-proof idiom for doing this. Do we have
> one, short of retreating to assembly code?

I doubt that such a thing is ever going to be safe.  The idea that a
null pointer points to nothing is so hard-baked into the design of C
that you can't get away from it.  Also, almost every C programmer and
especially library writer "knows" that a null pointer points to
nothing.  The only way to initialize an interrupt vector table at
address zero is to go outside the language.  While it may be possible
to use some proprietary GCC options to get around this, it's never
going to be a good idea because some GCC author (or indeed library
author) may make the "mistake" of assuming that null pointer point to
nothing.  Using C to initialize anything at address zero is so
dangerous that we shouldn't tell people that they can use
such-and-such a GCC option to support it; we should warn them never to
do it, and provide as many warnings as we can.

IMO, engineering best practice is to use assembly code.

Andrew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-19 20:56   ` Sandra Loosemore
  2015-02-19 21:16     ` Daniel Gutson
  2015-02-19 21:23     ` Joel Sherrill
@ 2015-02-20 11:06     ` Florian Weimer
  2015-02-20 11:43       ` Jonathan Wakely
                         ` (3 more replies)
  2 siblings, 4 replies; 33+ messages in thread
From: Florian Weimer @ 2015-02-20 11:06 UTC (permalink / raw)
  To: Sandra Loosemore, Jakub Jelinek; +Cc: Jeff Prothero, gcc

On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
> Hmmmm,  Passing the additional option in user code would be one thing,
> but what about library code?  E.g., using memcpy (either explicitly or
> implicitly for a structure copy)?

The memcpy problem isn't restricted to embedded architectures.

  size_t size;
  const unsigned char *source;
  std::vector<char> vec;
  â€¦
  vec.resize(size);
  memcpy(vec.data(), source, size);

std::vector<T>::data() can return a null pointer if the vector is empty,
which means that this code is invalid for empty inputs.

I think the C standard is wrong here.  We should extend it, as a QoI
matter, and support null pointers for variable-length inputs and outputs
if the size is 0.  But I suspect this is still a minority view.

-- 
Florian Weimer / Red Hat Product Security

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 11:06     ` Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Florian Weimer
@ 2015-02-20 11:43       ` Jonathan Wakely
  2015-02-20 12:05         ` Florian Weimer
  2015-02-20 17:01         ` Jeff Law
  2015-02-20 12:11       ` Jakub Jelinek
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 33+ messages in thread
From: Jonathan Wakely @ 2015-02-20 11:43 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Sandra Loosemore, Jakub Jelinek, Jeff Prothero, gcc

On 20 February 2015 at 11:06, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>> Hmmmm,  Passing the additional option in user code would be one thing,
>> but what about library code?  E.g., using memcpy (either explicitly or
>> implicitly for a structure copy)?
>
> The memcpy problem isn't restricted to embedded architectures.
>
>   size_t size;
>   const unsigned char *source;
>   std::vector<char> vec;
>   …
>   vec.resize(size);
>   memcpy(vec.data(), source, size);
>
> std::vector<T>::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.
>
> I think the C standard is wrong here.  We should extend it, as a QoI
> matter, and support null pointers for variable-length inputs and outputs
> if the size is 0.  But I suspect this is still a minority view.

I'm inclined to agree.

Most developers aren't aware of the preconditions on memcpy, but GCC
optimizes aggressively based on those preconditions, so we have a
large and potentially dangerous gap between what developers expect and
what actually happens.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20  9:30 ` Andrew Haley
@ 2015-02-20 11:45   ` Florian Weimer
  2015-02-20 17:01     ` Jeff Law
  0 siblings, 1 reply; 33+ messages in thread
From: Florian Weimer @ 2015-02-20 11:45 UTC (permalink / raw)
  To: Andrew Haley, Jeff Prothero, gcc

On 02/20/2015 10:30 AM, Andrew Haley wrote:
> I doubt that such a thing is ever going to be safe.  The idea that a
> null pointer points to nothing is so hard-baked into the design of C
> that you can't get away from it.  Also, almost every C programmer and
> especially library writer "knows" that a null pointer points to
> nothing.

NULL pointer dereferences (or NULL pointer with small offsets) were
common programming idioms in the DOS days because the interrupt vector
table was located at this address.  Quite a few systems once had a
readable page zero, and (manual, I assume) optimizations for list
traversal (p != NULL && p->next != NULL â†’ p->next != NULL) were commonly
used on these systems.

I think the treatment of pointers not as addresses, but something that
has type information and provenience associated with it, came much
later, when most of the design was already settled.

-- 
Florian Weimer / Red Hat Product Security

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 11:43       ` Jonathan Wakely
@ 2015-02-20 12:05         ` Florian Weimer
  2015-02-20 17:01         ` Jeff Law
  1 sibling, 0 replies; 33+ messages in thread
From: Florian Weimer @ 2015-02-20 12:05 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Sandra Loosemore, Jakub Jelinek, Jeff Prothero, gcc

On 02/20/2015 12:43 PM, Jonathan Wakely wrote:
> On 20 February 2015 at 11:06, Florian Weimer wrote:
>> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>>> Hmmmm,  Passing the additional option in user code would be one thing,
>>> but what about library code?  E.g., using memcpy (either explicitly or
>>> implicitly for a structure copy)?
>>
>> The memcpy problem isn't restricted to embedded architectures.
>>
>>   size_t size;
>>   const unsigned char *source;
>>   std::vector<char> vec;
>>   â€¦
>>   vec.resize(size);
>>   memcpy(vec.data(), source, size);
>>
>> std::vector<T>::data() can return a null pointer if the vector is empty,
>> which means that this code is invalid for empty inputs.
>>
>> I think the C standard is wrong here.  We should extend it, as a QoI
>> matter, and support null pointers for variable-length inputs and outputs
>> if the size is 0.  But I suspect this is still a minority view.
> 
> I'm inclined to agree.
> 
> Most developers aren't aware of the preconditions on memcpy, but GCC
> optimizes aggressively based on those preconditions, so we have a
> large and potentially dangerous gap between what developers expect and
> what actually happens.

Maybe we can add, as a compromise, an always-inline wrapper like this?

void *memcpy(void *dst, const *void src, size_t size)
{
  if (__builtin_constant_p(size > 0) && size > 0) {
    // Or whatever else is needed as non-null assertions.
    *(char *)dst;
    *(const char *)src;
  }
  return memcpy_real(dst, src, size); // Without non-null assertion.
}

Then we'll still get the non-NULL optimization for the common positive
size case.

-- 
Florian Weimer / Red Hat Product Security

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 11:06     ` Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Florian Weimer
  2015-02-20 11:43       ` Jonathan Wakely
@ 2015-02-20 12:11       ` Jakub Jelinek
  2015-02-20 17:03         ` Jeff Law
  2015-02-20 12:13       ` Andrew Haley
  2015-02-20 17:03       ` Jeff Law
  3 siblings, 1 reply; 33+ messages in thread
From: Jakub Jelinek @ 2015-02-20 12:11 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Sandra Loosemore, Jeff Prothero, gcc

On Fri, Feb 20, 2015 at 12:06:28PM +0100, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
> > Hmmmm,  Passing the additional option in user code would be one thing,
> > but what about library code?  E.g., using memcpy (either explicitly or
> > implicitly for a structure copy)?
> 
> The memcpy problem isn't restricted to embedded architectures.
> 
>   size_t size;
>   const unsigned char *source;
>   std::vector<char> vec;
>   â€¦
>   vec.resize(size);
>   memcpy(vec.data(), source, size);
> 
> std::vector<T>::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.
> 
> I think the C standard is wrong here.  We should extend it, as a QoI
> matter, and support null pointers for variable-length inputs and outputs
> if the size is 0.  But I suspect this is still a minority view.

I disagree.  If you want a function that will have that different property,
don't call it memcpy.

	Jakub

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 11:06     ` Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Florian Weimer
  2015-02-20 11:43       ` Jonathan Wakely
  2015-02-20 12:11       ` Jakub Jelinek
@ 2015-02-20 12:13       ` Andrew Haley
  2015-02-20 17:03       ` Jeff Law
  3 siblings, 0 replies; 33+ messages in thread
From: Andrew Haley @ 2015-02-20 12:13 UTC (permalink / raw)
  To: Florian Weimer, Sandra Loosemore, Jakub Jelinek; +Cc: Jeff Prothero, gcc

On 02/20/2015 11:06 AM, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>> Hmmmm,  Passing the additional option in user code would be one thing,
>> but what about library code?  E.g., using memcpy (either explicitly or
>> implicitly for a structure copy)?
> 
> The memcpy problem isn't restricted to embedded architectures.
> 
>   size_t size;
>   const unsigned char *source;
>   std::vector<char> vec;
>   â€¦
>   vec.resize(size);
>   memcpy(vec.data(), source, size);
> 
> std::vector<T>::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.

Sure, but if that's a bug then it's a bug in the definition of
memcpy(), not in the definition of the properties of a null pointer.
If the size is zero then it really shouldn't matter if the destination
address is null.

Andrew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 11:43       ` Jonathan Wakely
  2015-02-20 12:05         ` Florian Weimer
@ 2015-02-20 17:01         ` Jeff Law
  2015-02-20 17:09           ` Florian Weimer
  2015-02-20 18:01           ` Paul_Koning
  1 sibling, 2 replies; 33+ messages in thread
From: Jeff Law @ 2015-02-20 17:01 UTC (permalink / raw)
  To: Jonathan Wakely, Florian Weimer
  Cc: Sandra Loosemore, Jakub Jelinek, Jeff Prothero, gcc

On 02/20/15 04:43, Jonathan Wakely wrote:
> On 20 February 2015 at 11:06, Florian Weimer wrote:
>> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>>> Hmmmm,  Passing the additional option in user code would be one thing,
>>> but what about library code?  E.g., using memcpy (either explicitly or
>>> implicitly for a structure copy)?
>>
>> The memcpy problem isn't restricted to embedded architectures.
>>
>>    size_t size;
>>    const unsigned char *source;
>>    std::vector<char> vec;
>>    â€¦
>>    vec.resize(size);
>>    memcpy(vec.data(), source, size);
>>
>> std::vector<T>::data() can return a null pointer if the vector is empty,
>> which means that this code is invalid for empty inputs.
>>
>> I think the C standard is wrong here.  We should extend it, as a QoI
>> matter, and support null pointers for variable-length inputs and outputs
>> if the size is 0.  But I suspect this is still a minority view.
>
> I'm inclined to agree.
>
> Most developers aren't aware of the preconditions on memcpy, but GCC
> optimizes aggressively based on those preconditions, so we have a
> large and potentially dangerous gap between what developers expect and
> what actually happens.
But that's always true -- this isn't any different than aliasing, 
arithmetic overflow, etc.  The standards define the contract between 
the compiler/library implementors and the developers.  Once the contract 
is broken, all bets are off.


jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 11:45   ` Florian Weimer
@ 2015-02-20 17:01     ` Jeff Law
  2015-02-20 18:07       ` Paul_Koning
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Law @ 2015-02-20 17:01 UTC (permalink / raw)
  To: Florian Weimer, Andrew Haley, Jeff Prothero, gcc

On 02/20/15 04:45, Florian Weimer wrote:
> On 02/20/2015 10:30 AM, Andrew Haley wrote:
>> I doubt that such a thing is ever going to be safe.  The idea that a
>> null pointer points to nothing is so hard-baked into the design of C
>> that you can't get away from it.  Also, almost every C programmer and
>> especially library writer "knows" that a null pointer points to
>> nothing.
>
> NULL pointer dereferences (or NULL pointer with small offsets) were
> common programming idioms in the DOS days because the interrupt vector
> table was located at this address.  Quite a few systems once had a
> readable page zero, and (manual, I assume) optimizations for list
> traversal (p != NULL && p->next != NULL â†’ p->next != NULL) were commonly
> used on these systems.
True, but thankfully this isn't blessed anymore.


>
> I think the treatment of pointers not as addresses, but something that
> has type information and provenience associated with it, came much
> later, when most of the design was already settled.
We still have targets where page0 is special.  The H8 for example comes 
to mind.  Folks regularly place data into page0 and mark it as special 
so the compiler emits more efficient sequences to access that data.

Regardless, the right thing to do is to disable elimination of NULL 
pointer checks on targets where page 0 is mapped and thus a reference to 
*0 may not fault.  In my mind this is an attribute of both the processor 
(see H8 above) and/or the target OS.

On those targets the C-runtime had better also ensure that its headers 
aren't decorated with non-null attributes, particularly for the mem* and 
str* functions.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 12:11       ` Jakub Jelinek
@ 2015-02-20 17:03         ` Jeff Law
  2015-03-03 19:57           ` Martin Sebor
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Law @ 2015-02-20 17:03 UTC (permalink / raw)
  To: Jakub Jelinek, Florian Weimer; +Cc: Sandra Loosemore, Jeff Prothero, gcc

On 02/20/15 05:10, Jakub Jelinek wrote:
> On Fri, Feb 20, 2015 at 12:06:28PM +0100, Florian Weimer wrote:
>> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>>> Hmmmm,  Passing the additional option in user code would be one thing,
>>> but what about library code?  E.g., using memcpy (either explicitly or
>>> implicitly for a structure copy)?
>>
>> The memcpy problem isn't restricted to embedded architectures.
>>
>>    size_t size;
>>    const unsigned char *source;
>>    std::vector<char> vec;
>>    â€¦
>>    vec.resize(size);
>>    memcpy(vec.data(), source, size);
>>
>> std::vector<T>::data() can return a null pointer if the vector is empty,
>> which means that this code is invalid for empty inputs.
>>
>> I think the C standard is wrong here.  We should extend it, as a QoI
>> matter, and support null pointers for variable-length inputs and outputs
>> if the size is 0.  But I suspect this is still a minority view.
>
> I disagree.  If you want a function that will have that different property,
> don't call it memcpy.
Right.  If someone wants to take it up with the Austin group, that's 
fine. But until/unless the Austin group blesses, I don't think we should 
extend as a QoI matter.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 11:06     ` Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Florian Weimer
                         ` (2 preceding siblings ...)
  2015-02-20 12:13       ` Andrew Haley
@ 2015-02-20 17:03       ` Jeff Law
  3 siblings, 0 replies; 33+ messages in thread
From: Jeff Law @ 2015-02-20 17:03 UTC (permalink / raw)
  To: Florian Weimer, Sandra Loosemore, Jakub Jelinek; +Cc: Jeff Prothero, gcc

On 02/20/15 04:06, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>> Hmmmm,  Passing the additional option in user code would be one thing,
>> but what about library code?  E.g., using memcpy (either explicitly or
>> implicitly for a structure copy)?
>
> The memcpy problem isn't restricted to embedded architectures.
>
>    size_t size;
>    const unsigned char *source;
>    std::vector<char> vec;
>    â€¦
>    vec.resize(size);
>    memcpy(vec.data(), source, size);
>
> std::vector<T>::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.
>
> I think the C standard is wrong here.  We should extend it, as a QoI
> matter, and support null pointers for variable-length inputs and outputs
> if the size is 0.  But I suspect this is still a minority view.
And it's these kind of uses that scare me the most.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 17:01         ` Jeff Law
@ 2015-02-20 17:09           ` Florian Weimer
  2015-02-20 17:32             ` Jeff Law
  2015-02-20 18:01           ` Paul_Koning
  1 sibling, 1 reply; 33+ messages in thread
From: Florian Weimer @ 2015-02-20 17:09 UTC (permalink / raw)
  To: Jeff Law, Jonathan Wakely
  Cc: Sandra Loosemore, Jakub Jelinek, Jeff Prothero, gcc

On 02/20/2015 06:01 PM, Jeff Law wrote:

> But that's always true -- this isn't any different than aliasing,
> arithmetic overflow, etc.  The standards define the contract between the
> compiler/library implementors and the developers.  Once the contract is
> broken, all bets are off.

What I don't like about this case (std::vector<T>::data() returning
nullptr vs memcpy/memcmp/qsort non-null assertions) is that it is
internally non-composing in a totally non-obvious way.  data() is
explicitly intended to cover interoperability with these older C
functions, and it fails.

But you are right about overflows.  I think we should give up and just
enable -fwrapv by default in Fedora and downstream.  This issue has been
explicitly documented since 2002 at least (explicitly with
security-related checks in mind), and programmers still write overflow
checks which are only correct with -fwrapv, and it passes code review.
I fear that's not going to change, ever.

-- 
Florian Weimer / Red Hat Product Security

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-19 21:56       ` Chris Johns
@ 2015-02-20 17:30         ` Jeff Law
  2015-02-26 16:23           ` David Malcolm
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Law @ 2015-02-20 17:30 UTC (permalink / raw)
  To: Chris Johns, Joel Sherrill, Sandra Loosemore, Jakub Jelinek
  Cc: Jeff Prothero, gcc

On 02/19/15 14:56, Chris Johns wrote:
> On 20/02/2015 8:23 am, Joel Sherrill wrote:
>>
>> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
>>> Jakub Jelinek wrote:
>>>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>>>>> Starting with gcc 4.9, -O2 implicitly invokes
>>>>>
>>>>>      -fisolate-erroneous-paths-dereference:
>>>>>
>>>>> which
>>>>>
>>>>>      https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>>>
>>>>> documents as
>>>>>
>>>>>      Detect paths that trigger erroneous or undefined behavior due to
>>>>>      dereferencing a null pointer. Isolate those paths from the
>>>>> main control
>>>>>      flow and turn the statement with erroneous or undefined
>>>>> behavior into a
>>>>>      trap. This flag is enabled by default at -O2 and higher.
>>>>>
>>>>> This results in a sizable number of previously working embedded
>>>>> programs mysteriously
>>>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>>>>> programs will often have ram starting at address zero (think
>>>>> hardware-defined
>>>>> interrupt vectors, say) which gets initialized by code which the
>>>>> -fisolate-erroneous-paths-deference logic can recognize as reading
>>>>> and/or
>>>>> writing address zero.
>>>> If you have some pages mapped at address 0, you really should
>>>> compile your
>>>> code with -fno-delete-null-pointer-checks, otherwise you can run
>>>> into tons
>>>> of other issues.
>>> Hmmmm,  Passing the additional option in user code would be one thing,
>>> but what about library code?  E.g., using memcpy (either explicitly or
>>> implicitly for a structure copy)?
>>>
>>> It looks to me like cr16 and avr are currently the only architectures
>>> that disable flag_delete_null_pointer_checks entirely, but I am sure
>>> that this issue affects other embedded targets besides nios2, too.  E.g.
>>> scanning Mentor's ARM board support library, I see a whole pile of
>>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
>>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ....).  Plus our simulator
>>> BSPs assume a flat address space starting at address 0.
>> I forwarded this to the RTEMS list and was promptly pointed to a patch
>> on a Coldfire BSP where someone worked around this behavior.
>>
>> We are discussing how to deal with this. It is likely OK in user code but
>> horrible in BSP and driver code. We don't have a solution ourselves. We
>> just recognize it impacts a number of targets.
>>
>
> My main concern is not knowing the trap has been added to the code. If I
> could build an application and audit it somehow then I can manage it. We
> have a similar issue with the possible use of FP registers being used in
> general code (ISR save/restore trade off).
>
> Can the ELF be annotated in some GCC specific way that makes it to the
> final executable to flag this is happening ? We can then create tools to
> audit the executables.
Not really, for a variety of reasons.  However, the compiler can do 
better for warning about some of these kinds of things -- but we 
certainly can't guarantee we catch all of them as there are cases where 
the point where we determine a property (such as non-nullness) may be 
very different from the point where we exploit that property.

I did propose some patches to improve the warnings back in the 4.9 time 
frame, but they never got reviewed.  See BZ 16351.   We'll have to 
revisit them during the next open development period.

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 17:09           ` Florian Weimer
@ 2015-02-20 17:32             ` Jeff Law
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff Law @ 2015-02-20 17:32 UTC (permalink / raw)
  To: Florian Weimer, Jonathan Wakely
  Cc: Sandra Loosemore, Jakub Jelinek, Jeff Prothero, gcc

On 02/20/15 10:09, Florian Weimer wrote:
> On 02/20/2015 06:01 PM, Jeff Law wrote:
>
>> But that's always true -- this isn't any different than aliasing,
>> arithmetic overflow, etc.  The standards define the contract between the
>> compiler/library implementors and the developers.  Once the contract is
>> broken, all bets are off.
>
> What I don't like about this case (std::vector<T>::data() returning
> nullptr vs memcpy/memcmp/qsort non-null assertions) is that it is
> internally non-composing in a totally non-obvious way.  data() is
> explicitly intended to cover interoperability with these older C
> functions, and it fails.
And that's precisely why I consider this class of issues the most 
problematical.

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 17:01         ` Jeff Law
  2015-02-20 17:09           ` Florian Weimer
@ 2015-02-20 18:01           ` Paul_Koning
  1 sibling, 0 replies; 33+ messages in thread
From: Paul_Koning @ 2015-02-20 18:01 UTC (permalink / raw)
  To: law; +Cc: gcc

> On Feb 20, 2015, at 12:01 PM, Jeff Law <law@redhat.com> wrote:
> 
> On 02/20/15 04:43, Jonathan Wakely wrote:
>>> ...
>> 
>> I'm inclined to agree.
>> 
>> Most developers aren't aware of the preconditions on memcpy, but GCC
>> optimizes aggressively based on those preconditions, so we have a
>> large and potentially dangerous gap between what developers expect and
>> what actually happens.
> But that's always true -- this isn't any different than aliasing, arithmetic overflow, etc.  The standards define the contract between the compiler/library implementors and the developers.  Once the contract is broken, all bets are off.

True.  The unfortunate problem with C, and even more so with C++, is that the contract is so large and complex that few, if any, are skilled enough language lawyers to know what exactly it says.  For that matter, the contract (the standard) is large and complex enough that it has bugs and ambiguities, so the contract is not in fact precisely defined.

There’s a nice paper that drives this home: http://people.csail.mit.edu/akcheung/papers/apsys12.pdf

For example, while most people know about the “no overlaps” rule of memcpy, stuff like aliasing are far more obscure. Or the exact meaning (if there is one) of “volatile”.

It also doesn’t help that a large fraction of the contract is unenforced.  You only find out about it when a new version of the compiler starts using a particular rule to make an optimization that suddenly breaks 10 year old code. I remember some heated debates between Linux folks and compiler builders when such things strike the Linux kernel.

	paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 17:01     ` Jeff Law
@ 2015-02-20 18:07       ` Paul_Koning
  0 siblings, 0 replies; 33+ messages in thread
From: Paul_Koning @ 2015-02-20 18:07 UTC (permalink / raw)
  To: law; +Cc: fweimer, aph, jprother, gcc


> On Feb 20, 2015, at 12:01 PM, Jeff Law <law@redhat.com> wrote:
> 
> ...
> Regardless, the right thing to do is to disable elimination of NULL pointer checks on targets where page 0 is mapped and thus a reference to *0 may not fault.  In my mind this is an attribute of both the processor (see H8 above) and/or the target OS.
> 
> On those targets the C-runtime had better also ensure that its headers aren't decorated with non-null attributes, particularly for the mem* and str* functions.

pdp11 is an example of such a target, independent of OS (with only 8 pages, clearly no one is going to unmap page 0).  Fortunately there one is unlikely to find C program data structures at 0 (instead, vectors if kernel, stack limit if user mode).  So no fault is correct there, but no null (0 bits) pointer is also — for practical purposes — a valid assumption.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 17:30         ` Jeff Law
@ 2015-02-26 16:23           ` David Malcolm
  2015-02-27 20:55             ` [RFC/patch for stage1] Embed compiler dumps into generated .o files (was Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference) David Malcolm
  0 siblings, 1 reply; 33+ messages in thread
From: David Malcolm @ 2015-02-26 16:23 UTC (permalink / raw)
  To: Jeff Law
  Cc: Chris Johns, Joel Sherrill, Sandra Loosemore, Jakub Jelinek,
	Jeff Prothero, gcc

On Fri, 2015-02-20 at 10:29 -0700, Jeff Law wrote:
> On 02/19/15 14:56, Chris Johns wrote:
> > On 20/02/2015 8:23 am, Joel Sherrill wrote:
> >>
> >> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
> >>> Jakub Jelinek wrote:
> >>>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> >>>>> Starting with gcc 4.9, -O2 implicitly invokes
> >>>>>
> >>>>>      -fisolate-erroneous-paths-dereference:
> >>>>>
> >>>>> which
> >>>>>
> >>>>>      https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> >>>>>
> >>>>> documents as
> >>>>>
> >>>>>      Detect paths that trigger erroneous or undefined behavior due to
> >>>>>      dereferencing a null pointer. Isolate those paths from the
> >>>>> main control
> >>>>>      flow and turn the statement with erroneous or undefined
> >>>>> behavior into a
> >>>>>      trap. This flag is enabled by default at -O2 and higher.
> >>>>>
> >>>>> This results in a sizable number of previously working embedded
> >>>>> programs mysteriously
> >>>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
> >>>>> programs will often have ram starting at address zero (think
> >>>>> hardware-defined
> >>>>> interrupt vectors, say) which gets initialized by code which the
> >>>>> -fisolate-erroneous-paths-deference logic can recognize as reading
> >>>>> and/or
> >>>>> writing address zero.
> >>>> If you have some pages mapped at address 0, you really should
> >>>> compile your
> >>>> code with -fno-delete-null-pointer-checks, otherwise you can run
> >>>> into tons
> >>>> of other issues.
> >>> Hmmmm,  Passing the additional option in user code would be one thing,
> >>> but what about library code?  E.g., using memcpy (either explicitly or
> >>> implicitly for a structure copy)?
> >>>
> >>> It looks to me like cr16 and avr are currently the only architectures
> >>> that disable flag_delete_null_pointer_checks entirely, but I am sure
> >>> that this issue affects other embedded targets besides nios2, too.  E.g.
> >>> scanning Mentor's ARM board support library, I see a whole pile of
> >>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
> >>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ....).  Plus our simulator
> >>> BSPs assume a flat address space starting at address 0.
> >> I forwarded this to the RTEMS list and was promptly pointed to a patch
> >> on a Coldfire BSP where someone worked around this behavior.
> >>
> >> We are discussing how to deal with this. It is likely OK in user code but
> >> horrible in BSP and driver code. We don't have a solution ourselves. We
> >> just recognize it impacts a number of targets.
> >>
> >
> > My main concern is not knowing the trap has been added to the code. If I
> > could build an application and audit it somehow then I can manage it. We
> > have a similar issue with the possible use of FP registers being used in
> > general code (ISR save/restore trade off).
> >
> > Can the ELF be annotated in some GCC specific way that makes it to the
> > final executable to flag this is happening ? We can then create tools to
> > audit the executables.
> Not really, for a variety of reasons.

Is information on this reaching the pass-specific dumpfile?  I don't see
any explicit dumping in gimple-ssa-isolate-paths.c, but I guess that
insert_trap_and_remove_trailing_statements could log itself to the
dumpfile, or use the statistics framework (which itself also reaches a
dumpfile).

Assuming the info is reaching a dumpfile, could gcc have an option to
write its dumpfiles into a special ELF section in the .s, rather than to
disk?

Then (given a suitable new option to e.g. eu-readelf) you'd be able to
read the dumpfiles from a .o file, and (handwaving about linkage) from
an execuable or library.

Not that I'm volunteering...

> However, the compiler can do 
> better for warning about some of these kinds of things -- but we 
> certainly can't guarantee we catch all of them as there are cases where 
> the point where we determine a property (such as non-nullness) may be 
> very different from the point where we exploit that property.
> 
> I did propose some patches to improve the warnings back in the 4.9 time 
> frame, but they never got reviewed.  See BZ 16351.   We'll have to 
> revisit them during the next open development period.
> 
> Jeff


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC/patch for stage1] Embed compiler dumps into generated .o files (was Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference)
  2015-02-26 16:23           ` David Malcolm
@ 2015-02-27 20:55             ` David Malcolm
  0 siblings, 0 replies; 33+ messages in thread
From: David Malcolm @ 2015-02-27 20:55 UTC (permalink / raw)
  To: Jeff Law
  Cc: Chris Johns, Joel Sherrill, Sandra Loosemore, Jakub Jelinek,
	Jeff Prothero, gcc, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5863 bytes --]

On Thu, 2015-02-26 at 11:17 -0500, David Malcolm wrote:
> On Fri, 2015-02-20 at 10:29 -0700, Jeff Law wrote:
> > On 02/19/15 14:56, Chris Johns wrote:
> > > On 20/02/2015 8:23 am, Joel Sherrill wrote:
> > >>
> > >> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
> > >>> Jakub Jelinek wrote:
> > >>>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> > >>>>> Starting with gcc 4.9, -O2 implicitly invokes
> > >>>>>
> > >>>>>      -fisolate-erroneous-paths-dereference:
> > >>>>>
> > >>>>> which
> > >>>>>
> > >>>>>      https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> > >>>>>
> > >>>>> documents as
> > >>>>>
> > >>>>>      Detect paths that trigger erroneous or undefined behavior due to
> > >>>>>      dereferencing a null pointer. Isolate those paths from the
> > >>>>> main control
> > >>>>>      flow and turn the statement with erroneous or undefined
> > >>>>> behavior into a
> > >>>>>      trap. This flag is enabled by default at -O2 and higher.
> > >>>>>
> > >>>>> This results in a sizable number of previously working embedded
> > >>>>> programs mysteriously
> > >>>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
> > >>>>> programs will often have ram starting at address zero (think
> > >>>>> hardware-defined
> > >>>>> interrupt vectors, say) which gets initialized by code which the
> > >>>>> -fisolate-erroneous-paths-deference logic can recognize as reading
> > >>>>> and/or
> > >>>>> writing address zero.
> > >>>> If you have some pages mapped at address 0, you really should
> > >>>> compile your
> > >>>> code with -fno-delete-null-pointer-checks, otherwise you can run
> > >>>> into tons
> > >>>> of other issues.
> > >>> Hmmmm,  Passing the additional option in user code would be one thing,
> > >>> but what about library code?  E.g., using memcpy (either explicitly or
> > >>> implicitly for a structure copy)?
> > >>>
> > >>> It looks to me like cr16 and avr are currently the only architectures
> > >>> that disable flag_delete_null_pointer_checks entirely, but I am sure
> > >>> that this issue affects other embedded targets besides nios2, too.  E.g.
> > >>> scanning Mentor's ARM board support library, I see a whole pile of
> > >>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
> > >>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ....).  Plus our simulator
> > >>> BSPs assume a flat address space starting at address 0.
> > >> I forwarded this to the RTEMS list and was promptly pointed to a patch
> > >> on a Coldfire BSP where someone worked around this behavior.
> > >>
> > >> We are discussing how to deal with this. It is likely OK in user code but
> > >> horrible in BSP and driver code. We don't have a solution ourselves. We
> > >> just recognize it impacts a number of targets.
> > >>
> > >
> > > My main concern is not knowing the trap has been added to the code. If I
> > > could build an application and audit it somehow then I can manage it. We
> > > have a similar issue with the possible use of FP registers being used in
> > > general code (ISR save/restore trade off).
> > >
> > > Can the ELF be annotated in some GCC specific way that makes it to the
> > > final executable to flag this is happening ? We can then create tools to
> > > audit the executables.
> > Not really, for a variety of reasons.
> 
> Is information on this reaching the pass-specific dumpfile?  I don't see
> any explicit dumping in gimple-ssa-isolate-paths.c, but I guess that
> insert_trap_and_remove_trailing_statements could log itself to the
> dumpfile, or use the statistics framework (which itself also reaches a
> dumpfile).
> 
> Assuming the info is reaching a dumpfile, could gcc have an option to
> write its dumpfiles into a special ELF section in the .s, rather than to
> disk?
> 
> Then (given a suitable new option to e.g. eu-readelf) you'd be able to
> read the dumpfiles from a .o file, and (handwaving about linkage) from
> an execuable or library.
> 
> Not that I'm volunteering...

Perhaps foolishly I had a go at prototyping this; attached is a
proof-of-concept patch (albeit with FIXMEs and no ChangeLog or testsuite
coverage).

When writing out the final asm, each dumpfile is read, and embedded into
its own section.  Manual review of the built .o file shows that the
dumpfiles make it into them, e.g.:

$ eu-readelf -x .note.GNU-dump.tree-switchconv smoketest.o|head

Hex dump of section [28] '.note.GNU-dump.tree-switchconv', 2698 bytes at offset 0xf021:
  0x00000000 0a3b3b20 46756e63 74696f6e 20746573 .;; Function tes
  0x00000010 745f7068 69202874 6573745f 7068692c t_phi (test_phi,
  0x00000020 2066756e 63646566 5f6e6f3d 302c2064  funcdef_no=0, d
  0x00000030 65636c5f 7569643d 31383332 2c206367 ecl_uid=1832, cg
  0x00000040 72617068 5f756964 3d302c20 73796d62 raph_uid=0, symb
  0x00000050 6f6c5f6f 72646572 3d30290a 0a746573 ol_order=0)..tes
  0x00000060 745f7068 69202869 6e742069 2c20696e t_phi (int i, in
  0x00000070 74206a29 0a7b0a20 20696e74 206b3b0a t j).{.  int k;.

(again, handwaving over what happens at link time).

There could be an analogue here with debuginfo: the sections would be
big and something you'd generally want to strip, but potentially very
useful (e.g. for finding every place across a large number of compiles
where a particular optimization happened).

> > However, the compiler can do 
> > better for warning about some of these kinds of things -- but we 
> > certainly can't guarantee we catch all of them as there are cases where 
> > the point where we determine a property (such as non-nullness) may be 
> > very different from the point where we exploit that property.
> > 
> > I did propose some patches to improve the warnings back in the 4.9 time 
> > frame, but they never got reviewed.  See BZ 16351.   We'll have to 
> > revisit them during the next open development period.
> > 
> > Jeff
> 


[-- Attachment #2: dumpfile-hack.patch --]
[-- Type: text/x-patch, Size: 3793 bytes --]

commit a464b717cc87ba64456ef52d7829c478e6e58e91
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Fri Feb 27 15:52:40 2015 -0500

    Prototype of embedding dumpfiles within .s/.o

diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 743344e5..1456aca 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -302,6 +302,15 @@ get_dump_file_name (struct dump_file_info *dfi) const
   return concat (dump_base_name, dump_id, dfi->suffix, NULL);
 }
 
+void
+gcc::dump_manager::for_each_dumpfile (void (*cb) (struct dump_file_info *,
+						  void *user_data),
+				      void *user_data)
+{
+  for (unsigned i = 0; i < m_extra_dump_files_in_use; i++)
+    (*cb) (&m_extra_dump_files[i], user_data);
+}
+
 /* For a given DFI, open an alternate dump filename (which could also
    be a standard stream such as stdout/stderr). If the alternate dump
    file cannot be opened, return NULL.  */
diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h
index 40b6473..ebb7eba 100644
--- a/gcc/dumpfile.h
+++ b/gcc/dumpfile.h
@@ -220,6 +220,11 @@ public:
   const char *
   dump_flag_name (int phase) const;
 
+  void
+  for_each_dumpfile (void (*cb) (struct dump_file_info *,
+				 void *user_data),
+		     void *user_data);
+
 private:
 
   int
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 99cf180..10bfb24 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -114,6 +114,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "tree-chkp.h"
 #include "omp-low.h"
+#include "dumpfile.h"
 
 #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
 #include "dbxout.h"
@@ -580,6 +581,84 @@ emit_debug_global_declarations (tree *vec, int len)
   timevar_pop (TV_SYMOUT);
 }
 
+/* FIXME: this is just a copy of jit-playback.c's
+   playback::context::read_dump_file with the error-handling
+   hacked out.  */
+
+static char *
+read_dump_file (const char *path)
+{
+  char *result = NULL;
+  size_t total_sz = 0;
+  char buf[4096];
+  size_t sz;
+  FILE *f_in;
+
+  f_in = fopen (path, "r");
+  if (!f_in)
+    return NULL;
+
+  while ( (sz = fread (buf, 1, sizeof (buf), f_in)) )
+    {
+      size_t old_total_sz = total_sz;
+      total_sz += sz;
+      result = reinterpret_cast <char *> (xrealloc (result, total_sz + 1));
+      memcpy (result + old_total_sz, buf, sz);
+    }
+
+  if (!feof (f_in))
+    {
+      free (result);
+      fclose (f_in);
+      return NULL;
+    }
+
+  fclose (f_in);
+
+  if (result)
+    {
+      result[total_sz] = '\0';
+      return result;
+    }
+  else
+    return xstrdup ("");
+}
+
+/* FIXME: should have a leading comment.  */
+
+static void
+embed_dumpfiles_within_asm ()
+{
+  class callback
+  {
+  public:
+    static void fn (struct dump_file_info *dfi,
+		    void */*user_data*/)
+    {
+      char *filename;
+      char *content;
+
+      filename = g->get_dumps ()->get_dump_file_name (dfi);
+      content = read_dump_file (filename);
+
+      /* FIXME: what about error handling?  */
+      if (content)
+	{
+	  unsigned int flags = SECTION_DEBUG;
+	  /* FIXME: name of section?  */
+	  char *section_name = concat (".note.GNU-dump.", dfi->swtch, NULL);
+	  switch_to_section (get_section (section_name, flags, NULL));
+	  assemble_string (content, strlen (content));
+	  free (section_name);
+	}
+      free (filename);
+      free (content);
+    }
+  };
+
+  g->get_dumps ()->for_each_dumpfile (callback::fn, NULL);
+}
+
 /* Compile an entire translation unit.  Write a file of assembly
    output and various debugging dumps.  */
 
@@ -711,6 +790,9 @@ compile_file (void)
       targetm.asm_out.output_ident (ident_str);
     }
 
+  /* FIXME: should only be enabled if an option has been set.  */
+  embed_dumpfiles_within_asm ();
+
   /* Auto profile finalization. */
   if (flag_auto_profile)
     end_auto_profile ();

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-20 17:03         ` Jeff Law
@ 2015-03-03 19:57           ` Martin Sebor
  2015-03-03 23:38             ` Jeff Law
  0 siblings, 1 reply; 33+ messages in thread
From: Martin Sebor @ 2015-03-03 19:57 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek, Florian Weimer
  Cc: Sandra Loosemore, Jeff Prothero, gcc

On 02/20/2015 10:01 AM, Jeff Law wrote:
> On 02/20/15 05:10, Jakub Jelinek wrote:
>> On Fri, Feb 20, 2015 at 12:06:28PM +0100, Florian Weimer wrote:
>>> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>>>> Hmmmm,  Passing the additional option in user code would be one thing,
>>>> but what about library code?  E.g., using memcpy (either explicitly or
>>>> implicitly for a structure copy)?
>>>
>>> The memcpy problem isn't restricted to embedded architectures.
>>>
>>>    size_t size;
>>>    const unsigned char *source;
>>>    std::vector<char> vec;
>>>    â€¦
>>>    vec.resize(size);
>>>    memcpy(vec.data(), source, size);
>>>
>>> std::vector<T>::data() can return a null pointer if the vector is empty,
>>> which means that this code is invalid for empty inputs.
>>>
>>> I think the C standard is wrong here.  We should extend it, as a QoI
>>> matter, and support null pointers for variable-length inputs and outputs
>>> if the size is 0.  But I suspect this is still a minority view.
>>
>> I disagree.  If you want a function that will have that different
>> property,
>> don't call it memcpy.
> Right.  If someone wants to take it up with the Austin group, that's
> fine. But until/unless the Austin group blesses, I don't think we should
> extend as a QoI matter.

As a data point(*) it might be interesting to note that GCC itself
relies on memcpy providing stronger guarantees than the C standard
requires it to by emitting calls to the function for large structure
self-assignments (which are strictly conforming, as discussed in bug
65029).

Martin

[*] IMO, one in favor of tightening up the memcpy specification
to require implementations to provide the expected semantics.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-03-03 19:57           ` Martin Sebor
@ 2015-03-03 23:38             ` Jeff Law
  2015-03-04 12:36               ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Law @ 2015-03-03 23:38 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek, Florian Weimer
  Cc: Sandra Loosemore, Jeff Prothero, gcc

On 03/03/15 12:57, Martin Sebor wrote:
>
> As a data point(*) it might be interesting to note that GCC itself
> relies on memcpy providing stronger guarantees than the C standard
> requires it to by emitting calls to the function for large structure
> self-assignments (which are strictly conforming, as discussed in bug
> 65029).
Right.  I actually spent quite a bit of time struggling with this a 
while back in a different context.   The only case I could come up with 
where GCC would generate an overlapping memcpy was self assignment, but 
even that was bad and while we ultimately punted, I've always considered 
it a wart.


  [*] IMO, one in favor of tightening up the memcpy specification
> to require implementations to provide the expected semantics.
That works for me :-)

The things done in glibc's memcpy are a bit on the absurd side and the 
pain caused by the changes over time is almost impossible to overstate. 
  If the Austin group tightens memcpy to require fewer surprises I think 
most developers would ultimately be happy with the result -- a few would 
complain about the performance impacts for specific workloads, but I 
suspect they'd be in the minority.


jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-03-03 23:38             ` Jeff Law
@ 2015-03-04 12:36               ` Richard Biener
  0 siblings, 0 replies; 33+ messages in thread
From: Richard Biener @ 2015-03-04 12:36 UTC (permalink / raw)
  To: Jeff Law
  Cc: Martin Sebor, Jakub Jelinek, Florian Weimer, Sandra Loosemore,
	Jeff Prothero, GCC Development

On Wed, Mar 4, 2015 at 12:38 AM, Jeff Law <law@redhat.com> wrote:
> On 03/03/15 12:57, Martin Sebor wrote:
>>
>>
>> As a data point(*) it might be interesting to note that GCC itself
>> relies on memcpy providing stronger guarantees than the C standard
>> requires it to by emitting calls to the function for large structure
>> self-assignments (which are strictly conforming, as discussed in bug
>> 65029).
>
> Right.  I actually spent quite a bit of time struggling with this a while
> back in a different context.   The only case I could come up with where GCC
> would generate an overlapping memcpy was self assignment, but even that was
> bad and while we ultimately punted, I've always considered it a wart.

?

struct A { int large[100]; };
void foo (struct A *x, struct A *y)
{
  *x = *y;
}

call it as foo (&a, &a);  (on x86 you need -mstringop-strategy=libcall,
even at -O0, to emit a memcpy call)

The self-assignment doesn't have to be visible to the compiler - so
to fix this we'd have to assume pointer equality everywhere and
either emit a conditional call to memcpy or always emit a call to
memmove.

Richard.

>
>  [*] IMO, one in favor of tightening up the memcpy specification
>>
>> to require implementations to provide the expected semantics.
>
> That works for me :-)
>
> The things done in glibc's memcpy are a bit on the absurd side and the pain
> caused by the changes over time is almost impossible to overstate.  If the
> Austin group tightens memcpy to require fewer surprises I think most
> developers would ultimately be happy with the result -- a few would complain
> about the performance impacts for specific workloads, but I suspect they'd
> be in the minority.
>
>
> jeff
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
  2015-02-27 22:13 Manuel López-Ibáñez
@ 2015-03-03  8:41 ` Chris Johns
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Johns @ 2015-03-03  8:41 UTC (permalink / raw)
  To: Manuel López-Ibáñez, gcc Mailing List,
	Joel Sherrill, Sandra Loosemore, Jakub Jelinek, David Malcolm,
	Jeff Prothero

On 28/02/2015 9:12 am, Manuel LÃ³pez-IbÃ¡Ã±ez wrote:
> On 02/19/15 14:56, Chris Johns wrote:
>>
>> My main concern is not knowing the trap has been added to the code. If I
>> could build an application and audit it somehow then I can manage it. We
>> have a similar issue with the possible use of FP registers being used in
>> general code (ISR save/restore trade off).
>>
>> Can the ELF be annotated in some GCC specific way that makes it to the
>> final executable to flag this is happening ? We can then create tools to
>> audit the executables.
>
> Simply ignore me if I'm misunderstanding the issue: Couldn't GCC
> generate, instead of a trap, a call to a noinline noreturn cold weak
> function __gcc_is_a_trap that by default calls the trap? Then, audit
> tools can inspect the code and see if such a function call appears and
> even override it with something else.

Yes it could and this is a nice idea.

>
> Chris, wouldn't that be enough for your purposes?
>

I think it does because we can scan an executable for the call locations 
and audit them.

Chris

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
@ 2015-02-27 22:13 Manuel López-Ibáñez
  2015-03-03  8:41 ` Chris Johns
  0 siblings, 1 reply; 33+ messages in thread
From: Manuel López-Ibáñez @ 2015-02-27 22:13 UTC (permalink / raw)
  To: gcc Mailing List, Joel Sherrill, Sandra Loosemore, Jakub Jelinek,
	Chris Johns, David Malcolm, Jeff Prothero

On 02/19/15 14:56, Chris Johns wrote:
>
> My main concern is not knowing the trap has been added to the code. If I
> could build an application and audit it somehow then I can manage it. We
> have a similar issue with the possible use of FP registers being used in
> general code (ISR save/restore trade off).
>
> Can the ELF be annotated in some GCC specific way that makes it to the
> final executable to flag this is happening ? We can then create tools to
> audit the executables.

Simply ignore me if I'm misunderstanding the issue: Couldn't GCC
generate, instead of a trap, a call to a noinline noreturn cold weak
function __gcc_is_a_trap that by default calls the trap? Then, audit
tools can inspect the code and see if such a function call appears and
even override it with something else.

Chris, wouldn't that be enough for your purposes?

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
@ 2015-02-20  1:04 Jeff Prothero
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff Prothero @ 2015-02-20  1:04 UTC (permalink / raw)
  To: gcc


(Thanks to everyone for the helpful feedback!)

Daniel Gutson wrote:

> what about then two warnings (disabled by default), one intended to
> tell the user each time the compiler removes a conditional
>   (-fdelete-null-pointer-checks)
> and another intended to tell the user each time the compiler adds a
> trap due to dereference an address 0?
>
> E.g.
>   -Wnull-pointer-check-deleted
>   -Wnull-dereference-considered-erroneous

I very much like the idea of such warnings.

I'm not clear why one would not warn by default when detecting
non-standards-conformant code and producing code guaranteed not
to do what the programmer intended.  But presumably most sane
engineers these days compile with -Wall. :-)

-Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-03-04 12:36 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-18 19:24 Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Jeff Prothero
2015-02-18 19:29 ` Jakub Jelinek
2015-02-19 20:56   ` Sandra Loosemore
2015-02-19 21:16     ` Daniel Gutson
2015-02-19 22:10       ` Jakub Jelinek
2015-02-19 22:26         ` Sandra Loosemore
2015-02-19 21:23     ` Joel Sherrill
2015-02-19 21:56       ` Chris Johns
2015-02-20 17:30         ` Jeff Law
2015-02-26 16:23           ` David Malcolm
2015-02-27 20:55             ` [RFC/patch for stage1] Embed compiler dumps into generated .o files (was Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference) David Malcolm
2015-02-20 11:06     ` Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference Florian Weimer
2015-02-20 11:43       ` Jonathan Wakely
2015-02-20 12:05         ` Florian Weimer
2015-02-20 17:01         ` Jeff Law
2015-02-20 17:09           ` Florian Weimer
2015-02-20 17:32             ` Jeff Law
2015-02-20 18:01           ` Paul_Koning
2015-02-20 12:11       ` Jakub Jelinek
2015-02-20 17:03         ` Jeff Law
2015-03-03 19:57           ` Martin Sebor
2015-03-03 23:38             ` Jeff Law
2015-03-04 12:36               ` Richard Biener
2015-02-20 12:13       ` Andrew Haley
2015-02-20 17:03       ` Jeff Law
2015-02-18 19:30 ` Andrew Pinski
2015-02-20  9:30 ` Andrew Haley
2015-02-20 11:45   ` Florian Weimer
2015-02-20 17:01     ` Jeff Law
2015-02-20 18:07       ` Paul_Koning
2015-02-20  1:04 Jeff Prothero
2015-02-27 22:13 Manuel López-Ibáñez
2015-03-03  8:41 ` Chris Johns

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).