public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* GCC optimization re-order statements
@ 2021-06-11  9:37 Jonny Grant
  2021-06-11  9:44 ` Xi Ruoyao
  2021-06-11 11:56 ` David Brown
  0 siblings, 2 replies; 10+ messages in thread
From: Jonny Grant @ 2021-06-11  9:37 UTC (permalink / raw)
  To: gcc-help

Hello
This isn't real code, it's just an example to ask this question:

Would GCC optimizer ever re-order these statements and cause a NULL ptr de-reference SIGSEGV? I recall reading a Chris Lattner paper indicating it could happen.

void f(int * p)
{
    if(!p)
    {
        return;
    }

    printf("%d\n", *p);
}

I which case, a lot of production code faces issues, must be changed to:

void f(int * p)
{
    if(p)
    {
        printf("%d\n", *p);
    }
}

Jonny

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-11  9:37 GCC optimization re-order statements Jonny Grant
@ 2021-06-11  9:44 ` Xi Ruoyao
  2021-06-11 20:50   ` Jonny Grant
  2021-06-11 11:56 ` David Brown
  1 sibling, 1 reply; 10+ messages in thread
From: Xi Ruoyao @ 2021-06-11  9:44 UTC (permalink / raw)
  To: Jonny Grant; +Cc: gcc-help

On Fri, 2021-06-11 at 10:37 +0100, Jonny Grant wrote:
> Hello
> This isn't real code, it's just an example to ask this question:
> 
> Would GCC optimizer ever re-order these statements and cause a NULL ptr
> de-reference SIGSEGV? I recall reading a Chris Lattner paper indicating
> it could happen.
> 
> void f(int * p)
> {
>     if(!p)
>     {
>         return;
>     }
> 
>     printf("%d\n", *p);
> }

No it won't.

What Chris paper says is about something like:

void f(int * p)
{
    int x = *p;

    if(!p)
    {
         return;
    }

    printf("%d\n", x);
}

if p is NULL, this is an UB and the compiler can do anything.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-11  9:37 GCC optimization re-order statements Jonny Grant
  2021-06-11  9:44 ` Xi Ruoyao
@ 2021-06-11 11:56 ` David Brown
  2021-06-11 20:52   ` Jonny Grant
  2021-06-12  1:45   ` Segher Boessenkool
  1 sibling, 2 replies; 10+ messages in thread
From: David Brown @ 2021-06-11 11:56 UTC (permalink / raw)
  To: Jonny Grant, gcc-help

On 11/06/2021 11:37, Jonny Grant wrote:
> Hello
> This isn't real code, it's just an example to ask this question:
> 
> Would GCC optimizer ever re-order these statements and cause a NULL ptr de-reference SIGSEGV? I recall reading a Chris Lattner paper indicating it could happen.
> 
> void f(int * p)
> {
>     if(!p)
>     {
>         return;
>     }
> 
>     printf("%d\n", *p);
> }
> 
> I which case, a lot of production code faces issues, must be changed to:
> 
> void f(int * p)
> {
>     if(p)
>     {
>         printf("%d\n", *p);
>     }
> }

These two functions have identical semantics.

> 
> Jonny
> 

The optimiser never (baring bugs in the compiler, but those are rare!)
rearranges code in a way that changes side-effects in well-defined code.
 If code hits undefined behaviour, the compiler can do anything.

Thus if you write:

void f2(int *p) {
	int x  = *p;
	if (p) printf("%d\n", x);
}

the compiler can reason that either p is null, or it is not null.  If it
is not null, the "if (p)" conditional is always true and can be skipped.
 If it /is/ null, then dereferencing it to read *p is undefined
behaviour - and the compiler can assume the programmer doesn't care what
happens.

So the code can be optimised to:

void f2(int *p) {
	printf("%d\n", *p);
}


If the compiler knows that on the target in question, it is perfectly
safe (i.e., no signals or anything else) to dereference a null pointer,
then in your original "f" the compiler could read *p before it tests p
for null, but it could not do the printf before the test.  (Sometimes
for embedded systems the compiler knows that reading via a null pointer
is safe.)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-11  9:44 ` Xi Ruoyao
@ 2021-06-11 20:50   ` Jonny Grant
  0 siblings, 0 replies; 10+ messages in thread
From: Jonny Grant @ 2021-06-11 20:50 UTC (permalink / raw)
  To: gcc-help



On 11/06/2021 10:44, Xi Ruoyao wrote:
> On Fri, 2021-06-11 at 10:37 +0100, Jonny Grant wrote:
>> Hello
>> This isn't real code, it's just an example to ask this question:
>>
>> Would GCC optimizer ever re-order these statements and cause a NULL ptr
>> de-reference SIGSEGV? I recall reading a Chris Lattner paper indicating
>> it could happen.
>>
>> void f(int * p)
>> {
>>     if(!p)
>>     {
>>         return;
>>     }
>>
>>     printf("%d\n", *p);
>> }
> 
> No it won't.
> 
> What Chris paper says is about something like:
> 
> void f(int * p)
> {
>     int x = *p;
> 
>     if(!p)
>     {
>          return;
>     }
> 
>     printf("%d\n", x);
> }
> 
> if p is NULL, this is an UB and the compiler can do anything.
> 

Many thanks for your reply Xi.
Yes, I see, the compiler wouldn't change the order.
Jonny

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-11 11:56 ` David Brown
@ 2021-06-11 20:52   ` Jonny Grant
  2021-06-12  1:45   ` Segher Boessenkool
  1 sibling, 0 replies; 10+ messages in thread
From: Jonny Grant @ 2021-06-11 20:52 UTC (permalink / raw)
  To: David Brown, gcc-help



On 11/06/2021 12:56, David Brown wrote:
> On 11/06/2021 11:37, Jonny Grant wrote:
>> Hello
>> This isn't real code, it's just an example to ask this question:
>>
>> Would GCC optimizer ever re-order these statements and cause a NULL ptr de-reference SIGSEGV? I recall reading a Chris Lattner paper indicating it could happen.
>>
>> void f(int * p)
>> {
>>     if(!p)
>>     {
>>         return;
>>     }
>>
>>     printf("%d\n", *p);
>> }
>>
>> I which case, a lot of production code faces issues, must be changed to:
>>
>> void f(int * p)
>> {
>>     if(p)
>>     {
>>         printf("%d\n", *p);
>>     }
>> }
> 
> These two functions have identical semantics.
> 
>>
>> Jonny
>>
> 
> The optimiser never (baring bugs in the compiler, but those are rare!)
> rearranges code in a way that changes side-effects in well-defined code.
>  If code hits undefined behaviour, the compiler can do anything.
> 
> Thus if you write:
> 
> void f2(int *p) {
> 	int x  = *p;
> 	if (p) printf("%d\n", x);
> }
> 
> the compiler can reason that either p is null, or it is not null.  If it
> is not null, the "if (p)" conditional is always true and can be skipped.
>  If it /is/ null, then dereferencing it to read *p is undefined
> behaviour - and the compiler can assume the programmer doesn't care what
> happens.
> 
> So the code can be optimised to:
> 
> void f2(int *p) {
> 	printf("%d\n", *p);
> }

Ok I see, because the dereference was before, it was safe to remove the check after. But with code checking the pointer first, it would still be retained. That is good :)

> 
> If the compiler knows that on the target in question, it is perfectly
> safe (i.e., no signals or anything else) to dereference a null pointer,
> then in your original "f" the compiler could read *p before it tests p
> for null, but it could not do the printf before the test.  (Sometimes
> for embedded systems the compiler knows that reading via a null pointer
> is safe.)
Yes, exactly, on SH2A I reading from address 0x0 didn't cause a problem. Difficult to detect that kind of bug.
So I used to set a hardware address exception on reads from 0x0 to catch.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-11 11:56 ` David Brown
  2021-06-11 20:52   ` Jonny Grant
@ 2021-06-12  1:45   ` Segher Boessenkool
  2021-06-12 11:46     ` David Brown
  1 sibling, 1 reply; 10+ messages in thread
From: Segher Boessenkool @ 2021-06-12  1:45 UTC (permalink / raw)
  To: David Brown; +Cc: Jonny Grant, gcc-help

On Fri, Jun 11, 2021 at 01:56:40PM +0200, David Brown wrote:
> If the compiler knows that on the target in question, it is perfectly
> safe (i.e., no signals or anything else) to dereference a null pointer,
> then in your original "f" the compiler could read *p before it tests p
> for null, but it could not do the printf before the test.  (Sometimes
> for embedded systems the compiler knows that reading via a null pointer
> is safe.)

But note that this is undefined behaviour in standard C (a null pointer
is required to compare unequal to any pointer to any object or function
(6.3.2.3/3), and the indirection operator on any operand that does not
point to a function or object is undefined behaviour (6.5.3.2/4)).

It can be extremely useful to do support this, of course :-)


Segher

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-12  1:45   ` Segher Boessenkool
@ 2021-06-12 11:46     ` David Brown
  2021-06-12 14:03       ` Segher Boessenkool
  0 siblings, 1 reply; 10+ messages in thread
From: David Brown @ 2021-06-12 11:46 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Jonny Grant, gcc-help



On 12/06/2021 03:45, Segher Boessenkool wrote:
> On Fri, Jun 11, 2021 at 01:56:40PM +0200, David Brown wrote:
>> If the compiler knows that on the target in question, it is perfectly
>> safe (i.e., no signals or anything else) to dereference a null pointer,
>> then in your original "f" the compiler could read *p before it tests p
>> for null, but it could not do the printf before the test.  (Sometimes
>> for embedded systems the compiler knows that reading via a null pointer
>> is safe.)
> 
> But note that this is undefined behaviour in standard C (a null pointer
> is required to compare unequal to any pointer to any object or function
> (6.3.2.3/3), and the indirection operator on any operand that does not
> point to a function or object is undefined behaviour (6.5.3.2/4)).
> 
> It can be extremely useful to do support this, of course :-)
> 

It is undefined behaviour in the standard, but it is supported in some
compilers.  (To be honest, I can't remember if it is supported in any
gcc targets.)  In some microcontrollers for small embedded systems,
address 0 is part of the ram - and if you only have 1 KB ram, you don't
want to waste a byte as inaccessible.  On some microcontrollers I have
used, address 0 is part of the hardware registers for controlling pins,
peripherals, etc.  And often it is part of the flash and you might want
to read that for doing CRC checks or other integrity checks.  Of course,
in many of these cases you would (or could) use volatile accesses, and
the compiler will never mess with those.

There are also a few systems where reading address 0 is guaranteed to
return 0, or guaranteed to cause a fault.  Compilers might have modified
behaviour to take that into account (since the standards don't impose
any requirements about how accessing address 0 works).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-12 11:46     ` David Brown
@ 2021-06-12 14:03       ` Segher Boessenkool
  2021-06-13 10:29         ` Jonny Grant
  0 siblings, 1 reply; 10+ messages in thread
From: Segher Boessenkool @ 2021-06-12 14:03 UTC (permalink / raw)
  To: David Brown; +Cc: Jonny Grant, gcc-help

On Sat, Jun 12, 2021 at 01:46:14PM +0200, David Brown wrote:
> On 12/06/2021 03:45, Segher Boessenkool wrote:
> > On Fri, Jun 11, 2021 at 01:56:40PM +0200, David Brown wrote:
> >> If the compiler knows that on the target in question, it is perfectly
> >> safe (i.e., no signals or anything else) to dereference a null pointer,
> >> then in your original "f" the compiler could read *p before it tests p
> >> for null, but it could not do the printf before the test.  (Sometimes
> >> for embedded systems the compiler knows that reading via a null pointer
> >> is safe.)
> > 
> > But note that this is undefined behaviour in standard C (a null pointer
> > is required to compare unequal to any pointer to any object or function
> > (6.3.2.3/3), and the indirection operator on any operand that does not
> > point to a function or object is undefined behaviour (6.5.3.2/4)).
> > 
> > It can be extremely useful to do support this, of course :-)
> 
> It is undefined behaviour in the standard, but it is supported in some
> compilers.  (To be honest, I can't remember if it is supported in any
> gcc targets.)

GCC has the target hook TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID:
  bool TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID (addr_space_t as)
  Define this to modify the default handling of address 0 for the
  address space.  Return true if 0 should be considered a valid address.

Only x86 implements this currently, for allowing addresses %fs:0 and
%gs:0 .

> In some microcontrollers for small embedded systems,
> address 0 is part of the ram - and if you only have 1 KB ram, you don't
> want to waste a byte as inaccessible.  On some microcontrollers I have
> used, address 0 is part of the hardware registers for controlling pins,
> peripherals, etc.  And often it is part of the flash and you might want
> to read that for doing CRC checks or other integrity checks.  Of course,
> in many of these cases you would (or could) use volatile accesses, and
> the compiler will never mess with those.

Yeah, and some have address spaces (or address extensions, or whatever
it is called on that arch) on the hardware as well.  Address zero in a
non-default address space should not be considered a null pointer then
usually.

> There are also a few systems where reading address 0 is guaranteed to
> return 0, or guaranteed to cause a fault.  Compilers might have modified
> behaviour to take that into account (since the standards don't impose
> any requirements about how accessing address 0 works).

Yup, the standard is very careful not to :-)  Converting the integer
zero to a pointer gives a null pointer, that is how far it goes :-)


Segher

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-12 14:03       ` Segher Boessenkool
@ 2021-06-13 10:29         ` Jonny Grant
  2021-06-13 12:59           ` Xi Ruoyao
  0 siblings, 1 reply; 10+ messages in thread
From: Jonny Grant @ 2021-06-13 10:29 UTC (permalink / raw)
  To: Segher Boessenkool, David Brown; +Cc: gcc-help



On 12/06/2021 15:03, Segher Boessenkool wrote:
> On Sat, Jun 12, 2021 at 01:46:14PM +0200, David Brown wrote:
>> On 12/06/2021 03:45, Segher Boessenkool wrote:
>>> On Fri, Jun 11, 2021 at 01:56:40PM +0200, David Brown wrote:
>>>> If the compiler knows that on the target in question, it is perfectly
>>>> safe (i.e., no signals or anything else) to dereference a null pointer,
>>>> then in your original "f" the compiler could read *p before it tests p
>>>> for null, but it could not do the printf before the test.  (Sometimes
>>>> for embedded systems the compiler knows that reading via a null pointer
>>>> is safe.)
>>>
>>> But note that this is undefined behaviour in standard C (a null pointer
>>> is required to compare unequal to any pointer to any object or function
>>> (6.3.2.3/3), and the indirection operator on any operand that does not
>>> point to a function or object is undefined behaviour (6.5.3.2/4)).
>>>
>>> It can be extremely useful to do support this, of course :-)
>>
>> It is undefined behaviour in the standard, but it is supported in some
>> compilers.  (To be honest, I can't remember if it is supported in any
>> gcc targets.)
> 
> GCC has the target hook TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID:
>   bool TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID (addr_space_t as)
>   Define this to modify the default handling of address 0 for the
>   address space.  Return true if 0 should be considered a valid address.
> 
> Only x86 implements this currently, for allowing addresses %fs:0 and
> %gs:0 .
> 
>> In some microcontrollers for small embedded systems,
>> address 0 is part of the ram - and if you only have 1 KB ram, you don't
>> want to waste a byte as inaccessible.  On some microcontrollers I have
>> used, address 0 is part of the hardware registers for controlling pins,
>> peripherals, etc.  And often it is part of the flash and you might want
>> to read that for doing CRC checks or other integrity checks.  Of course,
>> in many of these cases you would (or could) use volatile accesses, and
>> the compiler will never mess with those.
> 
> Yeah, and some have address spaces (or address extensions, or whatever
> it is called on that arch) on the hardware as well.  Address zero in a
> non-default address space should not be considered a null pointer then
> usually.
> 
>> There are also a few systems where reading address 0 is guaranteed to
>> return 0, or guaranteed to cause a fault.  Compilers might have modified
>> behaviour to take that into account (since the standards don't impose
>> any requirements about how accessing address 0 works).
> 
> Yup, the standard is very careful not to :-)  Converting the integer
> zero to a pointer gives a null pointer, that is how far it goes :-)
> 
> 
> Segher
> 

If address 0x0 is allowed, would GCC ever change the optimization to remove NULL checks?
How does this configuration effect the compiled machine code.

On SH2A, we have the interrupt vector at address 0x0, with the first 4 bytes the initial program counter address (in this ROM space)


Kind regards
Jonny

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: GCC optimization re-order statements
  2021-06-13 10:29         ` Jonny Grant
@ 2021-06-13 12:59           ` Xi Ruoyao
  0 siblings, 0 replies; 10+ messages in thread
From: Xi Ruoyao @ 2021-06-13 12:59 UTC (permalink / raw)
  To: Jonny Grant, Segher Boessenkool, David Brown; +Cc: gcc-help

On Sun, 2021-06-13 at 11:29 +0100, Jonny Grant wrote:
> If address 0x0 is allowed, would GCC ever change the optimization to
> remove NULL checks?
> How does this configuration effect the compiled machine code.
> 
> On SH2A, we have the interrupt vector at address 0x0, with the first 4
> bytes the initial program counter address (in this ROM space)

Use -fno-delete-null-pointer-checks for the source file containing code
need to access address 0.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-13 12:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-11  9:37 GCC optimization re-order statements Jonny Grant
2021-06-11  9:44 ` Xi Ruoyao
2021-06-11 20:50   ` Jonny Grant
2021-06-11 11:56 ` David Brown
2021-06-11 20:52   ` Jonny Grant
2021-06-12  1:45   ` Segher Boessenkool
2021-06-12 11:46     ` David Brown
2021-06-12 14:03       ` Segher Boessenkool
2021-06-13 10:29         ` Jonny Grant
2021-06-13 12:59           ` Xi Ruoyao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).