public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
* Strict aliasing and malloc
@ 2018-07-03 18:07 narwhal x
  2018-07-03 18:26 ` Brian Inglis
  0 siblings, 1 reply; 8+ messages in thread
From: narwhal x @ 2018-07-03 18:07 UTC (permalink / raw)
  To: newlib

Hello,

I have a question regarding newlib and the -fstrict-aliasing implied
by turning on O2.

The strict aliasing implied by the ISO standard and enabled in gcc
with O2 (This might be specific to gcc, but could be the case with any
compiler with aliasing optimizations) makes it so you can only cast a
pointer to a compatible type, and a special case is malloc, which
should return an "undeclared type" *.

I however did not find the -fno-strict-aliasing flag in any
configuration or makefile (If I just overlooked it, and the flag is
mandatory that would answer my question)

My question:
In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
"top = (mchunkptr)brk;"
Here top is of type "mchunkptr" and brk is a "char *". The standard
says that you can not just alias a incompatible type and dereference
it (unless it's a malloc'ed variable, as it would change it's type
when written to, but how do you inform the compiler?)

As an example, see 4.2.1 (p. 63) in
https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf

So is this allowed? Or am I missing something.

* After asking in the gcc IRC, they mentioned that the way they go
about having the special case for malloc is making sure the libc
library is linked from a library and no LTO is performed.


My main reason for asking is just wanting to know how a malloc
implementation should deal with these restrictions stated by the ISO C
standard, and improve my understanding of the (sometimes confusing)
aliasing rules.

Thanks

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Strict aliasing and malloc
  2018-07-03 18:07 Strict aliasing and malloc narwhal x
@ 2018-07-03 18:26 ` Brian Inglis
  0 siblings, 0 replies; 8+ messages in thread
From: Brian Inglis @ 2018-07-03 18:26 UTC (permalink / raw)
  To: newlib

On 2018-07-03 09:16, narwhal x wrote:
> Hello,
> 
> I have a question regarding newlib and the -fstrict-aliasing implied
> by turning on O2.
> 
> The strict aliasing implied by the ISO standard and enabled in gcc
> with O2 (This might be specific to gcc, but could be the case with any
> compiler with aliasing optimizations) makes it so you can only cast a
> pointer to a compatible type, and a special case is malloc, which
> should return an "undeclared type" *.
> 
> I however did not find the -fno-strict-aliasing flag in any
> configuration or makefile (If I just overlooked it, and the flag is
> mandatory that would answer my question)
> 
> My question:
> In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
> "top = (mchunkptr)brk;"
> Here top is of type "mchunkptr" and brk is a "char *". The standard
> says that you can not just alias a incompatible type and dereference
> it (unless it's a malloc'ed variable, as it would change it's type
> when written to, but how do you inform the compiler?)
> 
> As an example, see 4.2.1 (p. 63) in
> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
> 
> So is this allowed? Or am I missing something.
> 
> * After asking in the gcc IRC, they mentioned that the way they go
> about having the special case for malloc is making sure the libc
> library is linked from a library and no LTO is performed.
> 
> 
> My main reason for asking is just wanting to know how a malloc
> implementation should deal with these restrictions stated by the ISO C
> standard, and improve my understanding of the (sometimes confusing)
> aliasing rules.

Pointer types char * and void * can be converted to other data pointer types,
and character types can alias other types, but you should not alias objects via
casts or conversions of pointers to objects stored as incompatible types,
because optimization could eliminate the stores, so the underlying storage of
the object of incompatible type may not be updated, and the compiler would not
know that because the type is different, as the compiler does not track possible
aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>

Implementations of malloc use char * internally and convert those to char ** and
int * to maintain their internal housekeeping data at the start of the block,
often using unions, returning a pointer to universally aligned storage following
that block prefix, often resulting in malloc overhead of one or more universally
aligned blocks per allocation; reducing space overhead takes more work: see e.g.
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Strict aliasing and malloc
  2018-07-04 12:20   ` Narwhal
  2018-07-04 15:34     ` Richard Damon
@ 2018-07-04 17:09     ` Brian Inglis
  1 sibling, 0 replies; 8+ messages in thread
From: Brian Inglis @ 2018-07-04 17:09 UTC (permalink / raw)
  To: newlib

On 2018-07-04 03:34, Narwhal wrote:
> On 07/04/2018 08:15 AM, Brian Inglis wrote:
>> On 2018-07-03 12:37, narwhal x wrote:
>>> On Tue, Jul 3, 2018 at 8:07 PM, Brian Inglis
>>> <Brian.Inglis@systematicsw.ab.ca> wrote:
>>>> On 2018-07-03 09:16, narwhal x wrote:
>>>>> I have a question regarding newlib and the -fstrict-aliasing implied
>>>>> by turning on O2.
>>>>>
>>>>> The strict aliasing implied by the ISO standard and enabled in gcc
>>>>> with O2 (This might be specific to gcc, but could be the case with any
>>>>> compiler with aliasing optimizations) makes it so you can only cast a
>>>>> pointer to a compatible type, and a special case is malloc, which
>>>>> should return an "undeclared type" *.
>>>>>
>>>>> I however did not find the -fno-strict-aliasing flag in any
>>>>> configuration or makefile (If I just overlooked it, and the flag is
>>>>> mandatory that would answer my question)
>>>>>
>>>>> My question:
>>>>> In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
>>>>> "top = (mchunkptr)brk;"
>>>>> Here top is of type "mchunkptr" and brk is a "char *". The standard
>>>>> says that you can not just alias a incompatible type and dereference
>>>>> it (unless it's a malloc'ed variable, as it would change it's type
>>>>> when written to, but how do you inform the compiler?)
>>>>>
>>>>> As an example, see 4.2.1 (p. 63) in
>>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>>>>
>>>>> So is this allowed? Or am I missing something.
>>>>>
>>>>> * After asking in the gcc IRC, they mentioned that the way they go
>>>>> about having the special case for malloc is making sure the libc
>>>>> library is linked from a library and no LTO is performed.
>>>>>
>>>>> My main reason for asking is just wanting to know how a malloc
>>>>> implementation should deal with these restrictions stated by the ISO C
>>>>> standard, and improve my understanding of the (sometimes confusing)
>>>>> aliasing rules.
>>>> Pointer types char * and void * can be converted to other data pointer types,
>>>> and character types can alias other types, but you should not alias objects via
>>>> casts or conversions of pointers to objects stored as incompatible types,
>>>> because optimization could eliminate the stores, so the underlying storage of
>>>> the object of incompatible type may not be updated, and the compiler would not
>>>> know that because the type is different, as the compiler does not track
>>>> possible
>>>> aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>
>>>>
>>>> Implementations of malloc use char * internally and convert those to char **
>>>> and
>>>> int * to maintain their internal housekeeping data at the start of the block,
>>>> often using unions, returning a pointer to universally aligned storage
>>>> following
>>>> that block prefix, often resulting in malloc overhead of one or more
>>>> universally
>>>> aligned blocks per allocation; reducing space overhead takes more work: see
>>>> e.g.
>>>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
>>>>
>>> Forgive me if I misunderstand, but doesn't your recap regarding strict
>>> aliasing agree with my understanding that this is an aliasing
>>> violation?
>>> Because you mention (correctly I think)
>>>> ... you should not alias objects via
>>>> casts or conversions of pointers to objects stored as incompatible types ...
>>> And in the case I mentioned (one of many) in mallocr.c on line 2212
>>> you have the statement:
>>>>> "top = (mchunkptr)brk;"
>>> Here brk is declared as: char *brk and is returned by sbrk (in my
>>> case) which takes memory from the heap declared somewhere in a
>>> linkerscript (or similar) AFAIK. But top is a mchunk *, which is a
>>> struct.
>> Here the char * is converted to a mchunk * and that is okay, works, both will be
>> checked for aliasing; the inverse conversion is also allowed; no object is
>> accessed using the pointer here.
>>
>>> So this is not a compatible type right (so the ARE incompatible)? You
>>> could cast from mchunk * TO char * and dereference it according to the
>>> standard, but not the other way around.
>>> Also if you look at the document I linked in my initial mail
>>>>> As an example, see 4.2.1 (p. 63) in
>>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>> Isn't this exactly what is done in mallocr.c ? And they state
>>> specifically that this can only be done when strict aliasing is NOT
>>> uphold. And that seems to be in accordance to the standard.
>>> I get that this is how the space is managed internally, I also know a
>>> lot of embedded applications and networking stacks do this casting
>>> from a char * to a struct *, but these also had to disable strict
>>> aliasing to avoid bugs.
>>> So am I missing something? If I am talking nonsense or
>>> misunderstanding something please let me know.
>>> I know it works basically always, but isn't this technically undefined
>>> behavior without -fno-strict-aliasing?
>> I think you may be missing that the issues are when using casts to pointer types
>> to access type-punned union members in a struct, or other objects, that are not
>> compatible types.
>>
>> At the minimum at a low level, the objects should be in the same memory type or
>> register set for the compiler to be able to consider them possibly aliased,
>> although the spec is stricter, more general, and abstract, to apply on the
>> abstract machine, for which the compiler is required to provide an
>> implementation on a real machine, where the properties conform to the abstract
>> model.
>>
> Thanks for the replies so far, sorry for being a nuance, but I really
> want to understand this fully.
> 
> So you state:
>> Here the char * is  converted to a mchunk * and that is okay, works, both will be
>> checked for aliasing; the inverse conversion is also allowed; no object is
>> accessed using the pointer here.
> 
> So I agree that this is not the undefined point, as nothing is
> dereferenced. But let me give the comparison between the example from
> the document, which they state is not in accordance to the aliasing
> rules, and and trimmed down version of the part in mallocr.c which I
> think is the same. Could you point me to the difference? (Or argue
> against the statement in the document)
> 
> Simplified mallocr.c example:
> char *brk;
> brk = (char*)(sbrk(sbrk_size)); // system sbrk (aligned)
> top = (mchunkptr)brk;
> top->size = top_size;  // access the struct, violation?

Here sbrk returns a void * and casts from those to other pointer types have been
unnecessary and inadvisable in modern C compilers for over a decade; I'd write
this as top = brk = sbrk(sbrk_size); but some compilers might warn about this,
and some project builds like to treat warnings as errors, to enforce a clean
compile or require disabling compiler warnings, which to me encourages unsafe
cruft.

> Document example (see first email, 4.2.1 (p. 63))
> unsigned char c[sizeof(float)];  // (aligned)
> float *fp = (float *)c; // example uses float, but should hold for other types
> *fp=1.0; // access, violation "DEFACTO: defined behaviour iff
> -no-strict-aliasing"
> 
> Also quoting Joseph Myers regarding using unsigned char arrays to hold
> values of other types:
> " No, this is not safe (if it's visible to the compiler that the
> memory in question has unsigned char as its declared type)."
> http://www.cl.cam.ac.uk/~pes20/cerberus/notes50-survey-discussion.html
> <http://www.cl.cam.ac.uk/%7Epes20/cerberus/notes50-survey-discussion.html>[11/15]
> https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html
> <https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html>

Using char arrays for other types is unsafe as often arithmetic types have
strict alignment requirements e.g. natural alignment on addresses which are
multiples of the object size, requirements not to cross a cache line, or a page
boundary, and that assumption is written into the standards spec; where no
alignment is required by an architecture, any such mismatch may result in
performance from poor to bad, so compilers and compiler and library implementers
have to be aware of and work around all restrictions, including implementing
functions in assembler where the compiler won't do what is required.

See first link above, last question, last point about memcpy: to be conforming,
use memcpy; but memcpy may be written in C for many library targets where
assembler versions are not available, so the library implementer has to be aware
of all the pitfalls.

An old book, The Standard C Library, by P.J.Plauger, 1992, explained the design
and implementation in ANSI Standard C, which he followed up with The Draft
Standard C++ Library in 1995, then one on STL; he was also on the standards
committees; and his companies Whitesmiths, Intermetrics, Dinkumware have been
providing libraries to MS, IBM, and embedded companies for decades.

> As sbrk returns a pointer to a char array and the compiler can see
> this, shouldn't it cause the same issue?
> 
> Thanks for bearing with me
> ** trying to figure out how to correctly reply to the mailinglist **

Read books and articles about library and compiler implementations that are not
just code listings, and read FAQs and discussions on groups like comp.lang.c,
where these questions would be better asked and answered.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Strict aliasing and malloc
  2018-07-04 12:20   ` Narwhal
@ 2018-07-04 15:34     ` Richard Damon
  2018-07-04 17:09     ` Brian Inglis
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Damon @ 2018-07-04 15:34 UTC (permalink / raw)
  To: newlib

On 7/4/18 2:34 AM, Narwhal wrote:
> On 07/04/2018 08:15 AM, Brian Inglis wrote:
>> On 2018-07-03 12:37, narwhal x wrote:
>>> On Tue, Jul 3, 2018 at 8:07 PM, Brian Inglis
>>> <Brian.Inglis@systematicsw.ab.ca> wrote:
>>>> On 2018-07-03 09:16, narwhal x wrote:
>>>>> I have a question regarding newlib and the -fstrict-aliasing implied
>>>>> by turning on O2.
>>>>>
>>>>> The strict aliasing implied by the ISO standard and enabled in gcc
>>>>> with O2 (This might be specific to gcc, but could be the case with
>>>>> any
>>>>> compiler with aliasing optimizations) makes it so you can only cast a
>>>>> pointer to a compatible type, and a special case is malloc, which
>>>>> should return an "undeclared type" *.
>>>>>
>>>>> I however did not find the -fno-strict-aliasing flag in any
>>>>> configuration or makefile (If I just overlooked it, and the flag is
>>>>> mandatory that would answer my question)
>>>>>
>>>>> My question:
>>>>> In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
>>>>> "top = (mchunkptr)brk;"
>>>>> Here top is of type "mchunkptr" and brk is a "char *". The standard
>>>>> says that you can not just alias a incompatible type and dereference
>>>>> it (unless it's a malloc'ed variable, as it would change it's type
>>>>> when written to, but how do you inform the compiler?)
>>>>>
>>>>> As an example, see 4.2.1 (p. 63) in
>>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>>>>
>>>>> So is this allowed? Or am I missing something.
>>>>>
>>>>> * After asking in the gcc IRC, they mentioned that the way they go
>>>>> about having the special case for malloc is making sure the libc
>>>>> library is linked from a library and no LTO is performed.
>>>>>
>>>>> My main reason for asking is just wanting to know how a malloc
>>>>> implementation should deal with these restrictions stated by the
>>>>> ISO C
>>>>> standard, and improve my understanding of the (sometimes confusing)
>>>>> aliasing rules.
>>>> Pointer types char * and void * can be converted to other data
>>>> pointer types,
>>>> and character types can alias other types, but you should not alias
>>>> objects via
>>>> casts or conversions of pointers to objects stored as incompatible
>>>> types,
>>>> because optimization could eliminate the stores, so the underlying
>>>> storage of
>>>> the object of incompatible type may not be updated, and the
>>>> compiler would not
>>>> know that because the type is different, as the compiler does not
>>>> track possible
>>>> aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>
>>>>
>>>> Implementations of malloc use char * internally and convert those
>>>> to char ** and
>>>> int * to maintain their internal housekeeping data at the start of
>>>> the block,
>>>> often using unions, returning a pointer to universally aligned
>>>> storage following
>>>> that block prefix, often resulting in malloc overhead of one or
>>>> more universally
>>>> aligned blocks per allocation; reducing space overhead takes more
>>>> work: see e.g.
>>>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
>>>>
>>> Forgive me if I misunderstand, but doesn't your recap regarding strict
>>> aliasing agree with my understanding that this is an aliasing
>>> violation?
>>> Because you mention (correctly I think)
>>>> ... you should not alias objects via
>>>> casts or conversions of pointers to objects stored as incompatible
>>>> types ...
>>> And in the case I mentioned (one of many) in mallocr.c on line 2212
>>> you have the statement:
>>>>> "top = (mchunkptr)brk;"
>>> Here brk is declared as: char *brk and is returned by sbrk (in my
>>> case) which takes memory from the heap declared somewhere in a
>>> linkerscript (or similar) AFAIK. But top is a mchunk *, which is a
>>> struct.
>> Here the char * is converted to a mchunk * and that is okay, works,
>> both will be
>> checked for aliasing; the inverse conversion is also allowed; no
>> object is
>> accessed using the pointer here.
>>
>>> So this is not a compatible type right (so the ARE incompatible)? You
>>> could cast from mchunk * TO char * and dereference it according to the
>>> standard, but not the other way around.
>>> Also if you look at the document I linked in my initial mail
>>>>> As an example, see 4.2.1 (p. 63) in
>>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>> Isn't this exactly what is done in mallocr.c ? And they state
>>> specifically that this can only be done when strict aliasing is NOT
>>> uphold. And that seems to be in accordance to the standard.
>>> I get that this is how the space is managed internally, I also know a
>>> lot of embedded applications and networking stacks do this casting
>>> from a char * to a struct *, but these also had to disable strict
>>> aliasing to avoid bugs.
>>> So am I missing something? If I am talking nonsense or
>>> misunderstanding something please let me know.
>>> I know it works basically always, but isn't this technically undefined
>>> behavior without -fno-strict-aliasing?
>> I think you may be missing that the issues are when using casts to
>> pointer types
>> to access type-punned union members in a struct, or other objects,
>> that are not
>> compatible types.
>>
>> At the minimum at a low level, the objects should be in the same
>> memory type or
>> register set for the compiler to be able to consider them possibly
>> aliased,
>> although the spec is stricter, more general, and abstract, to apply
>> on the
>> abstract machine, for which the compiler is required to provide an
>> implementation on a real machine, where the properties conform to the
>> abstract
>> model.
>>
> Thanks for the replies so far, sorry for being a nuance, but I really
> want to understand this fully.
>
> So you state:
>> Here the char * is  converted to a mchunk * and that is okay, works,
>> both will be
> > checked for aliasing; the inverse conversion is also allowed; no
> object is
> > accessed using the pointer here.
>
> So I agree that this is not the undefined point, as nothing is
> dereferenced. But let me give the comparison between the example from
> the document, which they state is not in accordance to the aliasing
> rules, and and trimmed down version of the part in mallocr.c which I
> think is the same. Could you point me to the difference? (Or argue
> against the statement in the document)
>
> Simplified mallocr.c example:
> char *brk;
> brk = (char*)(sbrk(sbrk_size)); // system sbrk (aligned)
> top = (mchunkptr)brk;
> top->size = top_size;  // access the struct, violation?
>
> Document example (see first email, 4.2.1 (p. 63))
> unsigned char c[sizeof(float)];  // (aligned)
> float *fp = (float *)c; // example uses float, but should hold for
> other types
> *fp=1.0; // access, violation "DEFACTO: defined behaviour iff
> -no-strict-aliasing"
>
> Also quoting Joseph Myers regarding using unsigned char arrays to hold
> values of other types:
> " No, this is not safe (if it's visible to the compiler that the
> memory in question has unsigned char as its declared type)."
> http://www.cl.cam.ac.uk/~pes20/cerberus/notes50-survey-discussion.html
> <http://www.cl.cam.ac.uk/%7Epes20/cerberus/notes50-survey-discussion.html>[11/15]
> https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html
> <https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html>
>
> As sbrk returns a pointer to a char array and the compiler can see
> this, shouldn't it cause the same issue?
>
> Thanks for bearing with me
> ** trying to figure out how to correctly reply to the mailinglist **
>
One of the reasons malloc (and sbrk) are part of the implementation
library is that to they may not be able to fully obey the rules. They
may need to use some implementation dependent 'trick' to get around some
undefined behaviors.

This says that it is not unexpected that they may have requirements on
what options they need to use (or not use) in order to work. Thus hiding
the 'type' of the raw memory is commonly needed.

-- 
Richard Damon

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Strict aliasing and malloc
  2018-07-04  9:24 ` Brian Inglis
  2018-07-04  9:34   ` narwhal x
@ 2018-07-04 12:20   ` Narwhal
  2018-07-04 15:34     ` Richard Damon
  2018-07-04 17:09     ` Brian Inglis
  1 sibling, 2 replies; 8+ messages in thread
From: Narwhal @ 2018-07-04 12:20 UTC (permalink / raw)
  To: newlib

On 07/04/2018 08:15 AM, Brian Inglis wrote:
> On 2018-07-03 12:37, narwhal x wrote:
>> On Tue, Jul 3, 2018 at 8:07 PM, Brian Inglis
>> <Brian.Inglis@systematicsw.ab.ca> wrote:
>>> On 2018-07-03 09:16, narwhal x wrote:
>>>> I have a question regarding newlib and the -fstrict-aliasing implied
>>>> by turning on O2.
>>>>
>>>> The strict aliasing implied by the ISO standard and enabled in gcc
>>>> with O2 (This might be specific to gcc, but could be the case with any
>>>> compiler with aliasing optimizations) makes it so you can only cast a
>>>> pointer to a compatible type, and a special case is malloc, which
>>>> should return an "undeclared type" *.
>>>>
>>>> I however did not find the -fno-strict-aliasing flag in any
>>>> configuration or makefile (If I just overlooked it, and the flag is
>>>> mandatory that would answer my question)
>>>>
>>>> My question:
>>>> In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
>>>> "top = (mchunkptr)brk;"
>>>> Here top is of type "mchunkptr" and brk is a "char *". The standard
>>>> says that you can not just alias a incompatible type and dereference
>>>> it (unless it's a malloc'ed variable, as it would change it's type
>>>> when written to, but how do you inform the compiler?)
>>>>
>>>> As an example, see 4.2.1 (p. 63) in
>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>>>
>>>> So is this allowed? Or am I missing something.
>>>>
>>>> * After asking in the gcc IRC, they mentioned that the way they go
>>>> about having the special case for malloc is making sure the libc
>>>> library is linked from a library and no LTO is performed.
>>>>
>>>> My main reason for asking is just wanting to know how a malloc
>>>> implementation should deal with these restrictions stated by the ISO C
>>>> standard, and improve my understanding of the (sometimes confusing)
>>>> aliasing rules.
>>> Pointer types char * and void * can be converted to other data pointer types,
>>> and character types can alias other types, but you should not alias objects via
>>> casts or conversions of pointers to objects stored as incompatible types,
>>> because optimization could eliminate the stores, so the underlying storage of
>>> the object of incompatible type may not be updated, and the compiler would not
>>> know that because the type is different, as the compiler does not track possible
>>> aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>
>>>
>>> Implementations of malloc use char * internally and convert those to char ** and
>>> int * to maintain their internal housekeeping data at the start of the block,
>>> often using unions, returning a pointer to universally aligned storage following
>>> that block prefix, often resulting in malloc overhead of one or more universally
>>> aligned blocks per allocation; reducing space overhead takes more work: see e.g.
>>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
>> Forgive me if I misunderstand, but doesn't your recap regarding strict
>> aliasing agree with my understanding that this is an aliasing
>> violation?
>> Because you mention (correctly I think)
>>> ... you should not alias objects via
>>> casts or conversions of pointers to objects stored as incompatible types ...
>> And in the case I mentioned (one of many) in mallocr.c on line 2212
>> you have the statement:
>>>> "top = (mchunkptr)brk;"
>> Here brk is declared as: char *brk and is returned by sbrk (in my
>> case) which takes memory from the heap declared somewhere in a
>> linkerscript (or similar) AFAIK. But top is a mchunk *, which is a
>> struct.
> Here the char * is converted to a mchunk * and that is okay, works, both will be
> checked for aliasing; the inverse conversion is also allowed; no object is
> accessed using the pointer here.
>
>> So this is not a compatible type right (so the ARE incompatible)? You
>> could cast from mchunk * TO char * and dereference it according to the
>> standard, but not the other way around.
>> Also if you look at the document I linked in my initial mail
>>>> As an example, see 4.2.1 (p. 63) in
>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>> Isn't this exactly what is done in mallocr.c ? And they state
>> specifically that this can only be done when strict aliasing is NOT
>> uphold. And that seems to be in accordance to the standard.
>> I get that this is how the space is managed internally, I also know a
>> lot of embedded applications and networking stacks do this casting
>> from a char * to a struct *, but these also had to disable strict
>> aliasing to avoid bugs.
>> So am I missing something? If I am talking nonsense or
>> misunderstanding something please let me know.
>> I know it works basically always, but isn't this technically undefined
>> behavior without -fno-strict-aliasing?
> I think you may be missing that the issues are when using casts to pointer types
> to access type-punned union members in a struct, or other objects, that are not
> compatible types.
>
> At the minimum at a low level, the objects should be in the same memory type or
> register set for the compiler to be able to consider them possibly aliased,
> although the spec is stricter, more general, and abstract, to apply on the
> abstract machine, for which the compiler is required to provide an
> implementation on a real machine, where the properties conform to the abstract
> model.
>
Thanks for the replies so far, sorry for being a nuance, but I really
want to understand this fully.

So you state:
> Here the char * is  converted to a mchunk * and that is okay, works, both will be
 > checked for aliasing; the inverse conversion is also allowed; no 
object is
 > accessed using the pointer here.

So I agree that this is not the undefined point, as nothing is
dereferenced. But let me give the comparison between the example from
the document, which they state is not in accordance to the aliasing
rules, and and trimmed down version of the part in mallocr.c which I
think is the same. Could you point me to the difference? (Or argue
against the statement in the document)

Simplified mallocr.c example:
char *brk;
brk = (char*)(sbrk(sbrk_size)); // system sbrk (aligned)
top = (mchunkptr)brk;
top->size = top_size;  // access the struct, violation?

Document example (see first email, 4.2.1 (p. 63))
unsigned char c[sizeof(float)];  // (aligned)
float *fp = (float *)c; // example uses float, but should hold for other 
types
*fp=1.0; // access, violation "DEFACTO: defined behaviour iff
-no-strict-aliasing"

Also quoting Joseph Myers regarding using unsigned char arrays to hold
values of other types:
" No, this is not safe (if it's visible to the compiler that the
memory in question has unsigned char as its declared type)."
http://www.cl.cam.ac.uk/~pes20/cerberus/notes50-survey-discussion.html 
<http://www.cl.cam.ac.uk/%7Epes20/cerberus/notes50-survey-discussion.html>[11/15]
https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html 
<https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html>

As sbrk returns a pointer to a char array and the compiler can see
this, shouldn't it cause the same issue?

Thanks for bearing with me
** trying to figure out how to correctly reply to the mailinglist **

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Strict aliasing and malloc
  2018-07-04  9:24 ` Brian Inglis
@ 2018-07-04  9:34   ` narwhal x
  2018-07-04 12:20   ` Narwhal
  1 sibling, 0 replies; 8+ messages in thread
From: narwhal x @ 2018-07-04  9:34 UTC (permalink / raw)
  To: Brian.Inglis; +Cc: newlib

On Wed, Jul 4, 2018 at 8:15 AM, Brian Inglis
<Brian.Inglis@systematicsw.ab.ca> wrote:
> On 2018-07-03 12:37, narwhal x wrote:
>> On Tue, Jul 3, 2018 at 8:07 PM, Brian Inglis
>> <Brian.Inglis@systematicsw.ab.ca> wrote:
>>> On 2018-07-03 09:16, narwhal x wrote:
>>>> I have a question regarding newlib and the -fstrict-aliasing implied
>>>> by turning on O2.
>>>>
>>>> The strict aliasing implied by the ISO standard and enabled in gcc
>>>> with O2 (This might be specific to gcc, but could be the case with any
>>>> compiler with aliasing optimizations) makes it so you can only cast a
>>>> pointer to a compatible type, and a special case is malloc, which
>>>> should return an "undeclared type" *.
>>>>
>>>> I however did not find the -fno-strict-aliasing flag in any
>>>> configuration or makefile (If I just overlooked it, and the flag is
>>>> mandatory that would answer my question)
>>>>
>>>> My question:
>>>> In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
>>>> "top = (mchunkptr)brk;"
>>>> Here top is of type "mchunkptr" and brk is a "char *". The standard
>>>> says that you can not just alias a incompatible type and dereference
>>>> it (unless it's a malloc'ed variable, as it would change it's type
>>>> when written to, but how do you inform the compiler?)
>>>>
>>>> As an example, see 4.2.1 (p. 63) in
>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>>>
>>>> So is this allowed? Or am I missing something.
>>>>
>>>> * After asking in the gcc IRC, they mentioned that the way they go
>>>> about having the special case for malloc is making sure the libc
>>>> library is linked from a library and no LTO is performed.
>>>>
>>>> My main reason for asking is just wanting to know how a malloc
>>>> implementation should deal with these restrictions stated by the ISO C
>>>> standard, and improve my understanding of the (sometimes confusing)
>>>> aliasing rules.
>>>
>>> Pointer types char * and void * can be converted to other data pointer types,
>>> and character types can alias other types, but you should not alias objects via
>>> casts or conversions of pointers to objects stored as incompatible types,
>>> because optimization could eliminate the stores, so the underlying storage of
>>> the object of incompatible type may not be updated, and the compiler would not
>>> know that because the type is different, as the compiler does not track possible
>>> aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>
>>>
>>> Implementations of malloc use char * internally and convert those to char ** and
>>> int * to maintain their internal housekeeping data at the start of the block,
>>> often using unions, returning a pointer to universally aligned storage following
>>> that block prefix, often resulting in malloc overhead of one or more universally
>>> aligned blocks per allocation; reducing space overhead takes more work: see e.g.
>>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
>>
>> Forgive me if I misunderstand, but doesn't your recap regarding strict
>> aliasing agree with my understanding that this is an aliasing
>> violation?
>> Because you mention (correctly I think)
>>> ... you should not alias objects via
>>> casts or conversions of pointers to objects stored as incompatible types ...
>>
>> And in the case I mentioned (one of many) in mallocr.c on line 2212
>> you have the statement:
>>>> "top = (mchunkptr)brk;"
>>
>> Here brk is declared as: char *brk and is returned by sbrk (in my
>> case) which takes memory from the heap declared somewhere in a
>> linkerscript (or similar) AFAIK. But top is a mchunk *, which is a
>> struct.
>
> Here the char * is converted to a mchunk * and that is okay, works, both will be
> checked for aliasing; the inverse conversion is also allowed; no object is
> accessed using the pointer here.
>
>> So this is not a compatible type right (so the ARE incompatible)? You
>> could cast from mchunk * TO char * and dereference it according to the
>> standard, but not the other way around.
>> Also if you look at the document I linked in my initial mail
>>>> As an example, see 4.2.1 (p. 63) in
>>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>
>> Isn't this exactly what is done in mallocr.c ? And they state
>> specifically that this can only be done when strict aliasing is NOT
>> uphold. And that seems to be in accordance to the standard.
>> I get that this is how the space is managed internally, I also know a
>> lot of embedded applications and networking stacks do this casting
>> from a char * to a struct *, but these also had to disable strict
>> aliasing to avoid bugs.
>> So am I missing something? If I am talking nonsense or
>> misunderstanding something please let me know.
>> I know it works basically always, but isn't this technically undefined
>> behavior without -fno-strict-aliasing?
>
> I think you may be missing that the issues are when using casts to pointer types
> to access type-punned union members in a struct, or other objects, that are not
> compatible types.
>
> At the minimum at a low level, the objects should be in the same memory type or
> register set for the compiler to be able to consider them possibly aliased,
> although the spec is stricter, more general, and abstract, to apply on the
> abstract machine, for which the compiler is required to provide an
> implementation on a real machine, where the properties conform to the abstract
> model.
>
> --
> Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada


Thanks for the replies so far, sorry for being a nuance, but I really
want to understand this fully.

So you state:
> Here the char * is converted to a mchunk * and that is okay, works, both will be
> checked for aliasing; the inverse conversion is also allowed; no object is
> accessed using the pointer here.

So I agree that this is not the undefined point, as nothing is
dereferenced. But let me give the comparison between the example from
the document, which they state is not in accordance to the aliasing
rules, and and trimmed down version of the part in mallocr.c which I
think is the same. Could you point me to the difference? (Or argue
against the statement in the document)

Simplified mallocr.c example:
char *brk;
brk = (char*)(sbrk(sbrk_size)); // system sbrk (aligned)
top = (mchunkptr)brk;
top->size = top_size;  // access the struct, violation?

Document example (see first email, 4.2.1 (p. 63))
unsigned char c[sizeof(float)];  // (aligned)
float *fp = (float *)c; // example uses float, but should hold for other types
*fp=1.0; // access, violation "DEFACTO: defined behaviour iff
-no-strict-aliasing"

Also quoting Joseph Myers regarding using unsigned char arrays to hold
values of other types:
" No, this is not safe (if it's visible to the compiler that the
memory in question has unsigned char as its declared type)."
http://www.cl.cam.ac.uk/~pes20/cerberus/notes50-survey-discussion.html [11/15]
https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html

As sbrk returns a pointer to a char array and the compiler can see
this, shouldn't it cause the same issue?

Thanks for bearing with me
** trying to figure out how to correctly reply to the mailinglist **

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Strict aliasing and malloc
  2018-07-04  6:15 narwhal x
@ 2018-07-04  9:24 ` Brian Inglis
  2018-07-04  9:34   ` narwhal x
  2018-07-04 12:20   ` Narwhal
  0 siblings, 2 replies; 8+ messages in thread
From: Brian Inglis @ 2018-07-04  9:24 UTC (permalink / raw)
  To: newlib

On 2018-07-03 12:37, narwhal x wrote:
> On Tue, Jul 3, 2018 at 8:07 PM, Brian Inglis
> <Brian.Inglis@systematicsw.ab.ca> wrote:
>> On 2018-07-03 09:16, narwhal x wrote:
>>> I have a question regarding newlib and the -fstrict-aliasing implied
>>> by turning on O2.
>>>
>>> The strict aliasing implied by the ISO standard and enabled in gcc
>>> with O2 (This might be specific to gcc, but could be the case with any
>>> compiler with aliasing optimizations) makes it so you can only cast a
>>> pointer to a compatible type, and a special case is malloc, which
>>> should return an "undeclared type" *.
>>>
>>> I however did not find the -fno-strict-aliasing flag in any
>>> configuration or makefile (If I just overlooked it, and the flag is
>>> mandatory that would answer my question)
>>>
>>> My question:
>>> In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
>>> "top = (mchunkptr)brk;"
>>> Here top is of type "mchunkptr" and brk is a "char *". The standard
>>> says that you can not just alias a incompatible type and dereference
>>> it (unless it's a malloc'ed variable, as it would change it's type
>>> when written to, but how do you inform the compiler?)
>>>
>>> As an example, see 4.2.1 (p. 63) in
>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>>
>>> So is this allowed? Or am I missing something.
>>>
>>> * After asking in the gcc IRC, they mentioned that the way they go
>>> about having the special case for malloc is making sure the libc
>>> library is linked from a library and no LTO is performed.
>>>
>>> My main reason for asking is just wanting to know how a malloc
>>> implementation should deal with these restrictions stated by the ISO C
>>> standard, and improve my understanding of the (sometimes confusing)
>>> aliasing rules.
>>
>> Pointer types char * and void * can be converted to other data pointer types,
>> and character types can alias other types, but you should not alias objects via
>> casts or conversions of pointers to objects stored as incompatible types,
>> because optimization could eliminate the stores, so the underlying storage of
>> the object of incompatible type may not be updated, and the compiler would not
>> know that because the type is different, as the compiler does not track possible
>> aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>
>>
>> Implementations of malloc use char * internally and convert those to char ** and
>> int * to maintain their internal housekeeping data at the start of the block,
>> often using unions, returning a pointer to universally aligned storage following
>> that block prefix, often resulting in malloc overhead of one or more universally
>> aligned blocks per allocation; reducing space overhead takes more work: see e.g.
>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
> 
> Forgive me if I misunderstand, but doesn't your recap regarding strict
> aliasing agree with my understanding that this is an aliasing
> violation?
> Because you mention (correctly I think)
>> ... you should not alias objects via
>> casts or conversions of pointers to objects stored as incompatible types ...
> 
> And in the case I mentioned (one of many) in mallocr.c on line 2212
> you have the statement:
>>> "top = (mchunkptr)brk;"
> 
> Here brk is declared as: char *brk and is returned by sbrk (in my
> case) which takes memory from the heap declared somewhere in a
> linkerscript (or similar) AFAIK. But top is a mchunk *, which is a
> struct.

Here the char * is converted to a mchunk * and that is okay, works, both will be
checked for aliasing; the inverse conversion is also allowed; no object is
accessed using the pointer here.

> So this is not a compatible type right (so the ARE incompatible)? You
> could cast from mchunk * TO char * and dereference it according to the
> standard, but not the other way around.
> Also if you look at the document I linked in my initial mail
>>> As an example, see 4.2.1 (p. 63) in
>>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
> 
> Isn't this exactly what is done in mallocr.c ? And they state
> specifically that this can only be done when strict aliasing is NOT
> uphold. And that seems to be in accordance to the standard.
> I get that this is how the space is managed internally, I also know a
> lot of embedded applications and networking stacks do this casting
> from a char * to a struct *, but these also had to disable strict
> aliasing to avoid bugs.
> So am I missing something? If I am talking nonsense or
> misunderstanding something please let me know.
> I know it works basically always, but isn't this technically undefined
> behavior without -fno-strict-aliasing?

I think you may be missing that the issues are when using casts to pointer types
to access type-punned union members in a struct, or other objects, that are not
compatible types.

At the minimum at a low level, the objects should be in the same memory type or
register set for the compiler to be able to consider them possibly aliased,
although the spec is stricter, more general, and abstract, to apply on the
abstract machine, for which the compiler is required to provide an
implementation on a real machine, where the properties conform to the abstract
model.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Strict aliasing and malloc
@ 2018-07-04  6:15 narwhal x
  2018-07-04  9:24 ` Brian Inglis
  0 siblings, 1 reply; 8+ messages in thread
From: narwhal x @ 2018-07-04  6:15 UTC (permalink / raw)
  To: newlib

On Tue, Jul 3, 2018 at 8:07 PM, Brian Inglis
<Brian.Inglis@systematicsw.ab.ca> wrote:
> On 2018-07-03 09:16, narwhal x wrote:
>> Hello,
>>
>> I have a question regarding newlib and the -fstrict-aliasing implied
>> by turning on O2.
>>
>> The strict aliasing implied by the ISO standard and enabled in gcc
>> with O2 (This might be specific to gcc, but could be the case with any
>> compiler with aliasing optimizations) makes it so you can only cast a
>> pointer to a compatible type, and a special case is malloc, which
>> should return an "undeclared type" *.
>>
>> I however did not find the -fno-strict-aliasing flag in any
>> configuration or makefile (If I just overlooked it, and the flag is
>> mandatory that would answer my question)
>>
>> My question:
>> In newlib/libc/stdlib/mallocr.c on line 2212 you have the statement:
>> "top = (mchunkptr)brk;"
>> Here top is of type "mchunkptr" and brk is a "char *". The standard
>> says that you can not just alias a incompatible type and dereference
>> it (unless it's a malloc'ed variable, as it would change it's type
>> when written to, but how do you inform the compiler?)
>>
>> As an example, see 4.2.1 (p. 63) in
>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
>>
>> So is this allowed? Or am I missing something.
>>
>> * After asking in the gcc IRC, they mentioned that the way they go
>> about having the special case for malloc is making sure the libc
>> library is linked from a library and no LTO is performed.
>>
>>
>> My main reason for asking is just wanting to know how a malloc
>> implementation should deal with these restrictions stated by the ISO C
>> standard, and improve my understanding of the (sometimes confusing)
>> aliasing rules.
>
> Pointer types char * and void * can be converted to other data pointer types,
> and character types can alias other types, but you should not alias objects via
> casts or conversions of pointers to objects stored as incompatible types,
> because optimization could eliminate the stores, so the underlying storage of
> the object of incompatible type may not be updated, and the compiler would not
> know that because the type is different, as the compiler does not track possible
> aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>
>
> Implementations of malloc use char * internally and convert those to char ** and
> int * to maintain their internal housekeeping data at the start of the block,
> often using unions, returning a pointer to universally aligned storage following
> that block prefix, often resulting in malloc overhead of one or more universally
> aligned blocks per allocation; reducing space overhead takes more work: see e.g.
> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
> --
> Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

Forgive me if I misunderstand, but doesn't your recap regarding strict
aliasing agree with my understanding that this is an aliasing
violation?
Because you mention (correctly I think)
> ... you should not alias objects via
> casts or conversions of pointers to objects stored as incompatible types ...

And in the case I mentioned (one of many) in mallocr.c on line 2212
you have the statement:
>> "top = (mchunkptr)brk;"

Here brk is declared as: char *brk and is returned by sbrk (in my
case) which takes memory from the heap declared somewhere in a
linkerscript (or similar) AFAIK. But top is a mchunk *, which is a
struct.
So this is not a compatible type right (so the ARE incompatible)? You
could cast from mchunk * TO char * and dereference it according to the
standard, but not the other way around.
Also if you look at the document I linked in my initial mail
>> As an example, see 4.2.1 (p. 63) in
>> https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf

Isn't this exactly what is done in mallocr.c ? And they state
specifically that this can only be done when strict aliasing is NOT
uphold. And that seems to be in accordance to the standard.
I get that this is how the space is managed internally, I also know a
lot of embedded applications and networking stacks do this casting
from a char * to a struct *, but these also had to disable strict
aliasing to avoid bugs.
So am I missing something? If I am talking nonsense or
misunderstanding something please let me know.
I know it works basically always, but isn't this technically undefined
behavior without -fno-strict-aliasing?

Thanks

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-07-04 16:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-03 18:07 Strict aliasing and malloc narwhal x
2018-07-03 18:26 ` Brian Inglis
2018-07-04  6:15 narwhal x
2018-07-04  9:24 ` Brian Inglis
2018-07-04  9:34   ` narwhal x
2018-07-04 12:20   ` Narwhal
2018-07-04 15:34     ` Richard Damon
2018-07-04 17:09     ` Brian Inglis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).