public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: Zack Weinberg <zack@owlfolio.org>,
	GNU libc development <libc-alpha@sourceware.org>,
	gcc@gcc.gnu.org
Cc: "Alejandro Colomar" <alx@kernel.org>,
	'linux-man' <linux-man@vger.kernel.org>,
	"Bastien Roucariès" <rouca@debian.org>,
	"Eric Blake" <eblake@redhat.com>,
	"Stefan Puiu" <stefan.puiu@gmail.com>,
	"Igor Sysoev" <igor@sysoev.ru>
Subject: Re: [PATCH v2] socket: Implement sockaddr_storage with an anonymous union
Date: Fri, 20 Jan 2023 20:25:12 +0100	[thread overview]
Message-ID: <0e6c7856-549b-5014-fb37-bc5925660ffe@gmail.com> (raw)
In-Reply-To: <d77b529d-e54d-4919-87a4-d90fd816ba8b@app.fastmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3975 bytes --]

[CC += GCC]  // pun not intended :P

Hi Zack,

On 1/20/23 19:04, Zack Weinberg wrote:
> On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote:
>> The historical design of `sockaddr_storage` makes it impossible to use
>> without breaking strict aliasing rules.  Not only this type is unusable,
>> but even the use of other `sockaddr_*` structures directly (when the
>> programmer only cares about a single address family) is also incorrect,
>> since at some point the structure will be accessed as a `sockaddr`, and
>> that breaks strict aliasing rules too.
>>
>> So, the only way for a programmer to not invoke Undefined Behavior is to
>> declare a union that includes `sockaddr` and any `sockaddr_*` structures
>> that are of interest, which allows later accessing as either the correct
>> structure or plain `sockaddr` for the sa_family.
> 
> ...
> 
>>      struct new_sockaddr_storage  nss;
>>
>>      // ... (initialize oss and nss)
>>
>>      inet_sockaddr2str(&nss.sa);  // correct (and has no casts)
> 
> I think we need to move slowly here and be _sure_ that any proposed change
> does in fact reduce the amount of UB.

Sure, I'm just sending the patch to polish the idea around some concrete code. 
While I checked that it compiles, I didn't add any tests about it or anything, 
to see that it's usable (and Joseph's email showed me that it's far from being 
finished).  I expect that this'll take some time.


>  This construct, in particular, might
> not actually be correct in practice: see https://godbolt.org/z/rn51cracn for
> a case where, if I'm reading it right, the compiler assumes that a write
> through a `struct fancy *` cannot alter the values accessible through a
> `struct simple *` even though both pointers point into the same union.
> (Test case provided by <https://stackoverflow.com/users/363751/supercat>;

I @supercat around?  Maybe you want to open a question on SO (or Codidact) and 
we can discuss it there; it would be interesting.  :)

About the program, I have doubts.  It's more or less what I asked on Codidact 
yesterday[1].  I can't find anything in the standard to support GCC's behavior, 
and am inclined to think that it's a compiler bug.  @Lundin's answer[2] seems 
reasonable.  But still, my understanding until yesterday seemed to be in line 
with the compiler behavior that you showed: the compiler sees a pointer to a 
different type, and assumes things.

I believe it's fine according to the common initial sequence rule in 
C11::6.5.2.3/6 <https://port70.net/%7Ensz/c/c11/n1570.html#6.5.2.3p6>
in combination with the pointer conversion rule in C11::6.3.2.3/7 
<https://port70.net/%7Ensz/c/c11/n1570.html#6.3.2.3p7>.

The test you showed in godbolt shows two behaviors: sometimes it prints 1/3 
(this is correct; it happens for high values of -O) and sometimes it prints 3/3 
(this is invalid; it happens for low values of -O).  BTW, I'm a bit surprised 
that GCC optimizes out the local in -O0 or even -Og.

What I guess that GCC is doing in the invalid case is that in main() it sees 
that myThing is modified in vtest(), and then the structure is only used through 
a pointer to a different member, which according to 6.5.2.3/6 is only allowed 
for reading but not writing, so it assumes that loading the structure again at 
printf() will be fine, so it optimizes out the local.  However, GCC didn't take 
into account that the pointer can later be casted back to the right member, and 
that way it's allowed to modify it, according to 6.3.2.3/7.

I had dropped GCC from the CC in the email to not spam them too much, but since 
this concerns them, I'll keep them in the loop from now on.



[1]:  <https://software.codidact.com/posts/287748>

[2]:  <https://software.codidact.com/posts/287748/287750#answer-287750>

> I don't know any other identifier for them.)
> 
> zw

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

       reply	other threads:[~2023-01-20 19:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230120134043.10247-1-alx@kernel.org>
     [not found] ` <d77b529d-e54d-4919-87a4-d90fd816ba8b@app.fastmail.com>
2023-01-20 19:25   ` Alejandro Colomar [this message]
2023-01-21  2:38     ` Alejandro Colomar
2023-01-21  3:17       ` Alejandro Colomar
2023-01-21 13:30         ` Bastien Roucariès
2023-01-21 14:30           ` Alejandro Colomar
2023-01-22 14:12             ` Bastien Roucariès
     [not found] ` <5187043.CeDsVVrsAm@portable-bastien>
2023-01-20 20:38   ` Alejandro Colomar
2023-01-20 20:46     ` Bastien Roucariès
2023-01-20 20:51       ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e6c7856-549b-5014-fb37-bc5925660ffe@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=alx@kernel.org \
    --cc=eblake@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=igor@sysoev.ru \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-man@vger.kernel.org \
    --cc=rouca@debian.org \
    --cc=stefan.puiu@gmail.com \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).