From: Alejandro Colomar <alx.manpages@gmail.com>
To: Stefan Puiu <stefan.puiu@gmail.com>
Cc: GNU C Library <libc-alpha@sourceware.org>,
linux-man <linux-man@vger.kernel.org>,
gcc@gcc.gnu.org, Igor Sysoev <igor@sysoev.ru>
Subject: Re: struct sockaddr_storage
Date: Fri, 20 Jan 2023 13:39:51 +0100 [thread overview]
Message-ID: <61bbb556-ff9b-ebdc-5566-bc1ae533c0aa@gmail.com> (raw)
In-Reply-To: <CACKs7VAXOXLw5Zm0wqVt8dDwam_=w8aeAu5wNpXcTRSqObimyQ@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 6064 bytes --]
Hi Stefan,
On 1/20/23 11:06, Stefan Puiu wrote:
> Hi Alex,
>
> On Thu, Jan 19, 2023 at 4:14 PM Alejandro Colomar
> <alx.manpages@gmail.com> wrote:
>>
>> Hi!
>>
>> I just received a report about struct sockaddr_storage in the man pages. It
>> reminded me of some concern I've always had about it: it doesn't seem to be a
>> usable type.
>>
>> It has some alignment promises that make it "just work" most of the time, but
>> it's still a UB mine, according to ISO C.
>>
>> According to strict aliasing rules, if you declare a variable of type 'struct
>> sockaddr_storage', that's what you get, and trying to access it later as some
>> other sockaddr_8 is simply not legal. The compiler may assume those accesses
>> can't happen, and optimize as it pleases.
>
> Can you detail the "is not legal" part?
I mean that it's Undefined Behavior contraband.
> How about the APIs like
> connect() etc that use pointers to struct sockaddr, where the
> underlying type is different, why would that be legal while using
> sockaddr_storage isn't?
That's also bad. However, it can be fixed by fixing `sockaddr_storage` and
telling everyone to use it instead of using whatever other `sockaddr_*`. You
need a union for the underlying storage, so that the library functions can
access both as `sockaddr` and as `sockaddr_*`.
The problem isn't really in the implementation of connect(2), but on the type.
The implementation of connect(2) would be fine if we just fixed the type. See
some example:
struct my_sockaddr_storage {
union {
sa_family_t ss_family;
struct sockaddr sa;
struct sockaddr_in sin;
struct sockaddr_in6 sin6;
struct sockaddr_un sun;
};
};
void
foo(foo)
{
struct my_sockaddr_storage mss;
struct sockaddr_storage ss;
// initialize mss and ss
inet_sockaddr2str(&mss.sa); // correct
inet_sockaddr2str((struct sockaddr_storage *)&ss); // UB
}
/* This function is correct, as far as the accessed object has the
* type we're using. That's only possible through a `union`, since
* we're accessing it with 2 different types: `sockaddr` for the
* `sa_family` and then the appropriate subtype for the address
* itself.
*/
const char *
inet_sockaddr2str(const struct sockaddr *sa)
{
struct sockaddr_in *sin;
struct sockaddr_in6 *sin6;
static char buf[INET_ADDRSTRLENMAX];
switch (sa->sa_family) {
case AF_INET:
sin = (struct sockaddr_in *) sa;
inet_ntop(AF_INET, &sin->sin_addr, buf, NITEMS(buf));
return buf;
case AF_INET6:
sin6 = (struct sockaddr_in6 *) sa;
inet_ntop(AF_INET6, &sin6->sin6_addr, buf, NITEMS(buf));
return buf;
default:
errno = EAFNOSUPPORT;
return NULL;
}
}
BTW, you need a union _even if_ you only care about a single address family.
That is, if you only care about Unix sockets, you can't declare your variable of
type sockaddr_un, because the libc functions and syscalls still need to access
it as a sockaddr to see which family it has.
> Will code break in practice?
Well, it depends on how much compilers advance. Here's some interesting experiment:
<https://software.codidact.com/posts/287748/287750#answer-287750>
I wouldn't rely on Undefined Behavior not causing nasal demons. When you get
them, you can only kill them with garlic.
>
>>
>> That means that one needs to declare a union with all possible sockaddr_* types
>> that are of interest, so that access as any of them is later allowed by the
>> compiler (of course, the user still needs to access the correct one, but that's
>> of course).
>>
>> In that union, one could add a member that is of type sockaddr_storage for
>> getting a more consistent structure size (for example, if some members are
>> conditional on preprocessor stuff), but I don't see much value in that.
>> Especially, given this comment that Igor Sysoev wrote in NGINX Unit's source code:
>>
>> * struct sockaddr_storage is:
>> * 128 bytes on Linux, FreeBSD, MacOSX, NetBSD;
>> * 256 bytes on Solaris, OpenBSD, and HP-UX;
>> * 1288 bytes on AIX.
>> *
>> * struct sockaddr_storage is too large on some platforms
>> * or less than real maximum struct sockaddr_un length.
>>
>> Which makes it even more useless as a type.
>
> I'm not sure using struct sockaddr_storage for storing sockaddr_un's
> (UNIX domain socket addresses, right?) is that common a usage. I've
> used it in the past to store either a sockaddr_in or a sockaddr_in6,
> and I think that would be a more common scenario. The comment above
> probably makes sense for nginx, but different projects have different
> needs.
>
> As for the size, I guess it might matter if you want to port your code
> to AIX, Solaris, OpenBSD etc. I don't think all software is meant to
> be portable, though (or portable to those platforms). Maybe a warning
> is in order that, for portable code, developers should check its size
> on the other platforms targeted.
The size thing is just an added problem. The deep problem is that you need to
use a union that contains all types that you care about _plus_ plain sockaddr,
because the structure will be accessed at least as a sockaddr, plus one of the
different specialized structures. So even for only sockaddr_un, you need at
least the following:
union my_unix_sockaddr {
struct sockaddr sa;
struct sockaddr_un sun;
};
Not doing that will necessarily result in invoking Undefined Behavior at some point.
>
> Just my 2 cents, as always,
> Stefan.
The good thing is that fixing sockaddr_storage and telling everybody to use it
always fixes the problem, so I'm preparing a patch for glibc.
Cheers,
Alex
>
>>
>>
>> Should we warn about uses of this type? Should we recommend against using it in
>> the manual page, since there's no legitimate uses of it?
>>
>> Cheers,
>>
>> Alex
>>
>> --
>> <http://www.alejandro-colomar.es/>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-01-20 12:40 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-19 14:11 Alejandro Colomar
2023-01-20 10:06 ` Stefan Puiu
2023-01-20 12:39 ` Alejandro Colomar [this message]
2023-01-23 7:40 ` Stefan Puiu
2023-01-23 16:03 ` Alejandro Colomar
2023-01-23 16:28 ` Richard Biener
2023-01-24 16:38 ` Alex Colomar
2023-01-23 16:37 ` Jakub Jelinek
2023-01-24 16:40 ` Alex Colomar
2023-01-24 18:00 ` Alex Colomar
2023-01-24 11:16 ` Rich Felker
2023-01-24 16:53 ` Alex Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=61bbb556-ff9b-ebdc-5566-bc1ae533c0aa@gmail.com \
--to=alx.manpages@gmail.com \
--cc=gcc@gcc.gnu.org \
--cc=igor@sysoev.ru \
--cc=libc-alpha@sourceware.org \
--cc=linux-man@vger.kernel.org \
--cc=stefan.puiu@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).