From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from brightrain.aerifal.cx (brightrain.aerifal.cx [216.12.86.13]) by sourceware.org (Postfix) with ESMTPS id 607FE3858D38 for ; Mon, 6 Feb 2023 11:20:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 607FE3858D38 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=libc.org Authentication-Results: sourceware.org; spf=none smtp.mailfrom=libc.org Date: Mon, 6 Feb 2023 06:20:20 -0500 From: Rich Felker To: Xi Ruoyao Cc: Alejandro Colomar , linux-man@vger.kernel.org, Alejandro Colomar , GCC , glibc , Bastien =?utf-8?Q?Roucari=C3=A8s?= , Stefan Puiu , Igor Sysoev , Andrew Clayton , Richard Biener , Zack Weinberg , Florian Weimer , Joseph Myers , Jakub Jelinek , Eric Blake Subject: Re: [PATCH] sockaddr.3type: BUGS: Document that libc should be fixed using a union Message-ID: <20230206112019.GH3298@brightrain.aerifal.cx> References: <20230205152835.17413-1-alx@kernel.org> <0a9306fa37edeb4a989b2929de67fee8606a3d8a.camel@xry111.site> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0a9306fa37edeb4a989b2929de67fee8606a3d8a.camel@xry111.site> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00,KAM_ACCOUNTPHISH,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Feb 06, 2023 at 02:02:23PM +0800, Xi Ruoyao wrote: > On Sun, 2023-02-05 at 16:31 +0100, Alejandro Colomar via Libc-alpha wrote: > > > The only correct way to use  different  types  in  an  API  is > > through  a  union. > > I don't think this statement is true (in general). Technically we can > write something like this: > > struct sockaddr { ... }; > struct sockaddr_in { ... }; > struct sockaddr_in6 { ... }; > > int bind(int fd, const struct sockaddr *addr, socklen_t addrlen) > { > if (addrlen < sizeof(struct sockaddr) { > errno = EINVAL; > return -1; > } > > /* cannot use "addr->sa_family" directly: it will be an UB */ > sa_family_t sa_family; > memcpy(&sa_family, addr, sizeof(sa_family)); > > switch (sa_family) { > case AF_INET: > return _do_bind_in(fd, (struct sockaddr_in *)addr, addrlen); > case AF_INET6: > return _do_bind_in6(fd, (struct sockaddr_in6 *)addr, addrlen); > /* more cases follow here */ > default: > errno = EINVAL; > return -1; > } > } > } > > In this way we can use sockaddr_{in,in6,...} for bind() safely, as long > as we can distinguish the "real" type of addr using the leading byte > sequence (and the caller uses it carefully). > > But obviously sockaddr_storage can't be distinguished here, so casting a > struct sockaddr_stroage * to struct sockaddr * and passing it to bind() > will still be wrong (unless we make sockaddr_storage an union or add > [[gnu::may_alias]]). If you wanted to make this work, you can just memcpy sockaddr_storage to a local object of the right declared type to access it. But this is only relevant for a userspace implementation of bind() rather than one that just marshalls it across some syscall boundary to a kernel. Rich