From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.133]) by sourceware.org (Postfix) with ESMTPS id D0372389443C for ; Mon, 3 May 2021 10:30:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D0372389443C Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=cygwin.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=corinna-cygwin@cygwin.com Received: from calimero.vinschen.de ([24.134.7.25]) by mrelayeu.kundenserver.de (mreue012 [212.227.15.167]) with ESMTPSA (Nemesis) id 1MFb38-1lmtVu1Reg-00H6OO for ; Mon, 03 May 2021 12:30:50 +0200 Received: by calimero.vinschen.de (Postfix, from userid 500) id D1081A80D64; Mon, 3 May 2021 12:30:45 +0200 (CEST) Date: Mon, 3 May 2021 12:30:45 +0200 From: Corinna Vinschen To: cygwin-developers@cygwin.com Subject: Re: The unreliability of AF_UNIX datagram sockets Message-ID: Reply-To: cygwin-developers@cygwin.com Mail-Followup-To: cygwin-developers@cygwin.com References: <58da34ac-f2b6-d8b2-e872-834cfcb1ab51@cornell.edu> <6cac30e5-56fc-5bf1-b85b-fe6b91bc5e97@cornell.edu> <16e1d55e-15ea-6c0e-04e4-aa6cb2c0c1bd@cornell.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Provags-ID: V03:K1:MP+Xe6aRotxQAjGbkqXACvh5knvPZX85CqvnKaQKxX3mlP5ZcsZ FEEP1CwNST0Nu4h0nirxDvnyx4IRyebiTlJ0yAivugPfx/Q+wwfHooryTyFAnOpkL2kfwSy pvon1x1SU3zvdronGqjpLFF2yGshiVMGIymbBP5o4QXgyWLmsHU5gHpM3PrX8z5KQ4vyB3M m8vNbcDJt+uc9nWI4ZzLg== X-UI-Out-Filterresults: notjunk:1;V03:K0:OtBQg0F051o=:bgl98Dn53OGK4y2DE7fGZ4 vaSb2IIjJeQiDbVOY2PyC+6dxaNAt/v6TpOP4PLpUGHYzVxsmXfHo1txJmfOz8lXEO1wD30Bm VRa9uctrfMLTzToKZBqPSZd6trOFi5EQmIyjvHN3bDpMYlqblIywWPbZ57VeBmteX9igrpJu6 Nqxgw7ujYGx6Vac9LcKwf/QhAVk7yoVaxQ9B36YZOGxcEJfCx6859iKMwTB2Ch+o0WtWOdGfG NhpRJQotO0x+KxMsxWv9+KYHmoHCdGpYeV8K/cveYD8esN4vm56otH5NM4fwCJ49YpQYPTEqR w95EqdXMlxNMFC7QJi98JVmcVTsG7fckWj931luUsycLH7WFzd5fuVPymfWeVOJuShjjlV8l4 YyFUPJAis5POnOqUqy2m8/VbBoLkn6MwL3OAJ205+Fi8UwJ4bJHXiDuLW0Gqrfnb1tiJG6QuQ Wfe17HqIcDO7Al+HsiDRRwxa+8CKIJ+Odfi+bz6bwhP09pkHWaGQ9EjkwAz3AKAhzqnyCVeCQ uzeE7zG3A6zXSV2Bi+E8ug= X-Spam-Status: No, score=-100.1 required=5.0 tests=BAYES_00, GOOD_FROM_CORINNA_CYGWIN, JMQ_SPF_NEUTRAL, KAM_DMARC_NONE, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NEUTRAL, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin-developers@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Cygwin core component developers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 May 2021 10:30:54 -0000 Hi Ken, On May 1 17:41, Ken Brown wrote: > I've been thinking about the overall design of using mqueues instead of > pipes, and I just want to make sure I'm on the right track. Here are my > thoughts: > > 1. Each socket needs to create its own mqueue that it uses only for reading. > For writing, it opens its peer's mqueue. So each socket holds two mqueue > descriptors, one for reading and one for writing. Sounds right to me. > 2. A STREAM socket S that wants to connect to a listening socket T sends a > message to T containing S's mqueue name. (Probably it's sufficient for S to > send its unique ID, from which the mqueue name will be constructed.) T then > creates a socket T1, which sends its mqueue name (or ID) to S, and S and T1 > are then connected. In the async case, maybe S uses mq_notify to set up the > thread that waits for a connection. Sounds good as well. Maybe it's better to look at this from the listener side in the first place, because that's the more tricky side, but that's just a POV thingy. > 3. In fhandler_socket_unix::dup, the child will need to open any mqueues > that the parent holds open. Maybe an internal _mq_dup function would be > useful here. Makes sense. > 4. I'm not sure what needs to be done after fork/exec. After an exec, all Same here, see below. > mqueue descriptors are automatically closed according to Kerrisk, but I > don't see where this is done in the Cygwin code. Or is it somehow automatic > as a consequence of the mqueue implementation (which I haven't studied in > detail)? Yes, that's automatic. The handles are duped, the addresses are either on the heap or in an mmap, those are duplicated automaticelly during fork. The file descriptor for the mmap'ed file gets closed right during mq_open, so it's not inherited at all, and memory isn't inherited by an exec'ed child. But, see below (Note 2). > On the other hand, why does Cygwin's mq_open accept O_CLOEXEC if > this is the case? The mq code doesn't handle incoming O_CLOEXEC explicitely, it just lets open flags slip through. I don't know what Linux' idea here is, but for our implementation O_CLOEXEC has no meaning because the open flags other than O_NONBLOCK are only used in the open(2) call for the mapped file, and that uses O_CLOEXEC anyway. > And after a fork, something might need to be done to make sure that the > child can set the blocking mode of its inherited mqueue descriptors > independently of the parent. If I understand the mqueue documentation > correctly, this isn't normally the case. In the terminology of Kerrisk, the > mqueue descriptor that the child inherits from the parent refers to the same > mqueue description as the parent's descriptor, and the blocking mode is part > of the description. But again, this might be Linux terminology that doesn't > apply to Cygwin. Doesn't apply to Cygwin. The structure representing the mqd_t, mq_info, is used to keep track of the O_NONBLOCK flag, not the mqueue header. So the flag is local only. > That's all I have for the moment, but I'm sure there will be more questions > when I actually start coding. Certainly. As for the above "see below"s... I encountered a couple of problems over the weekend myself (during soccer viewing, which I don't care for at all), which all need either fixing, or have to be implemented first. 1. As you noticed, the socket descriptors are inherited by exec'ed children, but the mqueue isn't. So we need at least some kind of fixup_after_exec for mqueues used as part of AF_UNIX sockets. 2. While none of the mqueue structures are propagated to child processes, the handles to the synchronization objects accidentally are. 3. Note 1 and 2 can only be implemented, if we introduce a new superstructure keeping track of all mdq_t/mq_info structure pointers in an application. Oh well. Bummer, I was SOO happy that the posix_ipc stuff didn't need it yet... 4. As stated in the code comment leading the mqueue implementation, I used Stevens code as the basis. What I didn't realize so far is that Stevens simplified the implementation in some ways. The code works for real POSIX mqueues, but needs some more fixing before it can be used for AF_UNIX at all. 5. I hacked a bit on an mq-only mmap call, which is supposed to allow creating/opening of named shared memeory areas, but that's a tricky extension to the mmap scenario. I have a gut feeling that it's better to avoid using mmap at all and use Windows section mapping directly in mq_open/mq_close, especially if we have to implement fixup_after_exec semantics anyway. 6. Ultimately, AF_UNIX sockets should not run file-backed at all, anyway. Given that sockets can't be bound multiple times, there's no persistency requirement for the mqueue. 7. ...? Not sure if I forgot something here, but the above problems are quite enough to spend some time on already... Corinna