From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2119.outbound.protection.outlook.com [40.107.236.119]) by sourceware.org (Postfix) with ESMTPS id 64299385702C for ; Wed, 14 Apr 2021 21:58:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 64299385702C ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Yhvuk0yehTgAwz4RIigRF7bHMBX+QZe+BAyYe0t3MLI3HNCKaAo99hcP7PpnzWizawSBpyRARNzVxft8HpIBWmnmpLbhW58hvJqJjZK1pJR1/cYZf4RKSBOrwgzUSWAD2n6wIpsnDsin3WD70jdkV8KXVGMgk1vGN2X9prp+lAcm2OY2+M5tIxuvvgWhnahKHepSwdwC1YiaDA03iBAj/nluNBhhD2Rc4y3WDM1aXGHeZlYP7NrDgS52ZgCNL7yCnB+h6OPU+yqiMVkXG65C1xCBCbwXkYBWmjT3k7/mN6SCmTfcyFsA1dFfc7F8MvW48D4yMsh1QN+yKzytyGIp9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IlluA/oRJWfiEYQ9DPxmQmENEdCzfFMG9dJGKIDGTX8=; b=UKpStBvUq2W9NtSN7H8VKrW6ywSjGdnxdAJulzPK/PeeGBMnruqJUtTz7bVim1uqUmYIcRVvcJmnO3lDcPDbGAp3f6dx6Hf0ynkDhjxDQAEd6unIfBtqFqQtsquW/2YJkO8cKpSI0TaVNXW6gb+HAobXr2XwXX9Z0mqgqZm619YcGuRUfUcB0ZlMVDon9fy6Q/2/n0nmgSd7UdYET4aojwEO6MRGjxquhuz4f4y8kcAY0eevIgSUhbIkq9rM01W2HV3wupnrpFbiWYZ6lXnF4HEqFCaq8wKVWlZXGrXbvEUM8yEYBbFNROOGa7quTByP7pqYU0C1QVYuG0zC7Lvhpw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cornell.edu; dmarc=pass action=none header.from=cornell.edu; dkim=pass header.d=cornell.edu; arc=none Received: from BN7PR04MB4388.namprd04.prod.outlook.com (2603:10b6:406:f8::19) by BN8PR04MB5473.namprd04.prod.outlook.com (2603:10b6:408:5e::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.17; Wed, 14 Apr 2021 21:58:53 +0000 Received: from BN7PR04MB4388.namprd04.prod.outlook.com ([fe80::59f8:fcc4:f07e:9a89]) by BN7PR04MB4388.namprd04.prod.outlook.com ([fe80::59f8:fcc4:f07e:9a89%4]) with mapi id 15.20.4020.023; Wed, 14 Apr 2021 21:58:53 +0000 Subject: Re: AF_UNIX/SOCK_DGRAM is dropping messages To: sten.kristian.ivarsson@gmail.com, cygwin@cygwin.com References: <04cc01d71ffa$7d1e6cf0$775b46d0$@gmail.com> <00d901d7208e$97c05c50$c74114f0$@gmail.com> <860668bf-8cf9-0969-6a01-7fbf8b782db1@cornell.edu> <000901d72607$55dc5a90$01950fb0$@gmail.com> <3346cd1c-b93f-83c4-ff26-553ac95ec692@cornell.edu> <7c21a430-9609-7fd4-1a02-8b7c1978d2f8@cornell.edu> <001901d72af4$4009cd50$c01d67f0$@gmail.com> <134074c1-4c0b-0842-b88b-536a1ed4aefe@cornell.edu> <000e01d7306e$3c265580$b4730080$@gmail.com> <19cf8626-c653-76db-a409-730a5aa5c955@cornell.edu> <4380cdea-c95b-d9dc-50e3-e5adabb73b92@cornell.edu> <000701d73151$9c259660$d470c320$@gmail.com> From: Ken Brown Message-ID: <2e64e918-b28b-753e-8337-c757cc62b9bb@cornell.edu> Date: Wed, 14 Apr 2021 17:58:08 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 In-Reply-To: <000701d73151$9c259660$d470c320$@gmail.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [24.194.34.31] X-ClientProxiedBy: CH0PR03CA0281.namprd03.prod.outlook.com (2603:10b6:610:e6::16) To BN7PR04MB4388.namprd04.prod.outlook.com (2603:10b6:406:f8::19) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.0.17] (24.194.34.31) by CH0PR03CA0281.namprd03.prod.outlook.com (2603:10b6:610:e6::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4042.16 via Frontend Transport; Wed, 14 Apr 2021 21:58:52 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 3da6568a-1724-4cd1-6d9f-08d8ff907e19 X-MS-TrafficTypeDiagnostic: BN8PR04MB5473: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: rlr1jo5BNegLW2uBzsEnydT1AL9VuUEFIc7qwNkgjgNViaLHDe2rL/AdDfgaOnThi7F/qnpIenBbsvA+5F0CH1uFN1eUcWqoBIUnTR2r3mCkcCF4GPPowRNXdOuyJQgzZBrVWjWzHLdBbvZgvFWWly7OitQHycYfxkyOHW0iFtz91Inpc7xRpac7rzq/qfYl1FAu4Fk9BcBqh8Ig5UrXycxRDS0T6XsXc1xGd1kw/YFfxXK+ma0GOwX+BZfs76oXLzuwmgQzlwC8XTdBHhCbtvMhVmfJWbdf0/XccZDhIDRzWq467BHRmEswsdMQfi+Wqxw53VlYVYB6P0QrvUzFR8G9RRG7OEQLH7C1LH8anhNwS5nQ63G618eumITJ11SIiaAQDMfOk74/8lvEKjS3jnrfknPBuBKNty7Gp8E9VJmffZEYnkuotEDq3CYZO7JVJ0nu0ifYFFOCcSTeOvrVoIkxCz3IllEqpQ0bBl3vJRnfEmsfqBBIcVhaPUJ5K6qmdnmexo5DEJ73MfeQrsII3SXnNNbm+Fnm4EMimr4MXn4PQ0h1BR2VFnvqf03uNuVg8+0vJdI/uErgAtKLZ69Y57v8PRN43m22fS5U4Q04YKyKOZRDt9dnXzZHQke7uFKoze4P8uLvexq4IbdlZJKXkyB1vIv9IOnckIxvoHM3bis+S/7Lc3LN4zAA+21QuIDq9VxJ9DTJ4wv+Aiayjmz7c6ZfrmsbsxLtcE5XEfAdS55qnI2Dj/bKqw/ITJ61NJyQUlPEPRUprO/TsDLwwirwc/kdjqUsB0tlF633zMW3FYA= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BN7PR04MB4388.namprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(39860400002)(396003)(366004)(136003)(376002)(316002)(956004)(53546011)(6666004)(2906002)(16576012)(15650500001)(38350700002)(31696002)(786003)(478600001)(86362001)(66476007)(31686004)(186003)(83380400001)(52116002)(8936002)(66574015)(6486002)(26005)(8676002)(36756003)(66946007)(75432002)(966005)(66556008)(2616005)(38100700002)(16526019)(5660300002)(45980500001)(43740500002); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData: =?Windows-1252?Q?bu6mkx2cDnjvuUc3mdI0W+tdxvQYUEMMSXPWY+eQ2bTb2A+NmV94uR4G?= =?Windows-1252?Q?ydieMPY4WbS6WA9+aMIe3//2lKYGLvPaSzifREWsrGLAEM9X0SzdJH44?= =?Windows-1252?Q?BCDuqrLOJ7Z6JxuojsaxzVCNRrc31uOxoatJ1Pyj92GMz0XZP8Zfht9V?= =?Windows-1252?Q?9qFqI8KL+CpR9JYzOCtIyWJ78S/4amFWf67xKnBb+xoOR8S8rxbi/9wk?= =?Windows-1252?Q?6h6cYL9NHLhGzHk2w4/ue0saxb9WA3koGLAVnMnedA1rYYyIMZUDWUJ6?= =?Windows-1252?Q?wvqiNW3i46G8zl8QAgubYiuPLjuQnwwNMOkkoyyyeZdYoqgP8is9SfTe?= =?Windows-1252?Q?c77IfHo+Ryuirh6yXiHAc0eq2m8ozgnAmc4WrLpx8joYnaYn3wX1rkhw?= =?Windows-1252?Q?T74s1fZEOyLFrtitIiZzLXdcCdKwPzKhGHXqn0IBz/T6Qh0MIhGm6137?= =?Windows-1252?Q?1rsLF10g9Y2ze+Ha1dpByCi0Vn2FwUjl+TGLT1WKzR85ZgHBKekH1Via?= =?Windows-1252?Q?j3Qj+IjqeHiCkgUJGEp8g98SiNcNLYQHNm+FseONrEuVlV4v2xoKsMWq?= =?Windows-1252?Q?MyJ9Q9CiHLMSSm0pxt0INJs8XVbI/zkzXfAOmXgsk1wVfASaW3x/k4cu?= =?Windows-1252?Q?xbsl/R26VCmVLQd5w+NPJ36OMJitsQ+l6lZR8DZxqCU/sDawv7XnTNK/?= =?Windows-1252?Q?DMj3b4MNFZPWLnJNa6ltVC19bY4TyB382THvJe53Q1JLoaMQ5/EJEUWl?= =?Windows-1252?Q?0uGkbZ0DRkzgttPoVMAtD+gjVndQtuv45ASo6mMCt5kgAc4NMT5OyuxJ?= =?Windows-1252?Q?FX5ivTiBWkbeEmmuUL+T/uyBg8gjvvdUb/Fl+BeWxXohRRuTMkPCjlBE?= =?Windows-1252?Q?+5gxdevnIVCvM7j4680cO2pFTZd+pR+WDm66QA+dNQ4iM5OS3mns5NZg?= =?Windows-1252?Q?Yr7LxNjpIrUF9PiBWrfYowVNEmwFD/wsEAFLjR25YKlGas5V5JJKv+ii?= =?Windows-1252?Q?Tp2h+FldYzLtTlhMgM7au9C6+mZetJDmtazJbf8NhKbSbYbb2bWVSNFc?= =?Windows-1252?Q?hKg/ZtMwIrY7EOpZ812CvCt6xkmLwRQyxB3dMJRiQLcT/bwfBf/zxH5o?= =?Windows-1252?Q?SJvb2DmOj0wT/BjS2dOhJi5AVELIKeCuJd6qD7Bs358PkuaTpCJsXBPe?= =?Windows-1252?Q?KDBhNidEXcweQi7mstDIVSSXCKSqsHtnqe/nFfgg0NahXsORGvfK5wM1?= =?Windows-1252?Q?Ck/HIIpc/0p0rne5c0gNXZP0I1CdZX7jV12GXCfrCBTNslXjRNmuFM7i?= =?Windows-1252?Q?xt8dLB7Tol9Z91qkxsIWuGCPRt+nW2o7P1AMVLB8anO1BnAUzfK9kkG6?= =?Windows-1252?Q?ZAG1JyF8yhmnSayx+IdWhvQ2vSHY7LpiwMY8uKS0cC9UAE17R9DsyqUg?= X-OriginatorOrg: cornell.edu X-MS-Exchange-CrossTenant-Network-Message-Id: 3da6568a-1724-4cd1-6d9f-08d8ff907e19 X-MS-Exchange-CrossTenant-AuthSource: BN7PR04MB4388.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2021 21:58:53.0753 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 5d7e4366-1b9b-45cf-8e79-b14b27df46e1 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: lxYNNK61Uh5tAE5SZWUlc3tJHo9EF9jrWhGOUnByDjIq7JAOMWvJYscsIUZZ+mkCmr5FVVGT/Z1KDDpcAMQrmA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN8PR04MB5473 X-Spam-Status: No, score=1.3 required=5.0 tests=BAYES_00, DKIM_INVALID, DKIM_SIGNED, KAM_DMARC_STATUS, MSGID_FROM_MTA_HEADER, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, URIBL_SBL, URIBL_SBL_A autolearn=no autolearn_force=no version=3.4.2 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Apr 2021 21:58:57 -0000 On 4/14/2021 1:14 PM, sten.kristian.ivarsson@gmail.com wrote: >>>> Hi Ken >>>> >>>>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) >> seems >>>>> to >>>>>>>>>>>>> drop messages or at least they are not received in the same >>>>>>>>>>>>> order they are sent >>>>>>>>> >>>>>>>>> [snip] >>>>>>>>> >>>>>>>>>> Thanks for the test case. I can confirm the problem. I'm not >>>>>>>>>> familiar enough with the current AF_UNIX implementation to >>>>>>>>>> debug this easily. I'd rather spend my time on the new >>>>>>>>>> implementation (on the topic/af_unix branch). It turns out >>>>>>>>>> that your test case fails there too, but in a completely >>>>>>>>>> different way, due to a bug in sendto for datagrams. I'll see >>>>>>>>>> if I can fix that bug and then try again. >>>>>>>>>> >>>>>>>>>> Ken >>>>>>>>> >>>>>>>>> Ok, too bad it wasn't our own code base but good that the >> "mystery" >>>>>>>>> is verified >>>>>>>>> >>>>>>>>> I finally succeed to build topic/af_unix (after finding out what >>>>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to >>>>>>>>> CXXFLAGS though and thus I haven’t tested it yet >>>>>>>>> >>>>>>>>> Is it sufficient to add the define to the "main" Makefile or do >>>>>>>>> you have to add it to all the Makefile:s ? I guess I can find >>>>>>>>> out though >>>>>>>> >>>>>>>> I do it on the configure line, like this: >>>>>>>> >>>>>>>> ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" -- >>>>> prefix=... >>>>>>>> >>>>>>>>> Is topic/af_unix fairly up to date with master branch ? >>>>>>>> >>>>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix. >>>>>>>> I'lldo that again right now. >>>>>>>> >>>>>>>>> Either way, I'll be glad to help out testing topic/af_unix >>>>>>>> >>>>>>>> Thanks! >>>>>>> >>>>>>> I've now pushed a fix for that sendto bug, and your test case runs >>>>>>> without error on the topic/af_unix branch. >>>>>> >>>>>> It seems like the test-case do work now with topic/af_unix in >>>>>> blocking mode, but when using non-blocking (with MSG_DONTWAIT) >>>>>> there are >>>>> some >>>>>> issues I think >>>>>> >>>>>> 1. When the queue is empty with non-blocking recv(), errno is set >>>>>> to EPIPE but I think it should be EAGAIN (or maybe the pipe is >>>>>> getting broken for real of some reason ?) >>>>>> >>>>>> 2. When using non-blocking recv() and no message is written at all, >>>>>> it seems like recv() blocks forever >>>>>> >>>>>> 3. Using non-blocking recv() where the "client" does send less than >>>>>> "count" messages, sometimes recv() blocks forever (as well) >>>>>> >>>>>> >>>>>> My naïve analysis of this is that for the first issue (if any) the >>>>>> wrong errno is set and for the second issue it blocks if no >>>>>> sendto() is done after the first recv(), i.e. nothing kicks the "reader >> thread" >>>>>> in the butt to realise the queue is empty. It is not super clear >>>>>> though what POSIX says about creating blocking descriptors and then >>>>>> using non-blocking-flags with recv(), but this works in Linux any >>>>>> way >>>>> >>>>> The explanation is actually much simpler. In the recv code where a >>>>> bound datagram socket waits for a remote socket to connect to the >>>>> pipe, I simply forget to handle MSG_DONTWAIT. I've pushed a >> fix. Please retest. >>>>> >>>>> I should add that in all my work so far on the topic/af_unix branch, >>>>> I've thought mainly about stream sockets. So there may still be >>>>> things remaining to be implemented for the datagram case. >>>> >>>> I finally got some time to test topic/af_unix in our "real" >>>> cygwin-application >>>> (casual) and unfortunately very few of our unittests pass >>>> >>>> The symptoms are that there's unexpected eternal blocking, sometimes >>>> there's unexpected EADDRNOTAVAIL, sometimes it looks like some >> memory >>>> corruption (and >>>> core-dumps) >>>> >>>> Of course the memory corruption etc could be our self and the >>>> core-dumps might be because of uncaught exceptions >>>> >>>> Needles to say is that all unittests pass on Linux, but of course >>>> cygwin-topic/af_unix could act according to POSIX-standard and the >>>> behaviour couldbe due to our own misinterpretation of how POSIX works >>> >>> More likely it's due to bugs in the topic/af_unix branch. This is >>> still very much a work in progress. >>> >>>> I will try to narrow down the quite complex logic and reproduce the >>>> problems >>> >>> That would be ideal. >>> >>>> If you of some reason wanna try it with casual, I'd be glad to help >>>> you out (it should be easier now that last time (but there might be >>>> some documentation missing for Cygwin still)) >>>> >>>> https://bitbucket.org/casualcore/ >>> >>> I'm going on vacation in a few days, but I might do this when I get back. >>> >>> Thanks for your testing. >> >> By the way, if your code is using datagram sockets, then there are very serious >> problems with our implementation (even aside from the performance issue >> that we've already discussed). For example, I don't know of any reasonable >> way for select to test whether such a socket is ready for writing. We'll need to >> solve that somehow. > > If you by that mean if we're using SOCK_DGRAM, the answer is yes > > I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0 but that didn't work at all > > As far as I understand, both all types on pretty much all implementations preserves message ordering though > > I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the topic/af_unix-branch. Is that worth a try ? SOCK_STREAM is definitely worth a try. The implementation of that should be much more reliable than the implementation of SOCK_DGRAM at the moment. We don't implement SOCK_SEQPACKET. Ken