From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2113.outbound.protection.outlook.com [40.107.93.113]) by sourceware.org (Postfix) with ESMTPS id C19F63954801 for ; Thu, 8 Apr 2021 21:02:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C19F63954801 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nXlspJHHutIKJdRMGBQzTUKOv5sex9UIsDSsR7y0XEZ/mmhG4jGriJ/bYfnnHNILMYoqVUHzg4/ZxR7xk5azELMHx8ezzBYM9eA+9Piu2nAgmqQpUJalT85tZc1+cwnkIj0+mCoSxllMCcGq/GtLq5QIyjREezc4TAERrhj6v7pLLOmBW6qzq2eg72GxYjmrii3eqtGOWowC0Jgg2MsQOGSbz47JhgD0aLRuK1R4BVfaSiThp+s3HaI9ddhW++8KbR7JOOL/SFKU0bJQ7QNzY2hkZx+S9qFmhP/niY2qw949+KVdypA9BoWsRejt+jxYAFvLRaoKIBliYTbt/cflcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PXRYPidmn/unCVYQ+NHjUIF2n9LQTOtoW3cFqSZyxa0=; b=n/gWDjJoAnVyGovBy4pAmMed4NSxT3Kjp1i0cWkjI7uTGBijlw1XrgLrortRePG9G2b6LPFZBshiDFXQsPXS/zV/luGtLR6eO5Fn4mfu4Gek2RT4Ho/3VCa7+cwNBeyYnBCtTu3dvtmh05qvm0GepH5gPk6UlLkwQHa6cd1TtB8Q1UAXntheTjnAYh70YntsXEFgYwCpLQOb9tO/jIPlY+3+jmpx6VmH/sFyeOyrwg1cOy+4rio7rLKUXhqhPFGXT2R9FvjctnGdrl3kYmviYBFa2uHRyk3hona6/GVhVtDmygaffmMjoUghScch+ptk08yE3Pg82tKBroZKbWYXRw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cornell.edu; dmarc=pass action=none header.from=cornell.edu; dkim=pass header.d=cornell.edu; arc=none Received: from BN7PR04MB4388.namprd04.prod.outlook.com (2603:10b6:406:f8::19) by BN6PR04MB0533.namprd04.prod.outlook.com (2603:10b6:404:96::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.28; Thu, 8 Apr 2021 21:02:12 +0000 Received: from BN7PR04MB4388.namprd04.prod.outlook.com ([fe80::59f8:fcc4:f07e:9a89]) by BN7PR04MB4388.namprd04.prod.outlook.com ([fe80::59f8:fcc4:f07e:9a89%4]) with mapi id 15.20.4020.017; Thu, 8 Apr 2021 21:02:12 +0000 Subject: Re: AF_UNIX/SOCK_DGRAM is dropping messages To: sten.kristian.ivarsson@gmail.com, cygwin@cygwin.com References: <04cc01d71ffa$7d1e6cf0$775b46d0$@gmail.com> <00d901d7208e$97c05c50$c74114f0$@gmail.com> <860668bf-8cf9-0969-6a01-7fbf8b782db1@cornell.edu> <000901d72607$55dc5a90$01950fb0$@gmail.com> <3346cd1c-b93f-83c4-ff26-553ac95ec692@cornell.edu> <7c21a430-9609-7fd4-1a02-8b7c1978d2f8@cornell.edu> <001901d72af4$4009cd50$c01d67f0$@gmail.com> <134074c1-4c0b-0842-b88b-536a1ed4aefe@cornell.edu> <002101d72c52$695ea630$3c1bf290$@gmail.com> <000601d72cb0$0263cc40$072b64c0$@gmail.com> From: Ken Brown Message-ID: <3e7e2393-b704-0675-f82c-5f070747ada4@cornell.edu> Date: Thu, 8 Apr 2021 17:02:10 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 In-Reply-To: <000601d72cb0$0263cc40$072b64c0$@gmail.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [2603:7081:7e41:6a00:d894:cf1c:5668:75ec] X-ClientProxiedBy: CH2PR20CA0016.namprd20.prod.outlook.com (2603:10b6:610:58::26) To BN7PR04MB4388.namprd04.prod.outlook.com (2603:10b6:406:f8::19) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [IPv6:2603:7081:7e41:6a00:d894:cf1c:5668:75ec] (2603:7081:7e41:6a00:d894:cf1c:5668:75ec) by CH2PR20CA0016.namprd20.prod.outlook.com (2603:10b6:610:58::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.16 via Frontend Transport; Thu, 8 Apr 2021 21:02:11 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 32ab779e-2cae-4f4b-6177-08d8fad194ac X-MS-TrafficTypeDiagnostic: BN6PR04MB0533: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2lcJFgz+Un/E5PpueyhEn2SlBiIcM5VZwzYvPVw27XKytFDvfCTTRsygpoTsoRO9WTFnh0TkJkjBg/FNLPHrXqMB7vzIHnepJNzexwXzCI0Owx0xA/5bxXl/kXRcZz0hbOOzk2+HKgfgVpPQGbyqVNFvEjJncmim3v3Ny/hIHxpBOo61+z7kopB9ZmyTAXjvm37Ie9o/rLPYJlvf3wguClKu1/09lnvLDXDEgy0BKZL3f88pW//Q4m6KETsfZD98AfABMqA4SF9mDbu+V7MoImp2uDy5COx4IfguNnfnoRJIBBmkcX6FYzNWT6UlokLDfsH0gnE4b6YuSabtCcT2cUeBL6J0MzNrv6e6p5kO2eryN1lb/WYgWkpb7+m/ccfeNSw2ba1OJ45774kRjGRWMlavVBa1wM5KjsvPZiRmD1njKjnGHpLYhuQ6+5YfOzzI83NK654O0w8/lGc1p72KaIEnTGhQ3BuR1/sFwpQQhXSnAYYeK8kHWnUmavSNM92/WOIXNBt3FFF3VcwZEIXzSPvzwPC4tbkMInr8NYo+uuULVjZF25cexfuBemKoXcoOts5LSq+lIwb9S0kWINVBrYQ7ifR0DxCSFU5Bm2YIkHJvYJ4uLP5QJh8tzS8sieJYop+6OUUGJk4IDBXJUFCsfe1rRa5IACkupxUKqu0QxU/XzncPwfqS1K+2oj2A79XIFuwVnoS/fximwM9leetBgw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BN7PR04MB4388.namprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(366004)(376002)(136003)(39860400002)(396003)(38100700001)(52116002)(2616005)(86362001)(8676002)(5660300002)(31696002)(75432002)(66946007)(316002)(66556008)(8936002)(36756003)(15650500001)(31686004)(2906002)(83380400001)(66476007)(66574015)(478600001)(16526019)(186003)(786003)(53546011)(6486002)(43740500002)(45980500001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData: =?Windows-1252?Q?YKs7q1S6ad7lWYeWVK7pHH6LSB1V7U1gQ306D6taPpR88GxTJO8Vpi3F?= =?Windows-1252?Q?RDCYhV3dwgZJif/+CaKGaZsZTtSi2fweQ9DUsT/KzxWxmWeh661NHrF2?= =?Windows-1252?Q?jdrVd4jHnTwuQCuYcigtQjSHDZjnaha38c7yFIVnBLoqPO7zg0uLUmgd?= =?Windows-1252?Q?Zvep6GunPFVklui4J+OMJxEG+koLoMr+hw/HivUl8a9AHsoyAnnjMWmT?= =?Windows-1252?Q?oqP4sz1JX2No/Eeg6OtcPg2UctFq7rONHIRO53CA8vANRVIV+zx1FiFk?= =?Windows-1252?Q?sCtWlOXBgxEZiCbsEd7h7uLYHl4j4D+06HhrZHIgal5/iIj0sQ4T92Ht?= =?Windows-1252?Q?A6yUWd1hsW9k7W9pe/WkgI7z7SJSIN2haxadaIjoCllKZq0xkxKUbZzk?= =?Windows-1252?Q?jhZ4+Z5PaAoUIw5a/jj54o5PAyHOYllhHNPKQYGjpHdDaSHl4RRpzHCG?= =?Windows-1252?Q?1s2x7OCduCTl0oLZmE5QreQzaFqWGADWXJhqqiQGliI8VMYdt5Cyjr1Q?= =?Windows-1252?Q?ymuCjqTRRm/buWQsmKJnkkV6n+Yk/eVkDVkChHiIdteMnkU5rvVg4uTE?= =?Windows-1252?Q?h7sBVWLHQ/oF/6559ffeQ6pm+TWaM1B/EO8XDJngfEYZsEBbzCRfC9HM?= =?Windows-1252?Q?fZBz5jWpIUJZ89sgqJ47+6OtNquV8ZT0S41cYSvVfmux7dLQrBManC0D?= =?Windows-1252?Q?t1vxj/Zaq/7MN/wC7VsPLQtXk6N+9KJcSOko2u6Cvb/khdzS+7qnjNSF?= =?Windows-1252?Q?1wy1+ZpDKnwntLzwzM+NcWkOGgat7P42+jQ06xoyUhVc6VLjuZtdaLWr?= =?Windows-1252?Q?edZkcFEjFvRY5MMzJwgC/gRYFcFs6vdmNeKbZh+0QPeN+NOKPuDxlfGT?= =?Windows-1252?Q?0VPIq5xf1ZksissAAY4WWhRnpAcqds8SkBPW8qFFJFKw3WCBUf85AV8c?= =?Windows-1252?Q?2ZueQDyMwRp2yykbHU2+HUp+S3vRrsrSHpuGjL9G8ouT/SNkyUh5lhED?= =?Windows-1252?Q?Sfug4TXqpQvn0g9ogf3t4i1HbYBZar/2T+vm89j2ShkkKzninfFJR0LC?= =?Windows-1252?Q?Q5VLW9NIgzNomZZTT11a/gvQb8dwv7rxlbq/ghpPEoimq1n1hiAiiZi7?= =?Windows-1252?Q?zDyZPRDZx0Uv4gw9ztoN7TjTmTQ09VBSTXBbj8LNDH4r/qQ1GMqfjtJQ?= =?Windows-1252?Q?7pberOCXrtlGgJH3/AHKizHGgbIpGdcyiu3NKDCfLJlQ79z8BTukpjS5?= =?Windows-1252?Q?J5Gauc3H2lmGSnhB+KUEzfaHxty3IlcJJPvpObafaZE0uc11oL+AA9HY?= =?Windows-1252?Q?xnt1H5n6kcAGZVI7wyAqiB9GewmSRHrsyLg19JwVmfeySVe11RrXeB6q?= =?Windows-1252?Q?7hwqb9cOn/t1DWlKcUjINTzuDah2EqVMiDsYg+Wx9F5YiLUXvlZUZt/x?= =?Windows-1252?Q?L2rMnAKdGIMw4FlljEVEQzKutFGN9kqfjlmUKeWgHttZjgmGsf3Dk1ni?= =?Windows-1252?Q?M077RVMQ?= X-OriginatorOrg: cornell.edu X-MS-Exchange-CrossTenant-Network-Message-Id: 32ab779e-2cae-4f4b-6177-08d8fad194ac X-MS-Exchange-CrossTenant-AuthSource: BN7PR04MB4388.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Apr 2021 21:02:12.7776 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 5d7e4366-1b9b-45cf-8e79-b14b27df46e1 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: RjDpQC/wK5dIDuFXt1nagejlLhRpEa2b4iuilNYGQdhU44/Jh8b+5VtuO+nZzimFDKd53zY6kljpuZpT8fwN5Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR04MB0533 X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_00, DKIM_INVALID, DKIM_SIGNED, KAM_DMARC_STATUS, MSGID_FROM_MTA_HEADER, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Apr 2021 21:02:18 -0000 On 4/8/2021 3:47 PM, sten.kristian.ivarsson@gmail.com wrote: >>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems >>> to >>>>>>>>>>> drop messages or at least they are not received in the same >>>>>>>>>>> order they are sent >>>>>>> >>>>>>> [snip] >>>>>>> >>>>>>>> Thanks for the test case. I can confirm the problem. I'm not >>>>>>>> familiar enough with the current AF_UNIX implementation to debug >>>>>>>> this easily. I'd rather spend my time on the new implementation >>>>>>>> (on the topic/af_unix branch). It turns out that your test case >>>>>>>> fails there too, but in a completely different way, due to a bug >>>>>>>> in sendto for datagrams. I'll see if I can fix that bug and then try >> again. >>>>>>>> >>>>>>>> Ken >>>>>>> >>>>>>> Ok, too bad it wasn't our own code base but good that the "mystery" >>>>>>> is verified >>>>>>> >>>>>>> I finally succeed to build topic/af_unix (after finding out what >>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to >>>>>>> CXXFLAGS though and thus I haven’t tested it yet >>>>>>> >>>>>>> Is it sufficient to add the define to the "main" Makefile or do >>>>>>> you have to add it to all the Makefile:s ? I guess I can find out >>>>>>> though >>>>>> >>>>>> I do it on the configure line, like this: >>>>>> >>>>>> ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" -- >>> prefix=... >>>>>> >>>>>>> Is topic/af_unix fairly up to date with master branch ? >>>>>> >>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix. >>>>>> I'lldo that again right now. >>>>>> >>>>>>> Either way, I'll be glad to help out testing topic/af_unix >>>>>> >>>>>> Thanks! >>>>> >>>>> I've now pushed a fix for that sendto bug, and your test case runs >>>>> without error on the topic/af_unix branch. >>>> >>>> It seems like the test-case do work now with topic/af_unix in >>>> blocking mode, but when using non-blocking (with MSG_DONTWAIT) there >>>> are >>> some >>>> issues I think >>>> >>>> 1. When the queue is empty with non-blocking recv(), errno is set to >>>> EPIPE but I think it should be EAGAIN (or maybe the pipe is getting >>>> broken for real of some reason ?) >>>> >>>> 2. When using non-blocking recv() and no message is written at all, >>>> it seems like recv() blocks forever >>>> >>>> 3. Using non-blocking recv() where the "client" does send less than >>>> "count" messages, sometimes recv() blocks forever (as well) >>>> >>>> >>>> My naïve analysis of this is that for the first issue (if any) the >>>> wrong errno is set and for the second issue it blocks if no sendto() >>>> is done after the first recv(), i.e. nothing kicks the "reader thread" >>>> in the butt to realise the queue is empty. It is not super clear >>>> though what POSIX says about creating blocking descriptors and then >>>> using non-blocking-flags with recv(), but this works in Linux any >>>> way >>> >>> The explanation is actually much simpler. In the recv code where a >>> bound datagram socket waits for a remote socket to connect to the >>> pipe, I simply forget to handle MSG_DONTWAIT. I've pushed a fix. Please >> retest. >> >> I tested it and now it seems like we get EAGAIN when there's no msg on the >> queue, but it seems like the client is blocked as well and that it cannot write >> any more messages until it is consumed by the server, so the af_unix.cpp test- >> client end prematurely >> >> If using sendto() with MSG_DONTWAIT as well, that is getting a EAGAIN, but >> the socket in it self is not a non-blocking socket, it is just the recv() that is done >> in a non-blocking fashion >> >> As I said earlier, it's a bit fuzzy (or at least for me) what POSIX mean by >> non/blocking descriptors combined with non/blocking operations, but as far >> as I understand, it should be possible to use blocking sendto()and messages >> should be written (as long as some buffer is not filled) at the same time >> someone is doing non-blocking recv() >> >> What is your take on this ? > > I was thinking of this again and came to the conclusion that the fix semantically probably works ok > > It was just me that didn't realise that only one message can be on the queue simultaneously even in blocking mode > > The problem is not functional but merely a performance hog, that I guess you have already realised and you mentioned it in previous message but I guess I thought it was about some other issue > > > So, I guess the fix works ok (I haven't done any more tests than with the sample program), but I guess out of an throughput aspect I guess it would be a good idea to let more messages be written to the queue before the first is consumed or so (I guess you already have some thoughts about this?) I have some thoughts, but nothing definitive yet. I'll keep thinking. Ken