From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id 33AC6385E001 for ; Thu, 8 Apr 2021 19:47:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 33AC6385E001 Received: by mail-lj1-x232.google.com with SMTP id s17so3704826ljc.5 for ; Thu, 08 Apr 2021 12:47:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:references:in-reply-to:subject:date :message-id:mime-version:content-transfer-encoding:thread-index :content-language; bh=oRh/EEDfU0BgV15WiXoHnHS5KEc+Cg12IPPLDjFQqDg=; b=V6aJ+WRSNhZV0GrtTF54ENKngab4xrACV7Re5Mi2cuU8b2cGAuG1E21JLvA6oPz61N hP7VgysJ9I9e5I9J7nuMg+QpdUkz/FrAJorLNujRGmCfUTFbr8GWQsBawXzuELixPLl8 g2xJUAT8tpHE3GaBItmVHWNVWpPELOgJEJGC4qxU0wRbzVVgjnhlb74mL3Xdp2vE41A/ itUr5W1W2r6SY7ecDCUq0RKJwwcEw3VJsTDyLIbSxh2994Ved+nLrSrpYzTDw4AYev5O pjJ0Fgwgw6LU4OQVuHYYrohV7jZsggqtQUoPa6qA2ze2MAhU4vxrUI4WurzFi5q4RkGo 64VQ== X-Gm-Message-State: AOAM532drwTILTbRsmqOmi7qkOC5rZHK6zhyNzA2dmIISUZhRy5W3rhn UWkGuZR1/M9MaGNY+VXOMTJmqRWalGUt1g== X-Google-Smtp-Source: ABdhPJxz4EWnb2V+HsCcbu6QdwhGWpLuX0I/b8ag9drsuNeWK/txQsUtnhFgOn5UOmxQMwbf4bN2ww== X-Received: by 2002:a2e:9bd1:: with SMTP id w17mr3812189ljj.77.1617911250029; Thu, 08 Apr 2021 12:47:30 -0700 (PDT) Received: from zingo (87-249-172-112.ljusnet.se. [87.249.172.112]) by smtp.gmail.com with ESMTPSA id n7sm42158lfu.5.2021.04.08.12.47.29 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Apr 2021 12:47:29 -0700 (PDT) From: To: "'Ken Brown'" , References: <04cc01d71ffa$7d1e6cf0$775b46d0$@gmail.com> <00d901d7208e$97c05c50$c74114f0$@gmail.com> <860668bf-8cf9-0969-6a01-7fbf8b782db1@cornell.edu> <000901d72607$55dc5a90$01950fb0$@gmail.com> <3346cd1c-b93f-83c4-ff26-553ac95ec692@cornell.edu> <7c21a430-9609-7fd4-1a02-8b7c1978d2f8@cornell.edu> <001901d72af4$4009cd50$c01d67f0$@gmail.com> <134074c1-4c0b-0842-b88b-536a1ed4aefe@cornell.edu> <002101d72c52$695ea630$3c1bf290$@gmail.com> In-Reply-To: <002101d72c52$695ea630$3c1bf290$@gmail.com> Subject: RE: AF_UNIX/SOCK_DGRAM is dropping messages Date: Thu, 8 Apr 2021 21:47:28 +0200 Message-ID: <000601d72cb0$0263cc40$072b64c0$@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQIPffBCgY7dkx32YYBd4buxXBOzegICwCl2At957CQCAh4QbgK/qZQ0Aiflzi4DDsW9ugMOPnyiAg8iLcECKwlbyamJSWpw Content-Language: en-se X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Apr 2021 19:47:33 -0000 > > >>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems > > to > > >>>>>>>> drop messages or at least they are not received in the same > > >>>>>>>> order they are sent > > >>>> > > >>>> [snip] > > >>>> > > >>>>> Thanks for the test case. I can confirm the problem. I'm not > > >>>>> familiar enough with the current AF_UNIX implementation to = debug > > >>>>> this easily. I'd rather spend my time on the new = implementation > > >>>>> (on the topic/af_unix branch). It turns out that your test = case > > >>>>> fails there too, but in a completely different way, due to a = bug > > >>>>> in sendto for datagrams. I'll see if I can fix that bug and = then try > again. > > >>>>> > > >>>>> Ken > > >>>> > > >>>> Ok, too bad it wasn't our own code base but good that the = "mystery" > > >>>> is verified > > >>>> > > >>>> I finally succeed to build topic/af_unix (after finding out = what > > >>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to > > >>>> CXXFLAGS though and thus I haven=E2=80=99t tested it yet > > >>>> > > >>>> Is it sufficient to add the define to the "main" Makefile or do > > >>>> you have to add it to all the Makefile:s ? I guess I can find = out > > >>>> though > > >>> > > >>> I do it on the configure line, like this: > > >>> > > >>> ../af_unix/configure CXXFLAGS=3D"-g -O0 -D__WITH_AF_UNIX" -- > > prefix=3D... > > >>> > > >>>> Is topic/af_unix fairly up to date with master branch ? > > >>> > > >>> Yes, I periodically cherry-pick commits from master to = topic/af_unix. > > >>> I'lldo that again right now. > > >>> > > >>>> Either way, I'll be glad to help out testing topic/af_unix > > >>> > > >>> Thanks! > > >> > > >> I've now pushed a fix for that sendto bug, and your test case = runs > > >> without error on the topic/af_unix branch. > > > > > > It seems like the test-case do work now with topic/af_unix in > > > blocking mode, but when using non-blocking (with MSG_DONTWAIT) = there > > > are > > some > > > issues I think > > > > > > 1. When the queue is empty with non-blocking recv(), errno is set = to > > > EPIPE but I think it should be EAGAIN (or maybe the pipe is = getting > > > broken for real of some reason ?) > > > > > > 2. When using non-blocking recv() and no message is written at = all, > > > it seems like recv() blocks forever > > > > > > 3. Using non-blocking recv() where the "client" does send less = than > > > "count" messages, sometimes recv() blocks forever (as well) > > > > > > > > > My na=C3=AFve analysis of this is that for the first issue (if = any) the > > > wrong errno is set and for the second issue it blocks if no = sendto() > > > is done after the first recv(), i.e. nothing kicks the "reader = thread" > > > in the butt to realise the queue is empty. It is not super clear > > > though what POSIX says about creating blocking descriptors and = then > > > using non-blocking-flags with recv(), but this works in Linux any > > > way > > > > The explanation is actually much simpler. In the recv code where a > > bound datagram socket waits for a remote socket to connect to the > > pipe, I simply forget to handle MSG_DONTWAIT. I've pushed a fix. = Please > retest. >=20 > I tested it and now it seems like we get EAGAIN when there's no msg on = the > queue, but it seems like the client is blocked as well and that it = cannot write > any more messages until it is consumed by the server, so the = af_unix.cpp test- > client end prematurely >=20 > If using sendto() with MSG_DONTWAIT as well, that is getting a EAGAIN, = but > the socket in it self is not a non-blocking socket, it is just the = recv() that is done > in a non-blocking fashion >=20 > As I said earlier, it's a bit fuzzy (or at least for me) what POSIX = mean by > non/blocking descriptors combined with non/blocking operations, but as = far > as I understand, it should be possible to use blocking sendto()and = messages > should be written (as long as some buffer is not filled) at the same = time > someone is doing non-blocking recv() >=20 > What is your take on this ? I was thinking of this again and came to the conclusion that the fix = semantically probably works ok It was just me that didn't realise that only one message can be on the = queue simultaneously even in blocking mode The problem is not functional but merely a performance hog, that I guess = you have already realised and you mentioned it in previous message but I = guess I thought it was about some other issue So, I guess the fix works ok (I haven't done any more tests than with = the sample program), but I guess out of an throughput aspect I guess it = would be a good idea to let more messages be written to the queue before = the first is consumed or so (I guess you already have some thoughts = about this?) Keep up the good work, Kristian > > I should add that in all my work so far on the topic/af_unix branch, > > I've thought mainly about stream sockets. So there may still be > > things remaining to be implemented for the datagram case. > > > > Ken