public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
* Re: python > 3.5: Issue with unix domain sockets
       [not found]   ` <af597ace-e986-35a0-9983-99256c440791@maxrnd.com>
@ 2021-05-04  9:45     ` Mark Geisert
  2021-05-04 12:27       ` Corinna Vinschen
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Geisert @ 2021-05-04  9:45 UTC (permalink / raw)
  To: cygwin-developers

Mark Geisert wrote:
> Ken Brown via Cygwin wrote:
>> On 5/3/2021 8:57 AM, Maximilian.Blenk--- via Cygwin wrote:
>>> Incorrect Behavior:
>>> Server:
>>> $ python3.7 server.py
>>> starting up on ./uds_socket
>>> waiting for a connection
>>> Traceback (most recent call last):
>>>    File "server.py", line 27, in <module>
>>>      connection, client_address = sock.accept()
>>>    File "/usr/lib/python3.7/socket.py", line 214, in accept
>>>      sock = socket(self.family, self.type, self.proto, fileno=fd)
>>>    File "/usr/lib/python3.7/socket.py", line 151, in __init__
>>>      _socket.socket.__init__(self, family, type, proto, fileno)
>>> SystemError: <slot wrapper '__init__' of '_socket.socket' objects> returned 
>>> NULL without setting an error
>>>
>>> Client:
>>> $ python3.7 client.py
>>> connecting to ./uds_socket
>>> sending b'This is the message.  It will be repeated.'
>>> closing socket
>>> Traceback (most recent call last):
>>>    File "client.py", line 27, in <module>
>>>      data = sock.recv(16)
>>> ConnectionResetError: [Errno 104] Connection reset by peer
>>
>> I wonder if this has the same cause as the problem reported here:
>>
>>    https://cygwin.com/pipermail/cygwin/2021-February/247884.html
>>
>> Mark, can you check that?

This issue is indeed related to the Python patch released to Python 3.{6,7,8} but 
not Python 3.5 or earlier.  I'm discussing here because the situation involves 
Python internals doing socket operations and Cygwin's AF_UNIX support is shaky in 
some aspects (that Ken's work will likely fix to the relief of everyone!).

The purpose of the Python patch is to disable the normal peer handshake that 
starts each AF_UNIX connection.  So whenever a Python app obtains an AF_UNIX 
socket, either from socket() or accept(), the internal routine that inits 
Python-level state was patched to call setsockopt() to turn off the handshake.

But it turns out that fhandler_socket_local::accept4() sets the socket's 
connect_state to "connected", on line fhandler_socket_local.cc:1086.  Then when 
the wrapping Python patch calls setsockopt() we end up in 
::af_local_set_no_getpeereid(), which is good, but the socket is marked 
"connected" so the result is an EALREADY error that rather clumsily knocks out 
both the server and client apps.

Assuming the connect_state check is needed (seems good for sanity check if nothing 
else) then I think I need to adjust when the Python patch is invoked.  Possibly 
distinguishing between Python-level accept()'s listening socket and returned 
socket.  I think that's right, assuming the Cygwin parts are operating correctly.

Does this sound like the right way to go?
Thanks for any comments either way.

..mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python > 3.5: Issue with unix domain sockets
  2021-05-04  9:45     ` python > 3.5: Issue with unix domain sockets Mark Geisert
@ 2021-05-04 12:27       ` Corinna Vinschen
  2021-05-05  5:04         ` Mark Geisert
  0 siblings, 1 reply; 7+ messages in thread
From: Corinna Vinschen @ 2021-05-04 12:27 UTC (permalink / raw)
  To: cygwin-developers

On May  4 02:45, Mark Geisert wrote:
> Mark Geisert wrote:
> > Ken Brown via Cygwin wrote:
> > > On 5/3/2021 8:57 AM, Maximilian.Blenk--- via Cygwin wrote:
> > > > Incorrect Behavior:
> > > > Server:
> > > > $ python3.7 server.py
> > > > starting up on ./uds_socket
> > > > waiting for a connection
> > > > Traceback (most recent call last):
> > > >    File "server.py", line 27, in <module>
> > > >      connection, client_address = sock.accept()
> > > >    File "/usr/lib/python3.7/socket.py", line 214, in accept
> > > >      sock = socket(self.family, self.type, self.proto, fileno=fd)
> > > >    File "/usr/lib/python3.7/socket.py", line 151, in __init__
> > > >      _socket.socket.__init__(self, family, type, proto, fileno)
> > > > SystemError: <slot wrapper '__init__' of '_socket.socket'
> > > > objects> returned NULL without setting an error
> > > > 
> > > > Client:
> > > > $ python3.7 client.py
> > > > connecting to ./uds_socket
> > > > sending b'This is the message.  It will be repeated.'
> > > > closing socket
> > > > Traceback (most recent call last):
> > > >    File "client.py", line 27, in <module>
> > > >      data = sock.recv(16)
> > > > ConnectionResetError: [Errno 104] Connection reset by peer
> > > 
> > > I wonder if this has the same cause as the problem reported here:
> > > 
> > >    https://cygwin.com/pipermail/cygwin/2021-February/247884.html
> > > 
> > > Mark, can you check that?
> 
> This issue is indeed related to the Python patch released to Python
> 3.{6,7,8} but not Python 3.5 or earlier.  I'm discussing here because the
> situation involves Python internals doing socket operations and Cygwin's
> AF_UNIX support is shaky in some aspects (that Ken's work will likely fix to
> the relief of everyone!).
> 
> The purpose of the Python patch is to disable the normal peer handshake that
> starts each AF_UNIX connection.  So whenever a Python app obtains an AF_UNIX
> socket, either from socket() or accept(), the internal routine that inits
> Python-level state was patched to call setsockopt() to turn off the
> handshake.
> 
> But it turns out that fhandler_socket_local::accept4() sets the socket's
> connect_state to "connected", on line fhandler_socket_local.cc:1086.  Then
> when the wrapping Python patch calls setsockopt() we end up in
> ::af_local_set_no_getpeereid(), which is good, but the socket is marked
> "connected" so the result is an EALREADY error that rather clumsily knocks
> out both the server and client apps.

You're supposed to call the special setsockopt(SO_PEERCRED) on the
accepting socket.  The no_getpeereid property is inherited by the
accepted socket.


Corinna

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python > 3.5: Issue with unix domain sockets
  2021-05-04 12:27       ` Corinna Vinschen
@ 2021-05-05  5:04         ` Mark Geisert
  2021-05-06  8:35           ` Corinna Vinschen
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Geisert @ 2021-05-05  5:04 UTC (permalink / raw)
  To: cygwin-developers

Corinna Vinschen wrote:
> On May  4 02:45, Mark Geisert wrote:
>>[blah blah...]
> You're supposed to call the special setsockopt(SO_PEERCRED) on the
> accepting socket.  The no_getpeereid property is inherited by the
> accepted socket.

Ah, of course.  Well, I couldn't figure out a way to do the setsockopt() call some 
times but not others, because Python can't reach into the Cygwin DLL.  Nor should it.

Since this Python patch is supposed to be a temporary workaround, I took the tack 
that it should just ignore an error return from setsockopt(SO_PEERCRED).  In this 
fashion the handshake will be turned off when it can be, and when it can't (on the 
accepted socket) the attempt will error but that error will be ignored.

Thanks for the sanity check!

..mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python > 3.5: Issue with unix domain sockets
  2021-05-05  5:04         ` Mark Geisert
@ 2021-05-06  8:35           ` Corinna Vinschen
  2021-05-06  9:24             ` Mark Geisert
  0 siblings, 1 reply; 7+ messages in thread
From: Corinna Vinschen @ 2021-05-06  8:35 UTC (permalink / raw)
  To: cygwin-developers

On May  4 22:04, Mark Geisert wrote:
> Corinna Vinschen wrote:
> > On May  4 02:45, Mark Geisert wrote:
> > > [blah blah...]
> > You're supposed to call the special setsockopt(SO_PEERCRED) on the
> > accepting socket.  The no_getpeereid property is inherited by the
> > accepted socket.
> 
> Ah, of course.  Well, I couldn't figure out a way to do the setsockopt()
> call some times but not others, because Python can't reach into the Cygwin
> DLL.  Nor should it.
> 
> Since this Python patch is supposed to be a temporary workaround, I took the
> tack that it should just ignore an error return from
> setsockopt(SO_PEERCRED).  In this fashion the handshake will be turned off
> when it can be, and when it can't (on the accepted socket) the attempt will
> error but that error will be ignored.

Sorry Mark, but I don't understand how you concluded that ignoring the
error from setsockopt(SO_PEERCRED) is the solution I pointed out above.

The right thing to do is to call setsockopt(SO_PEERCRED) (and check its
return value) before calling listen() on the accepting socket.  The
no_getpeereid property gets propagated to the accepted sockets and thus
there's no need to call it again for these sockets.


Corinna

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python > 3.5: Issue with unix domain sockets
  2021-05-06  8:35           ` Corinna Vinschen
@ 2021-05-06  9:24             ` Mark Geisert
  2021-05-06 10:30               ` Corinna Vinschen
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Geisert @ 2021-05-06  9:24 UTC (permalink / raw)
  To: cygwin-developers

Corinna Vinschen wrote:
> On May  4 22:04, Mark Geisert wrote:
>> Corinna Vinschen wrote:
>>> On May  4 02:45, Mark Geisert wrote:
>>>> [blah blah...]
>>> You're supposed to call the special setsockopt(SO_PEERCRED) on the
>>> accepting socket.  The no_getpeereid property is inherited by the
>>> accepted socket.
>>
>> Ah, of course.  Well, I couldn't figure out a way to do the setsockopt()
>> call some times but not others, because Python can't reach into the Cygwin
>> DLL.  Nor should it.
>>
>> Since this Python patch is supposed to be a temporary workaround, I took the
>> tack that it should just ignore an error return from
>> setsockopt(SO_PEERCRED).  In this fashion the handshake will be turned off
>> when it can be, and when it can't (on the accepted socket) the attempt will
>> error but that error will be ignored.
> 
> Sorry Mark, but I don't understand how you concluded that ignoring the
> error from setsockopt(SO_PEERCRED) is the solution I pointed out above.
> 
> The right thing to do is to call setsockopt(SO_PEERCRED) (and check its
> return value) before calling listen() on the accepting socket.  The
> no_getpeereid property gets propagated to the accepted sockets and thus
> there's no need to call it again for these sockets.

The strategy requires more precise language than I used, sorry.  I do understand 
what you're saying most recently, and have since my "of course" statement.

Inside the Python runtime environment, I only get visibility to the socket Cygwin 
has provided at one point, which is where Python inits its own context for that 
socket.  I can't tell whether the socket is the result of a socket() call or from 
an accept() call.  I can't even tell whether it's connected.  I can tell that it's 
AF_UNIX and can tell whether it's SOCK_DGRAM or SOCK_STREAM.

So in this constrained situation, I decided to just call setsockopt(SO_PEERCRED) 
and ignore any errors.  (Case 1:) If it doesn't error, the socket came from 
socket() and now has the peer handshake turned off.  (Case 2:) If it does error, 
the socket came from accept(), i.e. it's an accepted socket, and though we 
couldn't turn off handshake here, it's already turned off due to inheritance from 
the accepting socket.  The accepting socket has it turned off already because it 
was earlier Case 1 (i.e. both ends of the connection get treated, as long as they 
are both Python apps).

I know that in general it's poor practice to ignore errors but given the 
constraints here and that the possible cases are (I believe) well known, this is a 
solution that works to solve the most recent issue and the original issue.  Now 
that I think about it, the patch could be tightened up by only ignoring EALREADY 
rather than ignoring all errors.  Would you be okay with that?
Thank you for reviewing,

..mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python > 3.5: Issue with unix domain sockets
  2021-05-06  9:24             ` Mark Geisert
@ 2021-05-06 10:30               ` Corinna Vinschen
  2021-05-06 12:18                 ` Marco Atzeri
  0 siblings, 1 reply; 7+ messages in thread
From: Corinna Vinschen @ 2021-05-06 10:30 UTC (permalink / raw)
  To: cygwin-developers

On May  6 02:24, Mark Geisert wrote:
> Corinna Vinschen wrote:
> > On May  4 22:04, Mark Geisert wrote:
> > > Corinna Vinschen wrote:
> > > > On May  4 02:45, Mark Geisert wrote:
> > > > > [blah blah...]
> > > > You're supposed to call the special setsockopt(SO_PEERCRED) on the
> > > > accepting socket.  The no_getpeereid property is inherited by the
> > > > accepted socket.
> > > 
> > > Ah, of course.  Well, I couldn't figure out a way to do the setsockopt()
> > > call some times but not others, because Python can't reach into the Cygwin
> > > DLL.  Nor should it.
> > > 
> > > Since this Python patch is supposed to be a temporary workaround, I took the
> > > tack that it should just ignore an error return from
> > > setsockopt(SO_PEERCRED).  In this fashion the handshake will be turned off
> > > when it can be, and when it can't (on the accepted socket) the attempt will
> > > error but that error will be ignored.
> > 
> > Sorry Mark, but I don't understand how you concluded that ignoring the
> > error from setsockopt(SO_PEERCRED) is the solution I pointed out above.
> > 
> > The right thing to do is to call setsockopt(SO_PEERCRED) (and check its
> > return value) before calling listen() on the accepting socket.  The
> > no_getpeereid property gets propagated to the accepted sockets and thus
> > there's no need to call it again for these sockets.
> 
> The strategy requires more precise language than I used, sorry.  I do
> understand what you're saying most recently, and have since my "of course"
> statement.
> 
> Inside the Python runtime environment, I only get visibility to the socket
> Cygwin has provided at one point, which is where Python inits its own
> context for that socket.  I can't tell whether the socket is the result of a
> socket() call or from an accept() call.  I can't even tell whether it's
> connected.  I can tell that it's AF_UNIX and can tell whether it's
> SOCK_DGRAM or SOCK_STREAM.

Ah, ok.  Sounds like a shitty interface to me, but if that's what you
have to work with...

> So in this constrained situation, I decided to just call
> setsockopt(SO_PEERCRED) and ignore any errors.  (Case 1:) If it doesn't
> error, the socket came from socket() and now has the peer handshake turned
> off.  (Case 2:) If it does error, the socket came from accept(), i.e. it's
> an accepted socket, and though we couldn't turn off handshake here, it's
> already turned off due to inheritance from the accepting socket.  The
> accepting socket has it turned off already because it was earlier Case 1
> (i.e. both ends of the connection get treated, as long as they are both
> Python apps).
> 
> I know that in general it's poor practice to ignore errors but given the
> constraints here and that the possible cases are (I believe) well known,
> this is a solution that works to solve the most recent issue and the
> original issue.  Now that I think about it, the patch could be tightened up
> by only ignoring EALREADY rather than ignoring all errors.  Would you be
> okay with that?

Sounds good to me.

Thanks for explaining the situation!


Corinna

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: python > 3.5: Issue with unix domain sockets
  2021-05-06 10:30               ` Corinna Vinschen
@ 2021-05-06 12:18                 ` Marco Atzeri
  0 siblings, 0 replies; 7+ messages in thread
From: Marco Atzeri @ 2021-05-06 12:18 UTC (permalink / raw)
  To: cygwin-developers

On 06.05.2021 12:30, Corinna Vinschen wrote:
> On May  6 02:24, Mark Geisert wrote:
>> Corinna Vinschen wrote:
>>> On May  4 22:04, Mark Geisert wrote:
>>>> Corinna Vinschen wrote:
>>>>> On May  4 02:45, Mark Geisert wrote:
>>>>>> [blah blah...]
>>>>> You're supposed to call the special setsockopt(SO_PEERCRED) on the
>>>>> accepting socket.  The no_getpeereid property is inherited by the
>>>>> accepted socket.
>>>>
>>>> Ah, of course.  Well, I couldn't figure out a way to do the setsockopt()
>>>> call some times but not others, because Python can't reach into the Cygwin
>>>> DLL.  Nor should it.
>>>>
>>>> Since this Python patch is supposed to be a temporary workaround, I took the
>>>> tack that it should just ignore an error return from
>>>> setsockopt(SO_PEERCRED).  In this fashion the handshake will be turned off
>>>> when it can be, and when it can't (on the accepted socket) the attempt will
>>>> error but that error will be ignored.
>>>
>>> Sorry Mark, but I don't understand how you concluded that ignoring the
>>> error from setsockopt(SO_PEERCRED) is the solution I pointed out above.
>>>
>>> The right thing to do is to call setsockopt(SO_PEERCRED) (and check its
>>> return value) before calling listen() on the accepting socket.  The
>>> no_getpeereid property gets propagated to the accepted sockets and thus
>>> there's no need to call it again for these sockets.
>>
>> The strategy requires more precise language than I used, sorry.  I do
>> understand what you're saying most recently, and have since my "of course"
>> statement.
>>
>> Inside the Python runtime environment, I only get visibility to the socket
>> Cygwin has provided at one point, which is where Python inits its own
>> context for that socket.  I can't tell whether the socket is the result of a
>> socket() call or from an accept() call.  I can't even tell whether it's
>> connected.  I can tell that it's AF_UNIX and can tell whether it's
>> SOCK_DGRAM or SOCK_STREAM.
> 
> Ah, ok.  Sounds like a shitty interface to me, but if that's what you
> have to work with...
> 
>> So in this constrained situation, I decided to just call
>> setsockopt(SO_PEERCRED) and ignore any errors.  (Case 1:) If it doesn't
>> error, the socket came from socket() and now has the peer handshake turned
>> off.  (Case 2:) If it does error, the socket came from accept(), i.e. it's
>> an accepted socket, and though we couldn't turn off handshake here, it's
>> already turned off due to inheritance from the accepting socket.  The
>> accepting socket has it turned off already because it was earlier Case 1
>> (i.e. both ends of the connection get treated, as long as they are both
>> Python apps).
>>
>> I know that in general it's poor practice to ignore errors but given the
>> constraints here and that the possible cases are (I believe) well known,
>> this is a solution that works to solve the most recent issue and the
>> original issue.  Now that I think about it, the patch could be tightened up
>> by only ignoring EALREADY rather than ignoring all errors.  Would you be
>> okay with that?
> 
> Sounds good to me.
> 
> Thanks for explaining the situation!
> 
> 
> Corinna
> 

as general info with the current implementation
all the tests on

  /usr/lib/python3.8/test/test_asyncio

pass except one.
It is a huge improvement versus previous python packages

   $ mv test_subprocess.py test_subprocess.py-bk
   $ pytest
   ....

   Results (102.08s):
       1939 passed
         31 skipped


test_subprocess.py is still frozing

.......................................................................Traceback 
(most recent call last):
   File "test_subprocess.py-bk", line 725, in <module>
     unittest.main()
   File "/usr/lib/python3.8/unittest/main.py", line 101, in __init__
     self.runTests()
   File "/usr/lib/python3.8/unittest/main.py", line 271, in runTests
     self.result = testRunner.run(self.test)
   File "/usr/lib/python3.8/unittest/runner.py", line 176, in run
     test(result)
   File "/usr/lib/python3.8/unittest/suite.py", line 84, in __call__
     return self.run(*args, **kwds)
   File "/usr/lib/python3.8/unittest/suite.py", line 122, in run
     test(result)
   File "/usr/lib/python3.8/unittest/suite.py", line 84, in __call__
     return self.run(*args, **kwds)
   File "/usr/lib/python3.8/unittest/suite.py", line 122, in run
     test(result)
   File "/usr/lib/python3.8/unittest/case.py", line 736, in __call__
     return self.run(*args, **kwds)
   File "/usr/lib/python3.8/unittest/case.py", line 676, in run
     self._callTestMethod(testMethod)
   File "/usr/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
     method()
   File "test_subprocess.py-bk", line 175, in test_kill
     returncode = self.loop.run_until_complete(proc.wait())
   File "/usr/lib/python3.8/asyncio/base_events.py", line 603, in 
run_until_complete
     self.run_forever()
   File "/usr/lib/python3.8/asyncio/base_events.py", line 570, in 
run_forever
     self._run_once()
   File "/usr/lib/python3.8/asyncio/base_events.py", line 1823, in _run_once
     event_list = self._selector.select(timeout)
   File "/usr/lib/python3.8/selectors.py", line 415, in select
     fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt


Regards
Marco

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-05-06 12:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1620046759893.5340@bmw.de>
     [not found] ` <2cde4128-6a3d-7431-6608-a2184d23964a@cornell.edu>
     [not found]   ` <af597ace-e986-35a0-9983-99256c440791@maxrnd.com>
2021-05-04  9:45     ` python > 3.5: Issue with unix domain sockets Mark Geisert
2021-05-04 12:27       ` Corinna Vinschen
2021-05-05  5:04         ` Mark Geisert
2021-05-06  8:35           ` Corinna Vinschen
2021-05-06  9:24             ` Mark Geisert
2021-05-06 10:30               ` Corinna Vinschen
2021-05-06 12:18                 ` Marco Atzeri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).