public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* python3 3.9.18-1 hanging
@ 2024-01-30  9:22 Andrew Murray
  2024-01-30 10:18 ` Marco Atzeri
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Murray @ 2024-01-30  9:22 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]

Hi,

Over in Pillow’s GitHub Actions, the Cygwin job has started hanging when running pytest - https://github.com/python-pillow/Pillow/actions/runs/7690866578/job/20955296663

Interestingly, this is not something specific to Pillow. Cygwin has also been hanging for NumPy’s GitHub Actions, https://github.com/numpy/numpy/issues/25708, and GitPython, https://github.com/gitpython-developers/GitPython/pull/1814.

The conclusion that these Python developers have come to is that the problem started when Cygwin’s python3 package updated to 3.9.18-1. For the moment, they have downgraded to 3.9.16-1.

However, I’m letting you know here about the problem, in case you would like to find a more permanent solution.

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-01-30  9:22 python3 3.9.18-1 hanging Andrew Murray
@ 2024-01-30 10:18 ` Marco Atzeri
  2024-01-30 12:13   ` ggl329
                     ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Marco Atzeri @ 2024-01-30 10:18 UTC (permalink / raw)
  To: cygwin

On 30/01/2024 10:22, Andrew Murray via Cygwin wrote:
> Hi,
> 
> Over in Pillow’s GitHub Actions, the Cygwin job has started hanging when running pytest - https://github.com/python-pillow/Pillow/actions/runs/7690866578/job/20955296663
> 
> Interestingly, this is not something specific to Pillow. Cygwin has also been hanging for NumPy’s GitHub Actions, https://github.com/numpy/numpy/issues/25708, and GitPython, https://github.com/gitpython-developers/GitPython/pull/1814.
> 
> The conclusion that these Python developers have come to is that the problem started when Cygwin’s python3 package updated to 3.9.18-1. For the moment, they have downgraded to 3.9.16-1.
> 
> However, I’m letting you know here about the problem, in case you would like to find a more permanent solution.
> 
> Thanks
> 

thanks,
I will look

of course as soon I removed from test the problem raised

:-(



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-01-30 10:18 ` Marco Atzeri
@ 2024-01-30 12:13   ` ggl329
  2024-01-30 13:23     ` Marco Atzeri
  2024-01-31 16:36   ` Eliah Kagan
  2024-02-20  2:46   ` Takashi Yano
  2 siblings, 1 reply; 9+ messages in thread
From: ggl329 @ 2024-01-30 12:13 UTC (permalink / raw)
  To: cygwin

Hi,

> On 30/01/2024 10:22, Andrew Murray via Cygwin wrote:
>> Interestingly, this is not something specific to Pillow. Cygwin has also been hanging for NumPy’s GitHub Actions, https://github.com/numpy/numpy/issues/25708, and GitPython, https://github.com/gitpython-developers/GitPython/pull/1814.

In my environment, python-numpy stops working too.

$ python
Python 3.9.16 (main, Mar  8 2023, 22:47:22)
[GCC 11.3.0] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> help(numpy)
### stop here.


By downgrading libopenblas from 0.3.26-1 to 0.3.25-1, it works again.
Python 3.9.18 behaves same.

This may be related to the issue.

-- 
   ggl329

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-01-30 12:13   ` ggl329
@ 2024-01-30 13:23     ` Marco Atzeri
  0 siblings, 0 replies; 9+ messages in thread
From: Marco Atzeri @ 2024-01-30 13:23 UTC (permalink / raw)
  To: cygwin

On 30/01/2024 13:13, ggl329 via Cygwin wrote:
> Hi,
> 
>> On 30/01/2024 10:22, Andrew Murray via Cygwin wrote:
>>> Interestingly, this is not something specific to Pillow. Cygwin has 
>>> also been hanging for NumPy’s GitHub Actions, 
>>> https://github.com/numpy/numpy/issues/25708, and GitPython, 
>>> https://github.com/gitpython-developers/GitPython/pull/1814.
> 
> In my environment, python-numpy stops working too.
> 
> $ python
> Python 3.9.16 (main, Mar  8 2023, 22:47:22)
> [GCC 11.3.0] on cygwin
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import numpy
>>>> help(numpy)
> ### stop here.
> 
> 
> By downgrading libopenblas from 0.3.26-1 to 0.3.25-1, it works again.
> Python 3.9.18 behaves same.
> 
> This may be related to the issue.
> 

probably a different issue.
Let me check in the coming days

Regards
Marco


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-01-30 10:18 ` Marco Atzeri
  2024-01-30 12:13   ` ggl329
@ 2024-01-31 16:36   ` Eliah Kagan
  2024-01-31 17:17     ` Marco Atzeri
  2024-02-20  2:46   ` Takashi Yano
  2 siblings, 1 reply; 9+ messages in thread
From: Eliah Kagan @ 2024-01-31 16:36 UTC (permalink / raw)
  To: cygwin

Hi,

The fastest way to produce the problem described in
https://cygwin.com/pipermail/cygwin/2024-January/255267.html and 
https://cygwin.com/pipermail/cygwin/2024-January/255273.html seems to be 
to run `pip install ...` on a version of `pip` that uses its vendored 
`rich` dependency to draw progress bars. (The hang reliably occurs at 0% 
on the *second* progress bar, and `--progress-bar off` avoids it.) 
Examining what `pip` is doing *may* be sufficient to investigate this.

However, I was able to make a *fairly* simple script that reliably 
produces it, at least on my machine (and on GitHub Actions runners). It 
seems to me that this script may give some insight. In case it's useful:

import hashlib
import threading
import time
t1 = threading.Thread(target=lambda: print("hello"))
t2 = threading.Thread(target=lambda: print("goodbye"))
t1.start()
time.sleep(1)
print("in between")
t2.start()
t1.join()
t2.join()

The interesting thing here is that the `hashlib` import is required. 
Even though that import is not used, the script does not trigger the 
problem if it is removed.

As discussed at
https://github.com/gitpython-developers/GitPython/pull/1814, this script 
is motivated by code in GitPython that produces the hang when unit tests 
are run. The script hangs when attempting to execute `t2.start()`. The 
effect appears specific to Python 3.9.18 on Cygwin. Running that script 
with Python 3.9.16 on Cygwin, or on either Python 3.9.16 or Python 
3.9.18 on either Ubuntu 22.04 LTS or macOS 13, does not produce the 
problem. (I don't have native Windows builds of those versions to test 
with at this time.)

`t1` can be joined before `t2` is started, and the problem still 
reliably occurs. If that is done, then the sleep can be omitted and the 
problem sometimes occurs. Running the statements in a REPL also produces 
the problem without requiring a sleep (presumably the delay of entering 
them is sufficient). The child threads and main thread don't have to 
print to produce the problem; I included that to make it clearer what's 
going on. I have not tested non-blocking delays.

I named that `simple.py` and ran it in various ways to verify that it 
triggers the problem, but I think the most important ways to report are:

/usr/bin/python3.9 simple.py

And:

strace -o strace.out /usr/bin/python3.9 simple.py

By the time I killed the process in the strace run, `strace.out` had 
grown to 1819328 lines, most of which were:

--- Process 25112 (pid: 20768), exception c0000005 at 0000000000000000

(This is the same pattern Daniel Abrahamsson reported when running
`pip install` with strace.)

I made a copy of the first 6610 lines as `truncated.out`, but even that 
is 828 KiB, so I've posted it here rather than attaching it:

https://gist.github.com/EliahKagan/04143302056426d72c7a617d65890dda

The last 8 lines of `truncated.out` are identical, and the original 
`strace.out` continued that way.

(Although the strace output shows that this was run from a directory 
related to GitPython, this was not done with any virtual environment 
activated, nothing from GitPython was imported or otherwise used, and 
neither GitPython nor its distinctive dependencies gitdb and smmap were 
installed in the global environment.)

That GitHub Gist also includes `simple.py` for convenience, and 
`cygcheck.out` in case that would somehow be useful.

-Eliah

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-01-31 16:36   ` Eliah Kagan
@ 2024-01-31 17:17     ` Marco Atzeri
  0 siblings, 0 replies; 9+ messages in thread
From: Marco Atzeri @ 2024-01-31 17:17 UTC (permalink / raw)
  To: cygwin

On 31/01/2024 17:36, Eliah Kagan via Cygwin wrote:
> Hi,
> 

> 
> However, I was able to make a *fairly* simple script that reliably 
> produces it, at least on my machine (and on GitHub Actions runners). It 
> seems to me that this script may give some insight. In case it's useful:
> 
> import hashlib
> import threading
> import time
> t1 = threading.Thread(target=lambda: print("hello"))
> t2 = threading.Thread(target=lambda: print("goodbye"))
> t1.start()
> time.sleep(1)
> print("in between")
> t2.start()
> t1.join()
> t2.join()
> 
> The interesting thing here is that the `hashlib` import is required. 
> Even though that import is not used, the script does not trigger the 
> problem if it is removed.
> 
..

> 
> -Eliah
> 

Thanks Eliah,

for the detailed investigation

I will look on it, likely during the weekend

Regards
Marco


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-01-30 10:18 ` Marco Atzeri
  2024-01-30 12:13   ` ggl329
  2024-01-31 16:36   ` Eliah Kagan
@ 2024-02-20  2:46   ` Takashi Yano
  2024-02-20  3:46     ` Marco Atzeri
  2 siblings, 1 reply; 9+ messages in thread
From: Takashi Yano @ 2024-02-20  2:46 UTC (permalink / raw)
  To: cygwin

On Tue, 30 Jan 2024 11:18:37 +0100
Marco Atzeri wrote:
> On 30/01/2024 10:22, Andrew Murray via Cygwin wrote:
> > Hi,
> > 
> > Over in Pillow’s GitHub Actions, the Cygwin job has started hanging when running pytest - https://github.com/python-pillow/Pillow/actions/runs/7690866578/job/20955296663
> > 
> > Interestingly, this is not something specific to Pillow. Cygwin has also been hanging for NumPy’s GitHub Actions, https://github.com/numpy/numpy/issues/25708, and GitPython, https://github.com/gitpython-developers/GitPython/pull/1814.
> > 
> > The conclusion that these Python developers have come to is that the problem started when Cygwin’s python3 package updated to 3.9.18-1. For the moment, they have downgraded to 3.9.16-1.
> > 
> > However, I’m letting you know here about the problem, in case you would like to find a more permanent solution.
> > 
> > Thanks
> > 
> 
> thanks,
> I will look
> 
> of course as soon I removed from test the problem raised

I hit probably the same issue while building pango1.0 1.51.2-1
with python39 3.9.18-2 (Test) installed.

Several python3.9 processes are stalled and some of them cannot
be terminated even with kill -9.

Attaching gdb to them shows SIGSEGV in the thread whose stack
is corrupted. The program counter is 0 as follows. Null function
pointer dereference?

Thread 12 "python3.9" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 19184.0x18ac]
0x0000000000000000 in ?? ()

Another thread seems to stop in exit_thread() (winsup/cygwin/sigproc.cc).
It may be due to SEGV.

Downgrading it to 3.9.16-1 solves the issue.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-02-20  2:46   ` Takashi Yano
@ 2024-02-20  3:46     ` Marco Atzeri
  2024-02-24  6:42       ` Marco Atzeri
  0 siblings, 1 reply; 9+ messages in thread
From: Marco Atzeri @ 2024-02-20  3:46 UTC (permalink / raw)
  To: cygwin

On 20/02/2024 03:46, Takashi Yano via Cygwin wrote:
> On Tue, 30 Jan 2024 11:18:37 +0100
> Marco Atzeri wrote:
>> On 30/01/2024 10:22, Andrew Murray via Cygwin wrote:
>>> Hi,
>>>
>>> Over in Pillow’s GitHub Actions, the Cygwin job has started hanging when running pytest - https://github.com/python-pillow/Pillow/actions/runs/7690866578/job/20955296663
>>>
>>> Interestingly, this is not something specific to Pillow. Cygwin has also been hanging for NumPy’s GitHub Actions, https://github.com/numpy/numpy/issues/25708, and GitPython, https://github.com/gitpython-developers/GitPython/pull/1814.
>>>
>>> The conclusion that these Python developers have come to is that the problem started when Cygwin’s python3 package updated to 3.9.18-1. For the moment, they have downgraded to 3.9.16-1.
>>>
>>> However, I’m letting you know here about the problem, in case you would like to find a more permanent solution.
>>>
>>> Thanks
>>>
>>
>> thanks,
>> I will look
>>
>> of course as soon I removed from test the problem raised
> 
> I hit probably the same issue while building pango1.0 1.51.2-1
> with python39 3.9.18-2 (Test) installed.
> 
> Several python3.9 processes are stalled and some of them cannot
> be terminated even with kill -9.
> 
> Attaching gdb to them shows SIGSEGV in the thread whose stack
> is corrupted. The program counter is 0 as follows. Null function
> pointer dereference?
> 
> Thread 12 "python3.9" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 19184.0x18ac]
> 0x0000000000000000 in ?? ()
> 
> Another thread seems to stop in exit_thread() (winsup/cygwin/sigproc.cc).
> It may be due to SEGV.
> 
> Downgrading it to 3.9.16-1 solves the issue.
> 

The problem seems to be somewhere between 3.16 and 3.17

it is also the upstream transition were several patches are
not used anymore, so it seems the new inside path has some
hidden consequence for Cygwin



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: python3 3.9.18-1 hanging
  2024-02-20  3:46     ` Marco Atzeri
@ 2024-02-24  6:42       ` Marco Atzeri
  0 siblings, 0 replies; 9+ messages in thread
From: Marco Atzeri @ 2024-02-24  6:42 UTC (permalink / raw)
  To: cygwin

On 20/02/2024 04:46, Marco Atzeri wrote:
> On 20/02/2024 03:46, Takashi Yano via Cygwin wrote:

>>
>> I hit probably the same issue while building pango1.0 1.51.2-1
>> with python39 3.9.18-2 (Test) installed.
>>
>> Several python3.9 processes are stalled and some of them cannot
>> be terminated even with kill -9.
>>
>> Attaching gdb to them shows SIGSEGV in the thread whose stack
>> is corrupted. The program counter is 0 as follows. Null function
>> pointer dereference?
>>
>> Thread 12 "python3.9" received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 19184.0x18ac]
>> 0x0000000000000000 in ?? ()
>>
>> Another thread seems to stop in exit_thread() (winsup/cygwin/sigproc.cc).
>> It may be due to SEGV.
>>
>> Downgrading it to 3.9.16-1 solves the issue.
>>
> 
> The problem seems to be somewhere between 3.16 and 3.17
> 
> it is also the upstream transition were several patches are
> not used anymore, so it seems the new inside path has some
> hidden consequence for Cygwin
> 

to avoid anymore ME TOO test failure, I removed python 3.18-1 and -2
completely

Regards
Marco




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-02-24  6:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-30  9:22 python3 3.9.18-1 hanging Andrew Murray
2024-01-30 10:18 ` Marco Atzeri
2024-01-30 12:13   ` ggl329
2024-01-30 13:23     ` Marco Atzeri
2024-01-31 16:36   ` Eliah Kagan
2024-01-31 17:17     ` Marco Atzeri
2024-02-20  2:46   ` Takashi Yano
2024-02-20  3:46     ` Marco Atzeri
2024-02-24  6:42       ` Marco Atzeri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).