[ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1

public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed

* [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
@ 2015-06-20 21:15 Corinna Vinschen
  2015-06-21 18:47 ` Ken Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-20 21:15 UTC (permalink / raw)
  To: cygwin

Hi Cygwin friends and users,


I released a TEST version of Cygwin.  The version number is 2.1.0-0.1.

This test release is mostly for interested *developers*.

The important news which needs some testing is the implementation of
sigaltstack(2) and the underlying implementation of running a signal
handler on the alternate signal stack.

Implementation details:

- The alternate signal stack installed via sigaltstack is only valid
  for the current thread.  Each thread must call its own sigaltstack.
  On pthread_create, the alternate signal stack setting of the calling
  thread is *not* propagated to the newly created thread.  This follows
  current Linux semantics.

- The alternate signal stack is a minimal stack.  Certain datastructures
  used by Cygwin (_cygtls area) and Windows (on 32 bit: exception records)
  are not copied over to the alternate signal stack.  The stack settings
  in the Thread Environment Block (TEB) are not reflecting the current
  alternate stack while running the signal handler.  The TEB will still
  point to the original thread stack.  This seems to work nicely in my
  testing, but there may be Windows functions which stop working in this
  scenario.

- The 32 bit version stores the original stack register content at the
  base of the alternate stack.  If you screw this up while running
  the signal handler, your thread is doomed on return to the caller.

- The 64 bit version stores the information in callee-saved registers
  r12 and r13 per MS-ABI.

I'd be grateful if curious developers would give this new sigaltstack
implementation a whirl and report back if it's working for them as
desired/expected.  And if not, simple reproducers in plain C are most
welcome in this case.  Discussing aspects of this implementation may be
best handled on the cygwin-developers mailing list or the
#cygwin-developers IRC channel on Freenode.


All changes in this release so far:
===================================

What's new:
-----------

- First cut of an implementation to allow signal handlers running on an
  alternate signal stack.
  
- New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
  MINSIGSTKSZ, SIGSTKSZ.

- New API: sethostname.


Bug Fixes
---------

- Enable non-SA_RESTART behaviour on threads other than main thread.
  Addresses: https://cygwin.com/ml/cygwin/2015-06/msg00260.html

- Try to handle concurrent close on socket more gracefully
  Addresses: https://cygwin.com/ml/cygwin/2015-06/msg00235.html


To install 32-bit Cygwin use https://cygwin.com/setup-x86.exe
To install 64 bit Cygwin use https://cygwin.com/setup-x86_64.exe


Have fun,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-20 21:15 [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1 Corinna Vinschen
@ 2015-06-21 18:47 ` Ken Brown
  2015-06-22 11:08   ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-06-21 18:47 UTC (permalink / raw)
  To: cygwin

On 6/20/2015 4:55 PM, Corinna Vinschen wrote:
> - First cut of an implementation to allow signal handlers running on an
>    alternate signal stack.
>
> - New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
>    MINSIGSTKSZ, SIGSTKSZ.

I must be doing something wrong.  Shouldn't including signal.h make the new API 
available?

$ uname -a
CYGWIN_NT-6.1-WOW fiona 2.1.0(0.287/5/3) 2015-06-20 21:44 i686 Cygwin

$ cygcheck -cd cygwin-devel
Cygwin Package Information
Package              Version
cygwin-devel         2.1.0-0.1

$ cat test.c
#include <signal.h>
int
main()
{
   int foo = SIGSTKSZ;
   return 0;
}

$ gcc test.c
test.c: In function â€˜mainâ€™:
test.c:6:13: error: â€˜SIGSTKSZâ€™ undeclared (first use in this function)
    int foo = SIGSTKSZ;
              ^

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-21 18:47 ` Ken Brown
@ 2015-06-22 11:08   ` Corinna Vinschen
  2015-06-26 11:12     ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-22 11:08 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1750 bytes --]

Hi Ken,

On Jun 21 14:47, Ken Brown wrote:
> On 6/20/2015 4:55 PM, Corinna Vinschen wrote:
> >- First cut of an implementation to allow signal handlers running on an
> >   alternate signal stack.
> >
> >- New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
> >   MINSIGSTKSZ, SIGSTKSZ.
> 
> I must be doing something wrong.  Shouldn't including signal.h make the new
> API available?
> 
> $ uname -a
> CYGWIN_NT-6.1-WOW fiona 2.1.0(0.287/5/3) 2015-06-20 21:44 i686 Cygwin
> 
> $ cygcheck -cd cygwin-devel
> Cygwin Package Information
> Package              Version
> cygwin-devel         2.1.0-0.1
> 
> $ cat test.c
> #include <signal.h>
> int
> main()
> {
>   int foo = SIGSTKSZ;
>   return 0;
> }
> 
> $ gcc test.c
> test.c: In function ‘main’:
> test.c:6:13: error: ‘SIGSTKSZ’ undeclared (first use in this function)
>    int foo = SIGSTKSZ;
>              ^

You're not doing anything wrong.  The relevant definitions in
sys/signal.h were originally only available for RTEMS.  I just
made them availbale for all platforms.  The problem was that the
orignal code missed to include sys/cdefs.h, which is required to
get the macros guarding the definitions.  I fixed that in the git
repo.

I also made a bigger change to code setting up the alternate stack when
calling the signal handler function.  It turned out that my code moving
to the new stack missed to safe all potentially clobbered volatile
registers on both platforms.

I'll upload new snapshots and 2.1.0-0.2 test releases shortly.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-22 11:08   ` Corinna Vinschen
@ 2015-06-26 11:12     ` Corinna Vinschen
  2015-06-26 12:02       ` Ken Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-26 11:12 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]

Hi Ken,

On Jun 22 13:08, Corinna Vinschen wrote:
> On Jun 21 14:47, Ken Brown wrote:
> > On 6/20/2015 4:55 PM, Corinna Vinschen wrote:
> > >- First cut of an implementation to allow signal handlers running on an
> > >   alternate signal stack.
> > >
> > >- New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
> > >   MINSIGSTKSZ, SIGSTKSZ.
> > 
> > I must be doing something wrong.  Shouldn't including signal.h make the new
> > API available?
> > 
> > $ uname -a
> > CYGWIN_NT-6.1-WOW fiona 2.1.0(0.287/5/3) 2015-06-20 21:44 i686 Cygwin
> > 
> > $ cygcheck -cd cygwin-devel
> > Cygwin Package Information
> > Package              Version
> > cygwin-devel         2.1.0-0.1
> > 
> > $ cat test.c
> > #include <signal.h>
> > int
> > main()
> > {
> >   int foo = SIGSTKSZ;
> >   return 0;
> > }
> > 
> > $ gcc test.c
> > test.c: In function ‘main’:
> > test.c:6:13: error: ‘SIGSTKSZ’ undeclared (first use in this function)
> >    int foo = SIGSTKSZ;
> >              ^
> 
> You're not doing anything wrong.  The relevant definitions in
> sys/signal.h were originally only available for RTEMS.  I just
> made them availbale for all platforms.  The problem was that the
> orignal code missed to include sys/cdefs.h, which is required to
> get the macros guarding the definitions.  I fixed that in the git
> repo.
> 
> I also made a bigger change to code setting up the alternate stack when
> calling the signal handler function.  It turned out that my code moving
> to the new stack missed to safe all potentially clobbered volatile
> registers on both platforms.
> 
> I'll upload new snapshots and 2.1.0-0.2 test releases shortly.

did you have a chance to test this a bit, in the meantime?


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 11:12     ` Corinna Vinschen
@ 2015-06-26 12:02       ` Ken Brown
  2015-06-26 14:14         ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-06-26 12:02 UTC (permalink / raw)
  To: cygwin

Hi Corinna,

On 6/26/2015 7:12 AM, Corinna Vinschen wrote:
> Hi Ken,
>
> On Jun 22 13:08, Corinna Vinschen wrote:
>> On Jun 21 14:47, Ken Brown wrote:
>>> On 6/20/2015 4:55 PM, Corinna Vinschen wrote:
>>>> - First cut of an implementation to allow signal handlers running on an
>>>>    alternate signal stack.
>>>>
>>>> - New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
>>>>    MINSIGSTKSZ, SIGSTKSZ.
>>>
>>> I must be doing something wrong.  Shouldn't including signal.h make the new
>>> API available?
>>>
>>> $ uname -a
>>> CYGWIN_NT-6.1-WOW fiona 2.1.0(0.287/5/3) 2015-06-20 21:44 i686 Cygwin
>>>
>>> $ cygcheck -cd cygwin-devel
>>> Cygwin Package Information
>>> Package              Version
>>> cygwin-devel         2.1.0-0.1
>>>
>>> $ cat test.c
>>> #include <signal.h>
>>> int
>>> main()
>>> {
>>>    int foo = SIGSTKSZ;
>>>    return 0;
>>> }
>>>
>>> $ gcc test.c
>>> test.c: In function â€˜mainâ€™:
>>> test.c:6:13: error: â€˜SIGSTKSZâ€™ undeclared (first use in this function)
>>>     int foo = SIGSTKSZ;
>>>               ^
>>
>> You're not doing anything wrong.  The relevant definitions in
>> sys/signal.h were originally only available for RTEMS.  I just
>> made them availbale for all platforms.  The problem was that the
>> orignal code missed to include sys/cdefs.h, which is required to
>> get the macros guarding the definitions.  I fixed that in the git
>> repo.
>>
>> I also made a bigger change to code setting up the alternate stack when
>> calling the signal handler function.  It turned out that my code moving
>> to the new stack missed to safe all potentially clobbered volatile
>> registers on both platforms.
>>
>> I'll upload new snapshots and 2.1.0-0.2 test releases shortly.
>
> did you have a chance to test this a bit, in the meantime?

Yes, but I don't have anything definitive to report yet.  I tried to test a 
facility in emacs that uses the alternate stack to recover from stack overflow 
(of the main stack) under some circumstances.  The configure script did detect 
the alternate stack.

I then made the stack overflow by defining an elisp function that did an 
infinite recursion.  emacs still crashed, but the "segmentation fault" message 
was printed twice instead of once.  I haven't had a chance yet to investigate 
further and try to see what's going on.  What I hope is that the alternate stack 
functioned correctly but the code was still not able to recover for some reason. 
  I've appended below the signal handler in case you want to see if you think it 
ought to work on Cygwin.

Ken

The signal handler:

/* Attempt to recover from SIGSEGV caused by C stack overflow.  */
static void
handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
{
   /* Hard GC error may lead to stack overflow caused by
      too nested calls to mark_object.  No way to survive.  */
   if (!gc_in_progress)
     {
       struct rlimit rlim;

       if (!getrlimit (RLIMIT_STACK, &rlim))
	{
	  enum { STACK_DANGER_ZONE = 16 * 1024 };
	  char *beg, *end, *addr;

	  beg = stack_bottom;
	  end = stack_bottom + stack_direction * rlim.rlim_cur;
	  if (beg > end)
	    addr = beg, beg = end, end = addr;
	  addr = (char *) siginfo->si_addr;
	  /* If we're somewhere on stack and too close to
	     one of its boundaries, most likely this is it.  */
	  if (beg < addr && addr < end
	      && (addr - beg < STACK_DANGER_ZONE
		  || end - addr < STACK_DANGER_ZONE))
	    siglongjmp (return_to_command_loop, 1);
	}
     }

   /* Otherwise we can't do anything with this.  */
   deliver_fatal_thread_signal (sig);
}

The code to set up the signal handler on the alternate stack:

static bool
init_sigsegv (void)
{
   struct sigaction sa;
   stack_t ss;

   stack_direction = ((char *) &ss < stack_bottom) ? -1 : 1;

   ss.ss_sp = sigsegv_stack;
   ss.ss_size = sizeof (sigsegv_stack);
   ss.ss_flags = 0;
   if (sigaltstack (&ss, NULL) < 0)
     return 0;

   sigfillset (&sa.sa_mask);
   sa.sa_sigaction = handle_sigsegv;
   sa.sa_flags = SA_SIGINFO | SA_ONSTACK | emacs_sigaction_flags ();
   return sigaction (SIGSEGV, &sa, NULL) < 0 ? 0 : 1;
}

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 12:02       ` Ken Brown
@ 2015-06-26 14:14         ` Corinna Vinschen
  2015-06-26 14:34           ` Ken Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-26 14:14 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3669 bytes --]

Hi Ken,

On Jun 26 08:02, Ken Brown wrote:
> On 6/26/2015 7:12 AM, Corinna Vinschen wrote:
> >On Jun 22 13:08, Corinna Vinschen wrote:
> >>On Jun 21 14:47, Ken Brown wrote:
> >>>On 6/20/2015 4:55 PM, Corinna Vinschen wrote:
> >>>>- First cut of an implementation to allow signal handlers running on an
> >>>>   alternate signal stack.
> >>>>
> >>>>- New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
> >>>>   MINSIGSTKSZ, SIGSTKSZ.
> >>>[...]
> >>[...]
> >did you have a chance to test this a bit, in the meantime?
> 
> Yes, but I don't have anything definitive to report yet.  I tried to test a
> facility in emacs that uses the alternate stack to recover from stack
> overflow (of the main stack) under some circumstances.  The configure script
> did detect the alternate stack.
> 
> I then made the stack overflow by defining an elisp function that did an
> infinite recursion.  emacs still crashed, but the "segmentation fault"
> message was printed twice instead of once.  I haven't had a chance yet to
> investigate further and try to see what's going on.  What I hope is that the
> alternate stack functioned correctly but the code was still not able to
> recover for some reason.  I've appended below the signal handler in case you
> want to see if you think it ought to work on Cygwin.

Thank you.  I'll try to test this in the next couple of days.  One hint
and one question:

> The signal handler:
> 
> /* Attempt to recover from SIGSEGV caused by C stack overflow.  */
> static void
> handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
> {
>   /* Hard GC error may lead to stack overflow caused by
>      too nested calls to mark_object.  No way to survive.  */
>   if (!gc_in_progress)
>     {
>       struct rlimit rlim;
> 
>       if (!getrlimit (RLIMIT_STACK, &rlim))

This getrlimit probably won't work as desired.  I just had a quick look
how this request is handled.  It will return the size of the alternate
stack while running the signal handler, rather than the size of the
initial thread's stack as required by POSIX.  This definitely needs
fixing.

> 	{
> 	  enum { STACK_DANGER_ZONE = 16 * 1024 };
> 	  char *beg, *end, *addr;
> 
> 	  beg = stack_bottom;
> 	  end = stack_bottom + stack_direction * rlim.rlim_cur;
> 	  if (beg > end)
> 	    addr = beg, beg = end, end = addr;
> 	  addr = (char *) siginfo->si_addr;
> 	  /* If we're somewhere on stack and too close to
> 	     one of its boundaries, most likely this is it.  */
> 	  if (beg < addr && addr < end
> 	      && (addr - beg < STACK_DANGER_ZONE
> 		  || end - addr < STACK_DANGER_ZONE))
> 	    siglongjmp (return_to_command_loop, 1);
> 	}
>     }
> 
>   /* Otherwise we can't do anything with this.  */
>   deliver_fatal_thread_signal (sig);
> }
> 
> The code to set up the signal handler on the alternate stack:
> 
> static bool
> init_sigsegv (void)
> {
>   struct sigaction sa;
>   stack_t ss;
> 
>   stack_direction = ((char *) &ss < stack_bottom) ? -1 : 1;
> 
>   ss.ss_sp = sigsegv_stack;
>   ss.ss_size = sizeof (sigsegv_stack);
                 ^^^^^^^^^^^^^^^^^^^^^^^

What's that size in bytes?

>   ss.ss_flags = 0;
>   if (sigaltstack (&ss, NULL) < 0)
>     return 0;
> 
>   sigfillset (&sa.sa_mask);
>   sa.sa_sigaction = handle_sigsegv;
>   sa.sa_flags = SA_SIGINFO | SA_ONSTACK | emacs_sigaction_flags ();
>   return sigaction (SIGSEGV, &sa, NULL) < 0 ? 0 : 1;
> }


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 14:14         ` Corinna Vinschen
@ 2015-06-26 14:34           ` Ken Brown
  2015-06-26 15:36             ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-06-26 14:34 UTC (permalink / raw)
  To: cygwin

Hi Corinna,

On 6/26/2015 10:14 AM, Corinna Vinschen wrote:
> Hi Ken,
>
> On Jun 26 08:02, Ken Brown wrote:
>> On 6/26/2015 7:12 AM, Corinna Vinschen wrote:
>>> On Jun 22 13:08, Corinna Vinschen wrote:
>>>> On Jun 21 14:47, Ken Brown wrote:
>>>>> On 6/20/2015 4:55 PM, Corinna Vinschen wrote:
>>>>>> - First cut of an implementation to allow signal handlers running on an
>>>>>>    alternate signal stack.
>>>>>>
>>>>>> - New API sigaltstack, plus definitions for SA_ONSTACK, SS_ONSTACK, SS_DISABLE,
>>>>>>    MINSIGSTKSZ, SIGSTKSZ.
>>>>> [...]
>>>> [...]
>>> did you have a chance to test this a bit, in the meantime?
>>
>> Yes, but I don't have anything definitive to report yet.  I tried to test a
>> facility in emacs that uses the alternate stack to recover from stack
>> overflow (of the main stack) under some circumstances.  The configure script
>> did detect the alternate stack.
>>
>> I then made the stack overflow by defining an elisp function that did an
>> infinite recursion.  emacs still crashed, but the "segmentation fault"
>> message was printed twice instead of once.  I haven't had a chance yet to
>> investigate further and try to see what's going on.  What I hope is that the
>> alternate stack functioned correctly but the code was still not able to
>> recover for some reason.  I've appended below the signal handler in case you
>> want to see if you think it ought to work on Cygwin.
>
> Thank you.  I'll try to test this in the next couple of days.  One hint
> and one question:
>
>> The signal handler:
>>
>> /* Attempt to recover from SIGSEGV caused by C stack overflow.  */
>> static void
>> handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
>> {
>>    /* Hard GC error may lead to stack overflow caused by
>>       too nested calls to mark_object.  No way to survive.  */
>>    if (!gc_in_progress)
>>      {
>>        struct rlimit rlim;
>>
>>        if (!getrlimit (RLIMIT_STACK, &rlim))
>
> This getrlimit probably won't work as desired.  I just had a quick look
> how this request is handled.  It will return the size of the alternate
> stack while running the signal handler, rather than the size of the
> initial thread's stack as required by POSIX.  This definitely needs
> fixing.
>
>> 	{
>> 	  enum { STACK_DANGER_ZONE = 16 * 1024 };
>> 	  char *beg, *end, *addr;
>>
>> 	  beg = stack_bottom;
>> 	  end = stack_bottom + stack_direction * rlim.rlim_cur;
>> 	  if (beg > end)
>> 	    addr = beg, beg = end, end = addr;
>> 	  addr = (char *) siginfo->si_addr;
>> 	  /* If we're somewhere on stack and too close to
>> 	     one of its boundaries, most likely this is it.  */
>> 	  if (beg < addr && addr < end
>> 	      && (addr - beg < STACK_DANGER_ZONE
>> 		  || end - addr < STACK_DANGER_ZONE))
>> 	    siglongjmp (return_to_command_loop, 1);
>> 	}
>>      }
>>
>>    /* Otherwise we can't do anything with this.  */
>>    deliver_fatal_thread_signal (sig);
>> }
>>
>> The code to set up the signal handler on the alternate stack:
>>
>> static bool
>> init_sigsegv (void)
>> {
>>    struct sigaction sa;
>>    stack_t ss;
>>
>>    stack_direction = ((char *) &ss < stack_bottom) ? -1 : 1;
>>
>>    ss.ss_sp = sigsegv_stack;
>>    ss.ss_size = sizeof (sigsegv_stack);
>                   ^^^^^^^^^^^^^^^^^^^^^^^
>
> What's that size in bytes?

SIGSTKSZ

>>    ss.ss_flags = 0;
>>    if (sigaltstack (&ss, NULL) < 0)
>>      return 0;
>>
>>    sigfillset (&sa.sa_mask);
>>    sa.sa_sigaction = handle_sigsegv;
>>    sa.sa_flags = SA_SIGINFO | SA_ONSTACK | emacs_sigaction_flags ();
>>    return sigaction (SIGSEGV, &sa, NULL) < 0 ? 0 : 1;

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 14:34           ` Ken Brown
@ 2015-06-26 15:36             ` Corinna Vinschen
  2015-06-26 16:55               ` Ken Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-26 15:36 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2439 bytes --]

Hi Ken,

On Jun 26 10:33, Ken Brown wrote:
> On 6/26/2015 10:14 AM, Corinna Vinschen wrote:
> >On Jun 26 08:02, Ken Brown wrote:
> >>On 6/26/2015 7:12 AM, Corinna Vinschen wrote:
> >Thank you.  I'll try to test this in the next couple of days.  One hint
> >and one question:
> >
> >>The signal handler:
> >>
> >>/* Attempt to recover from SIGSEGV caused by C stack overflow.  */
> >>static void
> >>handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
> >>{
> >>   /* Hard GC error may lead to stack overflow caused by
> >>      too nested calls to mark_object.  No way to survive.  */
> >>   if (!gc_in_progress)
> >>     {
> >>       struct rlimit rlim;
> >>
> >>       if (!getrlimit (RLIMIT_STACK, &rlim))
> >
> >This getrlimit probably won't work as desired.  I just had a quick look
> >how this request is handled.  It will return the size of the alternate
> >stack while running the signal handler, rather than the size of the
> >initial thread's stack as required by POSIX.  This definitely needs
> >fixing.
> >
> >>	{
> >>	  enum { STACK_DANGER_ZONE = 16 * 1024 };
> >>	  char *beg, *end, *addr;
> >>
> >>	  beg = stack_bottom;
> >>	  end = stack_bottom + stack_direction * rlim.rlim_cur;
> >>	  if (beg > end)
> >>	    addr = beg, beg = end, end = addr;
> >>	  addr = (char *) siginfo->si_addr;
> >>	  /* If we're somewhere on stack and too close to
> >>	     one of its boundaries, most likely this is it.  */
> >>	  if (beg < addr && addr < end
> >>	      && (addr - beg < STACK_DANGER_ZONE
> >>		  || end - addr < STACK_DANGER_ZONE))
> >>	    siglongjmp (return_to_command_loop, 1);
> >>	}
> >>     }
> >>
> >>   /* Otherwise we can't do anything with this.  */
> >>   deliver_fatal_thread_signal (sig);
> >>}
> >>
> >>The code to set up the signal handler on the alternate stack:
> >>
> >>static bool
> >>init_sigsegv (void)
> >>{
> >>   struct sigaction sa;
> >>   stack_t ss;
> >>
> >>   stack_direction = ((char *) &ss < stack_bottom) ? -1 : 1;
> >>
> >>   ss.ss_sp = sigsegv_stack;
> >>   ss.ss_size = sizeof (sigsegv_stack);
> >                  ^^^^^^^^^^^^^^^^^^^^^^^
> >
> >What's that size in bytes?
> 
> SIGSTKSZ

Thanks.  Another question:  How does emacs compute stack_bottom?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 15:36             ` Corinna Vinschen
@ 2015-06-26 16:55               ` Ken Brown
  2015-06-26 20:10                 ` Corinna Vinschen
  2015-06-26 20:26                 ` Corinna Vinschen
  0 siblings, 2 replies; 29+ messages in thread
From: Ken Brown @ 2015-06-26 16:55 UTC (permalink / raw)
  To: cygwin

Hi Corinna,

On 6/26/2015 11:36 AM, Corinna Vinschen wrote:
> Hi Ken,
>
> On Jun 26 10:33, Ken Brown wrote:
>> On 6/26/2015 10:14 AM, Corinna Vinschen wrote:
>>> On Jun 26 08:02, Ken Brown wrote:
>>>> On 6/26/2015 7:12 AM, Corinna Vinschen wrote:
>>> Thank you.  I'll try to test this in the next couple of days.  One hint
>>> and one question:
>>>
>>>> The signal handler:
>>>>
>>>> /* Attempt to recover from SIGSEGV caused by C stack overflow.  */
>>>> static void
>>>> handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
>>>> {
>>>>    /* Hard GC error may lead to stack overflow caused by
>>>>       too nested calls to mark_object.  No way to survive.  */
>>>>    if (!gc_in_progress)
>>>>      {
>>>>        struct rlimit rlim;
>>>>
>>>>        if (!getrlimit (RLIMIT_STACK, &rlim))
>>>
>>> This getrlimit probably won't work as desired.  I just had a quick look
>>> how this request is handled.  It will return the size of the alternate
>>> stack while running the signal handler, rather than the size of the
>>> initial thread's stack as required by POSIX.  This definitely needs
>>> fixing.
>>>
>>>> 	{
>>>> 	  enum { STACK_DANGER_ZONE = 16 * 1024 };
>>>> 	  char *beg, *end, *addr;
>>>>
>>>> 	  beg = stack_bottom;
>>>> 	  end = stack_bottom + stack_direction * rlim.rlim_cur;
>>>> 	  if (beg > end)
>>>> 	    addr = beg, beg = end, end = addr;
>>>> 	  addr = (char *) siginfo->si_addr;
>>>> 	  /* If we're somewhere on stack and too close to
>>>> 	     one of its boundaries, most likely this is it.  */
>>>> 	  if (beg < addr && addr < end
>>>> 	      && (addr - beg < STACK_DANGER_ZONE
>>>> 		  || end - addr < STACK_DANGER_ZONE))
>>>> 	    siglongjmp (return_to_command_loop, 1);
>>>> 	}
>>>>      }
>>>>
>>>>    /* Otherwise we can't do anything with this.  */
>>>>    deliver_fatal_thread_signal (sig);
>>>> }
>>>>
>>>> The code to set up the signal handler on the alternate stack:
>>>>
>>>> static bool
>>>> init_sigsegv (void)
>>>> {
>>>>    struct sigaction sa;
>>>>    stack_t ss;
>>>>
>>>>    stack_direction = ((char *) &ss < stack_bottom) ? -1 : 1;
>>>>
>>>>    ss.ss_sp = sigsegv_stack;
>>>>    ss.ss_size = sizeof (sigsegv_stack);
>>>                   ^^^^^^^^^^^^^^^^^^^^^^^
>>>
>>> What's that size in bytes?
>>
>> SIGSTKSZ
>
> Thanks.  Another question:  How does emacs compute stack_bottom?

Very near the beginning of main() it does the following:

   char stack_bottom_variable;
[...]
   /* Record (approximately) where the stack begins.  */
   stack_bottom = &stack_bottom_variable;

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 16:55               ` Ken Brown
@ 2015-06-26 20:10                 ` Corinna Vinschen
  2015-06-26 20:26                 ` Corinna Vinschen
  1 sibling, 0 replies; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-26 20:10 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1.1: Type: text/plain, Size: 3142 bytes --]

Hi Ken,

On Jun 26 12:55, Ken Brown wrote:
> Hi Corinna,
> 
> On 6/26/2015 11:36 AM, Corinna Vinschen wrote:
> >Thanks.  Another question:  How does emacs compute stack_bottom?
> 
> Very near the beginning of main() it does the following:
> 
>   char stack_bottom_variable;
> [...]
>   /* Record (approximately) where the stack begins.  */
>   stack_bottom = &stack_bottom_variable;

Thank you.

I created an STC with your code snippets and it now works for me
(attached for reference).

First problem was the return value of getrlimit(RLIMIT_STACK).

Second problem is emacs.  The check for an offset of the offending
address in si_addr being less than 16K (STACK_DANGER_ZONE) is
non-portable, putting it mildly.  This might work on 32 bit Cygwin (I
didn't test that), but the value is too low for 64 bit Cygwin.  With
STACK_DANGER_ZONE == 32K the handler works as desired on 64 bit Cygwin.
Part of the reason is probably the _cygtls area of 12K reserved on each
thread's stack, which moves the address of &stack_bottom_variable to a
pretty low value right from the start.  Another the size of the guard
page area on the main thread (16K).

I had a brief email exchange with a collegue of mine.  Ben allowed me to
quote him, so here are the important snippets of his replies:

- Rlimits are an old way of doing a job and they were to a certain
  extent tied up in the pre-thread world of unix processes.  rlimits
  have never been fully implemented on linux with a way that reproduces
  the unix way in the pre-thread era. rlimits have become a bit of a
  historical legacy and are there for posix compliance and code
  compatibility. The posix language was designed to be vague enough that
  all implementations could be made to conform.

- Rather than making the system implementation conform to some
  unspecified behavior, I think it might be a wise idea to fix emacs
  instead. Looking at the code fragment you posted below(*), I’m not
  entirely convinced that the code would operate as intended on modern
  Linux or Unix. Given that, it may be better to make an implementation
  which does something like the current behavior was intended to do or
  better yet just remove it as a likely latent bug.

(*) Emacs' handle_sigsegv function.

Of course, for testing purposes this is still nice to have, so thank you
for this test, I really appreciate it.

As for getrlimit(RLIMIT_STACK), I changed that as outlined in my former
mail in git.  On second thought, I also changed the values of
MINSIGSTKSZ and SIGSTKSZ.  Instead of 2K and 8K, they are now defined
as 32K and 64K.  The reason is that we then have enough space on the
alternate stack to install a _cygtls area, should the need arise.

I created new developer snapshots on https://cygwin.com/snapshots/
Please give them a try.

Remember to tweak STACK_DANGER_ZONE.  You'll have to rebuild emacs
anyway due to the change to [MIN]SIGSTKSZ.

Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #1.2: sigalt.c --]
[-- Type: text/plain, Size: 1833 bytes --]

#include <alloca.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <setjmp.h>
#include <sys/time.h>
#include <sys/resource.h>

int stack_direction;
char *stack_bottom;

sigjmp_buf return_to_command_loop;

/* Attempt to recover from SIGSEGV caused by C stack overflow.  */
static void
handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
{
  struct rlimit rlim;

  if (!getrlimit (RLIMIT_STACK, &rlim))
    {
      enum { STACK_DANGER_ZONE = 32 * 1024 };
      char *beg, *end, *addr;

      beg = stack_bottom;
      end = stack_bottom + stack_direction * rlim.rlim_cur;
      if (beg > end)
	addr = beg, beg = end, end = addr;
      addr = (char *) siginfo->si_addr;
      /* If we're somewhere on stack and too close to
	 one of its boundaries, most likely this is it.  */
      if (beg < addr && addr < end
	  && (addr - beg < STACK_DANGER_ZONE
	      || end - addr < STACK_DANGER_ZONE))
	siglongjmp (return_to_command_loop, 1);
    }
  /* Otherwise we can't do anything with this.  */
  abort ();
}

static int
init_sigsegv (void)
{
  struct sigaction sa;
  stack_t ss;

  stack_direction = ((char *) &ss < stack_bottom) ? -1 : 1;

  ss.ss_sp = malloc (SIGSTKSZ);
  ss.ss_size = SIGSTKSZ;
  ss.ss_flags = 0;
  if (sigaltstack (&ss, NULL) < 0)
    return 0;

  sigfillset (&sa.sa_mask);
  sa.sa_sigaction = handle_sigsegv;
  sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
  return sigaction (SIGSEGV, &sa, NULL) < 0 ? 0 : 1;
}

void foo ()
{
  int buf[512];
  foo ();
}

int
main ()
{
  char stack_bottom_variable;
  /* Record (approximately) where the stack begins.  */
  stack_bottom = &stack_bottom_variable;

  init_sigsegv ();
  if (!sigsetjmp (return_to_command_loop, 1))
    {
      printf ("command loop before crash\n");
      foo ();
    }
  else
    printf ("command loop after crash\n");
  return 0;
}

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 16:55               ` Ken Brown
  2015-06-26 20:10                 ` Corinna Vinschen
@ 2015-06-26 20:26                 ` Corinna Vinschen
  2015-06-26 22:28                   ` Ken Brown
  1 sibling, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-26 20:26 UTC (permalink / raw)
  To: cygwin; +Cc: Ben Woodard

[-- Attachment #1.1: Type: text/plain, Size: 3208 bytes --]

[CC Ben, please keep him on the CC in replies.  Thank you]

Hi Ken,

On Jun 26 12:55, Ken Brown wrote:
> Hi Corinna,
> 
> On 6/26/2015 11:36 AM, Corinna Vinschen wrote:
> >Thanks.  Another question:  How does emacs compute stack_bottom?
> 
> Very near the beginning of main() it does the following:
> 
>   char stack_bottom_variable;
> [...]
>   /* Record (approximately) where the stack begins.  */
>   stack_bottom = &stack_bottom_variable;

Thank you.

I created an STC with your code snippets and it now works for me
(attached for reference).

First problem was the return value of getrlimit(RLIMIT_STACK).

Second problem is emacs.  The check for an offset of the offending
address in si_addr being less than 16K (STACK_DANGER_ZONE) is
non-portable, putting it mildly.  This might work on 32 bit Cygwin (I
didn't test that), but the value is too low for 64 bit Cygwin.  With
STACK_DANGER_ZONE == 32K the handler works as desired on 64 bit Cygwin.
Part of the reason is probably the _cygtls area of 12K reserved on each
thread's stack, which moves the address of &stack_bottom_variable to a
pretty low value right from the start.  Another the size of the guard
page area on the main thread (16K).

I had a brief email exchange with a collegue of mine.  Ben allowed me to
quote him, so here are the important snippets of his replies:

- Rlimits are an old way of doing a job and they were to a certain
  extent tied up in the pre-thread world of unix processes.  rlimits
  have never been fully implemented on linux with a way that reproduces
  the unix way in the pre-thread era. rlimits have become a bit of a
  historical legacy and are there for posix compliance and code
  compatibility. The posix language was designed to be vague enough that
  all implementations could be made to conform.

- Rather than making the system implementation conform to some
  unspecified behavior, I think it might be a wise idea to fix emacs
  instead. Looking at the code fragment you posted below(*), I’m not
  entirely convinced that the code would operate as intended on modern
  Linux or Unix. Given that, it may be better to make an implementation
  which does something like the current behavior was intended to do or
  better yet just remove it as a likely latent bug.

(*) Emacs' handle_sigsegv function.

Of course, for testing purposes this is still nice to have, so thank you
for this test, I really appreciate it.

As for getrlimit(RLIMIT_STACK), I changed that as outlined in my former
mail in git.  On second thought, I also changed the values of
MINSIGSTKSZ and SIGSTKSZ.  Instead of 2K and 8K, they are now defined
as 32K and 64K.  The reason is that we then have enough space on the
alternate stack to install a _cygtls area, should the need arise.

I created new developer snapshots on https://cygwin.com/snapshots/
Please give them a try.

Remember to tweak STACK_DANGER_ZONE.  You'll have to rebuild emacs
anyway due to the change to [MIN]SIGSTKSZ.

Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #1.2: sigalt.c --]
[-- Type: text/plain, Size: 1833 bytes --]

#include <alloca.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <setjmp.h>
#include <sys/time.h>
#include <sys/resource.h>

int stack_direction;
char *stack_bottom;

sigjmp_buf return_to_command_loop;

/* Attempt to recover from SIGSEGV caused by C stack overflow.  */
static void
handle_sigsegv (int sig, siginfo_t *siginfo, void *arg)
{
  struct rlimit rlim;

  if (!getrlimit (RLIMIT_STACK, &rlim))
    {
      enum { STACK_DANGER_ZONE = 32 * 1024 };
      char *beg, *end, *addr;

      beg = stack_bottom;
      end = stack_bottom + stack_direction * rlim.rlim_cur;
      if (beg > end)
	addr = beg, beg = end, end = addr;
      addr = (char *) siginfo->si_addr;
      /* If we're somewhere on stack and too close to
	 one of its boundaries, most likely this is it.  */
      if (beg < addr && addr < end
	  && (addr - beg < STACK_DANGER_ZONE
	      || end - addr < STACK_DANGER_ZONE))
	siglongjmp (return_to_command_loop, 1);
    }
  /* Otherwise we can't do anything with this.  */
  abort ();
}

static int
init_sigsegv (void)
{
  struct sigaction sa;
  stack_t ss;

  stack_direction = ((char *) &ss < stack_bottom) ? -1 : 1;

  ss.ss_sp = malloc (SIGSTKSZ);
  ss.ss_size = SIGSTKSZ;
  ss.ss_flags = 0;
  if (sigaltstack (&ss, NULL) < 0)
    return 0;

  sigfillset (&sa.sa_mask);
  sa.sa_sigaction = handle_sigsegv;
  sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
  return sigaction (SIGSEGV, &sa, NULL) < 0 ? 0 : 1;
}

void foo ()
{
  int buf[512];
  foo ();
}

int
main ()
{
  char stack_bottom_variable;
  /* Record (approximately) where the stack begins.  */
  stack_bottom = &stack_bottom_variable;

  init_sigsegv ();
  if (!sigsetjmp (return_to_command_loop, 1))
    {
      printf ("command loop before crash\n");
      foo ();
    }
  else
    printf ("command loop after crash\n");
  return 0;
}

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 20:26                 ` Corinna Vinschen
@ 2015-06-26 22:28                   ` Ken Brown
  2015-06-27 14:53                     ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-06-26 22:28 UTC (permalink / raw)
  To: cygwin, Ben Woodard

On 6/26/2015 4:05 PM, Corinna Vinschen wrote:
>
> [CC Ben, please keep him on the CC in replies.  Thank you]
>
>
> Hi Ken,
>
> On Jun 26 12:55, Ken Brown wrote:
>> Hi Corinna,
>>
>> On 6/26/2015 11:36 AM, Corinna Vinschen wrote:
>>> Thanks.  Another question:  How does emacs compute stack_bottom?
>>
>> Very near the beginning of main() it does the following:
>>
>>    char stack_bottom_variable;
>> [...]
>>    /* Record (approximately) where the stack begins.  */
>>    stack_bottom = &stack_bottom_variable;
>
> Thank you.
>
> I created an STC with your code snippets and it now works for me
> (attached for reference).
>
> First problem was the return value of getrlimit(RLIMIT_STACK).
>
> Second problem is emacs.  The check for an offset of the offending
> address in si_addr being less than 16K (STACK_DANGER_ZONE) is
> non-portable, putting it mildly.  This might work on 32 bit Cygwin (I
> didn't test that), but the value is too low for 64 bit Cygwin.  With
> STACK_DANGER_ZONE == 32K the handler works as desired on 64 bit Cygwin.
> Part of the reason is probably the _cygtls area of 12K reserved on each
> thread's stack, which moves the address of &stack_bottom_variable to a
> pretty low value right from the start.  Another the size of the guard
> page area on the main thread (16K).
>
> I had a brief email exchange with a collegue of mine.  Ben allowed me to
> quote him, so here are the important snippets of his replies:
>
> - Rlimits are an old way of doing a job and they were to a certain
>    extent tied up in the pre-thread world of unix processes.  rlimits
>    have never been fully implemented on linux with a way that reproduces
>    the unix way in the pre-thread era. rlimits have become a bit of a
>    historical legacy and are there for posix compliance and code
>    compatibility. The posix language was designed to be vague enough that
>    all implementations could be made to conform.
>
> - Rather than making the system implementation conform to some
>    unspecified behavior, I think it might be a wise idea to fix emacs
>    instead. Looking at the code fragment you posted below(*), Iâ€™m not
>    entirely convinced that the code would operate as intended on modern
>    Linux or Unix. Given that, it may be better to make an implementation
>    which does something like the current behavior was intended to do or
>    better yet just remove it as a likely latent bug.
>
> (*) Emacs' handle_sigsegv function.
>
> Of course, for testing purposes this is still nice to have, so thank you
> for this test, I really appreciate it.
>
> As for getrlimit(RLIMIT_STACK), I changed that as outlined in my former
> mail in git.  On second thought, I also changed the values of
> MINSIGSTKSZ and SIGSTKSZ.  Instead of 2K and 8K, they are now defined
> as 32K and 64K.  The reason is that we then have enough space on the
> alternate stack to install a _cygtls area, should the need arise.
>
> I created new developer snapshots on https://cygwin.com/snapshots/
> Please give them a try.
>
> Remember to tweak STACK_DANGER_ZONE.  You'll have to rebuild emacs
> anyway due to the change to [MIN]SIGSTKSZ.

Hi Corinna and Ben,

It works now, in the sense that emacs doesn't crash, and it produces the 
message "Re-entering top level after C stack overflow".  I tested both 
32-bit and 64-bit Cygwin.  My test consisted of evaluating the following 
in the emacs *scratch* buffer:

(setq max-specpdl-size 83200000
       max-lisp-eval-depth 640000)
(defun foo () (foo))
(foo)

(The 'setq' is to override emacs's built-in protection against 
too-deeply nested lisp function calls.)

On the other hand, emacs doesn't really make a full recovery.  For 
example, if I try to call a subprocess (e.g., 'C-x d' to list a 
directory), I get a fork error:

Debugger entered--Lisp error: (file-error "Doing vfork" "Resource 
temporarily unavailable")
   call-process("ls" nil nil nil "--dired")
   dired-insert-directory("/home/kbrown/src/emacs/32build/" "-al" nil nil t)
   dired-readin-insert()
   dired-readin()
   dired-internal-noselect("~/src/emacs/32build/" nil)
   dired-noselect("~/src/emacs/32build/" nil)
   dired("~/src/emacs/32build/" nil)
   funcall-interactively(dired "~/src/emacs/32build/" nil)
   call-interactively(dired nil nil)
   command-execute(dired)

In view of what Ben said, I don't really care about this from the emacs 
point of view.  I mention it only in case it's useful to you for testing 
the alternate stack.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-26 22:28                   ` Ken Brown
@ 2015-06-27 14:53                     ` Corinna Vinschen
  2015-06-30 19:55                       ` Corinna Vinschen
  2015-07-14 22:19                       ` Eric Blake
  0 siblings, 2 replies; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-27 14:53 UTC (permalink / raw)
  To: cygwin; +Cc: Ben Woodard

[-- Attachment #1: Type: text/plain, Size: 1998 bytes --]

On Jun 26 18:28, Ken Brown wrote:
> On 6/26/2015 4:05 PM, Corinna Vinschen wrote:
> >As for getrlimit(RLIMIT_STACK), I changed that as outlined in my former
> >mail in git.  On second thought, I also changed the values of
> >MINSIGSTKSZ and SIGSTKSZ.  Instead of 2K and 8K, they are now defined
> >as 32K and 64K.  The reason is that we then have enough space on the
> >alternate stack to install a _cygtls area, should the need arise.
> >
> >I created new developer snapshots on https://cygwin.com/snapshots/
> >Please give them a try.
> >
> >Remember to tweak STACK_DANGER_ZONE.  You'll have to rebuild emacs
> >anyway due to the change to [MIN]SIGSTKSZ.
> 
> Hi Corinna and Ben,
> 
> It works now, in the sense that emacs doesn't crash, and it produces the
> message "Re-entering top level after C stack overflow".  I tested both
> 32-bit and 64-bit Cygwin.  My test consisted of evaluating the following in
> the emacs *scratch* buffer:
> 
> (setq max-specpdl-size 83200000
>       max-lisp-eval-depth 640000)
> (defun foo () (foo))
> (foo)
> 
> (The 'setq' is to override emacs's built-in protection against too-deeply
> nested lisp function calls.)
> 
> On the other hand, emacs doesn't really make a full recovery.  For example,
> if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
> fork error:
> 
> Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
> temporarily unavailable")

The problem is probably that there are still resources in use which
didn't get free'd.  I'll check next week if I can do anything about it.
Ideally with a simple testcase than emacs :}

> In view of what Ben said, I don't really care about this from the emacs
> point of view.  I mention it only in case it's useful to you for testing the
> alternate stack.

Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-27 14:53                     ` Corinna Vinschen
@ 2015-06-30 19:55                       ` Corinna Vinschen
  2015-06-30 20:13                         ` Ken Brown
  2015-07-14 22:19                       ` Eric Blake
  1 sibling, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-06-30 19:55 UTC (permalink / raw)
  To: cygwin, Ben Woodard

[-- Attachment #1: Type: text/plain, Size: 2745 bytes --]

On Jun 27 16:52, Corinna Vinschen wrote:
> On Jun 26 18:28, Ken Brown wrote:
> > On 6/26/2015 4:05 PM, Corinna Vinschen wrote:
> > >As for getrlimit(RLIMIT_STACK), I changed that as outlined in my former
> > >mail in git.  On second thought, I also changed the values of
> > >MINSIGSTKSZ and SIGSTKSZ.  Instead of 2K and 8K, they are now defined
> > >as 32K and 64K.  The reason is that we then have enough space on the
> > >alternate stack to install a _cygtls area, should the need arise.
> > >
> > >I created new developer snapshots on https://cygwin.com/snapshots/
> > >Please give them a try.
> > >
> > >Remember to tweak STACK_DANGER_ZONE.  You'll have to rebuild emacs
> > >anyway due to the change to [MIN]SIGSTKSZ.
> > 
> > Hi Corinna and Ben,
> > 
> > It works now, in the sense that emacs doesn't crash, and it produces the
> > message "Re-entering top level after C stack overflow".  I tested both
> > 32-bit and 64-bit Cygwin.  My test consisted of evaluating the following in
> > the emacs *scratch* buffer:
> > 
> > (setq max-specpdl-size 83200000
> >       max-lisp-eval-depth 640000)
> > (defun foo () (foo))
> > (foo)
> > 
> > (The 'setq' is to override emacs's built-in protection against too-deeply
> > nested lisp function calls.)
> > 
> > On the other hand, emacs doesn't really make a full recovery.  For example,
> > if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
> > fork error:
> > 
> > Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
> > temporarily unavailable")
> 
> The problem is probably that there are still resources in use which
> didn't get free'd.  I'll check next week if I can do anything about it.
> Ideally with a simple testcase than emacs :}

Just FYI, I don't know yet what happens exactly, but this has nothing
to do with the alternate stack.  The child process fails with a status
code 0xC00000FD, STATUS_STACK_OVERFLOW.  Which is kind of weird, given
that the stack overflow has been averted by calling siglongjmp.

I have a hunch.  The stack state in the parent is so that TEB::StackLimit
points into the topmost guard area which, when poked into, triggers the
stack overflow exception.  When forking, Cygwin performs exactly this:
It pokes into the stack to push the guard page out of the way, thus 
causing the stack memory to be commited, which in turn allows to copy
the stack content from parent to child.

Ok, I'm not sure if I can debug this soon, but at leats it's not
related to sigaltstack handling nor is it a regression.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-30 19:55                       ` Corinna Vinschen
@ 2015-06-30 20:13                         ` Ken Brown
  2015-07-01 10:47                           ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-06-30 20:13 UTC (permalink / raw)
  To: cygwin, Ben Woodard

On 6/30/2015 3:55 PM, Corinna Vinschen wrote:
> On Jun 27 16:52, Corinna Vinschen wrote:
>> On Jun 26 18:28, Ken Brown wrote:
>>> On 6/26/2015 4:05 PM, Corinna Vinschen wrote:
>>>> As for getrlimit(RLIMIT_STACK), I changed that as outlined in my former
>>>> mail in git.  On second thought, I also changed the values of
>>>> MINSIGSTKSZ and SIGSTKSZ.  Instead of 2K and 8K, they are now defined
>>>> as 32K and 64K.  The reason is that we then have enough space on the
>>>> alternate stack to install a _cygtls area, should the need arise.
>>>>
>>>> I created new developer snapshots on https://cygwin.com/snapshots/
>>>> Please give them a try.
>>>>
>>>> Remember to tweak STACK_DANGER_ZONE.  You'll have to rebuild emacs
>>>> anyway due to the change to [MIN]SIGSTKSZ.
>>>
>>> Hi Corinna and Ben,
>>>
>>> It works now, in the sense that emacs doesn't crash, and it produces the
>>> message "Re-entering top level after C stack overflow".  I tested both
>>> 32-bit and 64-bit Cygwin.  My test consisted of evaluating the following in
>>> the emacs *scratch* buffer:
>>>
>>> (setq max-specpdl-size 83200000
>>>        max-lisp-eval-depth 640000)
>>> (defun foo () (foo))
>>> (foo)
>>>
>>> (The 'setq' is to override emacs's built-in protection against too-deeply
>>> nested lisp function calls.)
>>>
>>> On the other hand, emacs doesn't really make a full recovery.  For example,
>>> if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
>>> fork error:
>>>
>>> Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
>>> temporarily unavailable")
>>
>> The problem is probably that there are still resources in use which
>> didn't get free'd.  I'll check next week if I can do anything about it.
>> Ideally with a simple testcase than emacs :}
>
> Just FYI, I don't know yet what happens exactly, but this has nothing
> to do with the alternate stack.  The child process fails with a status
> code 0xC00000FD, STATUS_STACK_OVERFLOW.  Which is kind of weird, given
> that the stack overflow has been averted by calling siglongjmp.
>
> I have a hunch.  The stack state in the parent is so that TEB::StackLimit
> points into the topmost guard area which, when poked into, triggers the
> stack overflow exception.  When forking, Cygwin performs exactly this:
> It pokes into the stack to push the guard page out of the way, thus
> causing the stack memory to be commited, which in turn allows to copy
> the stack content from parent to child.
>
> Ok, I'm not sure if I can debug this soon, but at leats it's not
> related to sigaltstack handling nor is it a regression.

Thanks for the info, that's good to know.  Just out of curiosity, were you able 
to modify your testcase for this, or did you test with emacs?

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-30 20:13                         ` Ken Brown
@ 2015-07-01 10:47                           ` Corinna Vinschen
  2015-07-01 13:57                             ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-07-01 10:47 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3145 bytes --]

On Jun 30 16:13, Ken Brown wrote:
> On 6/30/2015 3:55 PM, Corinna Vinschen wrote:
> >On Jun 27 16:52, Corinna Vinschen wrote:
> >>On Jun 26 18:28, Ken Brown wrote:
> >>>On 6/26/2015 4:05 PM, Corinna Vinschen wrote:
> >>>>As for getrlimit(RLIMIT_STACK), I changed that as outlined in my former
> >>>>mail in git.  On second thought, I also changed the values of
> >>>>MINSIGSTKSZ and SIGSTKSZ.  Instead of 2K and 8K, they are now defined
> >>>>as 32K and 64K.  The reason is that we then have enough space on the
> >>>>alternate stack to install a _cygtls area, should the need arise.
> >>>>
> >>>>I created new developer snapshots on https://cygwin.com/snapshots/
> >>>>Please give them a try.
> >>>>
> >>>>Remember to tweak STACK_DANGER_ZONE.  You'll have to rebuild emacs
> >>>>anyway due to the change to [MIN]SIGSTKSZ.
> >>>
> >>>Hi Corinna and Ben,
> >>>
> >>>It works now, in the sense that emacs doesn't crash, and it produces the
> >>>message "Re-entering top level after C stack overflow".  I tested both
> >>>32-bit and 64-bit Cygwin.  My test consisted of evaluating the following in
> >>>the emacs *scratch* buffer:
> >>>
> >>>(setq max-specpdl-size 83200000
> >>>       max-lisp-eval-depth 640000)
> >>>(defun foo () (foo))
> >>>(foo)
> >>>
> >>>(The 'setq' is to override emacs's built-in protection against too-deeply
> >>>nested lisp function calls.)
> >>>
> >>>On the other hand, emacs doesn't really make a full recovery.  For example,
> >>>if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
> >>>fork error:
> >>>
> >>>Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
> >>>temporarily unavailable")
> >>
> >>The problem is probably that there are still resources in use which
> >>didn't get free'd.  I'll check next week if I can do anything about it.
> >>Ideally with a simple testcase than emacs :}
> >
> >Just FYI, I don't know yet what happens exactly, but this has nothing
> >to do with the alternate stack.  The child process fails with a status
> >code 0xC00000FD, STATUS_STACK_OVERFLOW.  Which is kind of weird, given
> >that the stack overflow has been averted by calling siglongjmp.
> >
> >I have a hunch.  The stack state in the parent is so that TEB::StackLimit
> >points into the topmost guard area which, when poked into, triggers the
> >stack overflow exception.  When forking, Cygwin performs exactly this:
> >It pokes into the stack to push the guard page out of the way, thus
> >causing the stack memory to be commited, which in turn allows to copy
> >the stack content from parent to child.
> >
> >Ok, I'm not sure if I can debug this soon, but at leats it's not
> >related to sigaltstack handling nor is it a regression.
> 
> Thanks for the info, that's good to know.  Just out of curiosity, were you
> able to modify your testcase for this, or did you test with emacs?

I just added a fork call to my testcase right after the last printf.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-01 10:47                           ` Corinna Vinschen
@ 2015-07-01 13:57                             ` Corinna Vinschen
  2015-07-01 20:12                               ` Ken Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-07-01 13:57 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]

On Jul  1 12:47, Corinna Vinschen wrote:
> On Jun 30 16:13, Ken Brown wrote:
> > On 6/30/2015 3:55 PM, Corinna Vinschen wrote:
> > >On Jun 27 16:52, Corinna Vinschen wrote:
> > >>On Jun 26 18:28, Ken Brown wrote:
> > >>>On the other hand, emacs doesn't really make a full recovery.  For example,
> > >>>if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
> > >>>fork error:
> > >>>
> > >>>Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
> > >>>temporarily unavailable")
> > >> [...]
> > >Just FYI, I don't know yet what happens exactly, but this has nothing
> > >to do with the alternate stack.  The child process fails with a status
> > >code 0xC00000FD, STATUS_STACK_OVERFLOW.  Which is kind of weird, given
> > >that the stack overflow has been averted by calling siglongjmp.
> > >
> > >I have a hunch.  The stack state in the parent is so that TEB::StackLimit
> > >points into the topmost guard area which, when poked into, triggers the
> > >stack overflow exception.  When forking, Cygwin performs exactly this:
> > >It pokes into the stack to push the guard page out of the way, thus
> > >causing the stack memory to be commited, which in turn allows to copy
> > >the stack content from parent to child.
> > >
> > >Ok, I'm not sure if I can debug this soon, but at leats it's not
> > >related to sigaltstack handling nor is it a regression.
> > 
> > Thanks for the info, that's good to know.  Just out of curiosity, were you
> > able to modify your testcase for this, or did you test with emacs?
> 
> I just added a fork call to my testcase right after the last printf.

My hunch was correct, apparently.  I changed the way the stack info
is set up for the child so only the actually used part of the stack is
prepared for the stack copy in the child.  This not only avoids the
stack overflow in the child, it should shave a few nanoseconds from
the time a fork takes ;)

I uploaded new developer snapshots to https://cygwin.com/snapshots/ and
I'm just building and uploading a new test release.

Please give it another try.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-01 13:57                             ` Corinna Vinschen
@ 2015-07-01 20:12                               ` Ken Brown
  2015-07-02  2:10                                 ` Ken Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-07-01 20:12 UTC (permalink / raw)
  To: cygwin

On 7/1/2015 9:57 AM, Corinna Vinschen wrote:
> On Jul  1 12:47, Corinna Vinschen wrote:
>> On Jun 30 16:13, Ken Brown wrote:
>>> On 6/30/2015 3:55 PM, Corinna Vinschen wrote:
>>>> On Jun 27 16:52, Corinna Vinschen wrote:
>>>>> On Jun 26 18:28, Ken Brown wrote:
>>>>>> On the other hand, emacs doesn't really make a full recovery.  For example,
>>>>>> if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
>>>>>> fork error:
>>>>>>
>>>>>> Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
>>>>>> temporarily unavailable")
>>>>> [...]
>>>> Just FYI, I don't know yet what happens exactly, but this has nothing
>>>> to do with the alternate stack.  The child process fails with a status
>>>> code 0xC00000FD, STATUS_STACK_OVERFLOW.  Which is kind of weird, given
>>>> that the stack overflow has been averted by calling siglongjmp.
>>>>
>>>> I have a hunch.  The stack state in the parent is so that TEB::StackLimit
>>>> points into the topmost guard area which, when poked into, triggers the
>>>> stack overflow exception.  When forking, Cygwin performs exactly this:
>>>> It pokes into the stack to push the guard page out of the way, thus
>>>> causing the stack memory to be commited, which in turn allows to copy
>>>> the stack content from parent to child.
>>>>
>>>> Ok, I'm not sure if I can debug this soon, but at leats it's not
>>>> related to sigaltstack handling nor is it a regression.
>>>
>>> Thanks for the info, that's good to know.  Just out of curiosity, were you
>>> able to modify your testcase for this, or did you test with emacs?
>>
>> I just added a fork call to my testcase right after the last printf.
>
> My hunch was correct, apparently.  I changed the way the stack info
> is set up for the child so only the actually used part of the stack is
> prepared for the stack copy in the child.  This not only avoids the
> stack overflow in the child, it should shave a few nanoseconds from
> the time a fork takes ;)
>
> I uploaded new developer snapshots to https://cygwin.com/snapshots/ and
> I'm just building and uploading a new test release.
>
> Please give it another try.

That fixes it.  Thanks!

Ken


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-01 20:12                               ` Ken Brown
@ 2015-07-02  2:10                                 ` Ken Brown
  2015-07-02 12:13                                   ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-07-02  2:10 UTC (permalink / raw)
  To: cygwin

On 7/1/2015 4:12 PM, Ken Brown wrote:
> On 7/1/2015 9:57 AM, Corinna Vinschen wrote:
>> On Jul  1 12:47, Corinna Vinschen wrote:
>>> On Jun 30 16:13, Ken Brown wrote:
>>>> On 6/30/2015 3:55 PM, Corinna Vinschen wrote:
>>>>> On Jun 27 16:52, Corinna Vinschen wrote:
>>>>>> On Jun 26 18:28, Ken Brown wrote:
>>>>>>> On the other hand, emacs doesn't really make a full recovery.  For example,
>>>>>>> if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
>>>>>>> fork error:
>>>>>>>
>>>>>>> Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
>>>>>>> temporarily unavailable")
>>>>>> [...]
>>>>> Just FYI, I don't know yet what happens exactly, but this has nothing
>>>>> to do with the alternate stack.  The child process fails with a status
>>>>> code 0xC00000FD, STATUS_STACK_OVERFLOW.  Which is kind of weird, given
>>>>> that the stack overflow has been averted by calling siglongjmp.
>>>>>
>>>>> I have a hunch.  The stack state in the parent is so that TEB::StackLimit
>>>>> points into the topmost guard area which, when poked into, triggers the
>>>>> stack overflow exception.  When forking, Cygwin performs exactly this:
>>>>> It pokes into the stack to push the guard page out of the way, thus
>>>>> causing the stack memory to be commited, which in turn allows to copy
>>>>> the stack content from parent to child.
>>>>>
>>>>> Ok, I'm not sure if I can debug this soon, but at leats it's not
>>>>> related to sigaltstack handling nor is it a regression.
>>>>
>>>> Thanks for the info, that's good to know.  Just out of curiosity, were you
>>>> able to modify your testcase for this, or did you test with emacs?
>>>
>>> I just added a fork call to my testcase right after the last printf.
>>
>> My hunch was correct, apparently.  I changed the way the stack info
>> is set up for the child so only the actually used part of the stack is
>> prepared for the stack copy in the child.  This not only avoids the
>> stack overflow in the child, it should shave a few nanoseconds from
>> the time a fork takes ;)
>>
>> I uploaded new developer snapshots to https://cygwin.com/snapshots/ and
>> I'm just building and uploading a new test release.
>>
>> Please give it another try.
>
> That fixes it.  Thanks!

I may have spoken too soon.  As I repeat the experiment on a different computer, 
with a build from a slightly different snapshot of the emacs trunk, emacs 
crashes when I type 'C-x d' with the following stack dump:

Stack trace:
Frame        Function    Args
00100A3E240  00180071CC3 (00000829630, 000008296D0, 00000000000, 0000082CE00)
00030000002  001800732BE (00000000000, 00000000002, 00100A48C80, 00000000002)
00000000000  00000006B40 (00000000002, 00100A48C80, 00000000002, 00100A48768)
00000000000  21000000003 (00000000002, 00100A48C80, 00000000002, 00100A48768)
End of stack trace

$ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175

$ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639

The next two days are pretty busy for me, but I'll try to provide further 
information as soon as I have a chance.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-02  2:10                                 ` Ken Brown
@ 2015-07-02 12:13                                   ` Corinna Vinschen
  2015-07-02 12:20                                     ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-07-02 12:13 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 4248 bytes --]

On Jul  1 22:10, Ken Brown wrote:
> On 7/1/2015 4:12 PM, Ken Brown wrote:
> >On 7/1/2015 9:57 AM, Corinna Vinschen wrote:
> >>On Jul  1 12:47, Corinna Vinschen wrote:
> >>>On Jun 30 16:13, Ken Brown wrote:
> >>>>On 6/30/2015 3:55 PM, Corinna Vinschen wrote:
> >>>>>On Jun 27 16:52, Corinna Vinschen wrote:
> >>>>>>On Jun 26 18:28, Ken Brown wrote:
> >>>>>>>On the other hand, emacs doesn't really make a full recovery.  For example,
> >>>>>>>if I try to call a subprocess (e.g., 'C-x d' to list a directory), I get a
> >>>>>>>fork error:
> >>>>>>>
> >>>>>>>Debugger entered--Lisp error: (file-error "Doing vfork" "Resource
> >>>>>>>temporarily unavailable")
> >>>>>>[...]
> >>>>>Just FYI, I don't know yet what happens exactly, but this has nothing
> >>>>>to do with the alternate stack.  The child process fails with a status
> >>>>>code 0xC00000FD, STATUS_STACK_OVERFLOW.  Which is kind of weird, given
> >>>>>that the stack overflow has been averted by calling siglongjmp.
> >>>>>
> >>>>>I have a hunch.  The stack state in the parent is so that TEB::StackLimit
> >>>>>points into the topmost guard area which, when poked into, triggers the
> >>>>>stack overflow exception.  When forking, Cygwin performs exactly this:
> >>>>>It pokes into the stack to push the guard page out of the way, thus
> >>>>>causing the stack memory to be commited, which in turn allows to copy
> >>>>>the stack content from parent to child.
> >>>>>
> >>>>>Ok, I'm not sure if I can debug this soon, but at leats it's not
> >>>>>related to sigaltstack handling nor is it a regression.
> >>>>
> >>>>Thanks for the info, that's good to know.  Just out of curiosity, were you
> >>>>able to modify your testcase for this, or did you test with emacs?
> >>>
> >>>I just added a fork call to my testcase right after the last printf.
> >>
> >>My hunch was correct, apparently.  I changed the way the stack info
> >>is set up for the child so only the actually used part of the stack is
> >>prepared for the stack copy in the child.  This not only avoids the
> >>stack overflow in the child, it should shave a few nanoseconds from
> >>the time a fork takes ;)
> >>
> >>I uploaded new developer snapshots to https://cygwin.com/snapshots/ and
> >>I'm just building and uploading a new test release.
> >>
> >>Please give it another try.
> >
> >That fixes it.  Thanks!
> 
> I may have spoken too soon.  As I repeat the experiment on a different
> computer, with a build from a slightly different snapshot of the emacs
> trunk, emacs crashes when I type 'C-x d' with the following stack dump:
> 
> Stack trace:
> Frame        Function    Args
> 00100A3E240  00180071CC3 (00000829630, 000008296D0, 00000000000, 0000082CE00)
> 00030000002  001800732BE (00000000000, 00000000002, 00100A48C80, 00000000002)
> 00000000000  00000006B40 (00000000002, 00100A48C80, 00000000002, 00100A48768)
> 00000000000  21000000003 (00000000002, 00100A48C80, 00000000002, 00100A48768)
> End of stack trace
> 
> $ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
> 
> $ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639

That points to a crash while setting up the alternate stack.  This is
always a possibility because, in contrast to the kernel signal handler
in a real POSIX system, the Cygwin exception handler is still running on
the stack which triggered the crash up to the point where we call the
signal handler function.  Dependent on how the stack overflow occured,
this additional stack usage may be enough to kill the process for good.

Out of curiosity, can you add this to the init_sigsegv() function:

  #include <windows.h>
  [...]
  init_sigsegv (void)
  {
    [...]
    SetThreadStackGuarantee (65536);
    [...]
  }

And see if that "fixes" the crash?

> The next two days are pretty busy for me, but I'll try to provide further
> information as soon as I have a chance.

Thanks a lot,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-02 12:13                                   ` Corinna Vinschen
@ 2015-07-02 12:20                                     ` Corinna Vinschen
  2015-07-02 19:25                                       ` Ken Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Corinna Vinschen @ 2015-07-02 12:20 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1999 bytes --]

On Jul  2 14:13, Corinna Vinschen wrote:
> On Jul  1 22:10, Ken Brown wrote:
> > I may have spoken too soon.  As I repeat the experiment on a different
> > computer, with a build from a slightly different snapshot of the emacs
> > trunk, emacs crashes when I type 'C-x d' with the following stack dump:
> > 
> > Stack trace:
> > Frame        Function    Args
> > 00100A3E240  00180071CC3 (00000829630, 000008296D0, 00000000000, 0000082CE00)
> > 00030000002  001800732BE (00000000000, 00000000002, 00100A48C80, 00000000002)
> > 00000000000  00000006B40 (00000000002, 00100A48C80, 00000000002, 00100A48768)
> > 00000000000  21000000003 (00000000002, 00100A48C80, 00000000002, 00100A48768)
> > End of stack trace
> > 
> > $ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
> > /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
> > 
> > $ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
> > /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639
> 
> That points to a crash while setting up the alternate stack.  This is
> always a possibility because, in contrast to the kernel signal handler
> in a real POSIX system, the Cygwin exception handler is still running on
> the stack which triggered the crash up to the point where we call the
> signal handler function.  Dependent on how the stack overflow occured,
> this additional stack usage may be enough to kill the process for good.
> 
> Out of curiosity, can you add this to the init_sigsegv() function:
> 
>   #include <windows.h>
>   [...]
>   init_sigsegv (void)
>   {
>     [...]
>     SetThreadStackGuarantee (65536);

Of course this only works "per thread", so if init_sigsegv is called
for the main thread, only the main thread gets this treatment.  For
testing this should be enough, though.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-02 12:20                                     ` Corinna Vinschen
@ 2015-07-02 19:25                                       ` Ken Brown
  2015-07-03 10:47                                         ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-07-02 19:25 UTC (permalink / raw)
  To: cygwin

On 7/2/2015 8:20 AM, Corinna Vinschen wrote:
> On Jul  2 14:13, Corinna Vinschen wrote:
>> On Jul  1 22:10, Ken Brown wrote:
>>> I may have spoken too soon.  As I repeat the experiment on a different
>>> computer, with a build from a slightly different snapshot of the emacs
>>> trunk, emacs crashes when I type 'C-x d' with the following stack dump:
>>>
>>> Stack trace:
>>> Frame        Function    Args
>>> 00100A3E240  00180071CC3 (00000829630, 000008296D0, 00000000000, 0000082CE00)
>>> 00030000002  001800732BE (00000000000, 00000000002, 00100A48C80, 00000000002)
>>> 00000000000  00000006B40 (00000000002, 00100A48C80, 00000000002, 00100A48768)
>>> 00000000000  21000000003 (00000000002, 00100A48C80, 00000000002, 00100A48768)
>>> End of stack trace
>>>
>>> $ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
>>> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
>>>
>>> $ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
>>> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639
>>
>> That points to a crash while setting up the alternate stack.  This is
>> always a possibility because, in contrast to the kernel signal handler
>> in a real POSIX system, the Cygwin exception handler is still running on
>> the stack which triggered the crash up to the point where we call the
>> signal handler function.  Dependent on how the stack overflow occured,
>> this additional stack usage may be enough to kill the process for good.
>>
>> Out of curiosity, can you add this to the init_sigsegv() function:
>>
>>    #include <windows.h>
>>    [...]
>>    init_sigsegv (void)
>>    {
>>      [...]
>>      SetThreadStackGuarantee (65536);
>
> Of course this only works "per thread", so if init_sigsegv is called
> for the main thread, only the main thread gets this treatment.  For
> testing this should be enough, though.

That didn't make any difference.  But I do have a little more information.  I 
tried running emacs under gdb with a breakpoint at handle_sigsegv.  The 
breakpoint is hit when I deliberately trigger the stack overflow.  Then I 
continue, emacs says it has recovered from the stack overflow, and I type 'C-x 
d'.  At this point there's a second SIGSEGV and handle_sigsegv is called again. 
  But this time garbage collection is in progress, and handle_sigsegv just gives up.

I don't know what caused the second SIGSEGV but I'll try to figure that out when 
I next have a chance to look at this.  I also don't know why the stack dump 
pointed to a crash while setting up the alternate stack, since the fatal crash 
actually seems to have happened later.  But maybe the stack was just completely 
messed up after the second SIGSEGV and the stack dump can't be trusted.

More later.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-02 19:25                                       ` Ken Brown
@ 2015-07-03 10:47                                         ` Corinna Vinschen
  2015-07-03 10:50                                           ` Corinna Vinschen
  2015-07-03 13:09                                           ` Ken Brown
  0 siblings, 2 replies; 29+ messages in thread
From: Corinna Vinschen @ 2015-07-03 10:47 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3896 bytes --]

On Jul  2 15:25, Ken Brown wrote:
> On 7/2/2015 8:20 AM, Corinna Vinschen wrote:
> >On Jul  2 14:13, Corinna Vinschen wrote:
> >>On Jul  1 22:10, Ken Brown wrote:
> >>>I may have spoken too soon.  As I repeat the experiment on a different
> >>>computer, with a build from a slightly different snapshot of the emacs
> >>>trunk, emacs crashes when I type 'C-x d' with the following stack dump:
> >>>
> >>>Stack trace:
> >>>Frame        Function    Args
> >>>00100A3E240  00180071CC3 (00000829630, 000008296D0, 00000000000, 0000082CE00)
> >>>00030000002  001800732BE (00000000000, 00000000002, 00100A48C80, 00000000002)
> >>>00000000000  00000006B40 (00000000002, 00100A48C80, 00000000002, 00100A48768)
> >>>00000000000  21000000003 (00000000002, 00100A48C80, 00000000002, 00100A48768)
> >>>End of stack trace
> >>>
> >>>$ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
> >>>/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
> >>>
> >>>$ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
> >>>/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639
> >>
> >>That points to a crash while setting up the alternate stack.  This is
> >>always a possibility because, in contrast to the kernel signal handler
> >>in a real POSIX system, the Cygwin exception handler is still running on
> >>the stack which triggered the crash up to the point where we call the
> >>signal handler function.  Dependent on how the stack overflow occured,
> >>this additional stack usage may be enough to kill the process for good.
> >>
> >>Out of curiosity, can you add this to the init_sigsegv() function:
> >>
> >>   #include <windows.h>
> >>   [...]
> >>   init_sigsegv (void)
> >>   {
> >>     [...]
> >>     SetThreadStackGuarantee (65536);
> >
> >Of course this only works "per thread", so if init_sigsegv is called
> >for the main thread, only the main thread gets this treatment.  For
> >testing this should be enough, though.
> 
> That didn't make any difference.

It should have.  If you don't also tweak STACK_DANGER_ZONE accordingly,
handle_sigsegv should fail to call siglongjmp.  Either way, I tested
it locally as well, and it doesn't work.

In the meantime I found that there's another problem.  Assuming you
longjmp out of handle_sigsegv, the stack will still be "broken".
It doesn't have the usual guard pages anymore, and the next time
you have a stack overflow, NTDLL will simply terminate the process.

I create a wrapper function which resets the stack so it has valid guard
pages again and then the stack overflow can be handled repeatedly.

While I was at it, I found that the setup for pthread stacks is not
quite right, either, so right now I'm hacking on this stuff to make
it behave as expected in the usual cases.

> But I do have a little more information.
> I tried running emacs under gdb with a breakpoint at handle_sigsegv.  The
> breakpoint is hit when I deliberately trigger the stack overflow.  Then I
> continue, emacs says it has recovered from the stack overflow, and I type
> 'C-x d'.  At this point there's a second SIGSEGV and handle_sigsegv is
> called again.  But this time garbage collection is in progress, and
> handle_sigsegv just gives up.

Sounds right to me.

> I don't know what caused the second SIGSEGV but I'll try to figure that out
> when I next have a chance to look at this.  I also don't know why the stack
> dump pointed to a crash while setting up the alternate stack, since the
> fatal crash actually seems to have happened later.  But maybe the stack was
> just completely messed up after the second SIGSEGV and the stack dump can't
> be trusted.
> 
> More later.

Thanks!


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-03 10:47                                         ` Corinna Vinschen
@ 2015-07-03 10:50                                           ` Corinna Vinschen
  2015-07-03 13:09                                           ` Ken Brown
  1 sibling, 0 replies; 29+ messages in thread
From: Corinna Vinschen @ 2015-07-03 10:50 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 703 bytes --]

On Jul  3 12:47, Corinna Vinschen wrote:
> In the meantime I found that there's another problem.  Assuming you
> longjmp out of handle_sigsegv, the stack will still be "broken".
> It doesn't have the usual guard pages anymore, and the next time
> you have a stack overflow, NTDLL will simply terminate the process.
> 
> I create a wrapper function which resets the stack so it has valid guard
> pages again and then the stack overflow can be handled repeatedly.

s/create/created/

I didn't push it yet, though.  Still hacking.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-03 10:47                                         ` Corinna Vinschen
  2015-07-03 10:50                                           ` Corinna Vinschen
@ 2015-07-03 13:09                                           ` Ken Brown
  2015-07-14 19:10                                             ` Ken Brown
  1 sibling, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-07-03 13:09 UTC (permalink / raw)
  To: cygwin

On 7/3/2015 6:47 AM, Corinna Vinschen wrote:
> On Jul  2 15:25, Ken Brown wrote:
>> On 7/2/2015 8:20 AM, Corinna Vinschen wrote:
>>> On Jul  2 14:13, Corinna Vinschen wrote:
>>>> On Jul  1 22:10, Ken Brown wrote:
>>>>> I may have spoken too soon.  As I repeat the experiment on a different
>>>>> computer, with a build from a slightly different snapshot of the emacs
>>>>> trunk, emacs crashes when I type 'C-x d' with the following stack dump:
>>>>>
>>>>> Stack trace:
>>>>> Frame        Function    Args
>>>>> 00100A3E240  00180071CC3 (00000829630, 000008296D0, 00000000000, 0000082CE00)
>>>>> 00030000002  001800732BE (00000000000, 00000000002, 00100A48C80, 00000000002)
>>>>> 00000000000  00000006B40 (00000000002, 00100A48C80, 00000000002, 00100A48768)
>>>>> 00000000000  21000000003 (00000000002, 00100A48C80, 00000000002, 00100A48768)
>>>>> End of stack trace
>>>>>
>>>>> $ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
>>>>> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
>>>>>
>>>>> $ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
>>>>> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639
>>>>
>>>> That points to a crash while setting up the alternate stack.  This is
>>>> always a possibility because, in contrast to the kernel signal handler
>>>> in a real POSIX system, the Cygwin exception handler is still running on
>>>> the stack which triggered the crash up to the point where we call the
>>>> signal handler function.  Dependent on how the stack overflow occured,
>>>> this additional stack usage may be enough to kill the process for good.
>>>>
>>>> Out of curiosity, can you add this to the init_sigsegv() function:
>>>>
>>>>    #include <windows.h>
>>>>    [...]
>>>>    init_sigsegv (void)
>>>>    {
>>>>      [...]
>>>>      SetThreadStackGuarantee (65536);
>>>
>>> Of course this only works "per thread", so if init_sigsegv is called
>>> for the main thread, only the main thread gets this treatment.  For
>>> testing this should be enough, though.
>>
>> That didn't make any difference.
>
> It should have.  If you don't also tweak STACK_DANGER_ZONE accordingly,
> handle_sigsegv should fail to call siglongjmp.  Either way, I tested
> it locally as well, and it doesn't work.
>
> In the meantime I found that there's another problem.  Assuming you
> longjmp out of handle_sigsegv, the stack will still be "broken".
> It doesn't have the usual guard pages anymore, and the next time
> you have a stack overflow, NTDLL will simply terminate the process.
>
> I create a wrapper function which resets the stack so it has valid guard
> pages again and then the stack overflow can be handled repeatedly.
>
> While I was at it, I found that the setup for pthread stacks is not
> quite right, either, so right now I'm hacking on this stuff to make
> it behave as expected in the usual cases.
>
>> But I do have a little more information.
>> I tried running emacs under gdb with a breakpoint at handle_sigsegv.  The
>> breakpoint is hit when I deliberately trigger the stack overflow.  Then I
>> continue, emacs says it has recovered from the stack overflow, and I type
>> 'C-x d'.  At this point there's a second SIGSEGV and handle_sigsegv is
>> called again.  But this time garbage collection is in progress, and
>> handle_sigsegv just gives up.
>
> Sounds right to me.
>
>> I don't know what caused the second SIGSEGV but I'll try to figure that out
>> when I next have a chance to look at this.  I also don't know why the stack
>> dump pointed to a crash while setting up the alternate stack, since the
>> fatal crash actually seems to have happened later.  But maybe the stack was
>> just completely messed up after the second SIGSEGV and the stack dump can't
>> be trusted.

I think I found the cause of that second SIGSEGV, and, if I'm right, it has 
nothing to do with Cygwin.  I think the problem was that in my testing, I forgot 
to reset max-specpdl-size and max-lisp-eval-depth to reasonable values after the 
recovery from stack overflow.  If I do that, then I can no longer reproduce the 
crash.

For the record, here's my complete elisp test case:

(setq max-specpdl-size 83200000
       max-lisp-eval-depth 640000)
(defun foo () (foo))
(foo)
;; The stack has now overflowed, and emacs has recovered.
(setq max-specpdl-size 1300
       max-lisp-eval-depth 800)
;; Can now continue working.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-03 13:09                                           ` Ken Brown
@ 2015-07-14 19:10                                             ` Ken Brown
  2015-07-14 19:16                                               ` Corinna Vinschen
  0 siblings, 1 reply; 29+ messages in thread
From: Ken Brown @ 2015-07-14 19:10 UTC (permalink / raw)
  To: cygwin

On 7/3/2015 9:09 AM, Ken Brown wrote:
> On 7/3/2015 6:47 AM, Corinna Vinschen wrote:
>> On Jul  2 15:25, Ken Brown wrote:
>>> On 7/2/2015 8:20 AM, Corinna Vinschen wrote:
>>>> On Jul  2 14:13, Corinna Vinschen wrote:
>>>>> On Jul  1 22:10, Ken Brown wrote:
>>>>>> I may have spoken too soon.  As I repeat the experiment on a
>>>>>> different
>>>>>> computer, with a build from a slightly different snapshot of the
>>>>>> emacs
>>>>>> trunk, emacs crashes when I type 'C-x d' with the following stack
>>>>>> dump:
>>>>>>
>>>>>> Stack trace:
>>>>>> Frame        Function    Args
>>>>>> 00100A3E240  00180071CC3 (00000829630, 000008296D0, 00000000000,
>>>>>> 0000082CE00)
>>>>>> 00030000002  001800732BE (00000000000, 00000000002, 00100A48C80,
>>>>>> 00000000002)
>>>>>> 00000000000  00000006B40 (00000000002, 00100A48C80, 00000000002,
>>>>>> 00100A48768)
>>>>>> 00000000000  21000000003 (00000000002, 00100A48C80, 00000000002,
>>>>>> 00100A48768)
>>>>>> End of stack trace
>>>>>>
>>>>>> $ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
>>>>>> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
>>>>>>
>>>>>> $ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
>>>>>> /usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639
>>>>>
>>>>> That points to a crash while setting up the alternate stack.  This is
>>>>> always a possibility because, in contrast to the kernel signal handler
>>>>> in a real POSIX system, the Cygwin exception handler is still
>>>>> running on
>>>>> the stack which triggered the crash up to the point where we call the
>>>>> signal handler function.  Dependent on how the stack overflow occured,
>>>>> this additional stack usage may be enough to kill the process for
>>>>> good.
>>>>>
>>>>> Out of curiosity, can you add this to the init_sigsegv() function:
>>>>>
>>>>>    #include <windows.h>
>>>>>    [...]
>>>>>    init_sigsegv (void)
>>>>>    {
>>>>>      [...]
>>>>>      SetThreadStackGuarantee (65536);
>>>>
>>>> Of course this only works "per thread", so if init_sigsegv is called
>>>> for the main thread, only the main thread gets this treatment.  For
>>>> testing this should be enough, though.
>>>
>>> That didn't make any difference.
>>
>> It should have.  If you don't also tweak STACK_DANGER_ZONE accordingly,
>> handle_sigsegv should fail to call siglongjmp.  Either way, I tested
>> it locally as well, and it doesn't work.
>>
>> In the meantime I found that there's another problem.  Assuming you
>> longjmp out of handle_sigsegv, the stack will still be "broken".
>> It doesn't have the usual guard pages anymore, and the next time
>> you have a stack overflow, NTDLL will simply terminate the process.
>>
>> I create a wrapper function which resets the stack so it has valid guard
>> pages again and then the stack overflow can be handled repeatedly.
>>
>> While I was at it, I found that the setup for pthread stacks is not
>> quite right, either, so right now I'm hacking on this stuff to make
>> it behave as expected in the usual cases.
>>
>>> But I do have a little more information.
>>> I tried running emacs under gdb with a breakpoint at handle_sigsegv.
>>> The
>>> breakpoint is hit when I deliberately trigger the stack overflow.
>>> Then I
>>> continue, emacs says it has recovered from the stack overflow, and I
>>> type
>>> 'C-x d'.  At this point there's a second SIGSEGV and handle_sigsegv is
>>> called again.  But this time garbage collection is in progress, and
>>> handle_sigsegv just gives up.
>>
>> Sounds right to me.
>>
>>> I don't know what caused the second SIGSEGV but I'll try to figure
>>> that out
>>> when I next have a chance to look at this.  I also don't know why the
>>> stack
>>> dump pointed to a crash while setting up the alternate stack, since the
>>> fatal crash actually seems to have happened later.  But maybe the
>>> stack was
>>> just completely messed up after the second SIGSEGV and the stack dump
>>> can't
>>> be trusted.
>
> I think I found the cause of that second SIGSEGV, and, if I'm right, it
> has nothing to do with Cygwin.  I think the problem was that in my
> testing, I forgot to reset max-specpdl-size and max-lisp-eval-depth to
> reasonable values after the recovery from stack overflow.  If I do that,
> then I can no longer reproduce the crash.

Just for the sake of the archives, it turned out that I could reproduce 
that second crash after all.  But it was an emacs bug, which has now 
been fixed:

   http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20996

So there are no loose ends; everything I know how to test involving the 
alternate stack works.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-14 19:10                                             ` Ken Brown
@ 2015-07-14 19:16                                               ` Corinna Vinschen
  0 siblings, 0 replies; 29+ messages in thread
From: Corinna Vinschen @ 2015-07-14 19:16 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1023 bytes --]

Hi Ken,

On Jul 14 15:09, Ken Brown wrote:
> On 7/3/2015 9:09 AM, Ken Brown wrote:
> >I think I found the cause of that second SIGSEGV, and, if I'm right, it
> >has nothing to do with Cygwin.  I think the problem was that in my
> >testing, I forgot to reset max-specpdl-size and max-lisp-eval-depth to
> >reasonable values after the recovery from stack overflow.  If I do that,
> >then I can no longer reproduce the crash.
> 
> Just for the sake of the archives, it turned out that I could reproduce that
> second crash after all.  But it was an emacs bug, which has now been fixed:
> 
>   http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20996
> 
> So there are no loose ends; everything I know how to test involving the
> alternate stack works.

Nice to know, thanks a lot.  I'm planning to release 2.1.0 tomorrow if
nothing gets in the way.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-06-27 14:53                     ` Corinna Vinschen
  2015-06-30 19:55                       ` Corinna Vinschen
@ 2015-07-14 22:19                       ` Eric Blake
  2015-07-15  2:21                         ` Ken Brown
  1 sibling, 1 reply; 29+ messages in thread
From: Eric Blake @ 2015-07-14 22:19 UTC (permalink / raw)
  To: cygwin, Ben Woodard

[-- Attachment #1: Type: text/plain, Size: 1283 bytes --]

On 06/27/2015 08:52 AM, Corinna Vinschen wrote:
> The problem is probably that there are still resources in use which
> didn't get free'd.  I'll check next week if I can do anything about it.
> Ideally with a simple testcase than emacs :}

Is libsigsegv an appropriate testcase?  There are several other
applications that then optionally use libsigsegv for stack overflow
protection (such as m4 and awk), as well as for user-managed page
faulting for garbage collection purposes (such as guile).  In fact, I'm
a bit surprised that emacs rolls its own protection instead of taking
advantage of libsigsegv - it might be worth suggesting that to upstream
emacs.

Upstream libsigsegv comes with a fairly decent testsuite for low-level
testing of what types of SIGSEGV handling is possible (regardless of
whether that recovery was done by native windows hacks as in the current
build, or via sigaltstack which is what I hope will happen when it is
reconfigured against the new cygwin).  And if those tests aren't enough
it is also a fairly simple test case using (32-bit) m4 (when linked
against a rebuilt libsigsegv) to do:

 echo 'define(a,a(a))a' | m4

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1
  2015-07-14 22:19                       ` Eric Blake
@ 2015-07-15  2:21                         ` Ken Brown
  0 siblings, 0 replies; 29+ messages in thread
From: Ken Brown @ 2015-07-15  2:21 UTC (permalink / raw)
  To: cygwin

On 7/14/2015 6:18 PM, Eric Blake wrote:
> In fact, I'm
> a bit surprised that emacs rolls its own protection instead of taking
> advantage of libsigsegv - it might be worth suggesting that to upstream
> emacs.

Good idea.  I'll do that.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-07-15  2:21 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-20 21:15 [ANNOUNCEMENT] TEST RELEASE: Cygwin 2.1.0-0.1 Corinna Vinschen
2015-06-21 18:47 ` Ken Brown
2015-06-22 11:08   ` Corinna Vinschen
2015-06-26 11:12     ` Corinna Vinschen
2015-06-26 12:02       ` Ken Brown
2015-06-26 14:14         ` Corinna Vinschen
2015-06-26 14:34           ` Ken Brown
2015-06-26 15:36             ` Corinna Vinschen
2015-06-26 16:55               ` Ken Brown
2015-06-26 20:10                 ` Corinna Vinschen
2015-06-26 20:26                 ` Corinna Vinschen
2015-06-26 22:28                   ` Ken Brown
2015-06-27 14:53                     ` Corinna Vinschen
2015-06-30 19:55                       ` Corinna Vinschen
2015-06-30 20:13                         ` Ken Brown
2015-07-01 10:47                           ` Corinna Vinschen
2015-07-01 13:57                             ` Corinna Vinschen
2015-07-01 20:12                               ` Ken Brown
2015-07-02  2:10                                 ` Ken Brown
2015-07-02 12:13                                   ` Corinna Vinschen
2015-07-02 12:20                                     ` Corinna Vinschen
2015-07-02 19:25                                       ` Ken Brown
2015-07-03 10:47                                         ` Corinna Vinschen
2015-07-03 10:50                                           ` Corinna Vinschen
2015-07-03 13:09                                           ` Ken Brown
2015-07-14 19:10                                             ` Ken Brown
2015-07-14 19:16                                               ` Corinna Vinschen
2015-07-14 22:19                       ` Eric Blake
2015-07-15  2:21                         ` Ken Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).