Re:1.5.19: changes have broken Qt3

public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed

* Re:1.5.19: changes have broken Qt3
@ 2006-05-23 17:10 Ralf Habacker
  2006-05-23 17:31 ` 1.5.19: " Brian Dessent
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-23 17:10 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

> It would appear that changes to the cygwin1.dll since 1.5.18-1 (and
> before the 20051207 snapshot) have broken Qt3. The relevant threads
> until now:

It looks that this problem is not limited to qt3 because the following
simple test applications taken from the pthread-win32 packages
ftp://sources.redhat.com/pub/pthreads-win32/pthreads-snap-2005-03-08.tar.gz
results into a seg fault.

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
mutex1b | grep  C0000005
strace: error creating process mutex1b, (error 2)

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
mutex1n | grep  C0000005
- --- Process 4872, exception C0000005 at 610B1005
  155   78759 [main] mutex1n 4872 _cygtls::handle_exceptions: In
cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CC00

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
mutex1r | grep  C0000005
- --- Process 4960, exception C0000005 at 610B1005
  153   40208 [main] mutex1r 4960 _cygtls::handle_exceptions: In
cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CC00

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace mutex2
| grep  C0000005

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace mutex6
| grep  C0000005
Assertion failed: (pthread_mutex_lock(&mutex) == 0), file mutex6.c, line 59

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
mutex6n | grep  C0000005
- --- Process 4820, exception C0000005 at 610B1005
  174   17751 [main] mutex6n 4820 _cygtls::handle_exceptions: In
cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CBF0

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
mutex6r | grep  C0000005
- --- Process 5676, exception C0000005 at 610B1005
  182   12533 [main] mutex6r 5676 _cygtls::handle_exceptions: In
cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CBF0

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
mutex6s | grep  C0000005
Assertion failed: (pthread_mutex_lock(&mutex) == 0), file mutex6s.c, line 59

C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
mutex7r | grep  C0000005
- --- Process 6240, exception C0000005 at 610B1005
  174   28939 [main] mutex7r 6240 _cygtls::handle_exceptions: In
cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CBF0


I identified the problem to the function pthread_mutexattr_init address
0x610b1005 in which a null pointer (%eax below) causes this seg faults

0x610b1003 <pthread_mutexattr_init+259>:   mov    (%edi),%eax
0x610b1005 <pthread_mutexattr_init+261>:   cmpl   $0xdf0df049,0x4(%eax)
0x610b100c <pthread_mutexattr_init+268>:   je     0x610b1070
<pthread_mutexattr_init+368>

The relating source code is in thread.cc inline function
verifyable_object_isvalid():
  ...
  if ((*object)->magic != magic)
    return INVALID_OBJECT;

The problem is that if *object is zero the access to the magic element
fails.

The following patch catch this zero pointer, although I'm not sure if
this  zero pointer indicates a major fault conditions in the threading stuff


Changelog:
2006-05-23  Ralf Habacker  <ralf.habacker@freenet.de>

	* thread.cc (verifyable_object_isvalid): catch zero pointer.


Index: thread.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/thread.cc,v
retrieving revision 1.198
diff -u -b -B -r1.198 thread.cc
- --- thread.cc   22 Mar 2006 20:38:26 -0000      1.198
+++ thread.cc   23 May 2006 13:16:57 -0000
@@ -118,6 +118,9 @@
 {
   verifyable_object **object = (verifyable_object **) objectptr;
+  if (*object == NULL)
+    return INVALID_OBJECT;
+
   myfault efault;
   if (efault.faulted ())
     return INVALID_OBJECT;


After some more investigation I found that there are additional cases
where seg faults happens because of object pointer not being null and
not be valid. This needs more research.

BTW: Using the pthread test applications makes it much easier to check
the threading api. For example there is an unhandled case in
semaphore::_timedwait where abstime=NULL results into seg fault.

If this would be my project I would add such unit test cases as far as
possible. Because pthread-win32 is also hosted on sources.redhat.com it
may be possible to relicense the test application to cygwin easier as
other external sources.

If wished, I can help adding some of these test applications.

Regards
 Ralf





-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEc0HwoHh+5t8EXncRAkXvAJ93oYvQOcPWc0jvLpoFj4lBFUDVxACcCdAc
HXNcHvvAF+is8L//ADQMGi0=
=wQG1
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 17:10 Re:1.5.19: changes have broken Qt3 Ralf Habacker
@ 2006-05-23 17:31 ` Brian Dessent
  2006-05-23 17:33 ` Dave Korn
  2006-05-24 15:38 ` Ralf Habacker
  2 siblings, 0 replies; 36+ messages in thread
From: Brian Dessent @ 2006-05-23 17:31 UTC (permalink / raw)
  To: cygwin

Ralf Habacker wrote:

> C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
> mutex1n | grep  C0000005
> - --- Process 4872, exception C0000005 at 610B1005
>   155   78759 [main] mutex1n 4872 _cygtls::handle_exceptions: In
> cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CC00
>
> [...]
> 
> Index: thread.cc
> ===================================================================
> RCS file: /cvs/src/src/winsup/cygwin/thread.cc,v
> retrieving revision 1.198
> diff -u -b -B -r1.198 thread.cc
> - --- thread.cc   22 Mar 2006 20:38:26 -0000      1.198
> +++ thread.cc   23 May 2006 13:16:57 -0000
> @@ -118,6 +118,9 @@
>  {
>    verifyable_object **object = (verifyable_object **) objectptr;
> +  if (*object == NULL)
> +    return INVALID_OBJECT;
> +
>    myfault efault;
>    if (efault.faulted ())
>      return INVALID_OBJECT;

Um, this can't be right.  The entire purpose of that "myfault efault"
line right there is to install a handler that catches any fault that
occurs until efault's destructor runs and return INVALID_OBJECT.  So
checking for NULL is not necessary; the c0000005 exception is caught and
handled gracefully.  Whatever testcase failure you are seeing is not
because of this.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: 1.5.19: changes have broken Qt3
  2006-05-23 17:10 Re:1.5.19: changes have broken Qt3 Ralf Habacker
  2006-05-23 17:31 ` 1.5.19: " Brian Dessent
@ 2006-05-23 17:33 ` Dave Korn
  2006-05-23 18:16   ` Ralf Habacker
  2006-05-24 15:38 ` Ralf Habacker
  2 siblings, 1 reply; 36+ messages in thread
From: Dave Korn @ 2006-05-23 17:33 UTC (permalink / raw)
  To: cygwin

On 23 May 2006 18:10, Ralf Habacker wrote:

  Oh no, not this old saw again!

> C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
> mutex1n | grep  C0000005
> - --- Process 4872, exception C0000005 at 610B1005
>   155   78759 [main] mutex1n 4872 _cygtls::handle_exceptions: In
> cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CC00

[ snip many more ].

> I identified the problem to the function pthread_mutexattr_init address
> 0x610b1005 in which a null pointer (%eax below) causes this seg faults

  Yes, but it's wrapped in an exception handler.  That's why you get to see an
strace error message, rather than having the process exit.  Names like
"cygwin_except_handler" and "_cygtls::handle_exceptions" should have given you
some clues about this!

> The following patch catch this zero pointer, although I'm not sure if
> this  zero pointer indicates a major fault conditions in the threading stuff

  If you don't know whether or not it's a bug, you shouldn't be trying to fix
it.  You should *understand* the code first, and think about patching it
second.

  And you know, if you think you've found a bug, and you think you've got some
testcases, and you think you've developed a patch, well, surely you actually
TRIED it out and saw that the testcases were still failing?

> 	* thread.cc (verifyable_object_isvalid): catch zero pointer.

  If you had even googled the list archive, you would have seen this being
suggested before.  See, e.g.

http://article.gmane.org/gmane.os.cygwin.patches/2976

> After some more investigation I found that there are additional cases
> where seg faults happens because of object pointer not being null and
> not be valid. This needs more research.

  Yep.  Start by looking up what efault.faulted is all about!

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 17:33 ` Dave Korn
@ 2006-05-23 18:16   ` Ralf Habacker
  2006-05-23 18:24     ` Christopher Faylor
  0 siblings, 1 reply; 36+ messages in thread
From: Ralf Habacker @ 2006-05-23 18:16 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dave Korn schrieb:
> On 23 May 2006 18:10, Ralf Habacker wrote:
> 
>   Oh no, not this old saw again!
> 
>> C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
>> mutex1n | grep  C0000005
>> - --- Process 4872, exception C0000005 at 610B1005
>>   155   78759 [main] mutex1n 4872 _cygtls::handle_exceptions: In
>> cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CC00
> 
> [ snip many more ].
> 
>> I identified the problem to the function pthread_mutexattr_init address
>> 0x610b1005 in which a null pointer (%eax below) causes this seg fault
> 
>   Yes, but it's wrapped in an exception handler.  That's why you get to see an
> strace error message, rather than having the process exit.  Names like
> "cygwin_except_handler" and "_cygtls::handle_exceptions" should have given you
> some clues about this

Hmmh, I have learned to fix obviously problems instead let it handle by
an exception handler, which has several disadvantages.

1. It costs additional runtime. In the mentioned designer i count 1653
internal exceptions, which are caused by the null pointer issue.

2. 70% of my strace output (1000 lines) are polluted by this internal
exceptions messages, which could be avoided by changing only on lines of
code. Do you think that this is the effort worth ?

Are there more problems with this simple patch ?

Regards
Ralf


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEc1DEoHh+5t8EXncRArlMAJ9epXvjle2JEMHKawUbLTndwtMRMwCfY2+D
v0dV1EFxuvjvUJKzhfDZJTE=
=hHTj
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 18:16   ` Ralf Habacker
@ 2006-05-23 18:24     ` Christopher Faylor
  2006-05-23 19:23       ` Ralf Habacker
  0 siblings, 1 reply; 36+ messages in thread
From: Christopher Faylor @ 2006-05-23 18:24 UTC (permalink / raw)
  To: cygwin

On Tue, May 23, 2006 at 08:13:24PM +0200, Ralf Habacker wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Dave Korn schrieb:
>> On 23 May 2006 18:10, Ralf Habacker wrote:
>> 
>>   Oh no, not this old saw again!
>> 
>>> C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
>>> mutex1n | grep  C0000005
>>> - --- Process 4872, exception C0000005 at 610B1005
>>>   155   78759 [main] mutex1n 4872 _cygtls::handle_exceptions: In
>>> cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CC00
>> 
>> [ snip many more ].
>> 
>>> I identified the problem to the function pthread_mutexattr_init address
>>> 0x610b1005 in which a null pointer (%eax below) causes this seg fault
>> 
>>   Yes, but it's wrapped in an exception handler.  That's why you get to see an
>> strace error message, rather than having the process exit.  Names like
>> "cygwin_except_handler" and "_cygtls::handle_exceptions" should have given you
>> some clues about this
>
>Hmmh, I have learned to fix obviously problems instead let it handle by
>an exception handler, which has several disadvantages.
>
>1. It costs additional runtime. In the mentioned designer i count 1653
>internal exceptions, which are caused by the null pointer issue.
>
>2. 70% of my strace output (1000 lines) are polluted by this internal
>exceptions messages, which could be avoided by changing only on lines of
>code. Do you think that this is the effort worth ?
>
>Are there more problems with this simple patch ?

The obvious problem is that you have provided an analysis of the common
code path.  If the standard code path does not usually involve a NULL
pointer then your patch introduces a statistically unneeded test, i.e.,
in your test case there are NNN internal exceptions but you haven't
stated that your test case is the standard way that these functions
would be used.  If the standard code path involves non-NULL pointers
then adding your test would mean a net slowdown.

Also, since the current code is supposed to deal with the problem
without your patch, your patch can't be considered as anything other
than a band-aid until the reason for the problem is actually understood.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 18:24     ` Christopher Faylor
@ 2006-05-23 19:23       ` Ralf Habacker
  2006-05-23 20:10         ` Christopher Faylor
  2006-05-24  8:50         ` Brian Dessent
  0 siblings, 2 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-23 19:23 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christopher Faylor schrieb:
> On Tue, May 23, 2006 at 08:13:24PM +0200, Ralf Habacker wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Dave Korn schrieb:
>>> On 23 May 2006 18:10, Ralf Habacker wrote:
>>>
>>>   Oh no, not this old saw again!
>>>
>>>> C:\cygwin\home\Habacker\src\pthreads-snap-2005-03-08\tests>strace
>>>> mutex1n | grep  C0000005
>>>> - --- Process 4872, exception C0000005 at 610B1005
>>>>   155   78759 [main] mutex1n 4872 _cygtls::handle_exceptions: In
>>>> cygwin_except_handler exc 0xC0000005 at 0x610B1005 sp 0x22CC00
>>> [ snip many more ].
>>>
>>>> I identified the problem to the function pthread_mutexattr_init address
>>>> 0x610b1005 in which a null pointer (%eax below) causes this seg fault
>>>   Yes, but it's wrapped in an exception handler.  That's why you get to see an
>>> strace error message, rather than having the process exit.  Names like
>>> "cygwin_except_handler" and "_cygtls::handle_exceptions" should have given you
>>> some clues about this
>> Hmmh, I have learned to fix obviously problems instead let it handle by
>> an exception handler, which has several disadvantages.
>>
>> 1. It costs additional runtime. In the mentioned designer i count 1653
>> internal exceptions, which are caused by the null pointer issue.
>>
>> 2. 70% of my strace output (1000 lines) are polluted by this internal
>> exceptions messages, which could be avoided by changing only on lines of
>> code. Do you think that this is the effort worth ?
>>
>> Are there more problems with this simple patch ?
> 
> The obvious problem is that you have provided an analysis of the common
> code path.  If the standard code path does not usually involve a NULL
> pointer then your patch introduces a statistically unneeded test, i.e.,
> in your test case there are NNN internal exceptions but you haven't
> stated that your test case is the standard way that these functions
> would be used.  

Okay, here is one of the test cases. Can anyone confirm, that this is a
standard way ?

#include <assert.h>
#include <pthread.h>

pthread_mutex_t mutex = NULL;
pthread_mutexattr_t mxAttr;

int
main()
{
  assert(pthread_mutexattr_init(&mxAttr) == 0);
  assert(pthread_mutexattr_settype(&mxAttr, PTHREAD_MUTEX_ERRORCHECK) == 0);
  assert(mutex == NULL);
  assert(pthread_mutex_init(&mutex, &mxAttr) == 0);
  assert(mutex != NULL);
  assert(pthread_mutex_lock(&mutex) == 0);
  assert(pthread_mutex_unlock(&mutex) == 0);
  assert(pthread_mutex_destroy(&mutex) == 0);
  assert(mutex == NULL);
  return 0;
}

Running this testcase results in an internal exception in
pthread_mutexattr_init()

Program received signal SIGSEGV, Segmentation fault.
0x610b1005 in pthread_mutexattr_init (attr=0x404040) at
../../../../src/winsup/cygwin/thread.cc:129
129       if ((*object)->magic != magic)
1: x/i $eip  0x610b1005 <pthread_mutexattr_init+261>:   cmpl
$0xdf0df049,0x4(%eax)

the variable object is located in eax, which is zero.

(gdb) p $eax
$1 = 0

There are two threads

(gdb) info thread
  2 thread 5772.0x1abc  0x7c91eb94 in ntdll!LdrAccessResource () from
ntdll.dll
* 1 thread 5772.0xc50  0x610b1005 in pthread_mutexattr_init
(attr=0x404040) at ../../../../src/winsup/cygwin/thread.cc:129

and the backtrace say that pthread_mutexattr_init() is called by _sigfe

(gdb) bt
#0  0x610b1005 in pthread_mutexattr_init (attr=0x404040) at
../../../../src/winsup/cygwin/thread.cc:129
#1  0x61090d68 in _sigfe () at ../../../../src/winsup/cygwin/cygserver.h:82
#2  0x00401050 in mainCRTStartup ()
(gdb)

but this backtrace save only that this functions seems to called from
the signal thread.


> If the standard code path involves non-NULL pointers
> then adding your test would mean a net slowdown.

> Also, since the current code is supposed to deal with the problem
> without your patch, your patch can't be considered as anything other
> than a band-aid until the reason for the problem is actually understood.
> 
your right, hope the above mentioned stuff help for this.

Ralf



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEc2B8oHh+5t8EXncRAlVrAJoDPcvJb/ynI6T+m4jtiUwxTlweQwCgoD6k
nEtFHGFWiE3j0SaMBUgCVRE=
=gRS1
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 19:23       ` Ralf Habacker
@ 2006-05-23 20:10         ` Christopher Faylor
  2006-05-24  8:56           ` Ralf Habacker
  2006-05-24  8:50         ` Brian Dessent
  1 sibling, 1 reply; 36+ messages in thread
From: Christopher Faylor @ 2006-05-23 20:10 UTC (permalink / raw)
  To: cygwin

On Tue, May 23, 2006 at 09:20:28PM +0200, Ralf Habacker wrote:
>your right, hope the above mentioned stuff help for this.

Ralf,
You have the test case.  You have the source code.  You've already
provided a patch.  What's stopping you from determinging the cause of
the problem now that you understand that this situation is already
supposed to be handled?  I appreciate that you have tracked this down
but I don't understand why you now seem to have given up at this point.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 19:23       ` Ralf Habacker
  2006-05-23 20:10         ` Christopher Faylor
@ 2006-05-24  8:50         ` Brian Dessent
  2006-05-24  9:01           ` Ralf Habacker
  2006-05-24 10:06           ` 1.5.19: changes have broken Qt3 clayne
  1 sibling, 2 replies; 36+ messages in thread
From: Brian Dessent @ 2006-05-24  8:50 UTC (permalink / raw)
  To: cygwin

Ralf Habacker wrote:

> Running this testcase results in an internal exception in
> pthread_mutexattr_init()
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x610b1005 in pthread_mutexattr_init (attr=0x404040) at
> ../../../../src/winsup/cygwin/thread.cc:129
> 129       if ((*object)->magic != magic)

Sigh.  We've been through this ad nauseum in the archives.  This is how
it's supposed to work, there's nothing wrong here.  Gdb doesn't know any
better though, and reports it as a SIGSEGV, when it is not.  Did you not
notice that when you run the program outside of the debugger it does not
fault?  If you use a recent Cygwin snapshot and a gdb built from CVS you
see no such fault, because this defect in gdb has been fixed.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 20:10         ` Christopher Faylor
@ 2006-05-24  8:56           ` Ralf Habacker
  0 siblings, 0 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-24  8:56 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christopher Faylor schrieb:
> On Tue, May 23, 2006 at 09:20:28PM +0200, Ralf Habacker wrote:
>> your right, hope the above mentioned stuff help for this.
> 
> Ralf,
> You have the test case.  You have the source code.  You've already
> provided a patch.  What's stopping you from determinging the cause of
> the problem now that you understand that this situation is already
> supposed to be handled?  I appreciate that you have tracked this down
> but I don't understand why you now seem to have given up at this point.

Have I said this ? I only reported about the current state in the hope
someone where bitten by the same issue and had some additional hints,
which seems not the case.

Anyway, the problem is that in

extern "C" int
pthread_mutexattr_init (pthread_mutexattr_t *attr)
{
  if (pthread_mutexattr::is_good_object (attr))
    return EBUSY;

  *attr = new pthread_mutexattr ();

pthread_mutexattr::is_good_object() is called, but attr does not contain
a valid object (it is created later) and the functions aborts, which
should not be.

Is there a replacement for the gone check_valid_pointer() function,
which could be added to pthread_mutexattr_init and was used before the
call to pthread_mutexattr::is_good_object() was introduced
(http://www.cygwin.com/ml/cygwin-patches/2002-q4/msg00204.html) ?

BTW: A similar problem is with

pthread_condattr_init ()
pthread_rwlockattr_init ()
pthread_attr_init ()


Regards
 Ralf

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEdB/IoHh+5t8EXncRAsD/AJ9n/X9jLNaq0qoU2nFmpJpFLkks9QCeMJDM
a16WqHXFx0EjPu7HA+ORhKI=
=gRfG
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24  8:50         ` Brian Dessent
@ 2006-05-24  9:01           ` Ralf Habacker
  2006-05-24  9:23             ` Brian Dessent
  2006-05-24 10:06           ` 1.5.19: changes have broken Qt3 clayne
  1 sibling, 1 reply; 36+ messages in thread
From: Ralf Habacker @ 2006-05-24  9:01 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brian Dessent schrieb:
> Ralf Habacker wrote:
> 
>> Running this testcase results in an internal exception in
>> pthread_mutexattr_init()
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x610b1005 in pthread_mutexattr_init (attr=0x404040) at
>> ../../../../src/winsup/cygwin/thread.cc:129
>> 129       if ((*object)->magic != magic)
> 
> Sigh.  We've been through this ad nauseum in the archives.  This is how
> it's supposed to work, there's nothing wrong here.  

But in the case of pthread_mutexattr_init() this exception results in an
abort of pthread_mutexattr_init(), which should not be. See my other
mail in this thread.



> Gdb doesn't know any
> better though, and reports it as a SIGSEGV, when it is not.  Did you not
> notice that when you run the program outside of the debugger it does not
> fault?  
There is no segfault, but it does not work as expected e.g.
pthread_mutexattr_init() does not fill the pthread_mutexattr_t struct
given as parameter.

If you use a recent Cygwin snapshot and a gdb built from CVS you
> see no such fault, because this defect in gdb has been fixed.
> 

Ralf

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEdCDRoHh+5t8EXncRAkcSAJ0TpZMnh5qhSQKY8nrb688Pq4bxogCfaTG5
9LDqWxCYGtlpmm9LBrKZcac=
=2syh
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24  9:01           ` Ralf Habacker
@ 2006-05-24  9:23             ` Brian Dessent
  2006-05-24 12:21               ` Ralf Habacker
  2006-05-24 16:13               ` Ralf Habacker
  0 siblings, 2 replies; 36+ messages in thread
From: Brian Dessent @ 2006-05-24  9:23 UTC (permalink / raw)
  To: cygwin

Ralf Habacker wrote:

> There is no segfault, but it does not work as expected e.g.
> pthread_mutexattr_init() does not fill the pthread_mutexattr_t struct
> given as parameter.

How does it not work?  The testcase runs fine for me with no assertion
failures, neither from a prompt nor in (CVS) gdb.  Even when I modify it
as follows:

--- pthread_mutexattr_init.c    2006-05-24 02:05:52.523968000 -0700
+++ pthread_mutexattr_init_2.c  2006-05-24 02:11:27.299406200 -0700
@@ -9,6 +9,9 @@ main()
 {
   assert(pthread_mutexattr_init(&mxAttr) == 0);
   assert(pthread_mutexattr_settype(&mxAttr, PTHREAD_MUTEX_ERRORCHECK)
== 0);
+  int t;
+  pthread_mutexattr_gettype(&mxAttr, &t);
+  assert(t == PTHREAD_MUTEX_ERRORCHECK);
   assert(mutex == NULL);
   assert(pthread_mutex_init(&mutex, &mxAttr) == 0);
   assert(mutex != NULL);

...it still runs without failure.

BTW the whole "myfault" thing was devised specifically to kill the
IsBadReadPtr() junk that was used before, so asking for that back is
probably never going to happen.  And with good reason too, because when
you call IsBadReadPtr is does exactly what "myfault" does, it installs a
temporary fault handler, tries to read the memory, and then removes that
temporary fault handler.  Except that if you call IsBadReadPtr a bunch
of times it has to do this setup/teardown every time, instead of just
doing it once for the entire lexical scope of the function, as with
myfault.

And yes, it used to be that gdb was too dumb to recognise that these
faults in IsBadReadPtr were not actual faults, and it would print them
as spurious SIGSEGVs, just as it currently does for "myfault"s.  Then it
was patched to ignore faults in kernel32.dll.  Now that the handler is
in cygwin1.dll, it had to be taught to ignore faults there too, and if
you use a CVS GDB, it does.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24  8:50         ` Brian Dessent
  2006-05-24  9:01           ` Ralf Habacker
@ 2006-05-24 10:06           ` clayne
  2006-05-24 11:00             ` Dave Korn
  2006-05-24 12:18             ` Brian Dessent
  1 sibling, 2 replies; 36+ messages in thread
From: clayne @ 2006-05-24 10:06 UTC (permalink / raw)
  To: cygwin

On Wed, May 24, 2006 at 01:49:53AM -0700, Brian Dessent wrote:
> Sigh.  We've been through this ad nauseum in the archives.  This is how
> it's supposed to work, there's nothing wrong here.  Gdb doesn't know any
> better though, and reports it as a SIGSEGV, when it is not.  Did you not
> notice that when you run the program outside of the debugger it does not
> fault?  If you use a recent Cygwin snapshot and a gdb built from CVS you
> see no such fault, because this defect in gdb has been fixed.
> 
> Brian

Actually, is this really a fault in gdb? Cygwin is throwing a SIGSEGV signal,
correct? GDB does what it's told, stops on SIGSEGV by default.

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: 1.5.19: changes have broken Qt3
  2006-05-24 10:06           ` 1.5.19: changes have broken Qt3 clayne
@ 2006-05-24 11:00             ` Dave Korn
  2006-05-24 12:10               ` clayne
  2006-05-24 12:18             ` Brian Dessent
  1 sibling, 1 reply; 36+ messages in thread
From: Dave Korn @ 2006-05-24 11:00 UTC (permalink / raw)
  To: cygwin

On 24 May 2006 11:05, clayne@anodized wrote:

> On Wed, May 24, 2006 at 01:49:53AM -0700, Brian Dessent wrote:
>> Sigh.  We've been through this ad nauseum in the archives.  This is how
>> it's supposed to work, there's nothing wrong here.  Gdb doesn't know any
>> better though, and reports it as a SIGSEGV, when it is not.  Did you not
>> notice that when you run the program outside of the debugger it does not
>> fault?  If you use a recent Cygwin snapshot and a gdb built from CVS you
>> see no such fault, because this defect in gdb has been fixed.
>> 
>> Brian
> 
> Actually, is this really a fault in gdb? Cygwin is throwing a SIGSEGV
> signal, correct? GDB does what it's told, stops on SIGSEGV by default.
> 
> -cl

  But it doesn't interact properly with cygwin's exception handling -> signal
mechanism, and the task dies, when it should just run on.

  Anyone who's doing any serious debugging on Cygwin very seriously wants to
build their own gdb and insight from current CVS.  It's much improved of late.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24 11:00             ` Dave Korn
@ 2006-05-24 12:10               ` clayne
  2006-05-24 12:28                 ` Dave Korn
  0 siblings, 1 reply; 36+ messages in thread
From: clayne @ 2006-05-24 12:10 UTC (permalink / raw)
  To: cygwin

On Wed, May 24, 2006 at 11:40:58AM +0100, Dave Korn wrote:
> > Actually, is this really a fault in gdb? Cygwin is throwing a SIGSEGV
> > signal, correct? GDB does what it's told, stops on SIGSEGV by default.
> > 
> > -cl
> 
>   But it doesn't interact properly with cygwin's exception handling -> signal
> mechanism, and the task dies, when it should just run on.
> 
>   Anyone who's doing any serious debugging on Cygwin very seriously wants to
> build their own gdb and insight from current CVS.  It's much improved of late.

Right, or bless their sanity - as it won't last long. But I'm just trying to
debate that it's no "lacking" of gdb that it's catching SIGSEGV signals which
are being artificially generated by cygwin.

What's the design mechanism for the entire 'check for non-initialized space and
segfault if uninitialized' when it comes to statically initialized pthreads
objects in the first place, btw? Why not just have pthread_mutex_t (for example
actually be a pthread_mutex_t instead of it being a type'd pointer to the real
pthread_mutex_t? Why dynamically initialize space for it at all via the check
bunk memory->throw fault->alloc real memory for real pthread_mutex_t as opposed
to "initialiize the mutex->if bunk space, segfault as usual" ?

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24 10:06           ` 1.5.19: changes have broken Qt3 clayne
  2006-05-24 11:00             ` Dave Korn
@ 2006-05-24 12:18             ` Brian Dessent
  1 sibling, 0 replies; 36+ messages in thread
From: Brian Dessent @ 2006-05-24 12:18 UTC (permalink / raw)
  To: cygwin

clayne@anodized.com wrote:

> Actually, is this really a fault in gdb? Cygwin is throwing a SIGSEGV signal,
> correct? GDB does what it's told, stops on SIGSEGV by default.

Not really.  In cases where it is checking parameters or otherwise
expects to dereference an invalid pointer, Cygwin installs a temporary
fault handler that intercepts any fault and returns the correct error
code.  If you run such code outside of gdb you get no indication of a
fault at all, just like a standard try/except block -- unlike an actual
segmentation violation where the program is terminated.  So yes, it is a
defect that gdb treats these as actual SIGSEGVs when they are actually
just part of how Cygwin works internally, and this misperception has
caused countless messages posted to this list insisting that there is
some kind of problem in Cygwin where there is none.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24  9:23             ` Brian Dessent
@ 2006-05-24 12:21               ` Ralf Habacker
  2006-05-24 12:31                 ` Dave Korn
  2006-05-27 19:47                 ` Brian Dessent
  2006-05-24 16:13               ` Ralf Habacker
  1 sibling, 2 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-24 12:21 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brian Dessent schrieb:
> Ralf Habacker wrote:
> 
>> There is no segfault, but it does not work as expected e.g.
>> pthread_mutexattr_init() does not fill the pthread_mutexattr_t struct
>> given as parameter.
> 
> How does it not work?  The testcase runs fine for me with no assertion
> failures, neither from a prompt nor in (CVS) gdb.  Even when I modify it
> as follows:
> 
> --- pthread_mutexattr_init.c    2006-05-24 02:05:52.523968000 -0700
> +++ pthread_mutexattr_init_2.c  2006-05-24 02:11:27.299406200 -0700
> @@ -9,6 +9,9 @@ main()
>  {
>    assert(pthread_mutexattr_init(&mxAttr) == 0);
>    assert(pthread_mutexattr_settype(&mxAttr, PTHREAD_MUTEX_ERRORCHECK)
> == 0);
> +  int t;
> +  pthread_mutexattr_gettype(&mxAttr, &t);
> +  assert(t == PTHREAD_MUTEX_ERRORCHECK);
>    assert(mutex == NULL);
>    assert(pthread_mutex_init(&mutex, &mxAttr) == 0);
>    assert(mutex != NULL);
> 
> ...it still runs without failure.
> 
> BTW the whole "myfault" thing was devised specifically to kill the
> IsBadReadPtr() junk that was used before, so asking for that back is
> probably never going to happen.  And with good reason too, because when
> you call IsBadReadPtr is does exactly what "myfault" does, it installs a
> temporary fault handler, tries to read the memory, and then removes that
> temporary fault handler.  Except that if you call IsBadReadPtr a bunch
> of times it has to do this setup/teardown every time, instead of just
> doing it once for the entire lexical scope of the function, as with
> myfault.

Thanks for this info to understand the new exception handling in cygwin.
I was bitten last year by some thread relating problems while porting
qt3 to cygwin and had investigated some time to understand this stuff,
which has changed much in the meantime.

> And yes, it used to be that gdb was too dumb to recognise that these
> faults in IsBadReadPtr were not actual faults, and it would print them
> as spurious SIGSEGVs, just as it currently does for "myfault"s.  Then it
> was patched to ignore faults in kernel32.dll.  Now that the handler is
> in cygwin1.dll, it had to be taught to ignore faults there too, and if
> you use a CVS GDB, it does.
> 

You said that the testcase runs, yes, but do you have tried to debug the
cygwin dll with this exception handling. Please start the above
mentioned testcase in gdb and enter

b main
r
b pthread_mutexattr::pthread_mutexattr()
c

This breakpoint is never reached (at least in released gdb) and makes it
hard to debug cygwin's threading stuff, probably impossible in this area.

This means to be able to debug the cygwin dll in this area I have to
recompile a special cygwin version with something like below mentioned.:

/* FIXME: write and test process shared mutex's.  */
extern "C" int
pthread_mutexattr_init (pthread_mutexattr_t *attr)

old:
  if (pthread_mutexattr::is_good_object (attr))
    return EBUSY;

new:
  if (attr && pthread_mutexattr::is_good_object (attr))
    return EBUSY;

BTW: This is not to hurt anyone or to bring in miscredit anyones work.
We all try our best to make cygwin as good as possible. It is only that
in this area things could be done better :-)

Regards

Ralf

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEdE8ooHh+5t8EXncRAhnRAKCfbhfNKawy70+t18zk56M3WHzuLACeJR1C
2WLX0BBt5N7efXQWuav0tNk=
=xZn9
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: 1.5.19: changes have broken Qt3
  2006-05-24 12:10               ` clayne
@ 2006-05-24 12:28                 ` Dave Korn
  0 siblings, 0 replies; 36+ messages in thread
From: Dave Korn @ 2006-05-24 12:28 UTC (permalink / raw)
  To: cygwin

On 24 May 2006 13:07, clayne@anodized.com wrote:

> On Wed, May 24, 2006 at 11:40:58AM +0100, Dave Korn wrote:
>>> Actually, is this really a fault in gdb? Cygwin is throwing a SIGSEGV
>>> signal, correct? GDB does what it's told, stops on SIGSEGV by default.
>>> 
>>> -cl
>> 
>>   But it doesn't interact properly with cygwin's exception handling ->
>> signal mechanism, and the task dies, when it should just run on.
>> 
>>   Anyone who's doing any serious debugging on Cygwin very seriously wants
>> to build their own gdb and insight from current CVS.  It's much improved
>> of late. 
> 
> Right, or bless their sanity - as it won't last long. But I'm just trying to
> debate that it's no "lacking" of gdb that it's catching SIGSEGV signals
> which are being artificially generated by cygwin.

  It's not a SIGSEGV.  It's actually a protection fault, which is one variety
of exception.  Cygwin's job is to catch the faulting access using a structured
exception handler and translate it into a signal - if, and only if, the
exception does in fact represent an event that should be reported to userland
as a signal.  If, as in this case, the exception is part of cygwin's internal
access checking and should not be reported as a signal to userland, cygwin's
job is to catch the faulting access and /not/ translate it into a signal.

  Now bear in mind that we're not talking about any old version of gdb here:
we're very specifically discussing the cygwin-targeted version of gdb.  That
means it should understand about cygwin's signal handling mechanism, and it
should know that some exceptions will be translated into signals and others
will not, and it should leave cygwin to handle that and only report a SIGSEGV
(or any other kind of signal) when cygwin decides to turn a particular
exception into a signal, and not when it doesn't.  It's an old
hangover/win32-ism that it intercepts all the windows exceptions and attempts
to interpret them to the user as SIGs of some kind; might make sense if it was
attempting to debug a windows native (msvcrt-based) application, but does not
make sense for a cygwin app these days; long long ago, it was a reasonable
design compromise when gdb was first being targeted at wintel platforms.

> What's the design mechanism for the entire 'check for non-initialized space
> and segfault if uninitialized' when it comes to statically initialized
> pthreads objects in the first place, btw? Why not just have pthread_mutex_t
> (for example actually be a pthread_mutex_t instead of it being a type'd
> pointer to the real pthread_mutex_t? Why dynamically initialize space for
> it at all via the check bunk memory->throw fault->alloc real memory for
> real pthread_mutex_t as opposed to "initialiize the mutex->if bunk space,
> segfault as usual" ? 

  In order to comply with the complex way the POSIX spec allows you to have
*either* static linktime initialisation *or* dynamic runtime initialisation of
all these objects.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: 1.5.19: changes have broken Qt3
  2006-05-24 12:21               ` Ralf Habacker
@ 2006-05-24 12:31                 ` Dave Korn
  2006-05-27 20:02                   ` Ralf Habacker
  2006-05-27 19:47                 ` Brian Dessent
  1 sibling, 1 reply; 36+ messages in thread
From: Dave Korn @ 2006-05-24 12:31 UTC (permalink / raw)
  To: cygwin

On 24 May 2006 13:19, Ralf Habacker wrote:

> This breakpoint is never reached (at least in released gdb) and makes it
> hard to debug cygwin's threading stuff, probably impossible in this area.

  How many times do you have to be told?  The last released gdb is known to
not cope with this.  IT IS A KNOWN BUG.  IT HAS BEEN FIXED.

  No, nobody has yet been able to travel backward in time and fix it in
earlier versions of gdb from before the bug was fixed.  Sorry, but this is not
the fault of the cygwin project.  Please report this to the
bug-reality@${DEITY}.god mailing list!

> This means to be able to debug the cygwin dll in this area I have to
> recompile a special cygwin version with something like below mentioned.:

  No, in order to debug the cygwin dll you have to use UP TO DATE gdb.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-23 17:10 Re:1.5.19: changes have broken Qt3 Ralf Habacker
  2006-05-23 17:31 ` 1.5.19: " Brian Dessent
  2006-05-23 17:33 ` Dave Korn
@ 2006-05-24 15:38 ` Ralf Habacker
  2 siblings, 0 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-24 15:38 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ralf Habacker schrieb:
> Hi all,
>  
> If this would be my project I would add such unit test cases as far as
> possible. Because pthread-win32 is also hosted on sources.redhat.com it
> may be possible to relicense the test application to cygwin easier as
> other external sources.
> 

No need for this, the related pthread functions are already in the cvs
dir. See src/winsup/testsuite

Running
/home/Habacker/src/extern/cygwin.com/src/winsup/testsuite/winsup.api/cygload.exp
...
FAIL: cygload (execute)
Running
/home/Habacker/src/extern/cygwin.com/src/winsup/testsuite/winsup.api/winsup.exp
...
FAIL: msgtest.c (execute)
FAIL: resethand.c (execute)
FAIL: semtest.c (execute)
FAIL: shmtest.c (execute)

                === winsup Summary ===

# of expected passes            270
# of unexpected failures        5
# of expected failures          8


Regards
 Ralf
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEdH0GoHh+5t8EXncRAoesAJ4gHbZ2OKdciNcj/9sChnAkKAP7RwCeM/XW
t2kzO62zKpUx4KoNtareNVQ=
=5g6q
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24  9:23             ` Brian Dessent
  2006-05-24 12:21               ` Ralf Habacker
@ 2006-05-24 16:13               ` Ralf Habacker
  2006-05-24 18:06                 ` Cygwin, gdb and SEH [was RE: 1.5.19: changes have broken Qt3] Dave Korn
  1 sibling, 1 reply; 36+ messages in thread
From: Ralf Habacker @ 2006-05-24 16:13 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brian Dessent schrieb:
> Ralf Habacker wrote:
> 
> And yes, it used to be that gdb was too dumb to recognise that these
> faults in IsBadReadPtr were not actual faults, and it would print them
> as spurious SIGSEGVs, just as it currently does for "myfault"s.  Then it
> was patched to ignore faults in kernel32.dll.  Now that the handler is
> in cygwin1.dll, it had to be taught to ignore faults there too, and if
> you use a CVS GDB, it does.

This kind of exceptions are handled complete in cygwin itself. Is there
no way to limit this exceptions to the cygwin library itself and to hide
them to the rest ?

This way exceptions are handled looks to me like a specific
implementation detail, which will worry users more than that it helps to
find problems in an application.

And what about debugging cygwin itself ? It looks to me that disabling
of this exception handling code must be possible ?

You may say to, this has to be done by gdb, but what about strace ? Do I
need to run strace through gdb to avoid such exception messages ? Or
will strace be patched too to hide such messages ?

Remember the previously listed examples where those messages occupies
about 70% of the whole output of an straced application.
Because this exception addresses are located in the cygwin dll it will
produce many, many obsolate support requests to the cygwin mailing list
(as I was faced) and will stresses the support people instead that they
can give support for the real problems and will eat time from the
developer.

And what about the usage of other windows debuggers ? Does they have
also such specific exception hiding support ? If not will there be a
manual how to disable this internal messages ?

As summary, I don't think that patching gdb is the best solution. It
would be better to limit these exception to the cygwin dll and to hide
this message to the rest.

May be an option in the cygwin or other environment var to enable such
message for debugging purpose will be useful.

Regards
Ralf

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEdIYToHh+5t8EXncRAjT8AJ4tnrVX6EDj5rynw8MPgd5TXAWeBwCfXVrU
wogfOq23tMiXfHoUTKorKR8=
=J6R+
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Cygwin, gdb and SEH [was RE: 1.5.19: changes have broken Qt3]
  2006-05-24 16:13               ` Ralf Habacker
@ 2006-05-24 18:06                 ` Dave Korn
  2006-05-25  3:05                   ` clayne
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Korn @ 2006-05-24 18:06 UTC (permalink / raw)
  To: cygwin

On 24 May 2006 17:13, Ralf Habacker wrote:

> Brian Dessent schrieb:
>> Ralf Habacker wrote:
>> 
>> And yes, it used to be that gdb was too dumb to recognise that these
>> faults in IsBadReadPtr were not actual faults, and it would print them
>> as spurious SIGSEGVs, just as it currently does for "myfault"s.  Then it
>> was patched to ignore faults in kernel32.dll.  Now that the handler is
>> in cygwin1.dll, it had to be taught to ignore faults there too, and if
>> you use a CVS GDB, it does.
> 
> This kind of exceptions are handled complete in cygwin itself. Is there
> no way to limit this exceptions to the cygwin library itself and to hide
> them to the rest ?

  <smacks forehead>  I'm going to have to start shouting now, because you
clearly can't hear or aren't listening to anything that's being said. 

  YES, THERE IS A WAY!

  WHAT IS MORE YOU HAVE ALREADY HAD IT EXPLAINED TO YOU A DOZEN TIMES IN THIS
THREAD! 

  THE WAY IS TO USE AN UP-TO-DATE GDB!

  <takes deep breath>

  You have everything back to front.  The problem is not for cygwin to hide
these exceptions from gdb; the problem is for gdb not to jump in ahead of
cygwin and intercept them.  *That* is why fixing gdb is the right thing.

> This way exceptions are handled looks to me like a specific
> implementation detail, which will worry users more than that it helps to
> find problems in an application.

  Yes, that is why GDB has been patched.  However, there is no way on earth
that an old out-of-date gdb could know about some trick that was only
introduced into the cygwin source years later.  We aim for backward
compatibility, forward is trickier.

> You may say to, this has to be done by gdb, but what about strace ? Do I
> need to run strace through gdb to avoid such exception messages ? Or
> will strace be patched too to hide such messages ?

  Learn to use "grep -v" or RTFM about the --mask option to strace.  The fact
that *you* do not want to see these messages does not mean they are not useful
for others.

> Remember the previously listed examples where those messages occupies
> about 70% of the whole output of an straced application.

  If you attempt to misuse strace as a tool for debugging your applications,
you will run into this kind of problem, and it will be your fault.  RTFM:

http://cygwin.com/cygwin-ug-net/using-utils.html#strace

quite clearly states "This program is mainly useful for debugging the Cygwin
DLL itself."

> Because this exception addresses are located in the cygwin dll it will
> produce many, many obsolate support requests to the cygwin mailing list
> (as I was faced)

  See, this is the *real* problem: you read an obscure internal debugging
message emitted by what is effectively a kernel debugging tool, but then you
just guessed at what it signified, instead of finding out.  The leap to the
false conclusion was yours.

> and will stresses the support people instead that they

  LOL!  What support people?  There are none.

> can give support for the real problems and will eat time from the
> developer.

  You appear to be under the same misunderstanding as the guy from yesterday
who thought cygwin might have a use for "market research".  There is no
company, no support team, and any developers whose employers might assign them
to work on anything related to cygwin are under no obligation to work on other
people's problems.

> And what about the usage of other windows debuggers ? Does they have
> also such specific exception hiding support ? If not will there be a
> manual how to disable this internal messages ?

  Cygwin does not support the use of "other" debuggers.  Cygwin is based
around the GNU toolchain.  We do not attempt compatibility with MSVC or
WinDbg.

> As summary, I don't think that patching gdb is the best solution.

  However your conclusions are based on a faulty understanding of the
situation:

> It
> would be better to limit these exception to the cygwin dll and to hide
> this message to the rest.

  The issue is not "limiting" the exceptions, which are already and always
have been innately limited by their very nature; nothing further up the stack
than the SEH handler frame will ever know the slightest thing about them.  It
is only when a *DEBUGGER* is controlling the flow of the program that it is
even *possible* for anything to get in there ahead of the standard win32
exception handling mechanism that cygwin is already using to do what you ask.

  I recommend you do not post to this thread again until you have read 

1)  The cygwin user guide 
2)  The cygwin FAQ
3)  The MSDN documentation about __try ... __except
3)  Matt Pietrek's classic article from MSJ '97 about the internals of SEH.

because you're just repeating yourself now.

    cheers,
      DaveK
--
Can't think of a witty .sigline today....

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Cygwin, gdb and SEH [was RE: 1.5.19: changes have broken Qt3]
  2006-05-24 18:06                 ` Cygwin, gdb and SEH [was RE: 1.5.19: changes have broken Qt3] Dave Korn
@ 2006-05-25  3:05                   ` clayne
  2006-05-25  9:23                     ` Dave Korn
  2006-05-25 15:25                     ` mwoehlke
  0 siblings, 2 replies; 36+ messages in thread
From: clayne @ 2006-05-25  3:05 UTC (permalink / raw)
  To: cygwin

On Wed, May 24, 2006 at 07:06:32PM +0100, Dave Korn wrote:
>   YES, THERE IS A WAY!
> 
>   WHAT IS MORE YOU HAVE ALREADY HAD IT EXPLAINED TO YOU A DOZEN TIMES IN THIS
> THREAD! 
> 
>   THE WAY IS TO USE AN UP-TO-DATE GDB!

BTW:

Myself, I had just updated to CVS gdb. Currently it looks like SIGINT is busted
(well atleast initiating via ctrl-c) and performance under gdb is crap (probably
because I'm trying to debug something with millions of objects - each with their
own mutexes).

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: Cygwin, gdb and SEH [was RE: 1.5.19: changes have broken Qt3]
  2006-05-25  3:05                   ` clayne
@ 2006-05-25  9:23                     ` Dave Korn
  2006-05-25 15:25                     ` mwoehlke
  1 sibling, 0 replies; 36+ messages in thread
From: Dave Korn @ 2006-05-25  9:23 UTC (permalink / raw)
  To: cygwin

On 25 May 2006 04:05, clayne@anodized.com wrote:

> On Wed, May 24, 2006 at 07:06:32PM +0100, Dave Korn wrote:
>>   YES, THERE IS A WAY!
>> 
>>   WHAT IS MORE YOU HAVE ALREADY HAD IT EXPLAINED TO YOU A DOZEN TIMES IN
>> THIS THREAD! 
>> 
>>   THE WAY IS TO USE AN UP-TO-DATE GDB!
> 
> BTW:
> 
> Myself, I had just updated to CVS gdb. Currently it looks like SIGINT is
> busted (well atleast initiating via ctrl-c) 

  PPAST?  I just wrote a main() with two printfs surrounding a sleep(5) call.
I ran it under gdb 6.5.50.20060523-cvs in bash in a dos console and when it
was in the sleep I pressed ctrl-c and got a SIGINT and control back.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Cygwin, gdb and SEH [was RE: 1.5.19: changes have broken Qt3]
  2006-05-25  3:05                   ` clayne
  2006-05-25  9:23                     ` Dave Korn
@ 2006-05-25 15:25                     ` mwoehlke
  1 sibling, 0 replies; 36+ messages in thread
From: mwoehlke @ 2006-05-25 15:25 UTC (permalink / raw)
  To: cygwin

clayne@<PCYMTNUYEAAYNUYRWTBS*> wrote:
> BTW:
> 
> Myself, I had just updated to CVS gdb. Currently it looks like SIGINT is busted
> (well atleast initiating via ctrl-c) and performance under gdb is crap (probably
> because I'm trying to debug something with millions of objects - each with their
> own mutexes).

Hmm, build problem maybe? Dave has the right idea. As for speed, do you 
know if you built with optimizations or with debug symbols? That might 
make a difference in speed...

(* Please Configure Your Mailer To Not Use Your E-mail Address As Your 
Name Unless You Really *Want* To Be Spammed)

-- 
Matthew
...Ruthlessly beating Windows with a hammer until it looks like POSIX.


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24 12:21               ` Ralf Habacker
  2006-05-24 12:31                 ` Dave Korn
@ 2006-05-27 19:47                 ` Brian Dessent
  2006-05-27 21:04                   ` Ralf Habacker
  1 sibling, 1 reply; 36+ messages in thread
From: Brian Dessent @ 2006-05-27 19:47 UTC (permalink / raw)
  To: Ralf Habacker; +Cc: cygwin

[ I realized that a couple of points in this thread were never addressed -- we
sort of got side tracked on the GDB issue.  I just want to reply to these points
and try to convince you that this bug you see does not exist.  People have a
tendency to point to the archives and say "lookee, it's broken" if the thread
does not come to a result. ]

Ralf Habacker wrote:

> You said that the testcase runs, yes, but do you have tried to debug the
> cygwin dll with this exception handling. Please start the above
> mentioned testcase in gdb and enter
> 
> b main
> r
> b pthread_mutexattr::pthread_mutexattr()
> c
> 
> This breakpoint is never reached (at least in released gdb) and makes it
> hard to debug cygwin's threading stuff, probably impossible in this area.

The breakpoint does not fire, correct.  But that is because pthread_mutexattr's
constructor is empty (other than the initialization list):

pthread_mutexattr::pthread_mutexattr ():verifyable_object
(PTHREAD_MUTEXATTR_MAGIC),
pshared (PTHREAD_PROCESS_PRIVATE), mutextype (PTHREAD_MUTEX_ERRORCHECK)
{
}

If instead you set a breakpoint for the desired line that calls the constructor
(in this case, thread.cc:3027) then it does fire.  And if you use a CVS GDB you
get no spurious faults either.

> This means to be able to debug the cygwin dll in this area I have to
> recompile a special cygwin version with something like below mentioned.:
> 
> /* FIXME: write and test process shared mutex's.  */
> extern "C" int
> pthread_mutexattr_init (pthread_mutexattr_t *attr)
> 
> old:
>   if (pthread_mutexattr::is_good_object (attr))
>     return EBUSY;
> 
> new:
>   if (attr && pthread_mutexattr::is_good_object (attr))
>     return EBUSY;

This is totally useless.  In order for "if (attr)" to be false, the function
would have had to been called as pthread_mutexattr_init (NULL) rather than
pthread_mutexattr_init (&some_as_yet_uninitialized_variable).  Furthermore, if
attr really were false, then the next line:

  *attr = new pthread_mutexattr ();

would cause a NULL dereference which would not be caught, causing the program to
crash and burn.  The function must always be passed a valid pointer; the thing
it points to might be uninitialized though.

Let's walk through the complete series of events that happens in the testcase
below:

pthread_mutexattr_t mxAttr;
assert(pthread_mutexattr_init(&mxAttr) == 0);

This is the thing that you claim is broken, however if you run this testcase
from a regular prompt (outside GDB) it does not assert, and in fact the
mutexattr is correctly initialized.  (And if you do run it in a recent GDB it
does not assert nor fault either.)

Let's look at the entire chain of code involved here:

extern "C" int
pthread_mutexattr_init (pthread_mutexattr_t *attr)
{
  if (pthread_mutexattr::is_good_object (attr))
    return EBUSY;

  *attr = new pthread_mutexattr ();
  if (!pthread_mutexattr::is_good_object (attr))
    {
      delete (*attr);
      *attr = NULL;
      return ENOMEM;
    }
  return 0;
}

inline bool
pthread_mutexattr::is_good_object (pthread_mutexattr_t const * attr)
{
  if (verifyable_object_isvalid (attr, PTHREAD_MUTEXATTR_MAGIC) !=
                                         VALID_OBJECT)
    return false;
  return true;
}

static inline verifyable_object_state
verifyable_object_isvalid (void const * objectptr, long magic, void
                           *static_ptr1, void *static_ptr2, void *static_ptr3)
{
  verifyable_object **object = (verifyable_object **) objectptr;

  myfault efault;
  if (efault.faulted ())
    return INVALID_OBJECT;

  if ((static_ptr1 && *object == static_ptr1) ||
      (static_ptr2 && *object == static_ptr2) ||
      (static_ptr3 && *object == static_ptr3))
    return VALID_STATIC_OBJECT;
  if ((*object)->magic != magic)
    return INVALID_OBJECT;
  return VALID_OBJECT;
}

So, the call chain will look like this:

pthread_mutexattr_init(&mxAttr)  ->  
 pthread_mutexattr::is_good_object (&mxAttr)  -> 
  verifyable_object_isvalid (&mxAttr, PTHREAD_MUTEXATTR_MAGIC, NULL,NULL,NULL)

Of course, these last two functions will be expanded inline, so this will all
occur in the context of pthread_mutexattr_init.  We are at the point in
verifyable_object_isvalid of:

  if ((*object)->magic != magic)

Here, object is &mxAttr, so *object is mxAttr.  But mxAttr is not yet
initialized, so dereferencing it as mxAttr->magic causes a fault.  This causes
verifyable_object_isvalid to return INVALID_OBJECT through the "if
(efault.faulted ())" branch.

Consequently, the if() condition in pthread_mutexattr::is_good_object is true,
the function returns false, the if() condition at the beginning of
pthread_mutexattr_init is false, and execution continues to the line "*attr =
new pthread_mutexattr ()", and finally mxAttr is initialized just as we desire.

I hope that I have shown that even though a fault occurs that execution
continues normally and the mutexattr IS initialized correctly.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-24 12:31                 ` Dave Korn
@ 2006-05-27 20:02                   ` Ralf Habacker
  0 siblings, 0 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-27 20:02 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dave Korn schrieb:
> On 24 May 2006 13:19, Ralf Habacker wrote:
> 
>> This breakpoint is never reached (at least in released gdb) and makes it
>> hard to debug cygwin's threading stuff, probably impossible in this area.
> 
>   How many times do you have to be told?  The last released gdb is known to
> not cope with this.  IT IS A KNOWN BUG.  IT HAS BEEN FIXED.
> 
I have downloaded and compiled gdb weekly snapshot (20060522), but there
are still problems with tracing after an internal exception occured. I
am using testcase mutex1n.c from cvs path
src/winsup/testsuite/winsup.api/pthread and set a breakpoint to
pthread_mutexattr_init


Breakpoint 1, pthread_mutexattr_init (attr=0x404040) at
/netrel/src/cygwin-snapshot-20060522-1/winsup/cygwin/cygtls.h:253
1: x/i $eip  0x610b0f07 <pthread_mutexattr_init+7>:     lea
0xffffff08(%ebp),%esi
(gdb) ni

<snip>

0x610b1005      129     in
/netrel/src/cygwin-snapshot-20060522-1/winsup/cygwin/thread.cc
1: x/i $eip  0x610b1005 <pthread_mutexattr_init+261>:   cmpl
$0xdf0df049,0x4(%eax)
(gdb)

here the internal exception occurs and gdb is out of sync until the
application ends  or a later breakpoint is hit.

0x7c91eaf0 in ntdll!LdrDisableThreadCalloutsForDll () from
/c/WINDOWS/system32/ntdll.dll
1: x/i $eip  0x7c91eaf0 <ntdll!LdrDisableThreadCalloutsForDll+4>:
mov    (%esp),%ebx
(gdb)
0x7c91eaf3 in ntdll!LdrDisableThreadCalloutsForDll () from
/c/WINDOWS/system32/ntdll.dll
1: x/i $eip  0x7c91eaf3 <ntdll!LdrDisableThreadCalloutsForDll+7>:
push   %ecx
(gdb)
0x7c91eaf4 in ntdll!LdrDisableThreadCalloutsForDll () from
/c/WINDOWS/system32/ntdll.dll
1: x/i $eip  0x7c91eaf4 <ntdll!LdrDisableThreadCalloutsForDll+8>:
push   %ebx
(gdb)
0x7c91eaf5 in ntdll!LdrDisableThreadCalloutsForDll () from
/c/WINDOWS/system32/ntdll.dll
1: x/i $eip  0x7c91eaf5 <ntdll!LdrDisableThreadCalloutsForDll+9>:
call   0x7c9477c1 <ntdll!LdrFindCreateProcessManifest+424>
(gdb)

Program exited normally.
(gdb)

Regards
Ralf

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEeKSToHh+5t8EXncRAgxiAJsHfqsBSME06zaSaMD/kgrQH4GJAgCeMqUp
wSedYnMrgNRpkpXRuny/2YE=
=1zgp
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-27 19:47                 ` Brian Dessent
@ 2006-05-27 21:04                   ` Ralf Habacker
  2006-05-27 23:51                     ` Brian Dessent
  0 siblings, 1 reply; 36+ messages in thread
From: Ralf Habacker @ 2006-05-27 21:04 UTC (permalink / raw)
  To: Brian Dessent; +Cc: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brian Dessent schrieb:
> [ I realized that a couple of points in this thread were never addressed -- we
> sort of got side tracked on the GDB issue.  I just want to reply to these points
> and try to convince you that this bug you see does not exist.  People have a
> tendency to point to the archives and say "lookee, it's broken" if the thread
> does not come to a result. ]
> 
> Ralf Habacker wrote:
> 
>> You said that the testcase runs, yes, but do you have tried to debug the
>> cygwin dll with this exception handling. Please start the above
>> mentioned testcase in gdb and enter
>>
>> b main
>> r
>> b pthread_mutexattr::pthread_mutexattr()
>> c
>>
>> This breakpoint is never reached (at least in released gdb) and makes it
>> hard to debug cygwin's threading stuff, probably impossible in this area.
> 
> The breakpoint does not fire, correct.  But that is because pthread_mutexattr's
> constructor is empty (other than the initialization list):
> 
> pthread_mutexattr::pthread_mutexattr ():verifyable_object
> (PTHREAD_MUTEXATTR_MAGIC),
> pshared (PTHREAD_PROCESS_PRIVATE), mutextype (PTHREAD_MUTEX_ERRORCHECK)
> {
> }
> 
> If instead you set a breakpoint for the desired line that calls the
constructor
> (in this case, thread.cc:3027) then it does fire.

If you take a look on assembler level you will find two constructors and
the breakpoint is set to the wrong one
0x610ad520 <_ZN17pthread_mutexattrC2Ev+0>:      push   %ebp
0x610ad560 <_ZN17pthread_mutexattrC1Ev+0>:      push   %ebp

(gdb) b pthread_mutexattr::pthread_mutexattr()
Breakpoint 3 at 0x610ad52c: -> _ZN17pthread_mutexattrC2Ev

but called is the other one.

0x610b0f7b <pthread_mutexattr_init+123>:        mov    %eax,(%esp)
0x610b0f7e <pthread_mutexattr_init+126>:        mov    %eax,%ebx
0x610b0f80 <pthread_mutexattr_init+128>:        call   0x610ad560
<_ZN17pthread_mutexattrC1Ev>


> And if you use a CVS GDB you  get no spurious faults either.
> 
>> This means to be able to debug the cygwin dll in this area I have to
>> recompile a special cygwin version with something like below mentioned.:
>
>> /* FIXME: write and test process shared mutex's.  */
>> extern "C" int
>> pthread_mutexattr_init (pthread_mutexattr_t *attr)
>>
>> old:
>>   if (pthread_mutexattr::is_good_object (attr))
>>     return EBUSY;
>>
>> new:
>>   if (attr && pthread_mutexattr::is_good_object (attr))
>>     return EBUSY;
> 
> This is totally useless.  In order for "if (attr)" to be false, the function
> would have had to been called as pthread_mutexattr_init (NULL) rather than
> pthread_mutexattr_init (&some_as_yet_uninitialized_variable).  Furthermore, if
> attr really were false, then the next line:
> 
>   *attr = new pthread_mutexattr ();
> 
> would cause a NULL dereference which would not be caught, causing the program to
> crash and burn.  The function must always be passed a valid pointer; the thing
> it points to might be uninitialized though.
> 
> Let's walk through the complete series of events that happens in the testcase
> below:
> 
> pthread_mutexattr_t mxAttr;
> assert(pthread_mutexattr_init(&mxAttr) == 0);
> 
> This is the thing that you claim is broken, however if you run this testcase
> from a regular prompt (outside GDB) it does not assert, and in fact the
> mutexattr is correctly initialized.  (And if you do run it in a recent GDB it
> does not assert nor fault either.)
> 
> Let's look at the entire chain of code involved here:
> 
> extern "C" int
> pthread_mutexattr_init (pthread_mutexattr_t *attr)
> {
>   if (pthread_mutexattr::is_good_object (attr))
>     return EBUSY;
> 
>   *attr = new pthread_mutexattr ();
>   if (!pthread_mutexattr::is_good_object (attr))
>     {
>       delete (*attr);
>       *attr = NULL;
>       return ENOMEM;
>     }
>   return 0;
> }
> 
> inline bool
> pthread_mutexattr::is_good_object (pthread_mutexattr_t const * attr)
> {
>   if (verifyable_object_isvalid (attr, PTHREAD_MUTEXATTR_MAGIC) !=
>                                          VALID_OBJECT)
>     return false;
>   return true;
> }
> 
> static inline verifyable_object_state
> verifyable_object_isvalid (void const * objectptr, long magic, void
>                            *static_ptr1, void *static_ptr2, void *static_ptr3)
> {
>   verifyable_object **object = (verifyable_object **) objectptr;
> 
>   myfault efault;
>   if (efault.faulted ())
>     return INVALID_OBJECT;
> 
>   if ((static_ptr1 && *object == static_ptr1) ||
>       (static_ptr2 && *object == static_ptr2) ||
>       (static_ptr3 && *object == static_ptr3))
>     return VALID_STATIC_OBJECT;
>   if ((*object)->magic != magic)
>     return INVALID_OBJECT;
>   return VALID_OBJECT;
> }
> 
> So, the call chain will look like this:
> 
> pthread_mutexattr_init(&mxAttr)  ->  
>  pthread_mutexattr::is_good_object (&mxAttr)  -> 
>   verifyable_object_isvalid (&mxAttr, PTHREAD_MUTEXATTR_MAGIC, NULL,NULL,NULL)
> 
> Of course, these last two functions will be expanded inline, so this will all
> occur in the context of pthread_mutexattr_init.  We are at the point in
> verifyable_object_isvalid of:
> 
>   if ((*object)->magic != magic)
> 
> Here, object is &mxAttr, so *object is mxAttr.  But mxAttr is not yet
> initialized, so dereferencing it as mxAttr->magic causes a fault.  This causes
> verifyable_object_isvalid to return INVALID_OBJECT through the "if
> (efault.faulted ())" branch.
> 
> Consequently, the if() condition in pthread_mutexattr::is_good_object is true,
> the function returns false, the if() condition at the beginning of
> pthread_mutexattr_init is false, and execution continues to the line "*attr =
> new pthread_mutexattr ()", and finally mxAttr is initialized just as we desire.
> 
> I hope that I have shown that even though a fault occurs that execution
> continues normally and the mutexattr IS initialized correctly.

Thanks for taking this time to write the detailed informations. Many
irritations came from the problems gdb had made. It would be good to
have a gdb update. :-)
Using the testcases in cvs src/winsup/testsuite/ I've verifyed by myself
that this stuff seems to work for that simple cases (although I have
still problem with threading stuff relating to qt3, more below).

There is only one case where I still believe that there may be a problem.
If a pthread_mutexattr_t is constructed on the stack and the magic class
membere is be exactly the predefined value, pthread_mutexattr_init()
will return EBUSY, although there is no good object, it is only by random.

Because of this in the former code of pthread_mutexattr_init()
check_valid_pointer() is used to check only the pointer, not the magic
class member. You and others may come to the conclusion that this could
be neglected, but it has to be listed.

Now i have still the problem why this thread was initial opened.
qt3 designer from the qt3-bin packages (and uic) crashes.
 The initial performed strace shows that there are internal recursive
exceptions until the stack overflows, which produces the 70% log amount
in http://cygwin.com/ml/cygwin/2006-05/msg00644.html.

If someone will try: it is reproducable using gdb /usr/lib/qt3/bin/designer

I found initial out that my initial patch prevents the segfault in
verifyable_object_isvalid() perfoming  ((*object)->magic != magic) and
let designer run.

The only workaround i currently have is to disable the first
is_good_object() call in pthread_mutexattr_init().

My currently conclusion is that there are rare conditions in the pthread
and/or exception stuff, which corrupts the stack and that there is more
work required to find out the problem.

May be I'm able to find out a bit more in the next time.

Regards
Ralf







-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEeLXToHh+5t8EXncRApU9AJ41mBDC8gfCCy2Cvjz1sZoGUoop+wCgi0ny
V9l1sNE39vtWvqBHIjyMwHY=
=QQi3
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-27 21:04                   ` Ralf Habacker
@ 2006-05-27 23:51                     ` Brian Dessent
  2006-05-28 13:22                       ` Ralf Habacker
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Dessent @ 2006-05-27 23:51 UTC (permalink / raw)
  To: cygwin

Ralf Habacker wrote:

> There is only one case where I still believe that there may be a problem.
> If a pthread_mutexattr_t is constructed on the stack and the magic class
> membere is be exactly the predefined value, pthread_mutexattr_init()
> will return EBUSY, although there is no good object, it is only by random.

I believe this has been hashed out on the list before as well.  I think
the conclusion is that the app needs to check the return values so that
it can cope with this case.  I don't have a link to the thread handy.

> My currently conclusion is that there are rare conditions in the pthread
> and/or exception stuff, which corrupts the stack and that there is more
> work required to find out the problem.

This I don't doubt, but I think it will just require someone digging
in.  If you can whittle down the qt3 stack overflow crash to a testcase,
then there's a good chance someone reading will give it more attention.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-27 23:51                     ` Brian Dessent
@ 2006-05-28 13:22                       ` Ralf Habacker
  2006-05-28 19:04                         ` Ralf Habacker
                                           ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-28 13:22 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brian Dessent schrieb:
> Ralf Habacker wrote:
> 
>> There is only one case where I still believe that there may be a problem.
>> If a pthread_mutexattr_t is constructed on the stack and the magic class
>> membere is be exactly the predefined value, pthread_mutexattr_init()
>> will return EBUSY, although there is no good object, it is only by random.
> 
> I believe this has been hashed out on the list before as well.  I think
> the conclusion is that the app needs to check the return values so that
> it can cope with this case.  I don't have a link to the thread handy.
> 
>> My currently conclusion is that there are rare conditions in the pthread
>> and/or exception stuff, which corrupts the stack and that there is more
>> work required to find out the problem.
> 
> This I don't doubt, but I think it will just require someone digging
> in.  If you can whittle down the qt3 stack overflow crash to a testcase,
> then there's a good chance someone reading will give it more attention.

I just downloaded cywin snapshot 2005-06-27 and got running designer and
 uic without any problem, so it looks like there is no need to deep
more into this stuff. I will follow the next time if this problems takes
places again.

Regards
Ralf

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEeYPUoHh+5t8EXncRAp+gAKCCzeyZYaEuwvRiaDL2FB/hYGPAjQCeJSH2
/BqF8kRfoLwrWOBQMLA21wg=
=PPbt
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-28 13:22                       ` Ralf Habacker
@ 2006-05-28 19:04                         ` Ralf Habacker
  2006-05-28 19:05                         ` clayne
  2006-05-28 22:45                         ` Yaakov S (Cygwin Ports)
  2 siblings, 0 replies; 36+ messages in thread
From: Ralf Habacker @ 2006-05-28 19:04 UTC (permalink / raw)
  To: cygwin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ralf Habacker schrieb:
> Brian Dessent schrieb:
>>> Ralf Habacker wrote:
>>>
>>>> There is only one case where I still believe that there may be a problem.
>>>> If a pthread_mutexattr_t is constructed on the stack and the magic class
>>>> membere is be exactly the predefined value, pthread_mutexattr_init()
>>>> will return EBUSY, although there is no good object, it is only by random.
>>> I believe this has been hashed out on the list before as well.  I think
>>> the conclusion is that the app needs to check the return values so that
>>> it can cope with this case.  I don't have a link to the thread handy.
>>>
>>>> My currently conclusion is that there are rare conditions in the pthread
>>>> and/or exception stuff, which corrupts the stack and that there is more
>>>> work required to find out the problem.
>>> This I don't doubt, but I think it will just require someone digging
>>> in.  If you can whittle down the qt3 stack overflow crash to a testcase,
>>> then there's a good chance someone reading will give it more attention.
> 
> I just downloaded cywin snapshot 2005-06-27 and got running designer and
>  uic without any problem, so it looks like there is no need to deep
> more into this stuff. I will follow the next time if this problems takes
> places again.

Many thanks to Chris and Gary. :-)

Ralf
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEeeqQoHh+5t8EXncRAmMNAJ4s2vN/ZW74vbwursRd00KxFqMNjgCdGwaV
FKdBdtiuyZoHAHkgn10zWw8=
=tVok
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-28 13:22                       ` Ralf Habacker
  2006-05-28 19:04                         ` Ralf Habacker
@ 2006-05-28 19:05                         ` clayne
  2006-05-28 19:06                           ` clayne
  2006-05-28 20:34                           ` Christopher Faylor
  2006-05-28 22:45                         ` Yaakov S (Cygwin Ports)
  2 siblings, 2 replies; 36+ messages in thread
From: clayne @ 2006-05-28 19:05 UTC (permalink / raw)
  To: cygwin

On Sun, May 28, 2006 at 01:04:52PM +0200, Ralf Habacker wrote:
> I just downloaded cywin snapshot 2005-06-27 and got running designer and
>  uic without any problem, so it looks like there is no need to deep
> more into this stuff. I will follow the next time if this problems takes
> places again.
> 
> Regards
> Ralf

That's just a tad old there, Ralf.
Tried any of the more recent snapshots at http://www.cygin.com/snapshots/ ?

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-28 19:05                         ` clayne
@ 2006-05-28 19:06                           ` clayne
  2006-05-28 20:34                           ` Christopher Faylor
  1 sibling, 0 replies; 36+ messages in thread
From: clayne @ 2006-05-28 19:06 UTC (permalink / raw)
  To: cygwin

> That's just a tad old there, Ralf.
> Tried any of the more recent snapshots at http://www.cygin.com/snapshots/ ?

Mistype.

http://www.cygwin.com/snapshots/

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-28 19:05                         ` clayne
  2006-05-28 19:06                           ` clayne
@ 2006-05-28 20:34                           ` Christopher Faylor
  1 sibling, 0 replies; 36+ messages in thread
From: Christopher Faylor @ 2006-05-28 20:34 UTC (permalink / raw)
  To: cygwin

On Sun, May 28, 2006 at 12:04:38PM -0700, clayne@anodized.com wrote:
>On Sun, May 28, 2006 at 01:04:52PM +0200, Ralf Habacker wrote:
>> I just downloaded cywin snapshot 2005-06-27 and got running designer and
>>  uic without any problem, so it looks like there is no need to deep
>> more into this stuff. I will follow the next time if this problems takes
>> places again.
>
>That's just a tad old there, Ralf.
>Tried any of the more recent snapshots at http://www.cygin.com/snapshots/ ?

Since Ralf couldn't actually download a Cygwin snapshot from 2005-06-27,
this was obviously a typo.  He meant yesterday's snapshot.

He's been around long enough to know where to get snapshots if he needs
them.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-28 13:22                       ` Ralf Habacker
  2006-05-28 19:04                         ` Ralf Habacker
  2006-05-28 19:05                         ` clayne
@ 2006-05-28 22:45                         ` Yaakov S (Cygwin Ports)
  2006-05-28 23:25                           ` Yaakov S (Cygwin Ports)
  2 siblings, 1 reply; 36+ messages in thread
From: Yaakov S (Cygwin Ports) @ 2006-05-28 22:45 UTC (permalink / raw)
  To: cygwin

Ralf Habacker wrote:
> I just downloaded cywin snapshot 2005-06-27 and got running designer and
>  uic without any problem, so it looks like there is no need to deep
> more into this stuff. I will follow the next time if this problems takes
> places again.

I can confirm that the 2006-Jun-27 snapshot (which is what he meant)
fixes the longstanding issues WRT qt3 and company.  Thanks to all who
helped figure this out, and I look forward to restarting work on qt3,
qt4, and KDE 3.5 in the near future.


Yaakov

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: 1.5.19: changes have broken Qt3
  2006-05-28 22:45                         ` Yaakov S (Cygwin Ports)
@ 2006-05-28 23:25                           ` Yaakov S (Cygwin Ports)
  0 siblings, 0 replies; 36+ messages in thread
From: Yaakov S (Cygwin Ports) @ 2006-05-28 23:25 UTC (permalink / raw)
  To: cygwin

Yaakov S (Cygwin Ports) wrote:
> I can confirm that the 2006-Jun-27 snapshot (which is what he meant)
> fixes the longstanding issues WRT qt3 and company.  Thanks to all who
> helped figure this out, and I look forward to restarting work on qt3,
> qt4, and KDE 3.5 in the near future.

Sigh.  That's 2006-MAY-27.


Yaakov

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* 1.5.19: changes have broken Qt3
@ 2006-02-14  3:36 Yaakov S (Cygwin Ports)
  0 siblings, 0 replies; 36+ messages in thread
From: Yaakov S (Cygwin Ports) @ 2006-02-14  3:36 UTC (permalink / raw)
  To: cygwin

It would appear that changes to the cygwin1.dll since 1.5.18-1 (and 
before the 20051207 snapshot) have broken Qt3.  The relevant threads 
until now:

http://www.cygwin.com/ml/cygwin-xfree/2005-12/msg00026.html
http://www.cygwin.com/ml/cygwin-xfree/2006-02/msg00005.html

The bottom line is that applications dependent on Qt threads, which 
worked with 1.5.18, now crash silently.  This also breaks building qt3 
from source, as uic is similarly affected.

Interestingly, running designer-qt3 (an alias for 
/usr/lib/qt3/bin/designer) on 1.5.19-4 uniquely raises the following error:

      17 [main] designer-qt3 1628 _cygtls::handle_exceptions: Error 
while dumping state (probably corrupted stack)
Segmentation fault (core dumped)

Using the 20060212 snapshot dll gives similar results; in the case of 
designer:

      12 [main] designer-qt3 808 _cygtls::handle_exceptions: Error while 
dumping state (probably corrupted stack)
  594884 [main] designer-qt3 808 _cygtls::handle_exceptions: Error while 
dumping state (probably corrupted stack)
2157249 [main] designer-qt3 808 _cygtls::handle_exceptions: Error while 
dumping state (probably corrupted stack)
Segmentation fault (core dumped)

Running scribus, which exits silently on 1.5.19-4, with the 
aforementioned snapshot also gives a similar error.

Note that non-threaded Qt3 apps (e.g. qtconfig) do work with all three 
mentioned versions of cygwin1.dll.

I would like very much to get to the bottom of this.  If one of the 
Cygwin developers could please let me know what other information I can 
provide to help solve this, I'd be glad to provide it.

Yaakov

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2006-05-28 23:05 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-23 17:10 Re:1.5.19: changes have broken Qt3 Ralf Habacker
2006-05-23 17:31 ` 1.5.19: " Brian Dessent
2006-05-23 17:33 ` Dave Korn
2006-05-23 18:16   ` Ralf Habacker
2006-05-23 18:24     ` Christopher Faylor
2006-05-23 19:23       ` Ralf Habacker
2006-05-23 20:10         ` Christopher Faylor
2006-05-24  8:56           ` Ralf Habacker
2006-05-24  8:50         ` Brian Dessent
2006-05-24  9:01           ` Ralf Habacker
2006-05-24  9:23             ` Brian Dessent
2006-05-24 12:21               ` Ralf Habacker
2006-05-24 12:31                 ` Dave Korn
2006-05-27 20:02                   ` Ralf Habacker
2006-05-27 19:47                 ` Brian Dessent
2006-05-27 21:04                   ` Ralf Habacker
2006-05-27 23:51                     ` Brian Dessent
2006-05-28 13:22                       ` Ralf Habacker
2006-05-28 19:04                         ` Ralf Habacker
2006-05-28 19:05                         ` clayne
2006-05-28 19:06                           ` clayne
2006-05-28 20:34                           ` Christopher Faylor
2006-05-28 22:45                         ` Yaakov S (Cygwin Ports)
2006-05-28 23:25                           ` Yaakov S (Cygwin Ports)
2006-05-24 16:13               ` Ralf Habacker
2006-05-24 18:06                 ` Cygwin, gdb and SEH [was RE: 1.5.19: changes have broken Qt3] Dave Korn
2006-05-25  3:05                   ` clayne
2006-05-25  9:23                     ` Dave Korn
2006-05-25 15:25                     ` mwoehlke
2006-05-24 10:06           ` 1.5.19: changes have broken Qt3 clayne
2006-05-24 11:00             ` Dave Korn
2006-05-24 12:10               ` clayne
2006-05-24 12:28                 ` Dave Korn
2006-05-24 12:18             ` Brian Dessent
2006-05-24 15:38 ` Ralf Habacker
  -- strict thread matches above, loose matches on Subject: below --
2006-02-14  3:36 Yaakov S (Cygwin Ports)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).