public inbox for pthreads-win32@sourceware.org
 help / color / mirror / Atom feed
* RE: Bug report with source code attached
@ 2000-03-01  7:38 Bossom, John
  2000-03-01  9:54 ` Dave Baggett
  0 siblings, 1 reply; 7+ messages in thread
From: Bossom, John @ 2000-03-01  7:38 UTC (permalink / raw)
  To: 'Dave Baggett', pthreads-win32

1) you didn't include the thread code, so we really don't
   know if all the threads are still consuming resources
   (and running!) or not. If they are you are chewing up
   all the system resources... there should be sufficient
   checks in both your code (and pthreads-win32) to deal
   with out of memory/resource errors (i.e. checking for
   failures of malloc, etc.)

2) the stack size should not be set at all when creating a
   pthread_attr... it should have been initialized to zero.
   WinNT will default the stack size of each thread to the
   size of the stack for the main process.. the default for
   that is 1Meg... don't worry... it isn't "committed" memory...
   it'll only acquire actual memory as needed, a page at a time.

   Seems there may be a bug here... Ross, you may want to
   verify that we are not initializing the stack size intentionally...
   we should let it default to the system defaults and not
   arbitrarily place limits implicitly.

-----Original Message-----
From: Dave Baggett [ mailto:dmb@itasoftware.com ]
Sent: Wednesday, March 01, 2000 2:55 AM
To: pthreads-win32@sourceware.cygnus.com
Subject: Bug report with source code attached


I have a sizeable pthreads-based server application that works well under
Linux and NT (using pthreads-win32). Under NT, however, it crashses
periodically. After several days of debugging, I have isolated the source of
the crashes to a small bit of code (included below) which looks to me to be
perfectly innocent. However, I may not understand pthreads semantics
adequately  -- it wouldn't be the first time --  so my code might be wrong.

If I compile the program below and run it with a "-l" argument ("l" as in
"lose"), I get:

  The instruction at "0x7800d557" referenced memory at "0x00000170".
  The memory could not be "written".

This happens after a delay that varies between 0 and 15 minutes.  The
instruction address never varies, but the referenced memory location
does. The debugger shows that it's dying in the malloc call in
pthread_attr_init. I.e., the heap is somehow getting corrupted.

If I run it without "-l", it seems to be able to run for hours without
crashing. Am I correct in assuming that the two modes of operation should
be equivalent (aside from the fact that I might be leaking memory by
not freeing the attrs)?

This happens with the 1999-11-02 snapshot as well as the 1999-09-17
snapshot.

I see nothing in the win32-pthreads source that looks like it could cause
this.
However, I did notice that pthread_attr_init() returns you an attr that sets
the stacksize to 1K, which doesn't seem good. Explicitly setting the
stacksize
won't fix the crash problem, but it still seems like you ought to default
the
stack to 256K or something reasonable.

Apologies in advance if I'm just doing something stupid. :)

Dave Baggett
dmb@itasoftware.com
----------------------------------------------------------------------------
---
//
// Call with -l argument to cause a crash.
// Compiled with MSVC6 using these args:
//   -nologo -D WIN32 -D _WINDOWS -ML -MTd -GX -Od -G6 -W3 -Zi
//
#include "pthread.h"
#include <assert.h>

void *NOP(void *p) { return NULL; }

int
main(int argc, char **argv)
{
    bool lose = (argc == 2 && !strcmp(argv[1], "-l"));
    for (;;) {
        int retval;
        pthread_t tid;

        if (lose) {
            pthread_attr_t attr;
            retval = pthread_attr_init(&attr);
            assert(retval==0);          // success
            retval = pthread_attr_setdetachstate(&attr,
PTHREAD_CREATE_DETACHED);
            assert(retval==0);          // success
            retval = pthread_create(&tid, &attr, NOP, NULL);
            assert(retval==0);
        }
        else {
            retval = pthread_create(&tid, NULL, NOP, NULL);
            assert(retval==0);          // success
            retval = pthread_detach(tid);
            assert(retval==0);
        }
    }
    return 0;
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug report with source code attached
  2000-03-01  7:38 Bug report with source code attached Bossom, John
@ 2000-03-01  9:54 ` Dave Baggett
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Baggett @ 2000-03-01  9:54 UTC (permalink / raw)
  To: pthreads-win32

John Bossom wrote:

> 1) you didn't include the thread code, so we really don't
>    know if all the threads are still consuming resources
>    (and running!) or not. If they are you are chewing up
>    all the system resources... there should be sufficient
>    checks in both your code (and pthreads-win32) to deal
>    with out of memory/resource errors (i.e. checking for
>    failures of malloc, etc.)

I think I may have been confusing.

That wasn't a *piece* of the code that causes the crash --
that code *itself* will cause a crash. In other words, there 
is no more code to show you as far as eliciting the bug
is concerned. Of course, in my real app the threads actually
do things, but that seems to have no bearing on this crash bug.
I watered down my test case to the code I sent, and that's all
you need to cause a crash. In other words, just paste that code
into "test.cpp", compile it, and run it. Within about 15
minutes the program should crash.

I know it looks pretty unlikely, but that's what happens
on my machine (NT SP5, 1GB RAM). I'd certainly be 
interested to learn that nobody else can duplicate this crash
if that's the case -- that means I need to reinstall NT,
I suppose.

Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug report with source code attached
  2000-03-02  7:46 Bossom, John
@ 2000-03-02 16:15 ` Dave Baggett
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Baggett @ 2000-03-02 16:15 UTC (permalink / raw)
  To: pthreads-win32

John Bossom wrote:

>Secondly, since your threads are SO short, they could actually
>run to completion BEFORE you actually call pthread_detach.

I agree, this is certainly a bug in my test program. I rewrote it to
incorporate this change (and also Dave's fix of adding a
pthread_attr_destroy so as to not leak memory) and ... well, it
takes much, much longer to crash now, but it does still crash.
I've included the code below. As I said earlier, it doesn't really
matter which clause of the if statement you use -- it will crash
either way.

>[...] Select "Process" and find your process name.
>Next add appropriate counters such as Private Bytes, etc. to monitor
>your resource consumption.

OK, I did this (and also watched it under the task manager -- you can turn
columns for things like handles).  If I run it without -l (i.e., I run it so
that the second, simpler clause of the if is taken), it does seem to leak
handles -- very, very slowly. After 2.5 hours it was up to 450 handles. This
number does steadily rise as the program runs. I don't think the crash
is caused by running out of handles -- if you write a simple test program
that creates joinable threads and never joins them, you'll see that you
can use up tens of thousands of handles before the program wil die.

It seems to leak handles in -l mode as well (and at a faster rate), but I
haven't let it run long enough to be too sure.

Dave

//
// Call with -l argument to cause a crash.
// Compiled with MSVC6 using these args:
//   -nologo -D WIN32 -D _WINDOWS -ML -MTd -GX -Od -G6 -W3 -Zi
//
#include "pthread.h"
#include <assert.h>

void *NOP1(void *p) { return NULL; }
void *NOP2(void *p) { assert(pthread_detach(pthread_self()) == 0); return NULL; }

int
main(int argc, char **argv)
{
    bool lose = (argc == 2 && !strcmp(argv[1], "-l"));
    for (;;) {
        int retval;
        pthread_t tid;

        if (lose) {
            pthread_attr_t attr;
            retval = pthread_attr_init(&attr);
            assert(retval==0);          // success
            retval = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
            assert(retval==0);          // success
            retval = pthread_create(&tid, &attr, NOP1, NULL);
            assert(retval==0);
            retval = pthread_attr_destroy(&attr);
            assert(retval==0);
        }
        else {
            // actually, this loses too!
            retval = pthread_create(&tid, NULL, NOP2, NULL);
            assert(retval==0);          // success
        }
    }
    return 0;
}


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Bug report with source code attached
@ 2000-03-02  7:46 Bossom, John
  2000-03-02 16:15 ` Dave Baggett
  0 siblings, 1 reply; 7+ messages in thread
From: Bossom, John @ 2000-03-02  7:46 UTC (permalink / raw)
  To: 'Dave Baggett', pthreads-win32

Sorry, I didn't actually see a thread main line routine in your
code. However, it does seem to be an out of resource type error,
such as stack, memory, thread id's, etc. I personally haven't
seen the actually released code base... I submitted the code
that had been used for the bulk of the implementation. I am
not personally active it the maintaining of the library...
I simply provide advice, etc. if it's called for.

So, one thing to check is to walk through when you return from
your thread routine to determine that the library does, indeed,
release ALL thread resources when the thread terminates (this
is what is supposed to happen when a detached thread terminates)

Secondly, since your threads are SO short, they could actually
run to completion BEFORE you actually call pthread_detach.
You might consider calling pthread_detach from within the actual
thread to avoid this situation. Some investigation is required
to determine what will happen on calling pthread_detach on
a thread that has actually terminated. I would surmise that since
the thread had finished, there is no chance for the library to
actually clean up it's resources; hence you are leaking all the
thread's resources (memory, and actual thread ids). Please try
moving the pthread_detach to inside your thread as in

	pthread_detach( pthread_self() );

Third, the library shouldn't be "crashing" when an out of resource
error occurs. Run the program from within the debugger and put a
break point in main. Use the Performance Monitor (Start... Programs...
Administrative Tools...) Select "Process" and find your process name.
Next add appropriate counters such as Private Bytes, etc. to monitor
your resource consumption.

I hope this helps,

John.

-----Original Message-----
From: Dave Baggett [ mailto:dmb@itasoftware.com ]
Sent: Wednesday, March 01, 2000 12:50 PM
To: pthreads-win32@sourceware.cygnus.com
Subject: Re: Bug report with source code attached


John Bossom wrote:

> 1) you didn't include the thread code, so we really don't
>    know if all the threads are still consuming resources
>    (and running!) or not. If they are you are chewing up
>    all the system resources... there should be sufficient
>    checks in both your code (and pthreads-win32) to deal
>    with out of memory/resource errors (i.e. checking for
>    failures of malloc, etc.)

I think I may have been confusing.

That wasn't a *piece* of the code that causes the crash --
that code *itself* will cause a crash. In other words, there 
is no more code to show you as far as eliciting the bug
is concerned. Of course, in my real app the threads actually
do things, but that seems to have no bearing on this crash bug.
I watered down my test case to the code I sent, and that's all
you need to cause a crash. In other words, just paste that code
into "test.cpp", compile it, and run it. Within about 15
minutes the program should crash.

I know it looks pretty unlikely, but that's what happens
on my machine (NT SP5, 1GB RAM). I'd certainly be 
interested to learn that nobody else can duplicate this crash
if that's the case -- that means I need to reinstall NT,
I suppose.

Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug report with source code attached
@ 2000-03-02  0:32 reentrant
  0 siblings, 0 replies; 7+ messages in thread
From: reentrant @ 2000-03-02  0:32 UTC (permalink / raw)
  To: Dave Baggett, pthreads-win32

For the first case, if you add a pthread_attr_destroy after the last assert?

I haven't looked at the pthread source to verify, but some pthread init
functions allocate memory which must be deallocated by calling the associated
destroy function.

Dave

--- Dave Baggett <dmb@itasoftware.com> wrote:
> I have a sizeable pthreads-based server application that works well under
> Linux and NT (using pthreads-win32). Under NT, however, it crashses
> periodically. After several days of debugging, I have isolated the source of
> the crashes to a small bit of code (included below) which looks to me to be
> perfectly innocent. However, I may not understand pthreads semantics
> adequately  -- it wouldn't be the first time --  so my code might be wrong.
> 
> If I compile the program below and run it with a "-l" argument ("l" as in
> "lose"), I get:
> 
>   The instruction at "0x7800d557" referenced memory at "0x00000170".
>   The memory could not be "written".
> 
> This happens after a delay that varies between 0 and 15 minutes.  The
> instruction address never varies, but the referenced memory location
> does. The debugger shows that it's dying in the malloc call in
> pthread_attr_init. I.e., the heap is somehow getting corrupted.
> 
> If I run it without "-l", it seems to be able to run for hours without
> crashing. Am I correct in assuming that the two modes of operation should
> be equivalent (aside from the fact that I might be leaking memory by
> not freeing the attrs)?
> 
> This happens with the 1999-11-02 snapshot as well as the 1999-09-17
> snapshot.
> 
> I see nothing in the win32-pthreads source that looks like it could cause
> this.
> However, I did notice that pthread_attr_init() returns you an attr that sets
> the stacksize to 1K, which doesn't seem good. Explicitly setting the
> stacksize
> won't fix the crash problem, but it still seems like you ought to default the
> stack to 256K or something reasonable.
> 
> Apologies in advance if I'm just doing something stupid. :)
> 
> Dave Baggett
> dmb@itasoftware.com
>
-------------------------------------------------------------------------------
> //
> // Call with -l argument to cause a crash.
> // Compiled with MSVC6 using these args:
> //   -nologo -D WIN32 -D _WINDOWS -ML -MTd -GX -Od -G6 -W3 -Zi
> //
> #include "pthread.h"
> #include <assert.h>
> 
> void *NOP(void *p) { return NULL; }
> 
> int
> main(int argc, char **argv)
> {
>     bool lose = (argc == 2 && !strcmp(argv[1], "-l"));
>     for (;;) {
>         int retval;
>         pthread_t tid;
> 
>         if (lose) {
>             pthread_attr_t attr;
>             retval = pthread_attr_init(&attr);
>             assert(retval==0);          // success
>             retval = pthread_attr_setdetachstate(&attr,
> PTHREAD_CREATE_DETACHED);
>             assert(retval==0);          // success
>             retval = pthread_create(&tid, &attr, NOP, NULL);
>             assert(retval==0);
>         }
>         else {
>             retval = pthread_create(&tid, NULL, NOP, NULL);
>             assert(retval==0);          // success
>             retval = pthread_detach(tid);
>             assert(retval==0);
>         }
>     }
>     return 0;
> }
> 
> 
> 
__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug report with source code attached
  2000-02-29 23:59 Dave Baggett
@ 2000-03-01  1:10 ` Dave Baggett
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Baggett @ 2000-03-01  1:10 UTC (permalink / raw)
  To: pthreads-win32

In an earlier message, I wrote:

>If I run it without "-l", it seems to be able to run for hours without
>crashing [...]

I take this back. It will crash no matter whether you specify -l or not.
It seems I just got lucky running it without. :)

So it looks like I don't have a workaround after all.

Dave Baggett
dmb@itasoftware.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Bug report with source code attached
@ 2000-02-29 23:59 Dave Baggett
  2000-03-01  1:10 ` Dave Baggett
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Baggett @ 2000-02-29 23:59 UTC (permalink / raw)
  To: pthreads-win32

I have a sizeable pthreads-based server application that works well under
Linux and NT (using pthreads-win32). Under NT, however, it crashses
periodically. After several days of debugging, I have isolated the source of
the crashes to a small bit of code (included below) which looks to me to be
perfectly innocent. However, I may not understand pthreads semantics
adequately  -- it wouldn't be the first time --  so my code might be wrong.

If I compile the program below and run it with a "-l" argument ("l" as in
"lose"), I get:

  The instruction at "0x7800d557" referenced memory at "0x00000170".
  The memory could not be "written".

This happens after a delay that varies between 0 and 15 minutes.  The
instruction address never varies, but the referenced memory location
does. The debugger shows that it's dying in the malloc call in
pthread_attr_init. I.e., the heap is somehow getting corrupted.

If I run it without "-l", it seems to be able to run for hours without
crashing. Am I correct in assuming that the two modes of operation should
be equivalent (aside from the fact that I might be leaking memory by
not freeing the attrs)?

This happens with the 1999-11-02 snapshot as well as the 1999-09-17
snapshot.

I see nothing in the win32-pthreads source that looks like it could cause this.
However, I did notice that pthread_attr_init() returns you an attr that sets
the stacksize to 1K, which doesn't seem good. Explicitly setting the stacksize
won't fix the crash problem, but it still seems like you ought to default the
stack to 256K or something reasonable.

Apologies in advance if I'm just doing something stupid. :)

Dave Baggett
dmb@itasoftware.com
-------------------------------------------------------------------------------
//
// Call with -l argument to cause a crash.
// Compiled with MSVC6 using these args:
//   -nologo -D WIN32 -D _WINDOWS -ML -MTd -GX -Od -G6 -W3 -Zi
//
#include "pthread.h"
#include <assert.h>

void *NOP(void *p) { return NULL; }

int
main(int argc, char **argv)
{
    bool lose = (argc == 2 && !strcmp(argv[1], "-l"));
    for (;;) {
        int retval;
        pthread_t tid;

        if (lose) {
            pthread_attr_t attr;
            retval = pthread_attr_init(&attr);
            assert(retval==0);          // success
            retval = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
            assert(retval==0);          // success
            retval = pthread_create(&tid, &attr, NOP, NULL);
            assert(retval==0);
        }
        else {
            retval = pthread_create(&tid, NULL, NOP, NULL);
            assert(retval==0);          // success
            retval = pthread_detach(tid);
            assert(retval==0);
        }
    }
    return 0;
}


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2000-03-02 16:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-03-01  7:38 Bug report with source code attached Bossom, John
2000-03-01  9:54 ` Dave Baggett
  -- strict thread matches above, loose matches on Subject: below --
2000-03-02  7:46 Bossom, John
2000-03-02 16:15 ` Dave Baggett
2000-03-02  0:32 reentrant
2000-02-29 23:59 Dave Baggett
2000-03-01  1:10 ` Dave Baggett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).