public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/30558] New: SIGEV_THREAD is badly implemented
@ 2023-06-15 19:36 stsp at users dot sourceforge.net
  2023-06-15 21:04 ` [Bug libc/30558] " adhemerval.zanella at linaro dot org
                   ` (30 more replies)
  0 siblings, 31 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-15 19:36 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

            Bug ID: 30558
           Summary: SIGEV_THREAD is badly implemented
           Product: glibc
           Version: 2.37
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: libc
          Assignee: unassigned at sourceware dot org
          Reporter: stsp at users dot sourceforge.net
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

SIGEV_THREAD seems to currently create
and destroy the thread per every timer
tick. In gdb I have the flood of

[Thread 0x7fff698b56c0 (LWP 1304252) exited]
[New Thread 0x7fff698b56c0 (LWP 1304253)]
[Thread 0x7fff698b56c0 (LWP 1304253) exited]
[New Thread 0x7fff698b56c0 (LWP 1304254)]
[Thread 0x7fff698b56c0 (LWP 1304254) exited]

messages, and under freebsd gdb simply asserts out.

While definitely not a bug, I think it would
be very good both for debugability and performance
to have just 1 thread per SIGEV_THREAD timer,
that gets destroyed only together with the timer.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
@ 2023-06-15 21:04 ` adhemerval.zanella at linaro dot org
  2023-06-16  2:23 ` stsp at users dot sourceforge.net
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-15 21:04 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |adhemerval.zanella at linaro dot o
                   |                            |rg

--- Comment #1 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
FreeBSD also implements SIGEV_THREAD in the same ways as glibc: with a detached
helper thread that spawns thread when kernel signal when thee time has expired
(check
https://github.com/freebsd/freebsd-src/blob/main/lib/librt/sigev_thread.c).

You can check with FreeBSD own tools:

--
$ cat <<EOF >>timer.c
#include <time.h>
#include <signal.h>
#include <unistd.h>

static void
handler(union sigval sv)
{
}

int
main(void)
{
  timer_t timerid;
  timer_create (CLOCK_REALTIME,
                &(struct sigevent) {
                  .sigev_notify = SIGEV_THREAD,
                  .sigev_notify_function = handler,
                },
                &timerid);

  timer_settime (timerid,
                 0,
                 &(struct itimerspec) {
                   { 0, 150000000 },
                   { 0, 300000000 }
                 },
                 NULL);

  sleep (1);

  timer_delete(timerid);

  return 0;
}
EOF
$ cc -Wall timer.c -o timer -lrt -lpthread
$ truss ./sigevthread 2>&1 | grep 'thread'
rtprio_thread(RTP_LOOKUP,100647,0x7fffffffda38)  = 0 (0x0)
<new thread 101341>
<new thread 101342>
<new thread 101343>
<thread 101343 exited>
<new thread 101344>
<thread 101344 exited>
<new thread 101345>
<thread 101345 exited>
<new thread 101346>
<thread 101346 exited>
<new thread 101347>
<thread 101347 exited>
<new thread 101348>
<thread 101348 exited>
<thread 101342 exited>
<thread 101341 exited>
--

One possibility is to remove the background thread, create the thread prior to
timer_create, and move the sigwaitinfo loop logic to the thread itself.  It
would require changing how timer_delete works, it needs now to signal the
thread instead to issue the syscall itself. Btw this is how musl has
implemented it.

I am not sure which is the best option, but it might worth experimenting to
remove the helper thread.  It would allow removing the
__timer_active_sigev_thread list and simplify the code, but I am not sure about
the performance implications.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
  2023-06-15 21:04 ` [Bug libc/30558] " adhemerval.zanella at linaro dot org
@ 2023-06-16  2:23 ` stsp at users dot sourceforge.net
  2023-06-16  6:44 ` stsp at users dot sourceforge.net
                   ` (28 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-16  2:23 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #2 from Stas Sergeev <stsp at users dot sourceforge.net> ---
(In reply to Adhemerval Zanella from comment #1)
> One possibility is to remove the background thread, create the thread prior
> to timer_create,

You mean, at timer_create()?
Or do you mean 1 thread per all timers?

> and move the sigwaitinfo loop logic to the thread itself. 
> It would require changing how timer_delete works, it needs now to signal the
> thread instead to issue the syscall itself. Btw this is how musl has
> implemented it.
> 
> I am not sure which is the best option, but it might worth experimenting to
> remove the helper thread.  It would allow removing the
> __timer_active_sigev_thread list and simplify the code, but I am not sure
> about the performance implications.

I would guess the performance effect may
be non-linear. For example on a low timer
rate maybe currently you save a syscall or
2 so things are fast. On a higher rate you
may get increased loadavg because of the
periodic thread re-creation, and on a yet
higher rate this may start manifesting in
a degraded performance of the entire system,
not just 1 prog.
I can try to play around this and measure
things out, but eg pid reuse problem also
bothers me a lot, i.e. I don't want my prog
to loop over the pid space indefinitely. Who
knows what system-wide problems will that
create. I've yet to see the program that
constantly runs over the pid space, so may
I assume SIGEV_THREAD just rarely used so
no one cared before?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
  2023-06-15 21:04 ` [Bug libc/30558] " adhemerval.zanella at linaro dot org
  2023-06-16  2:23 ` stsp at users dot sourceforge.net
@ 2023-06-16  6:44 ` stsp at users dot sourceforge.net
  2023-06-16  7:29 ` stsp at users dot sourceforge.net
                   ` (27 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-16  6:44 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #3 from Stas Sergeev <stsp at users dot sourceforge.net> ---
(In reply to Adhemerval Zanella from comment #1)
> FreeBSD also implements SIGEV_THREAD in the same ways as glibc: with a
> detached helper thread that spawns thread when kernel signal when thee time
> has expired (check
> https://github.com/freebsd/freebsd-src/blob/main/lib/librt/sigev_thread.c).

Hmm, looking into sigev_service_loop()
from this example, I don't even see the
synchronization. Am I right that after
sigwaitinfo() it just spawns the detached
thread and goes for another sigwaitinfo()?
Does this mean multiple threads can be
spawned concurrently if timer handler is
lagging?
If this is so, does glibc do the same?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (2 preceding siblings ...)
  2023-06-16  6:44 ` stsp at users dot sourceforge.net
@ 2023-06-16  7:29 ` stsp at users dot sourceforge.net
  2023-06-16  7:51 ` stsp at users dot sourceforge.net
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-16  7:29 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #4 from Stas Sergeev <stsp at users dot sourceforge.net> ---
Yes, I used your example to check with
glibc, and it appears it indeed spawns
any amount of threads concurrently if
the handler sleeps.
But in this case I think that impl is
not even posix-conforming because in
that scheme you can't use timer_getoverrun()
consistently. The man page of timer_getoverrun()
explicitly says that the timer doesn't
queue more than 1 signal, so that
timer_getoverrun() can show the overrun
count between signal queuing and dequeuing.
To me that still looks crazy because
after signal dequeuing another overrun
could occur. But at least that scheme
can be made working, because linux AFAICS
"cures" that by adding such lost overruns
upon the next signal delivery.

But if you spawn multiple threads
concurrently, how are you going to use
timer_getoverrun() at all?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (3 preceding siblings ...)
  2023-06-16  7:29 ` stsp at users dot sourceforge.net
@ 2023-06-16  7:51 ` stsp at users dot sourceforge.net
  2023-06-16 11:44 ` stsp at users dot sourceforge.net
                   ` (25 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-16  7:51 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #5 from Stas Sergeev <stsp at users dot sourceforge.net> ---
And looking into SIGEV_THREAD_ID I
wonder how to get tid from pthread_t
if glibc doesn't provide
pthread_getthreadid_np() function.

So it seems, SIGEV_SIGNAL is the only
viable option where you can at least
use timer_getoverrun() safely?
I might be really missing something
obvious here.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (4 preceding siblings ...)
  2023-06-16  7:51 ` stsp at users dot sourceforge.net
@ 2023-06-16 11:44 ` stsp at users dot sourceforge.net
  2023-06-19 17:41 ` adhemerval.zanella at linaro dot org
                   ` (24 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-16 11:44 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #6 from Stas Sergeev <stsp at users dot sourceforge.net> ---
If I could make a wishes here,
I'd say please implement SIGEV_THREAD
on top of timerfd API.
That way you avoid messing around
signals, and can get a consistent
timer_getoverrun().

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (5 preceding siblings ...)
  2023-06-16 11:44 ` stsp at users dot sourceforge.net
@ 2023-06-19 17:41 ` adhemerval.zanella at linaro dot org
  2023-06-19 18:54 ` stsp at users dot sourceforge.net
                   ` (23 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-19 17:41 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #7 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #6)
> If I could make a wishes here,
> I'd say please implement SIGEV_THREAD
> on top of timerfd API.
> That way you avoid messing around
> signals, and can get a consistent
> timer_getoverrun().

I don't think this a promising approach for POSIX timers, besides adding an
extra file descriptor under the hoods (with might interfere with
RLIMIT_NOFILE), the timerfd inherits over fork which would require keeping
track of the user file descriptor, and adding an atfork handler to close them.
We would also need to synthesize a sigval argument, which most likely would
require even extra state that kernel already provides us for free.

I created an RFC implementation to avoid the thread creation per timer trigger
[1]: timer_create now creates a thread per SIGEV_THREAD invocation, and each
thread will issue the timer callback without creating and destroying a new
thread per invocation.

The glibc allows the created thread to be cancellable or call pthread_exit, and
it is not fully specified how the POSIX timer should act in this case.  On the
current implementation, the next timer invocation will create a new thread, so
only timer_delete will unregister the kernel timer.  My RFC changes the
semantic that once the callback issues pthread_exit or the thread is cancelled,
no more timers are triggered. I think I will need to implement something like
musl that setjmp/longjmp to avoid the abort the cancellation process, although
I am not sure if is the correct or expected semantic.

Performance-wise it seems that it does use fewer CPU cycles:

--
#include <time.h>
#include <signal.h>
#include <assert.h>
#include <unistd.h>

static void
tf (union sigval arg)
{
}

int main (int argc, char *argv[])
{
  const struct itimerspec itval = {
    { 0, 150000000 },
    { 0, 150000000 },
  };

  struct sigevent sigev = {
    .sigev_notify = SIGEV_THREAD,
    .sigev_notify_function = tf,
    .sigev_notify_attributes = NULL,
    .sigev_value.sival_ptr = NULL,
  };

  enum { count = 128 };
  timer_t timers[count];

  for (int i = 0; i < count; i++)
    {
      assert (timer_create (CLOCK_REALTIME, &sigev, &timers[i]) == 0);
      assert (timer_settime (timers[i], 0, &itval, NULL) == 0);
    }

  sleep (5);

  for (int i = 0; i < count; i++)
    assert (timer_delete (timers[i]) == 0);

  sleep (2);

  return 0;
}
--

Master:

$ perf stat ./timer-bench

 Performance counter stats for './timer-bench':

            121,50 msec task-clock                #    0,017 CPUs utilized
               187      context-switches          #    1,539 K/sec
                 0      cpu-migrations            #    0,000 /sec
               187      page-faults               #    1,539 K/sec
       302.445.622      cycles                    #    2,489 GHz               
      (72,86%)
         3.089.604      stalled-cycles-frontend   #    1,02% frontend cycles
idle     (81,64%)
        47.543.776      stalled-cycles-backend    #   15,72% backend cycles
idle      (83,82%)
       287.475.386      instructions              #    0,95  insn per cycle
                                                  #    0,17  stalled cycles per
insn  (83,43%)
        48.578.487      branches                  #  399,835 M/sec             
      (72,20%)
           459.578      branch-misses             #    0,95% of all branches   
      (67,88%)

       7,001998495 seconds time elapsed

       0,013313000 seconds user
       0,150883000 seconds sys

VS my RFC branch:

x86_64-linux-gnu$ perf stat ./elf/ld.so --library-path . ../timer-bench

 Performance counter stats for './elf/ld.so --library-path . ../timer-bench':

             68,61 msec task-clock                #    0,010 CPUs utilized
             4.662      context-switches          #   67,949 K/sec
               161      cpu-migrations            #    2,347 K/sec
               326      page-faults               #    4,751 K/sec
       127.514.086      cycles                    #    1,859 GHz
         3.714.619      stalled-cycles-frontend   #    2,91% frontend cycles
idle
        10.407.556      stalled-cycles-backend    #    8,16% backend cycles
idle
        65.273.326      instructions              #    0,51  insn per cycle
                                                  #    0,16  stalled cycles per
insn
        13.918.731      branches                  #  202,866 M/sec             
      (0,46%)
     <not counted>      branch-misses                                          
      (0,00%)

       7,006109877 seconds time elapsed

       0,000000000 seconds user
       0,079625000 seconds sys

Although the extra threads might generate more context-switches/page-faults
since each one will eventually call sigwaitinfo. I have not analyzed latency,
although I would expect to also show better results.


[1]
https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/bz30558-posix_timer

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (6 preceding siblings ...)
  2023-06-19 17:41 ` adhemerval.zanella at linaro dot org
@ 2023-06-19 18:54 ` stsp at users dot sourceforge.net
  2023-06-19 19:33 ` adhemerval.zanella at linaro dot org
                   ` (22 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-19 18:54 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #8 from Stas Sergeev <stsp at users dot sourceforge.net> ---
Great news, thanks!.
So within this RFC may I use the
timer_getoverrun() safely? I suppose
so, because there seem to be no more
reentrancy into the handler ever possible,
and also no one will call sigwaitinfo()
until the handler is finished. That
seems to be enough for timer_getoverrun()
to work reliably.

Also I think the signal-related problems
are not available as internally SIGEV_THREAD_ID
is used, so I suppose every thread
receives only the signal from its own
timer. So indeed such impl seems to mostly
match what can be done with timerfd.

> The glibc allows the created thread to be cancellable or call pthread_exit,
> and it is not fully specified how the POSIX timer should act in this case.

Maybe you can assign the thread
destructor with pthread_key_create(),
which would then re-create a thread?
But there is more to it actually...
https://pubs.opengroup.org/onlinepubs/007904975/functions/timer_create.html
Quote:

If a timer is created which has evp->sigev_sigev_notify set to SIGEV_THREAD and
the attribute pointed to by evp->sigev_notify_attributes has a thread stack
address specified by a call to either pthread_attr_setstack() or
pthread_attr_setstackaddr(), the memory dedicated as a thread stack cannot be
recovered. The reason for this is that the threads created in response to a
timer expiration are created detached, or in an unspecified way if the thread
attribute's detachstate is PTHREAD_CREATE_JOINABLE.

From here we see that posix actually
enforces the current glibc behavior,
except when PTHREAD_CREATE_JOINABLE
is specified via evp->sigev_notify_attributes.
One way to do the job, retain the full
compatibility and posix compliance, is
to actually implement a proper way of
handling PTHREAD_CREATE_JOINABLE case.
Which would be only 1 joinable thread.

Of course posix specifies timer_create()
with multiple breakages, so what you do
(ignore posix) is also perfectly reasonable
in my eyes.

Whatever way you choice, it would be
good to write that posix are moro...
to open a DR to The OpenGroup.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (7 preceding siblings ...)
  2023-06-19 18:54 ` stsp at users dot sourceforge.net
@ 2023-06-19 19:33 ` adhemerval.zanella at linaro dot org
  2023-06-19 19:48 ` stsp at users dot sourceforge.net
                   ` (21 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-19 19:33 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #9 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #8)
> Great news, thanks!.
> So within this RFC may I use the
> timer_getoverrun() safely? I suppose
> so, because there seem to be no more
> reentrancy into the handler ever possible,
> and also no one will call sigwaitinfo()
> until the handler is finished. That
> seems to be enough for timer_getoverrun()
> to work reliably.

I don't see any reentrancy issue with the current approach, the background
thread that issues sigwaitinfo is created with only SIGSETXID not masked.  So
each sigwaitinfo will get one SIGTIMER from the kernel.

My RFC, on the other hand, requires installing the SIGCANCEL handler.  It works
because pthread_cancel does not actively sends the signal if the asynchronous
mode is not set (I might revise if this work for my proposed fix for BZ#12683).

> 
> Also I think the signal-related problems
> are not available as internally SIGEV_THREAD_ID
> is used, so I suppose every thread
> receives only the signal from its own
> timer. So indeed such impl seems to mostly
> match what can be done with timerfd.
> 
> > The glibc allows the created thread to be cancellable or call pthread_exit,
> > and it is not fully specified how the POSIX timer should act in this case.
> 
> Maybe you can assign the thread
> destructor with pthread_key_create(),
> which would then re-create a thread?

It would mean maintaining a list of SIGEV_THREAD timer_t so it updates the
relation of timer_t -> kernel timer_t on each thread recreation. It would not
be possible to use the pthread_t as the key for timer_* operation, so we will
always need to consult the data structure in such cases. I still prefer to musl
solution to intercept pthread_exit/pthread_cancel and do not act upon it.


> But there is more to it actually...
> https://pubs.opengroup.org/onlinepubs/007904975/functions/timer_create.html
> Quote:
> 
> If a timer is created which has evp->sigev_sigev_notify set to SIGEV_THREAD
> and the attribute pointed to by evp->sigev_notify_attributes has a thread
> stack address specified by a call to either pthread_attr_setstack() or
> pthread_attr_setstackaddr(), the memory dedicated as a thread stack cannot
> be recovered. The reason for this is that the threads created in response to
> a timer expiration are created detached, or in an unspecified way if the
> thread attribute's detachstate is PTHREAD_CREATE_JOINABLE.
> 
> From here we see that posix actually
> enforces the current glibc behavior,
> except when PTHREAD_CREATE_JOINABLE
> is specified via evp->sigev_notify_attributes.
> One way to do the job, retain the full
> compatibility and posix compliance, is
> to actually implement a proper way of
> handling PTHREAD_CREATE_JOINABLE case.
> Which would be only 1 joinable thread.
> 
> Of course posix specifies timer_create()
> with multiple breakages, so what you do
> (ignore posix) is also perfectly reasonable
> in my eyes.
> 
> Whatever way you choice, it would be
> good to write that posix are moro...
> to open a DR to The OpenGroup.

In fact, POSIX defines a semantic that is not what glibc provides since calling
pthread_cancel on a detached thread is not really specified.  One option is
bring this on patch to state that trying to cancel the POSIX timer thread is
UB, so we should not really support it.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (8 preceding siblings ...)
  2023-06-19 19:33 ` adhemerval.zanella at linaro dot org
@ 2023-06-19 19:48 ` stsp at users dot sourceforge.net
  2023-06-19 20:14 ` adhemerval.zanella at linaro dot org
                   ` (20 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-19 19:48 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #10 from Stas Sergeev <stsp at users dot sourceforge.net> ---
> I don't see any reentrancy issue with the current approach

But there is.
Its trivial to verify with your
own test-case posted here. You just
need to insert sleeps into the handler.

Which is why I made the above quote
from posix. It says explicitly that
a detached thread is created in response
to every timer expiration.
So it explicitly allows reentrancy and
deliberately breaks timer_getoverrun(),
and glibc was following that very precisely.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (9 preceding siblings ...)
  2023-06-19 19:48 ` stsp at users dot sourceforge.net
@ 2023-06-19 20:14 ` adhemerval.zanella at linaro dot org
  2023-06-19 20:26 ` stsp at users dot sourceforge.net
                   ` (19 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-19 20:14 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #11 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #10)
> > I don't see any reentrancy issue with the current approach
> 
> But there is.
> Its trivial to verify with your
> own test-case posted here. You just
> need to insert sleeps into the handler.
> 
> Which is why I made the above quote
> from posix. It says explicitly that
> a detached thread is created in response
> to every timer expiration.
> So it explicitly allows reentrancy and
> deliberately breaks timer_getoverrun(),
> and glibc was following that very precisely.

Ah right, indeed current approach will trigger the callback function with
provided arguments on each timer expiration while the new scheme will only
issue the callback once the previous timer callback returns.  Afaiu, POSIX does
not specify an semantic regarding it; it only specify that you need an overrun
facility that you can query. And since it is the kernel that keep track of the
overrun, the reentrancy issue is not really related on the overrun itself, but
rather on the sigval argument.

My understanding from the quote you borght is not that it is specified that
concurrent threads are allowed or not, but rather that if the application
allocates some memory to use as the thread stack it can not be reliably free
since the thread is not joinable (it will eventually leak once timer_delete is
called). 

And I don't think allowing the SIGEV_THREAD thread as joinable is also a good
way forward. Besides breaking the standard, it means that timer_delete is not
the only point where you cancel the timer (worse, the kernel will continue to
trigger the timer since in this case there is no timer_delete syscall on the
timer). We can add the timer_delete on thread exit / join, but the idea of the
per-thread timerid is that is only visible for the POSIX timer threads (pthread
code does not handle it).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (10 preceding siblings ...)
  2023-06-19 20:14 ` adhemerval.zanella at linaro dot org
@ 2023-06-19 20:26 ` stsp at users dot sourceforge.net
  2023-06-19 21:15 ` adhemerval.zanella at linaro dot org
                   ` (18 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-19 20:26 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #12 from Stas Sergeev <stsp at users dot sourceforge.net> ---
> And since it is the kernel that keep track of the overrun, the reentrancy
> issue is not really related on the overrun itself, but rather on the sigval
> argument.

Overruns are measured between signal queue
and dequeue. Linux fixes that scheme by
actually measuring between 2 subsequent
dequeues AFAIKS, though doesn't matter.
What matters is that, because of reentrancy,
the handler will query the overruns after
another signal was already delivered, so
the result would be a complete garbage.

> My understanding from the quote you borght is not that it is specified
> that concurrent threads are allowed or not

Let me quote again then:
[quote]
the threads created in response to a timer expiration are created detached, or
in an unspecified way if the thread attribute's detachstate is
PTHREAD_CREATE_JOINABLE.
[/quote]

Three things are being said here:
1. Threads are created in a response to every timer expiration
2. They are detached
3. The user can specify PTHREAD_CREATE_JOINABLE, in which
case the behavior is implementation-specific.

I propose to use 3 as a base for implementing
PTHREAD_CREATE_JOINABLE case first, then see
what remains.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (11 preceding siblings ...)
  2023-06-19 20:26 ` stsp at users dot sourceforge.net
@ 2023-06-19 21:15 ` adhemerval.zanella at linaro dot org
  2023-06-19 21:21 ` adhemerval.zanella at linaro dot org
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-19 21:15 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #13 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #12)
> > And since it is the kernel that keep track of the overrun, the reentrancy
> > issue is not really related on the overrun itself, but rather on the sigval
> > argument.
> 
> Overruns are measured between signal queue
> and dequeue. Linux fixes that scheme by
> actually measuring between 2 subsequent
> dequeues AFAIKS, though doesn't matter.
> What matters is that, because of reentrancy,
> the handler will query the overruns after
> another signal was already delivered, so
> the result would be a complete garbage.

Ok, I now see what you mean here. Indeed afaik Linux reset the overrun time on
each sigtimedwait syscall.  

> 
> > My understanding from the quote you borght is not that it is specified
> > that concurrent threads are allowed or not
> 
> Let me quote again then:
> [quote]
> the threads created in response to a timer expiration are created detached,
> or in an unspecified way if the thread attribute's detachstate is
> PTHREAD_CREATE_JOINABLE.
> [/quote]
> 
> Three things are being said here:
> 1. Threads are created in a response to every timer expiration

Reading the SIGEV_THREAD description [1], it is not clear to me that a new
thread should be created for *every* timer expiration from the same timer. In
fact, POSIX also advises against it on the timer_settime application because of
the concurrent stack usage issue [2].  So, the current scheme has the
additional drawback that a user-specific stack is really not safe.

> 2. They are detached
> 3. The user can specify PTHREAD_CREATE_JOINABLE, in which
> case the behavior is implementation-specific.
> 
> I propose to use 3 as a base for implementing
> PTHREAD_CREATE_JOINABLE case first, then see
> what remains.

But POSIX explicitly states that the thread is *not* joinable ("In neither case
is it valid to call pthread_join()"), and as I said it makes even tricker to
handle the per-timer thread (what happens to the timer after the thread is
joined? Should pthread_join issue timer_delete? What happens if there thread if
cancelled or issue pthread_exit?).

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
[2]
https://pubs.opengroup.org/onlinepubs/9699919799/functions/timer_settime.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (12 preceding siblings ...)
  2023-06-19 21:15 ` adhemerval.zanella at linaro dot org
@ 2023-06-19 21:21 ` adhemerval.zanella at linaro dot org
  2023-06-19 21:58 ` stsp at users dot sourceforge.net
                   ` (16 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-19 21:21 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #14 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Adhemerval Zanella from comment #13)
> But POSIX explicitly states that the thread is *not* joinable ("In neither
> case is it valid to call pthread_join()"), and as I said it makes even
> tricker to handle the per-timer thread (what happens to the timer after the
> thread is joined? Should pthread_join issue timer_delete? What happens if
> there thread if cancelled or issue pthread_exit?).
> 
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
> [2]
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/timer_settime.html

In fact, from the Austin issue [1] the next POSIX 2018 defined that providing a
pthread_attr_t with joinable state is undefined-behavior [2]; so I am even more
inclined to *not* move to make it implementation-defined.

[1] https://austingroupbugs.net/view.php?id=633#c5937
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (13 preceding siblings ...)
  2023-06-19 21:21 ` adhemerval.zanella at linaro dot org
@ 2023-06-19 21:58 ` stsp at users dot sourceforge.net
  2023-06-19 22:51 ` adhemerval.zanella at linaro dot org
                   ` (15 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-19 21:58 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #15 from Stas Sergeev <stsp at users dot sourceforge.net> ---
> Indeed afaik Linux reset the overrun time on each sigtimedwait syscall.

Its not linux, its posix who defined
timer_getoverrun() in an entirely crazy
way of measuring underruns between signal
queue and dequeue.
Linux is at least making this more or
less viable by actually measuring between
dequeues only.
Linux itself clearly says how it see it
implemented: timerfd, where read() just
resets the overrun count. Linux is written
by skilled programmers, while posix is
written by a mediocre doc-writers.

> In fact, POSIX also advises against it on the timer_settime application
> because of the concurrent stack usage issue [2].

Quote:

[EINVAL]
    The it_interval member of value is not zero and the timer was created with
notification by creation of a new thread (sigev_sigev_notify was SIGEV_THREAD)
and a fixed stack address has been set in the thread attribute pointed to by
sigev_notify_attributes.

So concurrent stack usage is explicitly
disallowed, which means a detached thread
per timer tick, as was said in the prev
quote that I provided.

> But POSIX explicitly states that the thread is *not* joinable
> ("In neither case is it valid to call pthread_join()")

I don't think it means not joinable.
I think it means only that the _user_
should never call pthread_join() on
such thread. Presumably he could do
that by exporting its pthread_self()
through the global variable.
I think the thread itself can be joinable,
and as long as the user doesn't call
pthread_join() on it, things are good
and glibc itself *can* (and should)
call pthread_join() internally on a
timer destruction.

> In fact, from the Austin issue [1] the next POSIX 2018 defined that providing
> a pthread_attr_t with joinable state is undefined-behavior [2];

This only confirms that posix is
written by idiots. But such an obvious
fact probably doesn't need so many
confirmations. :)

> so I am even more inclined to *not* move to make it implementation-defined.

I am definitely not "insisting" in
any way. I am only pointing out that
any other approach is even more problematic.
But at the end, any improvement is better
than the current mess.

What do you think about opening a DR to
The OpenGroup? If it would be me, I wouldn't
resist saying who they are, so its better
be you. :) Though I don't think timer_create()
API can be vindicated by a DR or 2.
getoverrun is also completely broken, and
so on. Why I still have to use it, is only
because of freebsd. On linux my app uses
timerfd, of course and for sure. I am trying
to avoid the brain-damaged posix APIs on
linux, but there comes freebsd...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (14 preceding siblings ...)
  2023-06-19 21:58 ` stsp at users dot sourceforge.net
@ 2023-06-19 22:51 ` adhemerval.zanella at linaro dot org
  2023-06-20  4:14 ` stsp at users dot sourceforge.net
                   ` (14 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-19 22:51 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #16 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #15)
> > Indeed afaik Linux reset the overrun time on each sigtimedwait syscall.
> 
> Its not linux, its posix who defined
> timer_getoverrun() in an entirely crazy
> way of measuring underruns between signal
> queue and dequeue.
> Linux is at least making this more or
> less viable by actually measuring between
> dequeues only.
> Linux itself clearly says how it see it
> implemented: timerfd, where read() just
> resets the overrun count. Linux is written
> by skilled programmers, while posix is
> written by a mediocre doc-writers.
> 
> > In fact, POSIX also advises against it on the timer_settime application
> > because of the concurrent stack usage issue [2].
> 
> Quote:
> 
> [EINVAL]
>     The it_interval member of value is not zero and the timer was created
> with notification by creation of a new thread (sigev_sigev_notify was
> SIGEV_THREAD) and a fixed stack address has been set in the thread attribute
> pointed to by sigev_notify_attributes.

This is defined as 'may fail', which does not bind the implementation to the
required semantic of one thread per time expiration.

> 
> So concurrent stack usage is explicitly
> disallowed, which means a detached thread
> per timer tick, as was said in the prev
> quote that I provided.
> 
> > But POSIX explicitly states that the thread is *not* joinable
> > ("In neither case is it valid to call pthread_join()")
> 
> I don't think it means not joinable.
> I think it means only that the _user_
> should never call pthread_join() on
> such thread. Presumably he could do
> that by exporting its pthread_self()
> through the global variable.
> I think the thread itself can be joinable,
> and as long as the user doesn't call
> pthread_join() on it, things are good
> and glibc itself *can* (and should)
> call pthread_join() internally on a
> timer destruction.

This wording is a user-visible API, so not being joinable it means that the
caller can not call any pthread interface meant for joinable threads (I think
POSIX is pretty clear in this regard).  As I said, making the thread joinable
implies a lot of caveats for the API, and that's why POSIX is explicit about
it.

And glibc does not need to join the thread, if it is created in detached mode
the thread itself will be responsible for handling any allocated resources and
an additional synchronization with any global state.

> 
> > In fact, from the Austin issue [1] the next POSIX 2018 defined that providing
> > a pthread_attr_t with joinable state is undefined-behavior [2];
> 
> This only confirms that posix is
> written by idiots. But such an obvious
> fact probably doesn't need so many
> confirmations. :)

Afaik POSIX only defines API after a de-facto implementation exactly to avoid
adding non-tested ones. But I would like to avoid bad-mouthing other groups in
bug reports.

> 
> > so I am even more inclined to *not* move to make it implementation-defined.
> 
> I am definitely not "insisting" in
> any way. I am only pointing out that
> any other approach is even more problematic.
> But at the end, any improvement is better
> than the current mess.
> 
> What do you think about opening a DR to
> The OpenGroup? If it would be me, I wouldn't
> resist saying who they are, so its better
> be you. :) Though I don't think timer_create()
> API can be vindicated by a DR or 2.
> getoverrun is also completely broken, and
> so on. Why I still have to use it, is only
> because of freebsd. On linux my app uses
> timerfd, of course and for sure. I am trying
> to avoid the brain-damaged posix APIs on
> linux, but there comes freebsd...

A DR for what exactly? I think the current definition does allow one-thread per
timer without extra thread per timer expiration (musl implements it and Rich is
*very careful* about following standards) so I don't see which additional
clarification we need on this topic.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (15 preceding siblings ...)
  2023-06-19 22:51 ` adhemerval.zanella at linaro dot org
@ 2023-06-20  4:14 ` stsp at users dot sourceforge.net
  2023-06-20 12:21 ` adhemerval.zanella at linaro dot org
                   ` (13 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-20  4:14 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #17 from Stas Sergeev <stsp at users dot sourceforge.net> ---
> This is defined as 'may fail', which does not bind the implementation
> to the required semantic of one thread per time expiration.

Still it is enough to make your other
claim invalid:

"Reading the SIGEV_THREAD description [1], it is not clear to me that a new
thread should be created for *every* timer expiration from the same timer.
In fact, POSIX also advises against it on the timer_settime application because
of the concurrent stack usage issue [2]."

The earlier quote provided by me
is very explicit on a thread per tick,
and timer_settime doesn't advice against
that because it even defines the failure
per multiple stack usage. This means it
foresee 1 detached thread per tick, rather
than advicing against.
So we have 1 explicit quote about that,
and we have 1 that implicitly supports
that. I'd say "posix is very clear on this".

> This wording is a user-visible API, so not being joinable

There is no such wording.
The wording it:
"In neither case is it valid to call pthread_join()"
and you conclude it means not joinable.
But it doesn't mean so.

> And glibc does not need to join the thread

But we were discussing the user-provided
PTHREAD_CREATE_JOINABLE attribute.

> Afaik POSIX only defines API after a de-facto implementation
> exactly to avoid adding non-tested ones.

Maybe that was modeled after some very
old bsd impl, no idea. Linux would definitely
not start from such an impl. It is visibly
even trying to fix timer_getoverrun()'s
definition, and then re-implements the whole
thing in a timerfd.

> A DR for what exactly?

Exactly for proposing 1 detached thread
per tick in the quote I pointed. And there
were no quotes that say otherwise.
Also they should explicitly disallow
calling pthread_exit() and other things
that musl disallow.
Also they should realize that 1 thread
per tick breaks timer_getoverrun(), and
probably re-spec timer_getoverrun()
completely.
Or if they fail to do the above, then
at least they should keep supporting
passing a pthread attribute, rather
than to declare that an UB (although
this is not needed if they fix the rest).
There are the reasons for a DR here.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (16 preceding siblings ...)
  2023-06-20  4:14 ` stsp at users dot sourceforge.net
@ 2023-06-20 12:21 ` adhemerval.zanella at linaro dot org
  2023-06-20 12:49 ` stsp at users dot sourceforge.net
                   ` (12 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-20 12:21 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #18 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #17)
> > This is defined as 'may fail', which does not bind the implementation
> > to the required semantic of one thread per time expiration.
> 
> Still it is enough to make your other
> claim invalid:
> 
> "Reading the SIGEV_THREAD description [1], it is not clear to me that a new
> thread should be created for *every* timer expiration from the same timer.
> In fact, POSIX also advises against it on the timer_settime application
> because of the concurrent stack usage issue [2]."
> 
> The earlier quote provided by me
> is very explicit on a thread per tick,
> and timer_settime doesn't advice against
> that because it even defines the failure
> per multiple stack usage. This means it
> foresee 1 detached thread per tick, rather
> than advicing against.
> So we have 1 explicit quote about that,
> and we have 1 that implicitly supports
> that. I'd say "posix is very clear on this".

I don't think so, the earlier quote (which you linked from the previous POSIX
version by the way, although it has not changed on POSIX 2017) allows both
implementations.  The timer_settime quote also only allows the implementation
to return an error if the resulting timer will trigger a UB (by using the same
stack on multiple threads), if the implementation is not subject to such issue
it can ignore the 'may fail'.

> 
> > This wording is a user-visible API, so not being joinable
> 
> There is no such wording.
> The wording it:
> "In neither case is it valid to call pthread_join()"
> and you conclude it means not joinable.
> But it doesn't mean so.

Not being joinable means that calling pthread_join() is UB. Some
implementations *might* try to catch such issues (as glibc, since it keeps the
pthread_t stack in a cache for some time), but again it is error-prone and
fragile.

> 
> > And glibc does not need to join the thread
> 
> But we were discussing the user-provided
> PTHREAD_CREATE_JOINABLE attribute.

Which is overridden by an explicit __pthread_attr_setdetachstate
(PTHREAD_CREATE_DETACHED) on timer_create. Again, not doing this results in a
tricky implementation and I think we should not move toward it.

> 
> > Afaik POSIX only defines API after a de-facto implementation
> > exactly to avoid adding non-tested ones.
> 
> Maybe that was modeled after some very
> old bsd impl, no idea. Linux would definitely
> not start from such an impl. It is visibly
> even trying to fix timer_getoverrun()'s
> definition, and then re-implements the whole
> thing in a timerfd.
> 
> > A DR for what exactly?
> 
> Exactly for proposing 1 detached thread
> per tick in the quote I pointed. And there
> were no quotes that say otherwise.
> Also they should explicitly disallow
> calling pthread_exit() and other things
> that musl disallow.
> Also they should realize that 1 thread
> per tick breaks timer_getoverrun(), and
> probably re-spec timer_getoverrun()
> completely.
> Or if they fail to do the above, then
> at least they should keep supporting
> passing a pthread attribute, rather
> than to declare that an UB (although
> this is not needed if they fix the rest).
> There are the reasons for a DR here.

Feel free to propose it, I think it would make some implementation
non-conformant and thus will most likely be either rejected or raised to amend
by some commenter. But again, the implementation with one thread per timer
should be conformant and improve current support for timer_overrun; so I see no
strong reason to ask for DR to tie POSIX timer SIGEV_THREAD to a specific
implementation that might not be the best approach.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (17 preceding siblings ...)
  2023-06-20 12:21 ` adhemerval.zanella at linaro dot org
@ 2023-06-20 12:49 ` stsp at users dot sourceforge.net
  2023-06-20 13:01 ` adhemerval.zanella at linaro dot org
                   ` (11 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-20 12:49 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #19 from Stas Sergeev <stsp at users dot sourceforge.net> ---
> I don't think so, the earlier quote (which you linked from the previous POSIX
> version by the way, although it has not changed on POSIX 2017) allows both
> implementations.

This:

[quote]
the threads created in response to a timer expiration are created detached, or
in an unspecified way if the thread attribute's detachstate is
PTHREAD_CREATE_JOINABLE.
[/quote]

I don't understand how it allows something
else. If there are "threads created in response to a timer expiration",
then obviously that means every timer
expiration creates a thread. There can't
be "threads created in response to a timer expiration"
created only on *some* expirations but not all...

> if the implementation is not subject to such issue it can ignore the 'may fail'.

Sure but that still defeats your claim that
it advises against 1 thread per tick. It
doesn't advice against, for sure. I see
nothing that advises against or suggests
otherwise, so could you please provide a
quote that supports your point?

Note: I am perfectly fine if you just tell
"there is no such quote, but my impl is better
so lets use it". Its perfectly fine, but
saying posix allows it, is probably not fine
unless it can be confirmed by a quote.

> Not being joinable means that calling pthread_join() is UB.

Hmm, ok. My definition differs, i.e. not
being joinable means detached. Thanks for
clarifying yours, I'll use it across this
thread.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (18 preceding siblings ...)
  2023-06-20 12:49 ` stsp at users dot sourceforge.net
@ 2023-06-20 13:01 ` adhemerval.zanella at linaro dot org
  2023-06-20 13:13 ` stsp at users dot sourceforge.net
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-20 13:01 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #20 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #19)
> > I don't think so, the earlier quote (which you linked from the previous POSIX
> > version by the way, although it has not changed on POSIX 2017) allows both
> > implementations.
> 
> This:
> 
> [quote]
> the threads created in response to a timer expiration are created detached,
> or in an unspecified way if the thread attribute's detachstate is
> PTHREAD_CREATE_JOINABLE.
> [/quote]
> 
> I don't understand how it allows something
> else. If there are "threads created in response to a timer expiration",
> then obviously that means every timer
> expiration creates a thread. There can't
> be "threads created in response to a timer expiration"
> created only on *some* expirations but not all...

Because afaiu this sentence only specifies that implementation *might* create
one more thread per timer expiration, it does not specify that the thread
*must* create one thread per expiration.

> 
> > if the implementation is not subject to such issue it can ignore the 'may fail'.
> 
> Sure but that still defeats your claim that
> it advises against 1 thread per tick. It
> doesn't advice against, for sure. I see
> nothing that advises against or suggests
> otherwise, so could you please provide a
> quote that supports your point?

I don't think, as I explained earlier. On POSIX, the 'may fail' is usually used
to allow different implementations to signal different errors that are bounded
by the implementation itself (which seems to be this case).

> 
> Note: I am perfectly fine if you just tell
> "there is no such quote, but my impl is better
> so lets use it". Its perfectly fine, but
> saying posix allows it, is probably not fine
> unless it can be confirmed by a quote.
> 
> > Not being joinable means that calling pthread_join() is UB.
> 
> Hmm, ok. My definition differs, i.e. not
> being joinable means detached. Thanks for
> clarifying yours, I'll use it across this
> thread.

Calling pthread_join() on detached threads is also UB, glibc allows it in some
cases but the support is brittle and error-prone. So afaik non-being joinable
does mean the thread as either been created in detached mode or has called
pthread_detach(), and calling pthread_join() is UB in both cases.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (19 preceding siblings ...)
  2023-06-20 13:01 ` adhemerval.zanella at linaro dot org
@ 2023-06-20 13:13 ` stsp at users dot sourceforge.net
  2023-06-21  3:19 ` stsp at users dot sourceforge.net
                   ` (9 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-20 13:13 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #21 from Stas Sergeev <stsp at users dot sourceforge.net> ---
Ok so this means your current patch
needs to be applied, which is fine with
me no matter what posix says or not says.
So when would it be applied then?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (20 preceding siblings ...)
  2023-06-20 13:13 ` stsp at users dot sourceforge.net
@ 2023-06-21  3:19 ` stsp at users dot sourceforge.net
  2023-06-21 14:32 ` adhemerval.zanella at linaro dot org
                   ` (8 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-21  3:19 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #22 from Stas Sergeev <stsp at users dot sourceforge.net> ---
Also please consider adding a check,
either a macro or a runtime check, so
that I can select between your impl
and timerfd one (as old impl is completely
broken and should never be selected)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (21 preceding siblings ...)
  2023-06-21  3:19 ` stsp at users dot sourceforge.net
@ 2023-06-21 14:32 ` adhemerval.zanella at linaro dot org
  2023-06-21 14:41 ` stsp at users dot sourceforge.net
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-21 14:32 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #23 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #21)
> Ok so this means your current patch
> needs to be applied, which is fine with
> me no matter what posix says or not says.
> So when would it be applied then?

I need to correctly handle pthread_exit on callback and think better how to
handle pthread_cancel (since it requires not now mask the signal).  musl reset
the thread state in this case (run dtor, reset tls, reinit internal state), so
I am considering do the same. It avoid the caller to miss any timer due failing
in thread creation.

> Also please consider adding a check,
> either a macro or a runtime check, so
> that I can select between your impl
> and timerfd one (as old impl is completely
> broken and should never be selected)

I do not plan to provide multiple implementations nor implement on top of
timerfd (as I put on comment #7).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (22 preceding siblings ...)
  2023-06-21 14:32 ` adhemerval.zanella at linaro dot org
@ 2023-06-21 14:41 ` stsp at users dot sourceforge.net
  2023-06-21 14:43 ` adhemerval.zanella at linaro dot org
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-21 14:41 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #24 from Stas Sergeev <stsp at users dot sourceforge.net> ---
(In reply to Adhemerval Zanella from comment #23)
> I do not plan to provide multiple implementations nor implement on top of
> timerfd (as I put on comment #7).

I didn't ask for anything like this.
Just provide a macro or a runtime check.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (23 preceding siblings ...)
  2023-06-21 14:41 ` stsp at users dot sourceforge.net
@ 2023-06-21 14:43 ` adhemerval.zanella at linaro dot org
  2023-06-21 14:52 ` stsp at users dot sourceforge.net
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-21 14:43 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #25 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #24)
> (In reply to Adhemerval Zanella from comment #23)
> > I do not plan to provide multiple implementations nor implement on top of
> > timerfd (as I put on comment #7).
> 
> I didn't ask for anything like this.
> Just provide a macro or a runtime check.

There is no plan to provide a user-visible macro that bleeds implementation
details, and I am not sure what you mean by 'runtime check'.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (24 preceding siblings ...)
  2023-06-21 14:43 ` adhemerval.zanella at linaro dot org
@ 2023-06-21 14:52 ` stsp at users dot sourceforge.net
  2023-06-21 15:07 ` adhemerval.zanella at linaro dot org
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-21 14:52 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #26 from Stas Sergeev <stsp at users dot sourceforge.net> ---
But if you don't provide any checking
method, then I won't be able to select
between your code and timerfd impl, so
I will have to keep using the timerfd
code instead of yours.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (25 preceding siblings ...)
  2023-06-21 14:52 ` stsp at users dot sourceforge.net
@ 2023-06-21 15:07 ` adhemerval.zanella at linaro dot org
  2023-06-22  2:57 ` bugdal at aerifal dot cx
                   ` (3 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-21 15:07 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #27 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Stas Sergeev from comment #26)
> But if you don't provide any checking
> method, then I won't be able to select
> between your code and timerfd impl, so
> I will have to keep using the timerfd
> code instead of yours.

Usually, if the semantic changes, a new symbol is provided to avoid
compatibility issues (I am not sure if it were the case). But for
implementation detail that changes over the release, glibc does not provide
additional user-visible defines that are not defined by standards (such any C
TS/extension). Bleeding out implementation details has added a lo of
maintainability burden.

You can always check the release to tie, but non-visible user changes might be
backported.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (26 preceding siblings ...)
  2023-06-21 15:07 ` adhemerval.zanella at linaro dot org
@ 2023-06-22  2:57 ` bugdal at aerifal dot cx
  2023-06-22  5:23 ` stsp at users dot sourceforge.net
                   ` (2 subsequent siblings)
  30 siblings, 0 replies; 32+ messages in thread
From: bugdal at aerifal dot cx @ 2023-06-22  2:57 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #28 from Rich Felker <bugdal at aerifal dot cx> ---
FWIW, I think I do read the specification as requiring a "new thread" for the
timer expiration event, but this does not conflict with an implementation
choice to reuse the memory and kernel task from a previous thread *after its
lifetime has ended* to provide the "new thread". This is what we do in musl.

This does of course preclude delivering a second timer event while the previous
one is still running (yielding overruns). I view that a as a very positive QoI
property since it prevents cascading resources usage leading to exhaustion and
failure under load, and lets the application handle the overruns in some
reasonable way instead. But if you did want to allow concurrent expiration
functions on the same timer, the timer thread could attempt to first create a
real new thread, and only fallback to running the function in its own context
when there are insufficient resources to make a new thread or some
implementation-chosen concurrency limit is reached.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (27 preceding siblings ...)
  2023-06-22  2:57 ` bugdal at aerifal dot cx
@ 2023-06-22  5:23 ` stsp at users dot sourceforge.net
  2023-06-23 18:34 ` adhemerval.zanella at linaro dot org
  2023-06-24 17:03 ` crrodriguez at opensuse dot org
  30 siblings, 0 replies; 32+ messages in thread
From: stsp at users dot sourceforge.net @ 2023-06-22  5:23 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #29 from Stas Sergeev <stsp at users dot sourceforge.net> ---
Just for the record, I changed my mind
and agreed with Adhemerval's reading of
this:

[quote]
the threads created in response to a timer expiration are created detached
[/quote]

This only says *how* the threads, created
in response to a timer expiration, are created.
It doesn't say they should be created, just
*iff* they are created, they are created detached.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (28 preceding siblings ...)
  2023-06-22  5:23 ` stsp at users dot sourceforge.net
@ 2023-06-23 18:34 ` adhemerval.zanella at linaro dot org
  2023-06-24 17:03 ` crrodriguez at opensuse dot org
  30 siblings, 0 replies; 32+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-23 18:34 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

--- Comment #30 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
I have updated my branch [1] to handle pthread_exit (it now resets the thread
state to the initial one, include DTORs, TLS, and signal mask); and trying to
cancel the helper thread is UB (it works on deferred mode because it does not
set the signal, but async more does not work because SIGCANCEL is always
blocked now).

[1]
https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/bz30558-posix_timer

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Bug libc/30558] SIGEV_THREAD is badly implemented
  2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
                   ` (29 preceding siblings ...)
  2023-06-23 18:34 ` adhemerval.zanella at linaro dot org
@ 2023-06-24 17:03 ` crrodriguez at opensuse dot org
  30 siblings, 0 replies; 32+ messages in thread
From: crrodriguez at opensuse dot org @ 2023-06-24 17:03 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30558

Cristian Rodríguez <crrodriguez at opensuse dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |crrodriguez at opensuse dot org

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-06-24 17:03 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-15 19:36 [Bug libc/30558] New: SIGEV_THREAD is badly implemented stsp at users dot sourceforge.net
2023-06-15 21:04 ` [Bug libc/30558] " adhemerval.zanella at linaro dot org
2023-06-16  2:23 ` stsp at users dot sourceforge.net
2023-06-16  6:44 ` stsp at users dot sourceforge.net
2023-06-16  7:29 ` stsp at users dot sourceforge.net
2023-06-16  7:51 ` stsp at users dot sourceforge.net
2023-06-16 11:44 ` stsp at users dot sourceforge.net
2023-06-19 17:41 ` adhemerval.zanella at linaro dot org
2023-06-19 18:54 ` stsp at users dot sourceforge.net
2023-06-19 19:33 ` adhemerval.zanella at linaro dot org
2023-06-19 19:48 ` stsp at users dot sourceforge.net
2023-06-19 20:14 ` adhemerval.zanella at linaro dot org
2023-06-19 20:26 ` stsp at users dot sourceforge.net
2023-06-19 21:15 ` adhemerval.zanella at linaro dot org
2023-06-19 21:21 ` adhemerval.zanella at linaro dot org
2023-06-19 21:58 ` stsp at users dot sourceforge.net
2023-06-19 22:51 ` adhemerval.zanella at linaro dot org
2023-06-20  4:14 ` stsp at users dot sourceforge.net
2023-06-20 12:21 ` adhemerval.zanella at linaro dot org
2023-06-20 12:49 ` stsp at users dot sourceforge.net
2023-06-20 13:01 ` adhemerval.zanella at linaro dot org
2023-06-20 13:13 ` stsp at users dot sourceforge.net
2023-06-21  3:19 ` stsp at users dot sourceforge.net
2023-06-21 14:32 ` adhemerval.zanella at linaro dot org
2023-06-21 14:41 ` stsp at users dot sourceforge.net
2023-06-21 14:43 ` adhemerval.zanella at linaro dot org
2023-06-21 14:52 ` stsp at users dot sourceforge.net
2023-06-21 15:07 ` adhemerval.zanella at linaro dot org
2023-06-22  2:57 ` bugdal at aerifal dot cx
2023-06-22  5:23 ` stsp at users dot sourceforge.net
2023-06-23 18:34 ` adhemerval.zanella at linaro dot org
2023-06-24 17:03 ` crrodriguez at opensuse dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).