public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process
@ 2012-11-11 19:31 bugdal at aerifal dot cx
  2013-06-14 16:36 ` [Bug libc/14829] " carlos at redhat dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: bugdal at aerifal dot cx @ 2012-11-11 19:31 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=14829

             Bug #: 14829
           Summary: sched_* functions wrongly alter thread scheduling,
                    rather than process
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: unassigned@sourceware.org
        ReportedBy: bugdal@aerifal.cx
                CC: drepper.fsp@gmail.com
    Classification: Unclassified


Per POSIX (XSH 2.8.4 Process Scheduling, subheading Scheduling Policies):

"For threads with system scheduling contention scope, the process scheduling
attributes shall have no effect on the scheduling attributes or behavior either
of the thread or an underlying kernel scheduling entity dedicated to that
thread."

Linux does not seem to support process-level scheduling attributes at all;
glibc/NPTL only accept system scheduling contention scope, and returns ENOTSUP
when an attempt is made to create a thread with process contention scope. As
such, my reading of the above text is that the sched_* functions should be
mostly/entirely no-ops. Even if this weren't the case, it's certainly incorrect
for them to modify the scheduling attributes of the single thread whose TID
happens to be equal to the PID.

I'm unsure what the best way to resolve this issue against application
expectations is, but the current behavior is non-conforming and does the wrong
thing in multi-threaded applications trying to use these functions for their
standard purposes.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
@ 2013-06-14 16:36 ` carlos at redhat dot com
  2013-06-16  4:46 ` bugdal at aerifal dot cx
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2013-06-14 16:36 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=14829

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com

--- Comment #1 from Carlos O'Donell <carlos at redhat dot com> ---
I agree that this a mess in Linux right now.

I agree that most of the sched_* functions should actually do nothing in glibc,
principally because process level contention scheduling is not supported.

At present these functions change the scheduling for the thread leader whose
tid == tgid and which is returned from the call to getpid(). I agree that this
is wrong according to POSIX. Thus the functions do not operate as required by
POSIX for threaded applications

These functions do operate correctly for non-threaded applications.

One solution is to version the interfaces, and have all of the new version do
nothing if threads are active.

When Linux gets proper process scheduling support we can stop doing nothing and
do the right sequence of operations to set the pid's (or tgid's) scheduling
parameters.

Linux needs to be enhanced to do the following:
- getpid returns pid
- gettid returns tid != pid

All processes whould start off with a unique tgid that can be used as the
target of these calls and that is returned via getpid, and that can't be
confused with the thread leader.

That would allow these calls to identify the tgid uqiquely.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
  2013-06-14 16:36 ` [Bug libc/14829] " carlos at redhat dot com
@ 2013-06-16  4:46 ` bugdal at aerifal dot cx
  2013-06-17 19:34 ` carlos at redhat dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugdal at aerifal dot cx @ 2013-06-16  4:46 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=14829

--- Comment #2 from Rich Felker <bugdal at aerifal dot cx> ---
Carlos, there are major issues with your proposed solution. I know this one
applies to musl and I believe it also applies to glibc: the set_tid_address
syscall, used when initializing the main thread's TLS, returns the tid, which
is also the pid; to avoid an additional wasteful getpid syscall, this value is
used for filling in both the tid and pid fields in the TCB. If the kernel were
changed so that the main thread's tid were different from its pid, existing
code would experience subtle but serious breakage. I think such a change would
be contrary to kernel policy.

I also disagree with the claim "These functions do operate correctly for
non-threaded applications." The POSIX requirements for sched_* are essentially
that they do nothing, even for single-threaded applications:

"For threads with system scheduling contention scope, the process scheduling
attributes shall have no effect on the scheduling attributes or behavior either
of the thread or an underlying kernel scheduling entity dedicated to that
thread."

Since Linux only supports system scheduling contention scope, not process
scheduling contention scope, the above text applies even to the main thread of
a single-threaded process.

Even if the process scheduling contention scope were supported, I don't think
the semantics would match the current Linux sched_* syscalls.

If it's desirable to support the Process Scheduling option of POSIX, I think
Linux needs to add new syscalls (perhaps named posix_sched_*), not just adjust
the target scope of the current ones based on whether the argument is a pid or
a tid. Short of that, I'm not entirely sure what the appropriate course of
action for glibc is. I think a good first step would be to redefine
_POSIX_PRIORITY_SCHEDULING to -1, reflecting that glibc does not support the
Process Scheduling option even at compile-time. With this macro defined as -1,
the implementation is not required to provide sched_* functions, and if these
functions do happen to exist as they do in glibc, the implementation is under
no obligation to endow them with semantics matching the POSIX semantics. This
would at least fix the conformance issue and make conforming applications aware
that they cannot use these functions.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
  2013-06-14 16:36 ` [Bug libc/14829] " carlos at redhat dot com
  2013-06-16  4:46 ` bugdal at aerifal dot cx
@ 2013-06-17 19:34 ` carlos at redhat dot com
  2013-06-17 20:02 ` bugdal at aerifal dot cx
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2013-06-17 19:34 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=14829

--- Comment #3 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Rich Felker from comment #2)
> Carlos, there are major issues with your proposed solution. I know this one
> applies to musl and I believe it also applies to glibc: the set_tid_address
> syscall, used when initializing the main thread's TLS, returns the tid,
> which is also the pid; to avoid an additional wasteful getpid syscall, this
> value is used for filling in both the tid and pid fields in the TCB. If the
> kernel were changed so that the main thread's tid were different from its
> pid, existing code would experience subtle but serious breakage. I think
> such a change would be contrary to kernel policy.

The only uses which would break are AFAICT:

* tgkill
- Which could be enhanced to accept a tid, find the tgid, and then kill the
associated group.

* rt_tgsigqueueinfo
- Same problem.

They can be solved because the expected semantics is that you call these for a
thread group, and if you instead pass a tid, it is a one-way translation from
thread to thread group. Even with cgroups I don't think this invariant is
broken.

Did I miss anything?

> I also disagree with the claim "These functions do operate correctly for
> non-threaded applications." The POSIX requirements for sched_* are
> essentially that they do nothing, even for single-threaded applications:
> 
> "For threads with system scheduling contention scope, the process scheduling
> attributes shall have no effect on the scheduling attributes or behavior
> either of the thread or an underlying kernel scheduling entity dedicated to
> that thread."
> 
> Since Linux only supports system scheduling contention scope, not process
> scheduling contention scope, the above text applies even to the main thread
> of a single-threaded process.

Sorry, let me clarify what I meant to say. With a relatively minor fix we could
support PTHREAD_SCOPE_PROCESS for a single-threaded application. The case of
inter-thread scheduling degenerates to unspecified so as long as we do nothing
for PTHREAD_SCOPE_SYSTEM and do exactly was we do now for
PTHREAD_SCOPE_PROCESS.

Does that make sense?

> Even if the process scheduling contention scope were supported, I don't
> think the semantics would match the current Linux sched_* syscalls.

It would certainly involve kernel work. At present Linux doesn't support
process scheduling contention scope.

> If it's desirable to support the Process Scheduling option of POSIX, I think
> Linux needs to add new syscalls (perhaps named posix_sched_*), not just
> adjust the target scope of the current ones based on whether the argument is
> a pid or a tid. Short of that, I'm not entirely sure what the appropriate
> course of action for glibc is. I think a good first step would be to
> redefine _POSIX_PRIORITY_SCHEDULING to -1, reflecting that glibc does not
> support the Process Scheduling option even at compile-time. With this macro
> defined as -1, the implementation is not required to provide sched_*
> functions, and if these functions do happen to exist as they do in glibc,
> the implementation is under no obligation to endow them with semantics
> matching the POSIX semantics. This would at least fix the conformance issue
> and make conforming applications aware that they cannot use these functions.

Agreed that that is the best way forward.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
                   ` (2 preceding siblings ...)
  2013-06-17 19:34 ` carlos at redhat dot com
@ 2013-06-17 20:02 ` bugdal at aerifal dot cx
  2013-06-17 20:44 ` carlos at redhat dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugdal at aerifal dot cx @ 2013-06-17 20:02 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=14829

--- Comment #4 from Rich Felker <bugdal at aerifal dot cx> ---
On Mon, Jun 17, 2013 at 07:34:51PM +0000, carlos at redhat dot com wrote:
> The only uses which would break are AFAICT:
> 
> * tgkill
> - Which could be enhanced to accept a tid, find the tgid, and then kill the
> associated group.
> 
> * rt_tgsigqueueinfo
> - Same problem.
> 
> They can be solved because the expected semantics is that you call these for a
> thread group, and if you instead pass a tid, it is a one-way translation from
> thread to thread group. Even with cgroups I don't think this invariant is
> broken.
> 
> Did I miss anything?

Yes. getpid() uses the cached pid from the TCB. So if set_tid_address
returned the tid, then getpid() would return the wrong pid. On the
other hand, if set_tid_address returned the pid, and all syscalls that
accept tids would also accept a pid and treat it as the tid of the
initial thread in that process, then things might not be so bad, but I
think there might be risks involved in threads having two distinct
identifiers, especially when the tid is used to mark the owner of a
synchronization object. I'm not sure that there would be problems with
this approach, but I still think it sounds risky.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
                   ` (3 preceding siblings ...)
  2013-06-17 20:02 ` bugdal at aerifal dot cx
@ 2013-06-17 20:44 ` carlos at redhat dot com
  2013-06-17 21:02 ` bugdal at aerifal dot cx
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: carlos at redhat dot com @ 2013-06-17 20:44 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=14829

--- Comment #5 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Rich Felker from comment #4)
> On Mon, Jun 17, 2013 at 07:34:51PM +0000, carlos at redhat dot com wrote:
> > The only uses which would break are AFAICT:
> > 
> > * tgkill
> > - Which could be enhanced to accept a tid, find the tgid, and then kill the
> > associated group.
> > 
> > * rt_tgsigqueueinfo
> > - Same problem.
> > 
> > They can be solved because the expected semantics is that you call these for a
> > thread group, and if you instead pass a tid, it is a one-way translation from
> > thread to thread group. Even with cgroups I don't think this invariant is
> > broken.
> > 
> > Did I miss anything?
> 
> Yes. getpid() uses the cached pid from the TCB. So if set_tid_address
> returned the tid, then getpid() would return the wrong pid. 

It's not the wrong pid, it's the identifier of the schedulable object which is
the kernel task associated with the process. It's as correct as you can get.

The kernel has always treated pid and tid as the same kind of identifier for
schedulable objects, so yes, calling any syscall that accepts pid will accept
tid. The Linux kernel makes no difference between the two.

The exception is tgkill and rt_tgsigqueueinfo, both which assert that the id of
the thread group leader matches the id passed into the function as the tgid.
This is why these two functions need slightly looser semantics.

And this specific code:
kernel/signal.c:
~~~
static int
do_send_specific(pid_t tgid, pid_t pid, int sig, struct siginfo *info)
{
        struct task_struct *p;
        int error = -ESRCH;

        rcu_read_lock();
        p = find_task_by_vpid(pid);
        if (p && (tgid <= 0 || task_tgid_vnr(p) == tgid)) {
                error = check_kill_permission(sig, info, p);
~~~
Should use:
~~~
static int
do_send_specific(pid_t tgid, pid_t pid, int sig, struct siginfo *info)
{
        struct task_struct *p, *pt;
        int error = -ESRCH;

        rcu_read_lock();
        p = find_task_by_vpid(pid);
        pt = find_task_by_vpid(tgid);
        if (p && (tgid <= 0 || task_tgid_vnr(p) == task_tgid_vnr(pt))) {
                error = check_kill_permission(sig, info, p);
~~~
Thus allowing any schedulable object to be passed in as the tgid,
and that object be used to find the thread group leader for doing
the work.

> On the other hand, if set_tid_address returned the pid, and all syscalls that
> accept tids would also accept a pid and treat it as the tid of the
> initial thread in that process, then things might not be so bad, but I
> think there might be risks involved in threads having two distinct
> identifiers, especially when the tid is used to mark the owner of a
> synchronization object. 

All kernel functions already do accept pid and tid interchangeably.

You can already have several different identifiers for the same task.

e.g.

include/linux/pid.h
~~~
enum pid_type
{
        PIDTYPE_PID,
        PIDTYPE_PGID,
        PIDTYPE_SID,
        PIDTYPE_MAX
};
~~~

> I'm not sure that there would be problems with
> this approach, but I still think it sounds risky.

AFAIK it would work, modulo the fix.

The return of getpid() would be the tid, and it would work.

However, the more I write out the design the uglier it gets.

It's probably better to just add another PIDTYPE_* value to support process
scheduling and a new syscall.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
                   ` (4 preceding siblings ...)
  2013-06-17 20:44 ` carlos at redhat dot com
@ 2013-06-17 21:02 ` bugdal at aerifal dot cx
  2014-06-14 11:11 ` fweimer at redhat dot com
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: bugdal at aerifal dot cx @ 2013-06-17 21:02 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=14829

--- Comment #6 from Rich Felker <bugdal at aerifal dot cx> ---
On Mon, Jun 17, 2013 at 08:44:43PM +0000, carlos at redhat dot com wrote:
> > > Did I miss anything?
> > 
> > Yes. getpid() uses the cached pid from the TCB. So if set_tid_address
> > returned the tid, then getpid() would return the wrong pid. 
> 
> It's not the wrong pid, it's the identifier of the schedulable object which is
> the kernel task associated with the process. It's as correct as you can get.

No. If fork() returns a value X to a parent process, and the child
process sees its own pid as value Y via getpid(), then the
implementation is non-conforming.

> All kernel functions already do accept pid and tid interchangeably.
> 
> You can already have several different identifiers for the same task.

There is nothing wrong with the kernel _accepting_ different
identifiers for the same task. However standard functions which
_return_ the pid must return consistent values. The change you
proposed for the kernel would break this requirement in existing
static binaries and when using old shared libc versions from before
the kernel change. As far as I know, this goes against kernel
interface stability policy.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
                   ` (5 preceding siblings ...)
  2013-06-17 21:02 ` bugdal at aerifal dot cx
@ 2014-06-14 11:11 ` fweimer at redhat dot com
  2014-06-17 12:47 ` fweimer at redhat dot com
  2014-06-17 13:54 ` bugdal at aerifal dot cx
  8 siblings, 0 replies; 10+ messages in thread
From: fweimer at redhat dot com @ 2014-06-14 11:11 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=14829

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
                   ` (6 preceding siblings ...)
  2014-06-14 11:11 ` fweimer at redhat dot com
@ 2014-06-17 12:47 ` fweimer at redhat dot com
  2014-06-17 13:54 ` bugdal at aerifal dot cx
  8 siblings, 0 replies; 10+ messages in thread
From: fweimer at redhat dot com @ 2014-06-17 12:47 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=14829

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com

--- Comment #7 from Florian Weimer <fweimer at redhat dot com> ---
I hate to ask this, but could we use the SETXID mechanism to paper over this
missing kernel feature?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process
  2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
                   ` (7 preceding siblings ...)
  2014-06-17 12:47 ` fweimer at redhat dot com
@ 2014-06-17 13:54 ` bugdal at aerifal dot cx
  8 siblings, 0 replies; 10+ messages in thread
From: bugdal at aerifal dot cx @ 2014-06-17 13:54 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=14829

--- Comment #8 from Rich Felker <bugdal at aerifal dot cx> ---
No. For at least a couple reasons:

1. SETXID is for a process acting on itself. sched_* can target any PID. Having
glibc accept SETXID signals from other processes would impose some serious
compatibility issues: the sender and receiver would have to have matching ideas
of the protocol. The locking to get it right would also be difficult or
impossible, and there may be permissions issues about sending the signals.

2. Changing the scheduling parameters of a process is not the same thing as
changing the parameters for each thread of the process. Per POSIX, the sched_*
functions are specified to be almost complete NOPs, aside from setting and
retrieving values and checking permissions, etc., on systems that do not
support a process scheduling scope (Linux is such a system). So making them
change all threads would just replace one wrong behavior with another.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-06-17 13:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-11 19:31 [Bug libc/14829] New: sched_* functions wrongly alter thread scheduling, rather than process bugdal at aerifal dot cx
2013-06-14 16:36 ` [Bug libc/14829] " carlos at redhat dot com
2013-06-16  4:46 ` bugdal at aerifal dot cx
2013-06-17 19:34 ` carlos at redhat dot com
2013-06-17 20:02 ` bugdal at aerifal dot cx
2013-06-17 20:44 ` carlos at redhat dot com
2013-06-17 21:02 ` bugdal at aerifal dot cx
2014-06-14 11:11 ` fweimer at redhat dot com
2014-06-17 12:47 ` fweimer at redhat dot com
2014-06-17 13:54 ` bugdal at aerifal dot cx

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).