Re: [RFC] Toward Shareable POSIX Signals

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Daniel Colascione <dancol@dancol.org>
To: Zack Weinberg <zackw@panix.com>
Cc: GNU C Library <libc-alpha@sourceware.org>
Subject: Re: [RFC] Toward Shareable POSIX Signals
Date: Mon, 12 Mar 2018 19:47:00 -0000	[thread overview]
Message-ID: <958db2bb-18a8-b46c-e098-5c388268e620@dancol.org> (raw)
In-Reply-To: <CAKCAbMiK3sOq2xKGDw96abhPYx99NGjXb1Fh86G_vsWo6VD73A@mail.gmail.com>

On 03/12/2018 08:17 AM, Zack Weinberg wrote:
> On Sun, Mar 11, 2018 at 2:56 PM, Daniel Colascione <dancol@dancol.org> wrote:
>> On 03/11/2018 11:07 AM, Zack Weinberg wrote:
>>
>>> However, along with most of the other posters in
>>> this thread, I don't like the proposed solution -- and not just
>>> because I don't like signals (although, indeed, I do not like signals)
>>> but because I think the basic mechanism you suggest, chained handlers,
>>> is inherently unreliable and will cause more problems than it solves.
>>
>> Why? You're ignoring the present reality that people _already_ use chained
>> signal handlers. They're not going to stop. When libc maintainers reject
>> widespread use cases as illegitimate, all they're doing is forestalling any
>> sort of improvement.
> 
> The C library has to be extremely conservative about adding new APIs,
> because we are, to first order, stuck with anything we add _forever_.

> In particular, we will _not_ accept "people are doing X now" as a
> valid argument for codifying X as part of the C library,

It's very curious to see an argument that suggests what users actually 
do shouldn't matter.

"People are ruining the lawn", the groundskeeper explained to the 
administrator. "Here's a proposal for some concrete walkways. Should be 
done by next week."

"No", the administrator replied. "We wouldn't want to encourage walking. 
It's dangerous." He paused a moment, then continued wistfully. "Flying 
is much better and safer. How many birds with broken ankles do you see?"

"...", the groundskeeper exclaimed internally.

> especially
> not when we can come up with an alternative with fewer problems.  Yes,
> the alternative might mean that the people doing X now have to do
> something else instead.  But switching from libsigchain (for instance)
> to libsigchain-codified-in-the-C-library is _also_ a code change for
> the people doing X now.  It might be a _smaller_ code change than what
> they would have to do to adopt the alternative, but we don't care
> about that.  We care, instead, about whether the alternative makes it
> easier in the long run to write reliable code.

You're missing the point. There is no universe where people stop using 
traps for null pointer checks. You are not going to convince people who 
care about performance to waste cycles doing in software what hardware 
will provide for free. Nobody will ever use a ptrace monitor to 
implement mandatory parts of a language specification either, especially 
not while sigaction still exists. POSIX already gave the world 
sigaction. That's _never_ going away.

There are two possible future worlds:

1) People continue to use awful sigaction-based hacks to exercise the 
signal mechanism, then screw it up and hurt users.

2) libc provides a mechanism that's at least as good as sigaction.

There is no world where people spawn an entire process (which can be 
killed, SIGSTOPed, ptraced, OOM-killed, places in a background cgroup, 
or damaged in myriad other ways) and talk some complicated protocol with 
it when all they want to do is load a library, run some code, unload the 
library, and move on. That is a perfectly reasonable thing to want to 
do, and it shouldn't require any interaction with the system process 
table, raising RLIMIT_NPROC, or reaping children.

Windows has provided a suitable API similar to the one I propose for 20 
years without catastrophe. All the concerns you've raised about memory 
corruption also exist on that system, yet programs continue to run 
reliably. POSIX should be at least as good as Windows, don't you think?

Yes, yes, I know, existing practice isn't evidence in favor of any 
particular proposal. It is though.

>> A C runtime needs to consider realistic proposals to address real
>> problems of real software --- not hold out instead for some
>> idealized alternative family of APIs that will never materialize.
> 
> This is unfair.  Some of the alternatives we have suggested already
> exist, and others are proposals at least as concrete as yours is.

Not for arbitrating access to shared signals, unless I missed something. 
The closest thing is userfaultfd, and I explained why it's not really 
suitable. It could be *made* suitable, but it'd be a massive effort, and 
I doubt other systems would adopt it. I want an API that I could 
conceivably working on Darwin and Cygwin and QNX too.

>> I propose an API that ensures
>> that the right thing happens as long as everyone follows the rules, and
>> that's far better than the lawless waste that exists today.
> 
> Speaking only for myself here, "the right thing happens as long as
> everyone follows the rules" is _not good enough_ for an API codified
> as part of the C library. 

Practically the entire existing interface surface of libc doesn't meet 
this bar. printf can corrupt memory with %n. People can get confused and 
misuse malloc and free --- and in fact, do all the time. longjmp 
introduces all sorts of odd side effects.

Practically anything you do in this language at this level more 
complicated than sysconf(3) is going to be unsafe if you don't follow 
the rules, and that's a good thing! A sharp razor cuts well.

In the right hands, dangerous tools are incredibly useful. I'm not 
proposing an API for people who don't know what they're doing. Sure, 
there's an argument that C shouldn't exist. But I don't expect to 
encounter this argument when trying to improve the C standard library.

> It needs to be "the right thing happens for
> everyone who follows the rules, _even if_ other code in the same
> process is breaking the rules in ways that we reasonably anticipate
> will happen."

At this level, we're talking about individual bytes. There is no 
absolute safety possible within a single process, nor should there be. 
Safety is the job of higher level components to provide, and it's the 
job of lower-level, slower-moving components to provide a suitable 
foundation for these components. In the area of signal arbitration, this 
foundation is lacking.

> For instance: Chained handlers for SIGCHLD are not good enough,
> because we reasonably anticipate that some handlers will -- not out of
> malice, just out of lack of foresight -- swallow notifications that
> were properly intended for other handlers.

What if someone were to call _exit from one of these handlers? What if 
someone accidentally infloops? Lots of things can do wrong.

> pdfork, on the other hand,
> _is_ good enough, because the holder of a process handle is the only
> code to receive a notification for that process, regardless of what
> other code waiting for unrelated processes might be doing.

What if some other component close(2)s the pdfork file descriptor? After 
all, it's not unheard of for people to erroneously retry close(2) and 
cause collateral damage. I don't see how file descriptors get a pass 
from "safe even if people don't follow the rules" criterion.

>>> I also think you haven't gone deep enough into the root cause of the
>>> problem you're trying to solve.  You set out to make it possible to
>>> have more than one signal handler per process for each signal, but
>>> _why_ is that an undesirable limitation?  In most cases, it's because
>>> _signals are too coarse_.  When you get a SIGCHLD or a SIGIO or a
>>> SIGSEGV, you don't know which of many possible child processes / file
>>> descriptors / memory addresses is relevant.
>>
>> This claim is technically incorrect. The siginfo structure passed to
>> the sigaction handler (and, in my proposal, to registered handlers)
>> provides the necessary specificity.
> 
> You misunderstand me.  The problem is not that the handler(s) don't
> have enough information to figure out whether the specific event is
> relevant to them; the problem is that the specific event is not
> delivered directly to the specific handler that cares about it, and
> nobody else.

The most flexible and succinct way to determine which handler should 
exclusively claim a particular fault is to ask each handler in turn. 
Requiring table registration would be both brittle and inefficient, 
since the tables would likely duplicate code-lookup structures that 
language environments already provide.

> This is why I'm sort-of OK with chained handlers for events that
> really are broadcast in nature, such as SIGPWR and SIGTERM.  However,
> having thought about it some more, I don't want it to be expressed in
> the API as chaining, because chaining implies an order, and that's a
> problem in itself.  I want it to be expressed as _independent_
> handlers, and by "handlers" I mean "file descriptors" to the maximum
> extent possible, e.g. open("/dev/power_failure_notify", O_RDONLY)
> gives you a file descriptor that will become readable at the same time
> SIGPWR is fired.
> 
>> You still need some kind of catch-all mechanism in case no specific
>> handler is applicable.
> 
> NO WE DON'T.

YES WE DO.

In-process crash reporting is another one of those things that might be 
ugly, but that never going away. Besides, what if one of your specific 
handlers *can't* handle a particular fault? Should it just infloop until 
the process dies? Raise a different signal?

There must be a way for a handler to throw its hands up in the air and 
say, "I don't know what to do with this signal. It's not mine. Do 
whatever would happen if I weren't here at all.", because this situation 
_will_ arise, and this approach is the least disruptive option.

If my runtime SIGSEGVs trying to dereference NULL, I know what to do. If 
I catch it trying to dereference 0xDEADBEEF, I probably want to treat it 
like a crash. I want to see the process die with SIGSEGV, not see it 
_exit(1) because the except_table pointer for that PC didn't know what 
else to do.

> Catch-alls are bad, OK.  They suffer intrinsically from the same
> problem you are trying to solve -- "what if two pieces of code want
> to be the catch-all?"

They chain to each other. One eventually invokes the SIG_DFL handler and 
the process dies.

> I don't even like your SA_LOW_PRIORITY, because, again, what if two
> pieces of code want to be the last to receive the notification?  You
> can't honor both requests, so you mustn't even offer the possibility
> in the first place.

I can accept the argument that for both SA_LOW_PRIORITY and for 
asynchronous signals, all handlers get called no matter what. We still 
need a way for synchronous signals to exclusively claim the right to 
handle a particular deliver of a particular signal to a particular 
thread, because that's necessary for correctness. I can budge on other 
signals.

> Instead, what I ideally want is for us to decompose all coarse events
> until there is one and only one handler for each specific event, and
> then figure out some way to map all of the specific events into file
> descriptor notifications that can be fielded via select() or epoll().
> If an event is legitimately a broadcast event, like SIGPWR, then we
> make it possible for there to be multiple _independent_ -- not
> chained; no ordering -- listeners.
> 
>> I would prefer process handle file descriptors. Linux upstream has
>> specifically rejected process handle file descriptors on several
>> occasions.
> 
> This is news to me; could you please dig up pointers to specific
> objections by people with veto authority?  I thought it had just been
> neglected.

I can't find the message now. I hope I'm not just imagining it. But I 
distinctly recall reading Linus (I think it was Linus) arguing that a 
file descriptor that would keep a zombie process alive was a terrible, 
bad, no-good thing because it would allow anyone to consume all the 
process table entries on the system.

To be clear, my first preferred option is a facility that represents 
processes at file descriptors --- preferably _arbitrary_ processes, not 
just direct children as with pdfork. The argument I'm recalling (and 
perhaps mis-recollecting) is, IMHO, bogus.

> ...
>> ALRM, VTALRM, PROF --- as a completely separate matter, additional
>> arbitration for coordinating timer deadlines would be useful.
> 
> Yeah.  I haven't had to do anything complicated with timers in C
> myself, so I'm not sure what would be ideal as a C API, but
> timer_create seems more like the Right Thing than setitimer does.
> 
> There is an additional headache in that SIGALRM or SIGVTALRM + a
> non-SA_RESTART handler are still sometimes the only way to impose a
> timeout on a blocking system call.  Abstractly, all such system calls
> need to grow extended versions that take timeouts

Do they? It's more elegant for timeout and logic to be orthogonal than 
for each system call to gain a parameter for any bit behavior you might 
want to associate with that call. There's parsimony in having one way to 
arrange a timeout for any system call. Something like SO_RCVTIMEO seems 
better, IMHO.

> but that's a large
> and mostly independent project, and there's still an issue with
> blocking system calls happening inside a library you don't control.

SIGALRM isn't a great way to impose a constraint on a library you don't 
control anyway. All you can do is make a call fail with EINTR. The 
library you don't control will probably retry on EINTR. You could 
longjmp out of your SIGALRM handler, but that'll probably break some 
invariants in that library you don't control.

>> I understand your motivation for ensuring all such handlers are
>> called for these broadcast signals. I think API uniformity matters
>> more than ensuring that all handlers are called, especially since
>> I'm certain that we need multi-handler support for synchronous
>> signals as well as asynchronous ones, and synchronous signals need
>> to be cancelable.
> 
> To me it's exactly the other way around: if we can't ensure that all
> handlers are called, then the design problem has not yet been solved;
> if the API needs to be non-uniform in order to fit the design
> requirements, then so be it.

I would accept an API that imposed a run-all requirement on most signals 
and that limited exclusive signal claims to synchronously-delivered 
trapping signals.

I agree with part of your earlier message that the signal mechanism 
basically conflates three very distinct APIs: 1) process management 
system calls, 2) process-wide notifications, and 3) poor man's SEH. It's 
#3 that most concerns me.

> I don't understand what you mean by "synchronous signals need to be
> cancelable."

A handler for a synchronous signal needs to be able to exclusively claim 
a particular signal and prevent both other handlers and any catch-all 
handlers (they _will_ exist) from running and misinterpreting that signal.

>>> ILL, ABRT, FPE, SEGV, BUS, SYS, TRAP, IOT, EMT, STKFLT -- Synchronous
>>> signals arising from processor faults deserve a specialized mechanism
>>> all their own.  The notion I currently like, at the kernel level, is
>>> just-in-time instantiation of a ptrace monitor
>>
>> That approach doesn't solve the arbitration issue and would make
>> performance significantly worse than present. Not every instance of
>> one of these synchronous signals is a crash. Spawning a process to
>> handle them is far too expensive (and unreliable!) for something
>> like a Java runtime's null pointer checks.
> 
> As discussed elsethread, I currently agree with Rich Felker that
> Java's null pointer checks are better implemented with explicit tests
> emitted by the JIT; not by taking a fault and then fixing up
> afterward.  Same for persistent object stores and incremental GC; use
> compiler-generated write barriers, not page faults.
All major runtime authors disagree with you. Just look at the code. I 
spoke to some colleagues last week, and they indicate that it's a win on 
code size too.

As I mentioned above, you don't have to accept the technical merit of 
trapping. All that's necessary to see the necessity of my proposal is 
understanding that people will use sigaction if no alternative presents 
itself and that sigaction is not library-safe.

> I _could_ be convinced otherwise, but what it would take is a
> head-to-head performance comparison between a JIT that relies on page
> faults and a JIT that relies on explicit tests and implements
> state-of-the-art elimination of unnecessary tests, all else held
> equal, on a real application.

That's an impossible bar. I could insert artificial NULL checks in ART, 
but there's no way I could convince you I'd done all I could do to 
eliminate extraneous checks. No matter how many checks I eliminated, as 
long as I demonstrated an adverse impact on speed and code size, you 
could claim that there was some algorithmic fruit left unpicked. I'm not 
doing that. The universal use of traps by high-performance managed code 
runtime authors should be evidence enough.

> In the absence of that comparison, for synchronous faults I'm really
> only interested in making crash recovery more reliable. 

That is a deeply disappointing stance.

> That needs to
> happen from outside the corrupted address space, and it's OK if it
> takes a slow path.
> 
> You're right that "instantiate a ptrace monitor just-in-time" still
> has an arbitration problem _at the kernel level_. 

You haven't also addressed the reliability issue. Reliability is 
essential when we're talking about code implementing a mandatory aspect 
of a language specification.

> I imagine the
> arbitration - via exception tables or whatever - happening _inside_
> the monitor.  The C library might provide a "shell" ptrace monitor
> that could be extended with application-specific modules.

Think of all the problems we have with NSS and PAM. Now add page faults. 
Is that a good world?

> Note also that we wouldn't spawn a fresh instance of the monitor for
> every fault.  Once it's running, it would stay running and remain
> attached to the process.  If the process was already being ptraced by
> a full debugger, the monitor would not be involved.

So the debugger, for correctness, would _also_ have to implement the 
trap-arbitration protocol? Would strace? rr? That's an unreasonable 
demand, especially when coupled with zero engineering benefit that such 
a complicated mechanism would bring.

> (This gets tricky
> when you want to debug the monitor, but not worse than when you want
> to debug a debugger.)

Debugging a debugger is no harder than debugging other programs, in my 
experience.

>> While I would approve of adding SEH, I don't think it's a realistic option
>> at the moment.
> 
> Agreed that it is too much of a coordination challenge to add SEH; also,
> since it relies on dynamic information on the stack, it's not safe in
> the face of adversarial memory corruption.

Nothing is safe in the face of memory corruption. Demanding perfect 
safety in an unsafe world is tantamount to blocking all progress.

>> An except_table approach might solve part of the problem, but you'd need to
>> provide a dynamic registration facility for the sake of JIT systems.
> 
> Yeah.  But you don't want the JIT-generated code to be able to access
> the registrar.  Here, perhaps the right thing is for the JIT to invoke
> its sandboxed untrusted-code subprocess already under its own ptrace
> monitoring.

So now we're talking about _multiple_ fragile external processes. That's 
unacceptable.

>> Also, think of how a table lookup would work at a mechanical level. The
>> kernel would still push a SIGSEGV frame onto some stack and transfer control
>> flow to the handler.
> 
> No. The kernel would wake up the monitor process sleeping in ptrace
> (or perhaps select() on the process handle) or instantiate one if it
> doesn't already exist.

It can't guarantee successful instantiation.

> "For code using the new API, we NEVER need to interrupt normal control
> flow and push a signal frame" is also on my list of constraints that
> must be satisfied for the design to be complete and acceptable.

That is an utterly unrealistic stance. You _have_ to interrupt control 
flow. That's the whole point. And you have to be able to respond to that 
interruption from _inside_ the process whose thread is being interrupted 
for reasons I've already discussed. At this point, you have 
async-signal-safety concerns whether or not the precise mechanism is 
frame-pushing or message delivery, so what's the point of bothering with 
the charade of a sending message?

The ability to handle synchronous signals in-process is one of the 
constraints that any acceptable and complete system must have. 
Fortunately, we have sigaction, for which this thread has taught me to 
be increasingly thankful.

     prev parent reply	other threads:[~2018-03-12 19:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-08 17:53 Daniel Colascione
2018-03-08 20:09 ` Florian Weimer
2018-03-08 20:22   ` dancol
2018-03-08 21:21     ` Ondřej Bílka
2018-03-08 21:50       ` dancol
2018-03-09  8:17         ` Ondřej Bílka
2018-03-09 10:51           ` Daniel Colascione
2018-03-09  9:19     ` Florian Weimer
2018-03-09 10:43       ` Daniel Colascione
2018-03-09 16:41         ` Rich Felker
2018-03-09 16:58           ` Florian Weimer
2018-03-09 17:14             ` Rich Felker
2018-03-09 17:36               ` Paul Eggert
2018-03-09 19:34               ` Daniel Colascione
2018-03-09 19:28           ` Daniel Colascione
2018-03-09 19:30           ` Zack Weinberg
2018-03-09 20:06             ` Daniel Colascione
2018-03-09 20:25             ` Rich Felker
2018-03-09 20:54               ` Daniel Colascione
2018-03-09 21:10                 ` Rich Felker
2018-03-09 21:27                   ` dancol
2018-03-09 21:05               ` Zack Weinberg
2018-03-10  7:56               ` Florian Weimer
2018-03-10  8:41                 ` dancol
2018-03-11 18:07 ` Zack Weinberg
2018-03-11 18:56   ` Daniel Colascione
2018-03-12 15:17     ` Zack Weinberg
2018-03-12 19:47       ` Daniel Colascione [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=958db2bb-18a8-b46c-e098-5c388268e620@dancol.org \
    --to=dancol@dancol.org \
    --cc=libc-alpha@sourceware.org \
    --cc=zackw@panix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).