From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1843 invoked by alias); 8 May 2011 18:57:07 -0000 Received: (qmail 1824 invoked by uid 22791); 8 May 2011 18:57:06 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST X-Spam-Check-By: sourceware.org Received: from mail-pv0-f175.google.com (HELO mail-pv0-f175.google.com) (74.125.83.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 08 May 2011 18:56:51 +0000 Received: by pvc30 with SMTP id 30so2435768pvc.20 for ; Sun, 08 May 2011 11:56:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.11.228 with SMTP id t4mr8946341pbb.294.1304881010904; Sun, 08 May 2011 11:56:50 -0700 (PDT) Received: by 10.68.42.197 with HTTP; Sun, 8 May 2011 11:56:50 -0700 (PDT) In-Reply-To: References: Date: Mon, 09 May 2011 02:31:00 -0000 Message-ID: Subject: Re: [Patch, libfortran] Thread safety and simplification of error printing From: Janne Blomqvist To: "N.M. Maclaren" Cc: Fortran List , GCC Patches Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg00624.txt.bz2 On Sun, May 8, 2011 at 16:42, N.M. Maclaren wrote: > On May 8 2011, Janne Blomqvist wrote: >>> >>> the error printing functionality (in io/unix.c) st_printf and >>> st_vprintf are not thread-safe as they use a static buffer. ... >> >> While this patch makes error printing thread-safe, it's no longer >> async-signal-safe as the stderr lock might lead to a deadlock. So I'm >> retracting this patch and thinking some more about this problem. > > It's theoretically insoluble, given the constraints you are working > under. =C2=A0Sorry. =C2=A0It is possible to do reasonably well, but there= will > always be likely scenarios where all you can do is to say "Aargh! > I give up." Well, I realize perfection is impossible, so I'm settling for merely improving the status quo! > Both I and the VMS people adopted the ratchet design. =C2=A0You have N > levels of error recovery, each level allocates all of the resources > it needs before startup, and any exception during level K increases > the level to K+1 and calls the level K+1 error handler. =C2=A0When you > have an exception at level N, you just die. To some extent we have a crude version of this, in that when we're entering many of the fatal error handling functions we do a recursion check and if that fails, die. Also, in a recent patch of mine (http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00584.html ) the fatal signal handler function has been reworked to hopefully deal better with other signal(s) being delivered before it's done; that code is modeled after an example in the glibc manual, and I'm a bit unsure if the recursion check thingy really works or we just end up in an infinite recursion (that is, do we need to re-set to the default handler before re-raising? I have a vague memory that the signal handler for SIGXXX must finish before starting the handler for another SIGXXX pending signal, which would make the current version safe). > That imposes the constraint that all diagnostics have a fixed upper > bound on the resources they need (not just buffer space, but that's > the main one). =C2=A0It's a real bummer when the system has some critical > resources that you can't reserve, so you have to treat an allocation > failure as an exception, but buffer space is not one such. > > That also tackles the thread problem, not very satisfactorily. =C2=A0If a > resource needs to be locked, you can try to get it for a bit, and > then raise a higher exception if you can't. =C2=A0And, typically, one or > more of the highest levels are for closing down the process, and > simply suspend any subsequent threads that call them (i.e. just leave > them waiting for a lock they won't get). I think in our case the situation is a bit easier in that we're not trying to recover from a serious failure, merely print some diagnostic information without getting stuck in a deadlock. --=20 Janne Blomqvist