From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-291453-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 1843 invoked by alias); 8 May 2011 18:57:07 -0000
Received: (qmail 1824 invoked by uid 22791); 8 May 2011 18:57:06 -0000
X-SWARE-Spam-Status: No, hits=-2.5 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST
X-Spam-Check-By: sourceware.org
Received: from mail-pv0-f175.google.com (HELO mail-pv0-f175.google.com) (74.125.83.175)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 08 May 2011 18:56:51 +0000
Received: by pvc30 with SMTP id 30so2435768pvc.20        for <multiple recipients>; Sun, 08 May 2011 11:56:51 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.68.11.228 with SMTP id t4mr8946341pbb.294.1304881010904; Sun, 08 May 2011 11:56:50 -0700 (PDT)
Received: by 10.68.42.197 with HTTP; Sun, 8 May 2011 11:56:50 -0700 (PDT)
In-Reply-To: <Prayer.1.3.3.1105081442370.28066@hermes-2.csi.cam.ac.uk>
References: <BANLkTim5TtOYWtxnzhHLNwKqE-mpwG9pJg@mail.gmail.com>	<BANLkTinQQNAb4h78JM_ZvByUPCEx60pa3g@mail.gmail.com>	<Prayer.1.3.3.1105081442370.28066@hermes-2.csi.cam.ac.uk>
Date: Mon, 09 May 2011 02:31:00 -0000
Message-ID: <BANLkTi=iW_tg46dp8z8JnHd=PnKsVrrOcg@mail.gmail.com>
Subject: Re: [Patch, libfortran] Thread safety and simplification of error printing
From: Janne Blomqvist <blomqvist.janne@gmail.com>
To: "N.M. Maclaren" <nmm1@cam.ac.uk>
Cc: Fortran List <fortran@gcc.gnu.org>, GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2011-05/txt/msg00624.txt.bz2

On Sun, May 8, 2011 at 16:42, N.M. Maclaren <nmm1@cam.ac.uk> wrote:
> On May 8 2011, Janne Blomqvist wrote:
>>>
>>> the error printing functionality (in io/unix.c) st_printf and
>>> st_vprintf are not thread-safe as they use a static buffer. ...
>>
>> While this patch makes error printing thread-safe, it's no longer
>> async-signal-safe as the stderr lock might lead to a deadlock. So I'm
>> retracting this patch and thinking some more about this problem.
>
> It's theoretically insoluble, given the constraints you are working
> under. =C2=A0Sorry. =C2=A0It is possible to do reasonably well, but there=
 will
> always be likely scenarios where all you can do is to say "Aargh!
> I give up."

Well, I realize perfection is impossible, so I'm settling for merely
improving the status quo!

> Both I and the VMS people adopted the ratchet design. =C2=A0You have N
> levels of error recovery, each level allocates all of the resources
> it needs before startup, and any exception during level K increases
> the level to K+1 and calls the level K+1 error handler. =C2=A0When you
> have an exception at level N, you just die.

To some extent we have a crude version of this, in that when we're
entering many of the fatal error handling functions we do a recursion
check and if that fails, die. Also, in a recent patch of mine
(http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00584.html ) the fatal
signal handler function has been reworked to hopefully deal better
with other signal(s) being delivered before it's done; that code is
modeled after an example in the glibc manual, and I'm a bit unsure if
the recursion check thingy really works or we just end up in an
infinite recursion (that is, do we need to re-set to the default
handler before re-raising? I have a vague memory that the signal
handler for SIGXXX must finish before starting the handler for another
SIGXXX pending signal, which would make the current version safe).

> That imposes the constraint that all diagnostics have a fixed upper
> bound on the resources they need (not just buffer space, but that's
> the main one). =C2=A0It's a real bummer when the system has some critical
> resources that you can't reserve, so you have to treat an allocation
> failure as an exception, but buffer space is not one such.
>
> That also tackles the thread problem, not very satisfactorily. =C2=A0If a
> resource needs to be locked, you can try to get it for a bit, and
> then raise a higher exception if you can't. =C2=A0And, typically, one or
> more of the highest levels are for closing down the process, and
> simply suspend any subsequent threads that call them (i.e. just leave
> them waiting for a lock they won't get).

I think in our case the situation is a bit easier in that we're not
trying to recover from a serious failure, merely print some diagnostic
information without getting stuck in a deadlock.


--=20
Janne Blomqvist