From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <fxcoudert@gmail.com>
Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com
 [IPv6:2a00:1450:4864:20::42d])
 by sourceware.org (Postfix) with ESMTPS id 6E6523858C60;
 Sun, 19 Dec 2021 23:50:50 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6E6523858C60
Received: by mail-wr1-x42d.google.com with SMTP id j9so16648106wrc.0;
 Sun, 19 Dec 2021 15:50:50 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:from:message-id:mime-version:subject:date
 :in-reply-to:cc:to:references;
 bh=KgkZXEgRI/6/kD2eO9mj5hmMV0o/kJJct4Hflkavr2I=;
 b=x3fqA8hr5w+MPRfv1ttLQ0ZKP8EudDuVv8uJXR0u9Vk0/r72dbyExJmFt3SaDNh1Ll
 LWvYKylnU6XRq5l4L10iRh4g/cN5MIYXklu6yxdA/WJviV2jEQZ++BAl75ZAlx+DaquY
 oZYIf9G7sqqyYeuQdW6+E+N8osHzQEyK/c2JJci8Gt20jdFdLA/iaFBJFMxiyHHqDDUt
 16B/S62bEvMv/mZYCyl4Tb8Z7iP+BUP0X9kVCCE0DQy2x3XthfQuJAMY4IxEwPJY85qF
 jb3sVYPG8IvdNpOz23Ke6FX60G3jjRu1j60B8p5eXf+3mw9RJJSXsWhyjbZRXk9P+FiX
 iqAw==
X-Gm-Message-State: AOAM5300q4/F4iGqnEJdzHVei63K55JzHXCSCJypHctHDKbHKHA77t5v
 a5/QMUByVzAknokbUIgDdXY=
X-Google-Smtp-Source: ABdhPJwEupzBKBdT/S/SRqtuO1DGQ28yr0qT8ZYfMPYubXe6lJ7KhhxUZGIPtTmav3zqSLj2aHBoCw==
X-Received: by 2002:a5d:5744:: with SMTP id q4mr10713107wrw.698.1639957849462; 
 Sun, 19 Dec 2021 15:50:49 -0800 (PST)
Received: from smtpclient.apple ([2a01:e34:ec28:8cb0:c4a8:8a63:d59a:c5a5])
 by smtp.gmail.com with ESMTPSA id q1sm13944319wra.82.2021.12.19.15.50.48
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 19 Dec 2021 15:50:48 -0800 (PST)
From: FX <fxcoudert@gmail.com>
Message-Id: <2DE7796D-F64D-420B-BAD7-12725690CE56@gmail.com>
Content-Type: multipart/mixed;
 boundary="Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183"
Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.40.0.1.81\))
Subject: Re: [patch, Fortran] IEEE support for aarch64-apple-darwin
Date: Mon, 20 Dec 2021 00:50:48 +0100
In-Reply-To: <0c2d5c11-4d72-8808-4d83-77797b1d9bdd@netcologne.de>
Cc: fortran@gcc.gnu.org,
 gcc-patches@gcc.gnu.org
To: Thomas Koenig <tkoenig@netcologne.de>
References: <93D8CCF9-4230-4517-A993-A811092ADC4B@gmail.com>
 <0c2d5c11-4d72-8808-4d83-77797b1d9bdd@netcologne.de>
X-Mailer: Apple Mail (2.3693.40.0.1.81)
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: fortran@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Fortran mailing list <fortran.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/fortran>,
 <mailto:fortran-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/fortran/>
List-Help: <mailto:fortran-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/fortran>,
 <mailto:fortran-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Dec 2021 23:50:51 -0000


--Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Hi Thomas,

> OK, and thanks for the patch!

Thanks for the review, committed a slightly amended patch as =
220b9bdfe8faebdd2aea0ab7cea81c162d42d8e0 with underflow control support =
added.

FX


--Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183
Content-Disposition: attachment;
	filename=ieee.patch
Content-Type: application/octet-stream;
	x-unix-mode=0644;
	name="ieee.patch"
Content-Transfer-Encoding: 7bit

commit 220b9bdfe8faebdd2aea0ab7cea81c162d42d8e0
Author: Francois-Xavier Coudert <fxcoudert@gmail.com>
Date:   2021-12-20 00:45:31 +0100

    Fortran: add support for IEEE intrinsics on aarch64 non-glibc targets
    
    This enables IEEE support on the upcoming aarch64-apple-darwin target,
    and has been tested for some time in an external port.
    
    libgfortran/ChangeLog:
    
            * configure.host: Add aarch64-apple-darwin support.
            * config/fpu-aarch64.h: New file.

diff --git a/libgfortran/config/fpu-aarch64.h b/libgfortran/config/fpu-aarch64.h
new file mode 100644
index 00000000000..0746f42938a
--- /dev/null
+++ b/libgfortran/config/fpu-aarch64.h
@@ -0,0 +1,331 @@
+/* FPU-related code for aarch64.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>
+
+This file is part of the GNU Fortran runtime library (libgfortran).
+
+Libgfortran is free software; you can redistribute it and/or
+modify it under the terms of the GNU General Public
+License as published by the Free Software Foundation; either
+version 3 of the License, or (at your option) any later version.
+
+Libgfortran is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+
+/* Rounding mask and modes */
+
+#define FPCR_RM_MASK  0x0c00000
+#define FE_TONEAREST  0x0000000
+#define FE_UPWARD     0x0400000
+#define FE_DOWNWARD   0x0800000
+#define FE_TOWARDZERO 0x0c00000
+#define FE_MAP_FZ     0x1000000
+
+/* Exceptions */
+
+#define FE_INVALID	1
+#define FE_DIVBYZERO	2
+#define FE_OVERFLOW	4
+#define FE_UNDERFLOW	8
+#define FE_INEXACT	16
+
+#define FE_ALL_EXCEPT (FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW | FE_INEXACT)
+#define FE_EXCEPT_SHIFT	8
+
+
+
+/* This structure corresponds to the layout of the block
+   written by FSTENV.  */
+struct fenv
+{
+  unsigned int __fpcr;
+  unsigned int __fpsr;
+};
+
+/* Check we can actually store the FPU state in the allocated size.  */
+_Static_assert (sizeof(struct fenv) <= (size_t) GFC_FPE_STATE_BUFFER_SIZE,
+		"GFC_FPE_STATE_BUFFER_SIZE is too small");
+
+
+
+void
+set_fpu (void)
+{
+  if (options.fpe & GFC_FPE_DENORMAL)
+    estr_write ("Fortran runtime warning: Floating point 'denormal operand' "
+	        "exception not supported.\n");
+
+  set_fpu_trap_exceptions (options.fpe, 0);
+}
+
+
+int
+get_fpu_trap_exceptions (void)
+{
+  unsigned int fpcr, exceptions;
+  int res = 0;
+
+  fpcr = __builtin_aarch64_get_fpcr();
+  exceptions = (fpcr >> FE_EXCEPT_SHIFT) & FE_ALL_EXCEPT;
+
+  if (exceptions & FE_INVALID) res |= GFC_FPE_INVALID;
+  if (exceptions & FE_DIVBYZERO) res |= GFC_FPE_ZERO;
+  if (exceptions & FE_OVERFLOW) res |= GFC_FPE_OVERFLOW;
+  if (exceptions & FE_UNDERFLOW) res |= GFC_FPE_UNDERFLOW;
+  if (exceptions & FE_INEXACT) res |= GFC_FPE_INEXACT;
+
+  return res;
+}
+
+
+void set_fpu_trap_exceptions (int trap, int notrap)
+{
+  unsigned int mode_set = 0, mode_clr = 0;
+  unsigned int fpsr, fpsr_new;
+  unsigned int fpcr, fpcr_new;
+
+  if (trap & GFC_FPE_INVALID)
+    mode_set |= FE_INVALID;
+  if (notrap & GFC_FPE_INVALID)
+    mode_clr |= FE_INVALID;
+
+  if (trap & GFC_FPE_ZERO)
+    mode_set |= FE_DIVBYZERO;
+  if (notrap & GFC_FPE_ZERO)
+    mode_clr |= FE_DIVBYZERO;
+
+  if (trap & GFC_FPE_OVERFLOW)
+    mode_set |= FE_OVERFLOW;
+  if (notrap & GFC_FPE_OVERFLOW)
+    mode_clr |= FE_OVERFLOW;
+
+  if (trap & GFC_FPE_UNDERFLOW)
+    mode_set |= FE_UNDERFLOW;
+  if (notrap & GFC_FPE_UNDERFLOW)
+    mode_clr |= FE_UNDERFLOW;
+
+  if (trap & GFC_FPE_INEXACT)
+    mode_set |= FE_INEXACT;
+  if (notrap & GFC_FPE_INEXACT)
+    mode_clr |= FE_INEXACT;
+
+  /* Clear stalled exception flags.  */
+  fpsr = __builtin_aarch64_get_fpsr();
+  fpsr_new = fpsr & ~FE_ALL_EXCEPT;
+  if (fpsr_new != fpsr)
+    __builtin_aarch64_set_fpsr(fpsr_new);
+
+  fpcr_new = fpcr = __builtin_aarch64_get_fpcr();
+  fpcr_new |= (mode_set << FE_EXCEPT_SHIFT);
+  fpcr_new &= ~(mode_clr << FE_EXCEPT_SHIFT);
+
+  if (fpcr_new != fpcr)
+    __builtin_aarch64_set_fpcr(fpcr_new);
+}
+
+
+int
+support_fpu_flag (int flag)
+{
+  if (flag & GFC_FPE_DENORMAL)
+    return 0;
+
+  return 1;
+}
+
+
+int
+support_fpu_trap (int flag)
+{
+  if (flag & GFC_FPE_DENORMAL)
+    return 0;
+
+  return 1;
+}
+
+
+int
+get_fpu_except_flags (void)
+{
+  int result;
+  unsigned int fpsr;
+
+  result = 0;
+  fpsr = __builtin_aarch64_get_fpsr() & FE_ALL_EXCEPT;
+
+  if (fpsr & FE_INVALID)
+    result |= GFC_FPE_INVALID;
+  if (fpsr & FE_DIVBYZERO)
+    result |= GFC_FPE_ZERO;
+  if (fpsr & FE_OVERFLOW)
+    result |= GFC_FPE_OVERFLOW;
+  if (fpsr & FE_UNDERFLOW)
+    result |= GFC_FPE_UNDERFLOW;
+  if (fpsr & FE_INEXACT)
+    result |= GFC_FPE_INEXACT;
+
+  return result;
+}
+
+
+void
+set_fpu_except_flags (int set, int clear)
+{
+  unsigned int exc_set = 0, exc_clr = 0;
+  unsigned int fpsr, fpsr_new;
+
+  if (set & GFC_FPE_INVALID)
+    exc_set |= FE_INVALID;
+  else if (clear & GFC_FPE_INVALID)
+    exc_clr |= FE_INVALID;
+
+  if (set & GFC_FPE_ZERO)
+    exc_set |= FE_DIVBYZERO;
+  else if (clear & GFC_FPE_ZERO)
+    exc_clr |= FE_DIVBYZERO;
+
+  if (set & GFC_FPE_OVERFLOW)
+    exc_set |= FE_OVERFLOW;
+  else if (clear & GFC_FPE_OVERFLOW)
+    exc_clr |= FE_OVERFLOW;
+
+  if (set & GFC_FPE_UNDERFLOW)
+    exc_set |= FE_UNDERFLOW;
+  else if (clear & GFC_FPE_UNDERFLOW)
+    exc_clr |= FE_UNDERFLOW;
+
+  if (set & GFC_FPE_INEXACT)
+    exc_set |= FE_INEXACT;
+  else if (clear & GFC_FPE_INEXACT)
+    exc_clr |= FE_INEXACT;
+
+  fpsr_new = fpsr = __builtin_aarch64_get_fpsr();
+  fpsr_new &= ~exc_clr;
+  fpsr_new |= exc_set;
+
+  if (fpsr_new != fpsr)
+    __builtin_aarch64_set_fpsr(fpsr_new);
+}
+
+
+void
+get_fpu_state (void *state)
+{
+  struct fenv *envp = state;
+  envp->__fpcr = __builtin_aarch64_get_fpcr();
+  envp->__fpsr = __builtin_aarch64_get_fpsr();
+}
+
+
+void
+set_fpu_state (void *state)
+{
+  struct fenv *envp = state;
+  __builtin_aarch64_set_fpcr(envp->__fpcr);
+  __builtin_aarch64_set_fpsr(envp->__fpsr);
+}
+
+
+int
+get_fpu_rounding_mode (void)
+{
+  unsigned int fpcr = __builtin_aarch64_get_fpcr();
+  fpcr &= FPCR_RM_MASK;
+
+  switch (fpcr)
+    {
+      case FE_TONEAREST:
+        return GFC_FPE_TONEAREST;
+      case FE_UPWARD:
+        return GFC_FPE_UPWARD;
+      case FE_DOWNWARD:
+        return GFC_FPE_DOWNWARD;
+      case FE_TOWARDZERO:
+        return GFC_FPE_TOWARDZERO;
+      default:
+        return 0; /* Should be unreachable.  */
+    }
+}
+
+
+void
+set_fpu_rounding_mode (int round)
+{
+  unsigned int fpcr, round_mode;
+
+  switch (round)
+    {
+    case GFC_FPE_TONEAREST:
+      round_mode = FE_TONEAREST;
+      break;
+    case GFC_FPE_UPWARD:
+      round_mode = FE_UPWARD;
+      break;
+    case GFC_FPE_DOWNWARD:
+      round_mode = FE_DOWNWARD;
+      break;
+    case GFC_FPE_TOWARDZERO:
+      round_mode = FE_TOWARDZERO;
+      break;
+    default:
+      return; /* Should be unreachable.  */
+    }
+
+  fpcr = __builtin_aarch64_get_fpcr();
+
+  /* Only set FPCR if requested mode is different from current.  */
+  round_mode = (fpcr ^ round_mode) & FPCR_RM_MASK;
+  if (round_mode != 0)
+    __builtin_aarch64_set_fpcr(fpcr ^ round_mode);
+}
+
+
+int
+support_fpu_rounding_mode (int mode __attribute__((unused)))
+{
+  return 1;
+}
+
+
+int
+support_fpu_underflow_control (int kind __attribute__((unused)))
+{
+  /* Not supported for binary128.  */
+  return (kind == 4 || kind == 8) ? 1 : 0;
+}
+
+
+int
+get_fpu_underflow_mode (void)
+{
+  unsigned int fpcr = __builtin_aarch64_get_fpcr();
+
+  /* Return 0 for abrupt underflow (flush to zero), 1 for gradual underflow.  */
+  return (fpcr & FE_MAP_FZ) ? 0 : 1;
+}
+
+
+void
+set_fpu_underflow_mode (int gradual __attribute__((unused)))
+{
+  unsigned int fpcr = __builtin_aarch64_get_fpcr();
+
+  if (gradual)
+    fpcr &= ~FE_MAP_FZ;
+  else
+    fpcr |= FE_MAP_FZ;
+
+  __builtin_aarch64_set_fpcr(fpcr);
+}
diff --git a/libgfortran/configure.host b/libgfortran/configure.host
index e9d92c9d34d..3d6c2db7772 100644
--- a/libgfortran/configure.host
+++ b/libgfortran/configure.host
@@ -39,17 +39,29 @@ if test "x${have_feenableexcept}" = "xyes"; then
   ieee_support='yes'
 fi
 
-# x86 asm should be used instead of glibc, since glibc doesn't support
-# the x86 denormal exception.
 case "${host_cpu}" in
+
+  # x86 asm should be used instead of glibc, since glibc doesn't support
+  # the x86 denormal exception.
   i?86 | x86_64)
     if test "x${have_soft_float}" = "xyes"; then
       fpu_host='fpu-generic'
+      ieee_support='no'
     else
       fpu_host='fpu-387'
+      ieee_support='yes'
     fi
-    ieee_support='yes'
     ;;
+
+  # use asm on aarch64-darwin
+  aarch64)
+    case "${host_os}" in
+      darwin*)
+        fpu_host='fpu-aarch64'
+        ieee_support='yes'
+        ;;
+    esac
+
 esac
 
 # Some targets require additional compiler options for NaN/Inf.

--Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183--