From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id 6E6523858C60; Sun, 19 Dec 2021 23:50:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6E6523858C60 Received: by mail-wr1-x42d.google.com with SMTP id j9so16648106wrc.0; Sun, 19 Dec 2021 15:50:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=KgkZXEgRI/6/kD2eO9mj5hmMV0o/kJJct4Hflkavr2I=; b=x3fqA8hr5w+MPRfv1ttLQ0ZKP8EudDuVv8uJXR0u9Vk0/r72dbyExJmFt3SaDNh1Ll LWvYKylnU6XRq5l4L10iRh4g/cN5MIYXklu6yxdA/WJviV2jEQZ++BAl75ZAlx+DaquY oZYIf9G7sqqyYeuQdW6+E+N8osHzQEyK/c2JJci8Gt20jdFdLA/iaFBJFMxiyHHqDDUt 16B/S62bEvMv/mZYCyl4Tb8Z7iP+BUP0X9kVCCE0DQy2x3XthfQuJAMY4IxEwPJY85qF jb3sVYPG8IvdNpOz23Ke6FX60G3jjRu1j60B8p5eXf+3mw9RJJSXsWhyjbZRXk9P+FiX iqAw== X-Gm-Message-State: AOAM5300q4/F4iGqnEJdzHVei63K55JzHXCSCJypHctHDKbHKHA77t5v a5/QMUByVzAknokbUIgDdXY= X-Google-Smtp-Source: ABdhPJwEupzBKBdT/S/SRqtuO1DGQ28yr0qT8ZYfMPYubXe6lJ7KhhxUZGIPtTmav3zqSLj2aHBoCw== X-Received: by 2002:a5d:5744:: with SMTP id q4mr10713107wrw.698.1639957849462; Sun, 19 Dec 2021 15:50:49 -0800 (PST) Received: from smtpclient.apple ([2a01:e34:ec28:8cb0:c4a8:8a63:d59a:c5a5]) by smtp.gmail.com with ESMTPSA id q1sm13944319wra.82.2021.12.19.15.50.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 19 Dec 2021 15:50:48 -0800 (PST) From: FX Message-Id: <2DE7796D-F64D-420B-BAD7-12725690CE56@gmail.com> Content-Type: multipart/mixed; boundary="Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183" Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.40.0.1.81\)) Subject: Re: [patch, Fortran] IEEE support for aarch64-apple-darwin Date: Mon, 20 Dec 2021 00:50:48 +0100 In-Reply-To: <0c2d5c11-4d72-8808-4d83-77797b1d9bdd@netcologne.de> Cc: fortran@gcc.gnu.org, gcc-patches@gcc.gnu.org To: Thomas Koenig References: <93D8CCF9-4230-4517-A993-A811092ADC4B@gmail.com> <0c2d5c11-4d72-8808-4d83-77797b1d9bdd@netcologne.de> X-Mailer: Apple Mail (2.3693.40.0.1.81) X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: fortran@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Fortran mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Dec 2021 23:50:51 -0000 --Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi Thomas, > OK, and thanks for the patch! Thanks for the review, committed a slightly amended patch as = 220b9bdfe8faebdd2aea0ab7cea81c162d42d8e0 with underflow control support = added. FX --Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183 Content-Disposition: attachment; filename=ieee.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="ieee.patch" Content-Transfer-Encoding: 7bit commit 220b9bdfe8faebdd2aea0ab7cea81c162d42d8e0 Author: Francois-Xavier Coudert Date: 2021-12-20 00:45:31 +0100 Fortran: add support for IEEE intrinsics on aarch64 non-glibc targets This enables IEEE support on the upcoming aarch64-apple-darwin target, and has been tested for some time in an external port. libgfortran/ChangeLog: * configure.host: Add aarch64-apple-darwin support. * config/fpu-aarch64.h: New file. diff --git a/libgfortran/config/fpu-aarch64.h b/libgfortran/config/fpu-aarch64.h new file mode 100644 index 00000000000..0746f42938a --- /dev/null +++ b/libgfortran/config/fpu-aarch64.h @@ -0,0 +1,331 @@ +/* FPU-related code for aarch64. + Copyright (C) 2020 Free Software Foundation, Inc. + Contributed by Francois-Xavier Coudert + +This file is part of the GNU Fortran runtime library (libgfortran). + +Libgfortran is free software; you can redistribute it and/or +modify it under the terms of the GNU General Public +License as published by the Free Software Foundation; either +version 3 of the License, or (at your option) any later version. + +Libgfortran is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + + +/* Rounding mask and modes */ + +#define FPCR_RM_MASK 0x0c00000 +#define FE_TONEAREST 0x0000000 +#define FE_UPWARD 0x0400000 +#define FE_DOWNWARD 0x0800000 +#define FE_TOWARDZERO 0x0c00000 +#define FE_MAP_FZ 0x1000000 + +/* Exceptions */ + +#define FE_INVALID 1 +#define FE_DIVBYZERO 2 +#define FE_OVERFLOW 4 +#define FE_UNDERFLOW 8 +#define FE_INEXACT 16 + +#define FE_ALL_EXCEPT (FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW | FE_INEXACT) +#define FE_EXCEPT_SHIFT 8 + + + +/* This structure corresponds to the layout of the block + written by FSTENV. */ +struct fenv +{ + unsigned int __fpcr; + unsigned int __fpsr; +}; + +/* Check we can actually store the FPU state in the allocated size. */ +_Static_assert (sizeof(struct fenv) <= (size_t) GFC_FPE_STATE_BUFFER_SIZE, + "GFC_FPE_STATE_BUFFER_SIZE is too small"); + + + +void +set_fpu (void) +{ + if (options.fpe & GFC_FPE_DENORMAL) + estr_write ("Fortran runtime warning: Floating point 'denormal operand' " + "exception not supported.\n"); + + set_fpu_trap_exceptions (options.fpe, 0); +} + + +int +get_fpu_trap_exceptions (void) +{ + unsigned int fpcr, exceptions; + int res = 0; + + fpcr = __builtin_aarch64_get_fpcr(); + exceptions = (fpcr >> FE_EXCEPT_SHIFT) & FE_ALL_EXCEPT; + + if (exceptions & FE_INVALID) res |= GFC_FPE_INVALID; + if (exceptions & FE_DIVBYZERO) res |= GFC_FPE_ZERO; + if (exceptions & FE_OVERFLOW) res |= GFC_FPE_OVERFLOW; + if (exceptions & FE_UNDERFLOW) res |= GFC_FPE_UNDERFLOW; + if (exceptions & FE_INEXACT) res |= GFC_FPE_INEXACT; + + return res; +} + + +void set_fpu_trap_exceptions (int trap, int notrap) +{ + unsigned int mode_set = 0, mode_clr = 0; + unsigned int fpsr, fpsr_new; + unsigned int fpcr, fpcr_new; + + if (trap & GFC_FPE_INVALID) + mode_set |= FE_INVALID; + if (notrap & GFC_FPE_INVALID) + mode_clr |= FE_INVALID; + + if (trap & GFC_FPE_ZERO) + mode_set |= FE_DIVBYZERO; + if (notrap & GFC_FPE_ZERO) + mode_clr |= FE_DIVBYZERO; + + if (trap & GFC_FPE_OVERFLOW) + mode_set |= FE_OVERFLOW; + if (notrap & GFC_FPE_OVERFLOW) + mode_clr |= FE_OVERFLOW; + + if (trap & GFC_FPE_UNDERFLOW) + mode_set |= FE_UNDERFLOW; + if (notrap & GFC_FPE_UNDERFLOW) + mode_clr |= FE_UNDERFLOW; + + if (trap & GFC_FPE_INEXACT) + mode_set |= FE_INEXACT; + if (notrap & GFC_FPE_INEXACT) + mode_clr |= FE_INEXACT; + + /* Clear stalled exception flags. */ + fpsr = __builtin_aarch64_get_fpsr(); + fpsr_new = fpsr & ~FE_ALL_EXCEPT; + if (fpsr_new != fpsr) + __builtin_aarch64_set_fpsr(fpsr_new); + + fpcr_new = fpcr = __builtin_aarch64_get_fpcr(); + fpcr_new |= (mode_set << FE_EXCEPT_SHIFT); + fpcr_new &= ~(mode_clr << FE_EXCEPT_SHIFT); + + if (fpcr_new != fpcr) + __builtin_aarch64_set_fpcr(fpcr_new); +} + + +int +support_fpu_flag (int flag) +{ + if (flag & GFC_FPE_DENORMAL) + return 0; + + return 1; +} + + +int +support_fpu_trap (int flag) +{ + if (flag & GFC_FPE_DENORMAL) + return 0; + + return 1; +} + + +int +get_fpu_except_flags (void) +{ + int result; + unsigned int fpsr; + + result = 0; + fpsr = __builtin_aarch64_get_fpsr() & FE_ALL_EXCEPT; + + if (fpsr & FE_INVALID) + result |= GFC_FPE_INVALID; + if (fpsr & FE_DIVBYZERO) + result |= GFC_FPE_ZERO; + if (fpsr & FE_OVERFLOW) + result |= GFC_FPE_OVERFLOW; + if (fpsr & FE_UNDERFLOW) + result |= GFC_FPE_UNDERFLOW; + if (fpsr & FE_INEXACT) + result |= GFC_FPE_INEXACT; + + return result; +} + + +void +set_fpu_except_flags (int set, int clear) +{ + unsigned int exc_set = 0, exc_clr = 0; + unsigned int fpsr, fpsr_new; + + if (set & GFC_FPE_INVALID) + exc_set |= FE_INVALID; + else if (clear & GFC_FPE_INVALID) + exc_clr |= FE_INVALID; + + if (set & GFC_FPE_ZERO) + exc_set |= FE_DIVBYZERO; + else if (clear & GFC_FPE_ZERO) + exc_clr |= FE_DIVBYZERO; + + if (set & GFC_FPE_OVERFLOW) + exc_set |= FE_OVERFLOW; + else if (clear & GFC_FPE_OVERFLOW) + exc_clr |= FE_OVERFLOW; + + if (set & GFC_FPE_UNDERFLOW) + exc_set |= FE_UNDERFLOW; + else if (clear & GFC_FPE_UNDERFLOW) + exc_clr |= FE_UNDERFLOW; + + if (set & GFC_FPE_INEXACT) + exc_set |= FE_INEXACT; + else if (clear & GFC_FPE_INEXACT) + exc_clr |= FE_INEXACT; + + fpsr_new = fpsr = __builtin_aarch64_get_fpsr(); + fpsr_new &= ~exc_clr; + fpsr_new |= exc_set; + + if (fpsr_new != fpsr) + __builtin_aarch64_set_fpsr(fpsr_new); +} + + +void +get_fpu_state (void *state) +{ + struct fenv *envp = state; + envp->__fpcr = __builtin_aarch64_get_fpcr(); + envp->__fpsr = __builtin_aarch64_get_fpsr(); +} + + +void +set_fpu_state (void *state) +{ + struct fenv *envp = state; + __builtin_aarch64_set_fpcr(envp->__fpcr); + __builtin_aarch64_set_fpsr(envp->__fpsr); +} + + +int +get_fpu_rounding_mode (void) +{ + unsigned int fpcr = __builtin_aarch64_get_fpcr(); + fpcr &= FPCR_RM_MASK; + + switch (fpcr) + { + case FE_TONEAREST: + return GFC_FPE_TONEAREST; + case FE_UPWARD: + return GFC_FPE_UPWARD; + case FE_DOWNWARD: + return GFC_FPE_DOWNWARD; + case FE_TOWARDZERO: + return GFC_FPE_TOWARDZERO; + default: + return 0; /* Should be unreachable. */ + } +} + + +void +set_fpu_rounding_mode (int round) +{ + unsigned int fpcr, round_mode; + + switch (round) + { + case GFC_FPE_TONEAREST: + round_mode = FE_TONEAREST; + break; + case GFC_FPE_UPWARD: + round_mode = FE_UPWARD; + break; + case GFC_FPE_DOWNWARD: + round_mode = FE_DOWNWARD; + break; + case GFC_FPE_TOWARDZERO: + round_mode = FE_TOWARDZERO; + break; + default: + return; /* Should be unreachable. */ + } + + fpcr = __builtin_aarch64_get_fpcr(); + + /* Only set FPCR if requested mode is different from current. */ + round_mode = (fpcr ^ round_mode) & FPCR_RM_MASK; + if (round_mode != 0) + __builtin_aarch64_set_fpcr(fpcr ^ round_mode); +} + + +int +support_fpu_rounding_mode (int mode __attribute__((unused))) +{ + return 1; +} + + +int +support_fpu_underflow_control (int kind __attribute__((unused))) +{ + /* Not supported for binary128. */ + return (kind == 4 || kind == 8) ? 1 : 0; +} + + +int +get_fpu_underflow_mode (void) +{ + unsigned int fpcr = __builtin_aarch64_get_fpcr(); + + /* Return 0 for abrupt underflow (flush to zero), 1 for gradual underflow. */ + return (fpcr & FE_MAP_FZ) ? 0 : 1; +} + + +void +set_fpu_underflow_mode (int gradual __attribute__((unused))) +{ + unsigned int fpcr = __builtin_aarch64_get_fpcr(); + + if (gradual) + fpcr &= ~FE_MAP_FZ; + else + fpcr |= FE_MAP_FZ; + + __builtin_aarch64_set_fpcr(fpcr); +} diff --git a/libgfortran/configure.host b/libgfortran/configure.host index e9d92c9d34d..3d6c2db7772 100644 --- a/libgfortran/configure.host +++ b/libgfortran/configure.host @@ -39,17 +39,29 @@ if test "x${have_feenableexcept}" = "xyes"; then ieee_support='yes' fi -# x86 asm should be used instead of glibc, since glibc doesn't support -# the x86 denormal exception. case "${host_cpu}" in + + # x86 asm should be used instead of glibc, since glibc doesn't support + # the x86 denormal exception. i?86 | x86_64) if test "x${have_soft_float}" = "xyes"; then fpu_host='fpu-generic' + ieee_support='no' else fpu_host='fpu-387' + ieee_support='yes' fi - ieee_support='yes' ;; + + # use asm on aarch64-darwin + aarch64) + case "${host_os}" in + darwin*) + fpu_host='fpu-aarch64' + ieee_support='yes' + ;; + esac + esac # Some targets require additional compiler options for NaN/Inf. --Apple-Mail=_F1F919B5-727B-44D7-BE16-2E7214C78183--