From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 031203858431; Wed, 18 Jan 2023 19:02:21 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 031203858431
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1674068542;
	bh=mHV8UhwN66WvAVrYilGucR0zExtZi30ugJq3YMHWFS4=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=KqC/J+JB4bK2ycfoUE904bf9FT7E3wm/ZtZR5GFRjUofVHjhKt5ZvtBKMFEH2sg/y
	 6juwMgNyheIDkvb0S1znAqth1pgcl1MNjr9gF8pv85sIu8t/vQSBJAq9wXdChlicaG
	 f116lAH7qrmxa63+QndDbqZuv1l7jnAA/p4SgHQM=
From: "wilco at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug libgcc/108279] Improved speed for float128 routines
Date: Wed, 18 Jan 2023 19:02:19 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: libgcc
X-Bugzilla-Version: unknown
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: wilco at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-108279-4-sMth09lJa1@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-108279-4@http.gcc.gnu.org/bugzilla/>
References: <bug-108279-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108279

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wilco at gcc dot gnu.org
--- Comment #18 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Michael_S from comment #12)

> This set of options does not map too well into real difficulties of
> implementation.
> There are only 2 things that are expensive:
> 1. Inexact Exception
> 2. Fetching of the current rounding mode.
> The rest of IEEE-754 features is so cheap that creating separate variants
> without them simply is not worth the effort of maintaining distinct
> variants, even if all difference is a single three-lines #ifdef

In general reading the current rounding mode is relatively cheap, but modif=
ying
can be expensive, so optimized fenv implementations in GLIBC only modify th=
e FP
status if a change is required. It should be feasible to check for
round-to-even and use optimized code for that case.

> BTW, Inexact Exception can be made fairly affordable with a little help f=
rom
> compiler. All we need for that is ability to say "don't remove this float=
ing
> point addition even if you don't see that it produces any effect".
> Something similar to 'volatile', but with volatile compiler currently puts
> result of addition on stack, which adds undesirable cost.
> However, judged by comment of Jakub, compiler maintainers are not
> particularly interested in this enterprise.

There are macros in GLIBC math-barriers.h which do what you want - eg. AArc=
h64:

#define math_opt_barrier(x)                                     \
  ({ __typeof (x) __x =3D (x); __asm ("" : "+w" (__x)); __x; })
#define math_force_eval(x)                                              \
  ({ __typeof (x) __x =3D (x); __asm __volatile__ ("" : : "w" (__x)); })

The first blocks optimizations (like constant folding) across the barrier, =
the
2nd forces evaluation of an expression even if it is deemed useless. These =
are
used in many math functions in GLIBC. They are target specific due to needi=
ng
inline assembler operands, but it should be easy to add similar definitions=
 to
libgcc.=