From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DD9113858D28; Fri, 10 Feb 2023 13:38:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DD9113858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1676036293; bh=AMQQuqTwGFElQPbfB1AKWa/Q30q6owMI5siSAqODT4g=; h=From:To:Subject:Date:In-Reply-To:References:From; b=teFGAu+mrCy6LPMRVdfs5v135MSH5MHKUO1wdGAk+C1h820bFVKeNoI/ZdV9e5aQJ UKGJz6DPE3ovQMOWEe/Ez65Gb6XZqREFjCFtlF4BAktKqC7Q/lAVGYzUReT13/hBK4 dcWbG2F41q5NznszKqyhq4yxh1K5qm5srbWKtad0= From: "already5chosen at yahoo dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug libgcc/108279] Improved speed for float128 routines Date: Fri, 10 Feb 2023 13:38:09 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libgcc X-Bugzilla-Version: unknown X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: already5chosen at yahoo dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108279 --- Comment #24 from Michael_S --- (In reply to Michael_S from comment #22) > (In reply to Michael_S from comment #8) > > (In reply to Thomas Koenig from comment #6) > > > And there will have to be a decision about 32-bit targets. > > > > >=20 > > IMHO, 32-bit targets should be left in their current state. > > People that use them probably do not care deeply about performance. > > Technically, I can implement 32-bit targets in the same sources, by mea= ns of > > few ifdefs and macros, but resulting source code will look much uglier = than > > how it looks today. Still, not to the same level of horror that you hav= e in > > matmul_r16.c, but certainly uglier than how I like it to look. > > And I am not sure at all that my implementation of 32-bit targets would= be > > significantly faster than current soft float. >=20 > I explored this path (implementing 32-bit and 64-bit targets from the same > source with few ifdefs) a little more: > Now I am even more sure that it is not a way to go. gcc compiler does not > generate good 32-bit code for this style of sources. This especially appl= ies > to i386, other supported 32-bit targets (RV32, SPARC32) are affected less. >=20 I can't explain to myself why I am doing it, but I did continue exploration= of 32-bit targets. Well, not quite "targets", I don't have SPARC32 or RV32 to play. So, I did continue exploration of i386. As said above, using the same code for 32-bit and 64-bit does not produce acceptable results. But pure 32-bit source did better than what I expected. So when 2023-01-13 I wrote "And I am not sure at all that my implementation= of 32-bit targets would be significantly faster than current soft float" I was wrong. My implementation of 32-bit targets (i.e. i386) is significantly fas= ter than current soft float. Up to 3 times faster on Zen3, approximately 2 times faster on various oldish Intel CPUs. Today I put 32-bit sources into my github repository. I am still convinced that improving performance of IEEE binary128 on 32-bit targets is wastage of time, but since the time is already wasted may be res= ults can be used. And may be, it can be used to bring IEEE binary128 to the Arm Cortex-M, whe= re it can be moderately useful in some situations.=