From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 5A6AA3858D20; Tue, 3 Jan 2023 21:02:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5A6AA3858D20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1672779730; bh=6B94UaeR+rhkJJGvWNSGp6HCZmNuxMjf3vjiLU8Ql+A=; h=From:To:Subject:Date:In-Reply-To:References:From; b=xj9nSRAo7b81ZOEft4WgAHuNyyGp8B8GFDSc/LSzIDVhwexXqCUMoYWUh7U6Uv1S5 PEXNI8JOtHPUIfLj+9Mp2hI78QW5DARjZABT8ysOzhfx9AeGGsetYclTN0wv8SDYH/ SxNb3qarIyaJpVcaYkfGt6J0ruPXqYoHQ56kYug0= From: "tkoenig at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug libgcc/108279] Improved speed for float128 routines Date: Tue, 03 Jan 2023 21:02:09 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libgcc X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: tkoenig at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108279 --- Comment #1 from Thomas Koenig --- Created attachment 54183 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D54183&action=3Dedit Example patch with Michael S's code just pasted over the libgcc implementat= ion, for a test A benchmarks: Just pasting over the code from the github repo yields an improvement of gfortran's matmul by almost a factor of two, so significant speedups are possible: module tick interface function rdtsc() bind(C,name=3D"rdtsc") use iso_c_binding integer(kind=3Dc_long) :: rdtsc end function rdtsc end interface end module tick program main use tick use iso_c_binding implicit none integer, parameter :: wp =3D selected_real_kind(30) ! integer, parameter :: n=3D5000, p=3D4000, m=3D3666 integer, parameter :: n =3D 1000, p =3D 1000, m =3D 1000 real (kind=3Dwp) :: c(n,p), a(n,m), b(m, p) character(len=3D80) :: line integer(c_long) :: t1, t2, t3 real (kind=3Dwp) :: fl =3D 2.d0*n*m*p integer :: i,j print *,wp line =3D '10 10' call random_number(a) call random_number(b) t1 =3D rdtsc() t2 =3D rdtsc() t3 =3D t2-t1 print *,t3 t1 =3D rdtsc() c =3D matmul(a,b) t2 =3D rdtsc() print *,1/(fl/(t2-t1-t3)),"Cycles per operation" read (unit=3Dline,fmt=3D*) i,j write (unit=3Dline,fmt=3D*) c(i,j) end program main showed tkoenig@gcc188:~> ./original 16 32 ^C tkoenig@gcc188:~> time ./original 16 32 90.5696151959999999999999999999999997 Cycles per operation real 1m2,148s user 1m2,123s sys 0m0,008s tkoenig@gcc188:~> time ./modified 16 32 52.8148391719999999999999999999999957 Cycles per operation real 0m36,296s user 0m36,278s sys 0m0,008s=20 where "original" is the current libgcc soft-float implementation, and "modified" is with the code from the repro. It does not handle exceptions, so this causes a few regressions, but certai= nly shows the potential=