public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* Tests of gcc development beyond its testsuite (in this case, for gfortran)
@ 2024-05-06 21:27 Toon Moene
  2024-05-06 21:32 ` Andrew Pinski
  0 siblings, 1 reply; 8+ messages in thread
From: Toon Moene @ 2024-05-06 21:27 UTC (permalink / raw)
  To: gcc mailing list, fortran

I have now, for some time, ran LAPACK's test programs on my gcc/gfortran 
builds on both on the x86_64-linux-gnu architecture, as well as the 
aarch64-linux-gnu one (see, e.g., 
http://moene.org/~toon/lapack-amd64-gfortran13-O3).

The results are rather alarming - this is r15-202 for aarch64 vs r15-204 
for x86_64 (compiled with -O3):

diff lapack-amd64-gfortran15-O3 lapack-aarch64-gfortran15-O3

3892,3895c3928,3931
< REAL             	1327023		0	(0.000%)	0	(0.000%)	
< DOUBLE PRECISION	1300917		6	(0.000%)	0	(0.000%)	
< COMPLEX          	786775		0	(0.000%)	0	(0.000%)	
< COMPLEX16         	787842		0	(0.000%)	0	(0.000%)	
---
 > REAL             	1317063		71	(0.005%)	0	(0.000%)	
 > DOUBLE PRECISION	1318331		54	(0.004%)	4	(0.000%)	
 > COMPLEX          	767023		390	(0.051%)	0	(0.000%)	
 > COMPLEX16         	772338		305	(0.039%)	0	(0.000%)	
3897c3933
< --> ALL PRECISIONS	4202557		6	(0.000%)	0	(0.000%)	
---
 > --> ALL PRECISIONS	4174755		820	(0.020%)	4	(0.000%)	

Note the excessive exceeding the threshold for errors on the aarch64 
side (>).

Of course, this is only an excerpt of the full log file - there is more 
information in it to zoom in on the errors on the aarch64 side (note 
that the x86_64 side is not faultless).

Is there a way to pass this information to our websites, so that we do 
not "forget" this - or in the alternative, follow the progress in 
solving this ?

Kind regards,

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
  2024-05-06 21:27 Tests of gcc development beyond its testsuite (in this case, for gfortran) Toon Moene
@ 2024-05-06 21:32 ` Andrew Pinski
  2024-05-06 21:35   ` Toon Moene
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Pinski @ 2024-05-06 21:32 UTC (permalink / raw)
  To: Toon Moene; +Cc: gcc mailing list, fortran

On Mon, May 6, 2024 at 2:27 PM Toon Moene <toon@moene.org> wrote:
>
> I have now, for some time, ran LAPACK's test programs on my gcc/gfortran
> builds on both on the x86_64-linux-gnu architecture, as well as the
> aarch64-linux-gnu one (see, e.g.,
> http://moene.org/~toon/lapack-amd64-gfortran13-O3).
>
> The results are rather alarming - this is r15-202 for aarch64 vs r15-204
> for x86_64 (compiled with -O3):

Did you test x86_64 with -march=native (or with -mfma) or just -O3?
The reason why I am asking is aarch64 includes FMA by default while
x86_64 does not.
Most recent x86_64 includes an FMA instruction but since the base ISA
does not include it, it is not enabled by default.
I am suspect the aarch64 "excessive exceeding the threshold for
errors" are all caused by the more use of FMA rather than anything
else.

Thanks,
Andrew Pinski

>
> diff lapack-amd64-gfortran15-O3 lapack-aarch64-gfortran15-O3
>
> 3892,3895c3928,3931
> < REAL                  1327023         0       (0.000%)        0       (0.000%)
> < DOUBLE PRECISION      1300917         6       (0.000%)        0       (0.000%)
> < COMPLEX               786775          0       (0.000%)        0       (0.000%)
> < COMPLEX16             787842          0       (0.000%)        0       (0.000%)
> ---
>  > REAL                 1317063         71      (0.005%)        0       (0.000%)
>  > DOUBLE PRECISION     1318331         54      (0.004%)        4       (0.000%)
>  > COMPLEX              767023          390     (0.051%)        0       (0.000%)
>  > COMPLEX16            772338          305     (0.039%)        0       (0.000%)
> 3897c3933
> < --> ALL PRECISIONS    4202557         6       (0.000%)        0       (0.000%)
> ---
>  > --> ALL PRECISIONS   4174755         820     (0.020%)        4       (0.000%)
>
> Note the excessive exceeding the threshold for errors on the aarch64
> side (>).
>
> Of course, this is only an excerpt of the full log file - there is more
> information in it to zoom in on the errors on the aarch64 side (note
> that the x86_64 side is not faultless).
>
> Is there a way to pass this information to our websites, so that we do
> not "forget" this - or in the alternative, follow the progress in
> solving this ?
>
> Kind regards,
>
> --
> Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
  2024-05-06 21:32 ` Andrew Pinski
@ 2024-05-06 21:35   ` Toon Moene
  2024-05-06 22:02     ` Toon Moene
  0 siblings, 1 reply; 8+ messages in thread
From: Toon Moene @ 2024-05-06 21:35 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc mailing list, fortran

On 5/6/24 23:32, Andrew Pinski wrote:

> Did you test x86_64 with -march=native (or with -mfma) or just -O3?
> The reason why I am asking is aarch64 includes FMA by default while
> x86_64 does not.
> Most recent x86_64 includes an FMA instruction but since the base ISA
> does not include it, it is not enabled by default.
> I am suspect the aarch64 "excessive exceeding the threshold for
> errors" are all caused by the more use of FMA rather than anything
> else.

Aah, I forgot to include that tidbit, because its readily apparent from 
the full logs - I compiled with *just* -O3.

Thanks,

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
  2024-05-06 21:35   ` Toon Moene
@ 2024-05-06 22:02     ` Toon Moene
  2024-05-07 18:30       ` Toon Moene
  0 siblings, 1 reply; 8+ messages in thread
From: Toon Moene @ 2024-05-06 22:02 UTC (permalink / raw)
  To: fortran, gcc mailing list; +Cc: Andrew Pinski

On 5/6/24 23:35, Toon Moene wrote:

> On 5/6/24 23:32, Andrew Pinski wrote:
> 
>> Did you test x86_64 with -march=native (or with -mfma) or just -O3?
>> The reason why I am asking is aarch64 includes FMA by default while
>> x86_64 does not.
>> Most recent x86_64 includes an FMA instruction but since the base ISA
>> does not include it, it is not enabled by default.
>> I am suspect the aarch64 "excessive exceeding the threshold for
>> errors" are all caused by the more use of FMA rather than anything
>> else.
> 
> Aah, I forgot to include that tidbit, because its readily apparent from 
> the full logs - I compiled with *just* -O3.
> 
> Thanks,
> 

OK, perhaps on the aarch64 I need the following option to make the 
comparison fair:

‘rdma’

     Enable Round Double Multiply Accumulate instructions. This is on by 
default for -march=armv8.1-a.

I.e., -mno-rdma

(I hope that's correct - I'll will try that when the Sun rises again and 
I have some power to run the AArch64 machine ...).

I must say I didn't expected this - the discussion on the "Intel" side 
was always that the fact that fused multiply-add instruction didn't 
express the "real computations" expressed by the program meant that they 
were evil and therefore had to be hidden behind some special compiler 
option that made it very clear that those instruction were evil.

Again, thanks to point me to the difference (in philosophy, if not math) 
between to the two continents (i.e., the Americas and Europe's - before 
Brexit - England :-)

Kind regards,

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
  2024-05-06 22:02     ` Toon Moene
@ 2024-05-07 18:30       ` Toon Moene
  2024-05-07 18:35         ` Andrew Pinski
  0 siblings, 1 reply; 8+ messages in thread
From: Toon Moene @ 2024-05-07 18:30 UTC (permalink / raw)
  To: fortran, gcc mailing list; +Cc: Andrew Pinski

On 5/7/24 00:02, Toon Moene wrote:

> OK, perhaps on the aarch64 I need the following option to make the 
> comparison fair:
> 
> ‘rdma’
> 
>      Enable Round Double Multiply Accumulate instructions. This is on by 
> default for -march=armv8.1-a.
> 
> I.e., -mno-rdma
> 
> (I hope that's correct - I'll will try that when the Sun rises again and 
> I have some power to run the AArch64 machine ...).

Well, I did two independent runs with gfortran-13.2 and the following 
options:

-O3 -march=armv8.1-a+rdma

and

-O3 -march=armv8.1-a+nordma

No difference in the number of error runs exceeding the prescribed 
thresholds.

So, unless I made a mistake in the option specification (or the compiler 
silently ignored them because they were not applicable to my machine - 
ugh), the cause of the problem lies elsewhere.

Kind regards,

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
  2024-05-07 18:30       ` Toon Moene
@ 2024-05-07 18:35         ` Andrew Pinski
  2024-05-07 18:44           ` Toon Moene
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Pinski @ 2024-05-07 18:35 UTC (permalink / raw)
  To: Toon Moene; +Cc: fortran, gcc mailing list

On Tue, May 7, 2024 at 11:31 AM Toon Moene <toon@moene.org> wrote:
>
> On 5/7/24 00:02, Toon Moene wrote:
>
> > OK, perhaps on the aarch64 I need the following option to make the
> > comparison fair:
> >
> > ‘rdma’
> >
> >      Enable Round Double Multiply Accumulate instructions. This is on by
> > default for -march=armv8.1-a.
> >
> > I.e., -mno-rdma
> >
> > (I hope that's correct - I'll will try that when the Sun rises again and
> > I have some power to run the AArch64 machine ...).
>
> Well, I did two independent runs with gfortran-13.2 and the following
> options:
>
> -O3 -march=armv8.1-a+rdma
>
> and
>
> -O3 -march=armv8.1-a+nordma
>
> No difference in the number of error runs exceeding the prescribed
> thresholds.
>
> So, unless I made a mistake in the option specification (or the compiler
> silently ignored them because they were not applicable to my machine -
> ugh), the cause of the problem lies elsewhere.


AARCH64 armv8-a has FMA as part of its base ISA.
So you want to try with `-ffp-contract=off` instead.
RDMA turns on/off instructions which are not used by the
auto-vectorizer (yet) and used by intrinsics for them (If I read the
code correctly).

Thanks,
Andrew Pinski

>
> Kind regards,
>
> --
> Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
  2024-05-07 18:35         ` Andrew Pinski
@ 2024-05-07 18:44           ` Toon Moene
  2024-05-08 12:43             ` Toon Moene
  0 siblings, 1 reply; 8+ messages in thread
From: Toon Moene @ 2024-05-07 18:44 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: fortran, gcc mailing list

On 5/7/24 20:35, Andrew Pinski wrote:

> On Tue, May 7, 2024 at 11:31 AM Toon Moene <toon@moene.org> wrote:
>>
>> On 5/7/24 00:02, Toon Moene wrote:
>>
>>> OK, perhaps on the aarch64 I need the following option to make the
>>> comparison fair:
>>>
>>> ‘rdma’
>>>
>>>       Enable Round Double Multiply Accumulate instructions. This is on by
>>> default for -march=armv8.1-a.
>>>
>>> I.e., -mno-rdma
>>>
>>> (I hope that's correct - I'll will try that when the Sun rises again and
>>> I have some power to run the AArch64 machine ...).
>>
>> Well, I did two independent runs with gfortran-13.2 and the following
>> options:
>>
>> -O3 -march=armv8.1-a+rdma
>>
>> and
>>
>> -O3 -march=armv8.1-a+nordma
>>
>> No difference in the number of error runs exceeding the prescribed
>> thresholds.
>>
>> So, unless I made a mistake in the option specification (or the compiler
>> silently ignored them because they were not applicable to my machine -
>> ugh), the cause of the problem lies elsewhere.
> 
> 
> AARCH64 armv8-a has FMA as part of its base ISA.
> So you want to try with `-ffp-contract=off` instead.
> RDMA turns on/off instructions which are not used by the
> auto-vectorizer (yet) and used by intrinsics for them (If I read the
> code correctly).

Ah, thanks - I'll try that tomorrow.

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
  2024-05-07 18:44           ` Toon Moene
@ 2024-05-08 12:43             ` Toon Moene
  0 siblings, 0 replies; 8+ messages in thread
From: Toon Moene @ 2024-05-08 12:43 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: fortran, gcc mailing list

On 5/7/24 20:44, Toon Moene wrote:

> On 5/7/24 20:35, Andrew Pinski wrote:
> 
>> On Tue, May 7, 2024 at 11:31 AM Toon Moene <toon@moene.org> wrote:
>>>
>>> On 5/7/24 00:02, Toon Moene wrote:
>>>
>>>> OK, perhaps on the aarch64 I need the following option to make the
>>>> comparison fair:
>>>>
>>>> ‘rdma’
>>>>
>>>>       Enable Round Double Multiply Accumulate instructions. This is 
>>>> on by
>>>> default for -march=armv8.1-a.
>>>>
>>>> I.e., -mno-rdma
>>>>
>>>> (I hope that's correct - I'll will try that when the Sun rises again 
>>>> and
>>>> I have some power to run the AArch64 machine ...).
>>>
>>> Well, I did two independent runs with gfortran-13.2 and the following
>>> options:
>>>
>>> -O3 -march=armv8.1-a+rdma
>>>
>>> and
>>>
>>> -O3 -march=armv8.1-a+nordma
>>>
>>> No difference in the number of error runs exceeding the prescribed
>>> thresholds.
>>>
>>> So, unless I made a mistake in the option specification (or the compiler
>>> silently ignored them because they were not applicable to my machine -
>>> ugh), the cause of the problem lies elsewhere.
>>
>>
>> AARCH64 armv8-a has FMA as part of its base ISA.
>> So you want to try with `-ffp-contract=off` instead.
>> RDMA turns on/off instructions which are not used by the
>> auto-vectorizer (yet) and used by intrinsics for them (If I read the
>> code correctly).
> 
> Ah, thanks - I'll try that tomorrow.

Yep, that did it:


			-->   LAPACK TESTING SUMMARY  <--
		Processing LAPACK Testing output found in the TESTING directory
SUMMARY             	nb test run 	numerical error   	other error
================   	===========	=================	================
REAL             	1327023		0	(0.000%)	0	(0.000%)	
DOUBLE PRECISION	1327845		0	(0.000%)	0	(0.000%)	
COMPLEX          	786775		0	(0.000%)	0	(0.000%)	
COMPLEX16         	787842		0	(0.000%)	0	(0.000%)	

--> ALL PRECISIONS	4229485		0	(0.000%)	0	(0.000%)	

So, obviously, the threshold values for these tests were derived on a 
machine without fused-multiply-add, or without using them if present.

This is perhaps not surprising, as the default build-and-test setup 
(make.inc.example) of the LAPACK package as distributed from netlib.org 
lists as the compiler choice:

FC = gfortran
FFLAGS = -O2 -frecursive
FFLAGS_DRV = $(FFLAGS)
FFLAGS_NOOPT = -O0 -frecursive

which means that the choice of architecture on x86-64 would be "generic" 
and wouldn't include FMA instructions. If the authors had used that 
setup in deriving the thresholds, it is not surprising that you need 
-ffp-contract=off on architectures that include FMA instructions by default.

Thanks for helping me out with this !

-- 
Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-08 12:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-06 21:27 Tests of gcc development beyond its testsuite (in this case, for gfortran) Toon Moene
2024-05-06 21:32 ` Andrew Pinski
2024-05-06 21:35   ` Toon Moene
2024-05-06 22:02     ` Toon Moene
2024-05-07 18:30       ` Toon Moene
2024-05-07 18:35         ` Andrew Pinski
2024-05-07 18:44           ` Toon Moene
2024-05-08 12:43             ` Toon Moene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).