public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH Linux] powerpc: add documentation for HWCAPs
@ 2022-05-24  9:38 Nicholas Piggin
  2022-05-24  9:52 ` Florian Weimer
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Nicholas Piggin @ 2022-05-24  9:38 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, libc-alpha, gcc, Paul E Murphy,
	Segher Boessenkool, Peter Bergner, Michael Ellerman

Take the arm64 HWCAP documentation file and adjust it for powerpc.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---

Thanks for all the comments and corrections. It should be nearing the
point where it is useful now. Yes I do think it would be useful to align
this more with OpenPOWER docs (and possibly eventually move it into the
ABI, given that's the allocator of these numbers) but that's not
done yet.

Thanks,
Nick

 Documentation/powerpc/elf_hwcaps.rst | 225 +++++++++++++++++++++++++++
 1 file changed, 225 insertions(+)
 create mode 100644 Documentation/powerpc/elf_hwcaps.rst

diff --git a/Documentation/powerpc/elf_hwcaps.rst b/Documentation/powerpc/elf_hwcaps.rst
new file mode 100644
index 000000000000..0a39077cd5d5
--- /dev/null
+++ b/Documentation/powerpc/elf_hwcaps.rst
@@ -0,0 +1,225 @@
+.. _elf_hwcaps_index:
+
+==================
+POWERPC ELF HWCAPs
+==================
+
+This document describes the usage and semantics of the powerpc ELF HWCAPs.
+
+
+1. Introduction
+---------------
+
+Some hardware or software features are only available on some CPU
+implementations, and/or with certain kernel configurations, but have no other
+discovery mechanism available to userspace code. The kernel exposes the
+presence of these features to userspace through a set of flags called HWCAPs,
+exposed in the auxiliary vector.
+
+Userspace software can test for features by acquiring the AT_HWCAP or
+AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
+flags are set, e.g.::
+
+	bool floating_point_is_present(void)
+	{
+		unsigned long HWCAPs = getauxval(AT_HWCAP);
+		if (HWCAPs & PPC_FEATURE_HAS_FPU)
+			return true;
+
+		return false;
+	}
+
+Where software relies on a feature described by a HWCAP, it should check the
+relevant HWCAP flag to verify that the feature is present before attempting to
+make use of the feature.
+
+Features should not be probed through other means. When a feature is not
+available, attempting to use it may result in unpredictable behaviour, and
+may not be guaranteed to result in any reliable indication that the feature
+is unavailable.
+
+Software that targets a particular platform does not necessarily have to
+test for required or implied features. For example if the program requires
+FPU, VMX, VSX, it is not necessary to test those HWCAPs, and it may be
+impossible to do so if the compiler generates code requiring those features.
+
+2. Facilities
+-------------
+The Power ISA uses the term "facility" to describe a class of instructions,
+registers, interrupts, etc. The presence or absence of a facility indicates
+whether this class is available to be used, but the specifics depend on the
+ISA version. For example, if the VSX facility is available, the VSX
+instructions that can be used differ between the v3.0B and v3.1B ISA
+verstions.
+
+3. HWCAP allocation
+-------------------
+
+HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI
+Specification (which will be reflected in the kernel's uapi headers).
+
+4. The HWCAPs exposed in AT_HWCAP
+---------------------------------
+
+PPC_FEATURE_32
+    32-bit CPU
+
+PPC_FEATURE_64
+    64-bit CPU (userspace may be running in 32-bit mode).
+
+PPC_FEATURE_601_INSTR
+    The processor is PowerPC 601.
+    Unused in the kernel since:
+      f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
+
+PPC_FEATURE_HAS_ALTIVEC
+    Vector (aka Altivec, VMX) facility is available.
+
+PPC_FEATURE_HAS_FPU
+    Floating point facility is available.
+
+PPC_FEATURE_HAS_MMU
+    Memory management unit is present and enabled.
+
+PPC_FEATURE_HAS_4xxMAC
+    The processor is 40x or 44x family.
+
+PPC_FEATURE_UNIFIED_CACHE
+    The processor has a unified L1 cache for instructions and data, as
+    found in NXP e200.
+    Unused in the kernel since:
+      39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)")
+
+PPC_FEATURE_HAS_SPE
+    Signal Processing Engine facility is available.
+
+PPC_FEATURE_HAS_EFP_SINGLE
+    Embedded Floating Point single precision operations are available.
+
+PPC_FEATURE_HAS_EFP_DOUBLE
+    Embedded Floating Point double precision operations are available.
+
+PPC_FEATURE_NO_TB
+    The timebase facility (mftb instruction) is not available.
+    This is a 601 specific HWCAP, so if it is known that the processor
+    running is not a 601, via other HWCAPs or other means, it is not
+    required to test this bit before using the timebase.
+    Unused in the kernel since:
+      f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
+
+PPC_FEATURE_POWER4
+    The processor is POWER4 or PPC970/FX/MP.
+    POWER4 support dropped from the kernel since:
+      471d7ff8b51b ("powerpc/64s: Remove POWER4 support")
+
+PPC_FEATURE_POWER5
+    The processor is POWER5.
+
+PPC_FEATURE_POWER5_PLUS
+    The processor is POWER5+.
+
+PPC_FEATURE_CELL
+    The processor is Cell.
+
+PPC_FEATURE_BOOKE
+    The processor implements the BookE architecture.
+
+PPC_FEATURE_SMT
+    The processor implements SMT.
+
+PPC_FEATURE_ICACHE_SNOOP
+    The processor icache is coherent with the dcache, and instruction storage
+    can be made consistent with data storage for the purpose of executing
+    instructions with the sequence (as described in, e.g., POWER9 Processor
+    User's Manual, 4.6.2.2 Instruction Cache Block Invalidate (icbi)):
+        sync
+        icbi (to any address)
+        isync
+
+PPC_FEATURE_ARCH_2_05
+    The processor supports the v2.05 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_PA6T
+    The processor is PA6T.
+
+PPC_FEATURE_HAS_DFP
+    DFP facility is available.
+
+PPC_FEATURE_POWER6_EXT
+    The processor is POWER6.
+
+PPC_FEATURE_ARCH_2_06
+    The processor supports the v2.06 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_HAS_VSX
+    VSX facility is available.
+
+PPC_FEATURE_PSERIES_PERFMON_COMPAT
+    The processor supports architected PMU events in the range 0xE0-0xFF.
+
+PPC_FEATURE_TRUE_LE
+    The processor supports true little-endian mode.
+
+PPC_FEATURE_PPC_LE
+    The processor supports "PowerPC Little-Endian", that uses address
+    munging to make storage access appear to be little-endian, but the
+    data is stored in a different format that is unsuitable to be
+    accessed by other agents not running in this mode.
+
+5. The HWCAPs exposed in AT_HWCAP2
+----------------------------------
+
+PPC_FEATURE2_ARCH_2_07
+    The processor supports the v2.07 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HTM
+    Transactional Memory feature is available.
+
+PPC_FEATURE2_DSCR
+    DSCR facility is available.
+
+PPC_FEATURE2_EBB
+    EBB facility is available.
+
+PPC_FEATURE2_ISEL
+    isel instruction is available. This is superseded by ARCH_2_07 and
+    later.
+
+PPC_FEATURE2_TAR
+    TAR facility is available.
+
+PPC_FEATURE2_VEC_CRYPTO
+    v2.07 crypto instructions are available.
+
+PPC_FEATURE2_HTM_NOSC
+    System calls fail if called in a transactional state, see
+    Documentation/powerpc/syscall64-abi.rst
+
+PPC_FEATURE2_ARCH_3_00
+    The processor supports the v3.0B / v3.0C userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HAS_IEEE128
+    IEEE 128-bit binary floating point is supported with VSX
+    quad-precision instructions and data types.
+
+PPC_FEATURE2_DARN
+    darn instruction is available.
+
+PPC_FEATURE2_SCV
+    scv instruction may be used for system calls, see
+    Documentation/powerpc/syscall64-abi.rst.
+
+PPC_FEATURE2_HTM_NO_SUSPEND
+    A limited Transactional Memory facility that does not support suspend is
+    available, see Documentation/powerpc/transactional_memory.rst.
+
+PPC_FEATURE2_ARCH_3_1
+    The processor supports the v3.1 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_MMA
+    MMA facility is available.
-- 
2.35.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH Linux] powerpc: add documentation for HWCAPs
  2022-05-24  9:38 [PATCH Linux] powerpc: add documentation for HWCAPs Nicholas Piggin
@ 2022-05-24  9:52 ` Florian Weimer
  2022-05-24 18:32   ` Segher Boessenkool
  2022-05-24 17:38 ` Segher Boessenkool
  2023-06-06 14:49 ` Passing the complex args in the GPR's Umesh Kalappa
  2 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2022-05-24  9:52 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, gcc, libc-alpha, Paul E Murphy

* Nicholas Piggin:

> +2. Facilities
> +-------------
> +The Power ISA uses the term "facility" to describe a class of instructions,
> +registers, interrupts, etc. The presence or absence of a facility indicates
> +whether this class is available to be used, but the specifics depend on the
> +ISA version. For example, if the VSX facility is available, the VSX
> +instructions that can be used differ between the v3.0B and v3.1B ISA
> +verstions.

The 2.07 ISA manual also has categories.  ISA 3.0 made a lot of things
mandatory.  It may make sense to clarify that feature bits for mandatory
aspects of the ISA are still set, to help with backwards compatibility.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH Linux] powerpc: add documentation for HWCAPs
  2022-05-24  9:38 [PATCH Linux] powerpc: add documentation for HWCAPs Nicholas Piggin
  2022-05-24  9:52 ` Florian Weimer
@ 2022-05-24 17:38 ` Segher Boessenkool
  2022-07-15  1:00   ` Nicholas Piggin
  2023-06-06 14:49 ` Passing the complex args in the GPR's Umesh Kalappa
  2 siblings, 1 reply; 16+ messages in thread
From: Segher Boessenkool @ 2022-05-24 17:38 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: linuxppc-dev, libc-alpha, gcc, Paul E Murphy, Peter Bergner,
	Michael Ellerman

Hi!

On Tue, May 24, 2022 at 07:38:28PM +1000, Nicholas Piggin wrote:
> Thanks for all the comments and corrections. It should be nearing the
> point where it is useful now. Yes I do think it would be useful to align
> this more with OpenPOWER docs (and possibly eventually move it into the
> ABI, given that's the allocator of these numbers) but that's not
> done yet.

The auxiliary vector is a Linux/glibc thing, it should not be described
in more generic ABI documents.  It is fine where you have it now afaics.

> +Where software relies on a feature described by a HWCAP, it should check the
> +relevant HWCAP flag to verify that the feature is present before attempting to
> +make use of the feature.
> +
> +Features should not be probed through other means. When a feature is not
> +available, attempting to use it may result in unpredictable behaviour, and
> +may not be guaranteed to result in any reliable indication that the feature
> +is unavailable.

Traditionally VMX was tested for by simply executing an instruction and
catching SIGILL.  This is portable even.  This has worked fine for over
two decades, it's a bit weird to declare this a forbidden practice
now :-)

It certainly isn't recommended for more complex and/or newer things.

> +verstions.

(typo.  spellcheck maybe?)


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH Linux] powerpc: add documentation for HWCAPs
  2022-05-24  9:52 ` Florian Weimer
@ 2022-05-24 18:32   ` Segher Boessenkool
  2022-07-15  1:17     ` Nicholas Piggin
  0 siblings, 1 reply; 16+ messages in thread
From: Segher Boessenkool @ 2022-05-24 18:32 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Nicholas Piggin, gcc, libc-alpha, linuxppc-dev, Paul E Murphy

On Tue, May 24, 2022 at 11:52:00AM +0200, Florian Weimer wrote:
> * Nicholas Piggin:
> 
> > +2. Facilities
> > +-------------
> > +The Power ISA uses the term "facility" to describe a class of instructions,
> > +registers, interrupts, etc. The presence or absence of a facility indicates
> > +whether this class is available to be used, but the specifics depend on the
> > +ISA version. For example, if the VSX facility is available, the VSX
> > +instructions that can be used differ between the v3.0B and v3.1B ISA
> > +verstions.
> 
> The 2.07 ISA manual also has categories.  ISA 3.0 made a lot of things
> mandatory.  It may make sense to clarify that feature bits for mandatory
> aspects of the ISA are still set, to help with backwards compatibility.

Linux runs on ISA 1.xx and ISA 2.01 machines still.  "Category" wasn't
invented for either yet either, but similar concepts did exist of
course.


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH Linux] powerpc: add documentation for HWCAPs
  2022-05-24 17:38 ` Segher Boessenkool
@ 2022-07-15  1:00   ` Nicholas Piggin
  0 siblings, 0 replies; 16+ messages in thread
From: Nicholas Piggin @ 2022-07-15  1:00 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Peter Bergner, gcc, libc-alpha, linuxppc-dev, Michael Ellerman,
	Paul E Murphy

Finally got some details about the icache snoop question so just coming 
back to this now, sorry for the delay... (POWER10 does support the 
coherent icache flush sequence as expected, there was some updates to
the UM wording but that will be fixed).

Excerpts from Segher Boessenkool's message of May 25, 2022 3:38 am:
> Hi!
> 
> On Tue, May 24, 2022 at 07:38:28PM +1000, Nicholas Piggin wrote:
>> Thanks for all the comments and corrections. It should be nearing the
>> point where it is useful now. Yes I do think it would be useful to align
>> this more with OpenPOWER docs (and possibly eventually move it into the
>> ABI, given that's the allocator of these numbers) but that's not
>> done yet.
> 
> The auxiliary vector is a Linux/glibc thing, it should not be described
> in more generic ABI documents.  It is fine where you have it now afaics.

It is already in the ABI document. In fact that (not the kernel) had
been the allocator of the feature numbers, at least in the past I think.

> 
>> +Where software relies on a feature described by a HWCAP, it should check the
>> +relevant HWCAP flag to verify that the feature is present before attempting to
>> +make use of the feature.
>> +
>> +Features should not be probed through other means. When a feature is not
>> +available, attempting to use it may result in unpredictable behaviour, and
>> +may not be guaranteed to result in any reliable indication that the feature
>> +is unavailable.
> 

> Traditionally VMX was tested for by simply executing an instruction and
> catching SIGILL.  This is portable even.  This has worked fine for over
> two decades, it's a bit weird to declare this a forbidden practice
> now :-)

The statement does not override architectural specification, so
if an encoding does not exist then it should cause a trap and SIGILL.
I suppose in theory we could work around performance or correctness
issues in an implementation by clearing HWCAP even if the hardware does 
execute the instruction, so I would still say testing HWCAP is
preferred.

> 
> It certainly isn't recommended for more complex and/or newer things.
> 
>> +verstions.
> 
> (typo.  spellcheck maybe?)

Thanks,
Nick

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH Linux] powerpc: add documentation for HWCAPs
  2022-05-24 18:32   ` Segher Boessenkool
@ 2022-07-15  1:17     ` Nicholas Piggin
  2022-07-15 14:35       ` Segher Boessenkool
  0 siblings, 1 reply; 16+ messages in thread
From: Nicholas Piggin @ 2022-07-15  1:17 UTC (permalink / raw)
  To: Florian Weimer, Segher Boessenkool
  Cc: gcc, libc-alpha, linuxppc-dev, Paul E Murphy

Excerpts from Segher Boessenkool's message of May 25, 2022 4:32 am:
> On Tue, May 24, 2022 at 11:52:00AM +0200, Florian Weimer wrote:
>> * Nicholas Piggin:
>> 
>> > +2. Facilities
>> > +-------------
>> > +The Power ISA uses the term "facility" to describe a class of instructions,
>> > +registers, interrupts, etc. The presence or absence of a facility indicates
>> > +whether this class is available to be used, but the specifics depend on the
>> > +ISA version. For example, if the VSX facility is available, the VSX
>> > +instructions that can be used differ between the v3.0B and v3.1B ISA
>> > +verstions.
>> 
>> The 2.07 ISA manual also has categories.  ISA 3.0 made a lot of things
>> mandatory.  It may make sense to clarify that feature bits for mandatory
>> aspects of the ISA are still set, to help with backwards compatibility.
> 
> Linux runs on ISA 1.xx and ISA 2.01 machines still.  "Category" wasn't
> invented for either yet either, but similar concepts did exist of
> course.

Not sure what to say about this. It now also has "Compliancy Subset"
although maybe that's more like a set of features rather than
incompatible features or modes such as some of the category stuff
seems to be. I'll try add something.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH Linux] powerpc: add documentation for HWCAPs
  2022-07-15  1:17     ` Nicholas Piggin
@ 2022-07-15 14:35       ` Segher Boessenkool
  0 siblings, 0 replies; 16+ messages in thread
From: Segher Boessenkool @ 2022-07-15 14:35 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Florian Weimer, gcc, libc-alpha, linuxppc-dev, Paul E Murphy

On Fri, Jul 15, 2022 at 11:17:24AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of May 25, 2022 4:32 am:
> > Linux runs on ISA 1.xx and ISA 2.01 machines still.  "Category" wasn't
> > invented for either yet either, but similar concepts did exist of
> > course.
> 
> Not sure what to say about this. It now also has "Compliancy Subset"
> although maybe that's more like a set of features rather than
> incompatible features or modes such as some of the category stuff
> seems to be. I'll try add something.

The compliancy subset stuff is an attempt to simplify things again.
In most cases you want to require a whole swath of feature at once,
if you really try to support fine-grained optional features you need to
test thousands of configurations, while you really can test only ten
(if you are lucky!)

Maybe it is best to just be a bit vague here?


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Passing the complex args in the GPR's
  2022-05-24  9:38 [PATCH Linux] powerpc: add documentation for HWCAPs Nicholas Piggin
  2022-05-24  9:52 ` Florian Weimer
  2022-05-24 17:38 ` Segher Boessenkool
@ 2023-06-06 14:49 ` Umesh Kalappa
  2023-06-06 14:58   ` Andrew Pinski
  2 siblings, 1 reply; 16+ messages in thread
From: Umesh Kalappa @ 2023-06-06 14:49 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: linuxppc-dev, gcc, libc-alpha, Segher Boessenkool,
	Michael Ellerman, Paul E Murphy

Hi all ,

For the test case https://godbolt.org/z/vjs1vfs5W ,we see the mismatch
in the ABI b/w gcc and clang .

Do we have any supporting documents that second the GCC behaviour over CLANG ?

EABI states like

In the Power Architecture 64-Bit ELF V2 ABI Specification document
(v1.1 from 16 July 2015)

Page 53:

Map complex floating-point and complex integer types as if the
argument was specified as separate real
and imaginary parts.

and in this case the double complexes are broken down with double real
and double img and expected to pass in FPR not the GPR.



Thank you
~Umesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 14:49 ` Passing the complex args in the GPR's Umesh Kalappa
@ 2023-06-06 14:58   ` Andrew Pinski
  2023-06-06 15:05     ` Umesh Kalappa
  2023-06-06 17:18     ` Joseph Myers
  0 siblings, 2 replies; 16+ messages in thread
From: Andrew Pinski @ 2023-06-06 14:58 UTC (permalink / raw)
  To: Umesh Kalappa
  Cc: Nicholas Piggin, linuxppc-dev, gcc, libc-alpha,
	Segher Boessenkool, Michael Ellerman, Paul E Murphy

On Tue, Jun 6, 2023 at 7:50 AM Umesh Kalappa via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Hi all ,
>
> For the test case https://godbolt.org/z/vjs1vfs5W ,we see the mismatch
> in the ABI b/w gcc and clang .
>
> Do we have any supporting documents that second the GCC behaviour over CLANG ?
>
> EABI states like
>
> In the Power Architecture 64-Bit ELF V2 ABI Specification document
> (v1.1 from 16 July 2015)

You are looking at the wrong ABI document.
That is for the 64bit ABI.
The 32bit ABI document is located at:
http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf

Plus the 32bit ABI document does not document Complex argument passing
as it was written in 1995 and never updated.

https://www.nxp.com/docs/en/reference-manual/E500ABIUG.pdf does not
document it either.

Thanks,
Andrew Pinski

>
> Page 53:
>
> Map complex floating-point and complex integer types as if the
> argument was specified as separate real
> and imaginary parts.
>
> and in this case the double complexes are broken down with double real
> and double img and expected to pass in FPR not the GPR.
>
>
>
> Thank you
> ~Umesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 14:58   ` Andrew Pinski
@ 2023-06-06 15:05     ` Umesh Kalappa
  2023-06-06 15:16       ` Andrew Pinski
  2023-06-06 16:42       ` Segher Boessenkool
  2023-06-06 17:18     ` Joseph Myers
  1 sibling, 2 replies; 16+ messages in thread
From: Umesh Kalappa @ 2023-06-06 15:05 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Nicholas Piggin, linuxppc-dev, gcc, libc-alpha,
	Segher Boessenkool, Michael Ellerman, Paul E Murphy

Hi Adnrew,
Thank you for the quick response and for PPC64 too ,we do have
mismatches in ABI b/w complex operations like
https://godbolt.org/z/bjsYovx4c .

Any reason why GCC chose to use GPR 's here ?

~Umesh

On Tue, Jun 6, 2023 at 8:28 PM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Tue, Jun 6, 2023 at 7:50 AM Umesh Kalappa via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > Hi all ,
> >
> > For the test case https://godbolt.org/z/vjs1vfs5W ,we see the mismatch
> > in the ABI b/w gcc and clang .
> >
> > Do we have any supporting documents that second the GCC behaviour over CLANG ?
> >
> > EABI states like
> >
> > In the Power Architecture 64-Bit ELF V2 ABI Specification document
> > (v1.1 from 16 July 2015)
>
> You are looking at the wrong ABI document.
> That is for the 64bit ABI.
> The 32bit ABI document is located at:
> http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf
>
> Plus the 32bit ABI document does not document Complex argument passing
> as it was written in 1995 and never updated.
>
> https://www.nxp.com/docs/en/reference-manual/E500ABIUG.pdf does not
> document it either.
>
> Thanks,
> Andrew Pinski
>
> >
> > Page 53:
> >
> > Map complex floating-point and complex integer types as if the
> > argument was specified as separate real
> > and imaginary parts.
> >
> > and in this case the double complexes are broken down with double real
> > and double img and expected to pass in FPR not the GPR.
> >
> >
> >
> > Thank you
> > ~Umesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 15:05     ` Umesh Kalappa
@ 2023-06-06 15:16       ` Andrew Pinski
  2023-06-06 16:42       ` Segher Boessenkool
  1 sibling, 0 replies; 16+ messages in thread
From: Andrew Pinski @ 2023-06-06 15:16 UTC (permalink / raw)
  To: Umesh Kalappa
  Cc: Nicholas Piggin, linuxppc-dev, gcc, libc-alpha,
	Segher Boessenkool, Michael Ellerman, Paul E Murphy

On Tue, Jun 6, 2023 at 8:05 AM Umesh Kalappa <umesh.kalappa0@gmail.com> wrote:
>
> Hi Adnrew,
> Thank you for the quick response and for PPC64 too ,we do have
> mismatches in ABI b/w complex operations like
> https://godbolt.org/z/bjsYovx4c .
>
> Any reason why GCC chose to use GPR 's here ?

Yes because it was set before 2003. There could not be an ABI break.
r0-50273-gded9bf77e35ce9a2246 fixed GCC for the AIX ABI though.

>
> ~Umesh
>
> On Tue, Jun 6, 2023 at 8:28 PM Andrew Pinski <pinskia@gmail.com> wrote:
> >
> > On Tue, Jun 6, 2023 at 7:50 AM Umesh Kalappa via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> > >
> > > Hi all ,
> > >
> > > For the test case https://godbolt.org/z/vjs1vfs5W ,we see the mismatch
> > > in the ABI b/w gcc and clang .
> > >
> > > Do we have any supporting documents that second the GCC behaviour over CLANG ?
> > >
> > > EABI states like
> > >
> > > In the Power Architecture 64-Bit ELF V2 ABI Specification document
> > > (v1.1 from 16 July 2015)
> >
> > You are looking at the wrong ABI document.
> > That is for the 64bit ABI.
> > The 32bit ABI document is located at:
> > http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf
> >
> > Plus the 32bit ABI document does not document Complex argument passing
> > as it was written in 1995 and never updated.
> >
> > https://www.nxp.com/docs/en/reference-manual/E500ABIUG.pdf does not
> > document it either.
> >
> > Thanks,
> > Andrew Pinski
> >
> > >
> > > Page 53:
> > >
> > > Map complex floating-point and complex integer types as if the
> > > argument was specified as separate real
> > > and imaginary parts.
> > >
> > > and in this case the double complexes are broken down with double real
> > > and double img and expected to pass in FPR not the GPR.
> > >
> > >
> > >
> > > Thank you
> > > ~Umesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 15:05     ` Umesh Kalappa
  2023-06-06 15:16       ` Andrew Pinski
@ 2023-06-06 16:42       ` Segher Boessenkool
  2023-06-06 17:07         ` Umesh Kalappa
  1 sibling, 1 reply; 16+ messages in thread
From: Segher Boessenkool @ 2023-06-06 16:42 UTC (permalink / raw)
  To: Umesh Kalappa
  Cc: Andrew Pinski, Nicholas Piggin, linuxppc-dev, gcc, libc-alpha,
	Michael Ellerman, Paul E Murphy

Hi!

On Tue, Jun 06, 2023 at 08:35:22PM +0530, Umesh Kalappa wrote:
> Hi Adnrew,
> Thank you for the quick response and for PPC64 too ,we do have
> mismatches in ABI b/w complex operations like
> https://godbolt.org/z/bjsYovx4c .
> 
> Any reason why GCC chose to use GPR 's here ?

What did you expect, what happened instead?  Why did you expect that,
and why then is it an error what did happen?

You used -O0.  As long as the code works, all is fine.  But unoptimised
code frequently is hard to read, please use -O2 instead?

As Andrew says, why did you use -m32 for GCC but -m64 for LLVM?  It is
hard to compare those at all!  32-bit PowerPC Linux ABI (based on 32-bit
PowerPC ELF ABI from 1995, BE version) vs. 64-bit ELFv2 ABI from 2015
(LE version).


Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 16:42       ` Segher Boessenkool
@ 2023-06-06 17:07         ` Umesh Kalappa
  2023-06-06 17:33           ` David Edelsohn
  2023-06-07 13:17           ` Michael Matz
  0 siblings, 2 replies; 16+ messages in thread
From: Umesh Kalappa @ 2023-06-06 17:07 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Andrew Pinski, Nicholas Piggin, linuxppc-dev, gcc, libc-alpha,
	Michael Ellerman, Paul E Murphy

Hi Segher ,

>>What did you expect, what happened instead?
For example the complex args are passed in GPR's for  cexp in the case
GCC and Clang uses  caller memory .

for reference : https://godbolt.org/z/MfMz3cTe7

We have cross tools  like some of libraries built  using  the GCC and
some use Clang .

We approached Clang developers on this behaviour (Why stack , not the
FPR's registers like PPC64)  and they are not going to change this
behaviour, and asked us to refer back to GCC ,hence this email thread.

Question is : Why does GCC choose to use GPR's here and have any
reference to support this decision  ?

Thank you
~Umesh



On Tue, Jun 6, 2023 at 10:16 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Tue, Jun 06, 2023 at 08:35:22PM +0530, Umesh Kalappa wrote:
> > Hi Adnrew,
> > Thank you for the quick response and for PPC64 too ,we do have
> > mismatches in ABI b/w complex operations like
> > https://godbolt.org/z/bjsYovx4c .
> >
> > Any reason why GCC chose to use GPR 's here ?
>
> What did you expect, what happened instead?  Why did you expect that,
> and why then is it an error what did happen?
>
> You used -O0.  As long as the code works, all is fine.  But unoptimised
> code frequently is hard to read, please use -O2 instead?
>
> As Andrew says, why did you use -m32 for GCC but -m64 for LLVM?  It is
> hard to compare those at all!  32-bit PowerPC Linux ABI (based on 32-bit
> PowerPC ELF ABI from 1995, BE version) vs. 64-bit ELFv2 ABI from 2015
> (LE version).
>
>
> Segher

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 14:58   ` Andrew Pinski
  2023-06-06 15:05     ` Umesh Kalappa
@ 2023-06-06 17:18     ` Joseph Myers
  1 sibling, 0 replies; 16+ messages in thread
From: Joseph Myers @ 2023-06-06 17:18 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Umesh Kalappa, Nicholas Piggin, linuxppc-dev, gcc, libc-alpha,
	Segher Boessenkool, Michael Ellerman, Paul E Murphy

On Tue, 6 Jun 2023, Andrew Pinski via Gcc wrote:

> You are looking at the wrong ABI document.
> That is for the 64bit ABI.
> The 32bit ABI document is located at:
> http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf
> 
> Plus the 32bit ABI document does not document Complex argument passing
> as it was written in 1995 and never updated.

For the 32-bit ABI see 
https://www.polyomino.org.uk/publications/2011/Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf 
(sources at https://github.com/ryanarn/powerabi - power.org has long since 
disappeared).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 17:07         ` Umesh Kalappa
@ 2023-06-06 17:33           ` David Edelsohn
  2023-06-07 13:17           ` Michael Matz
  1 sibling, 0 replies; 16+ messages in thread
From: David Edelsohn @ 2023-06-06 17:33 UTC (permalink / raw)
  To: Umesh Kalappa
  Cc: Segher Boessenkool, Andrew Pinski, Nicholas Piggin, linuxppc-dev,
	gcc, libc-alpha, Michael Ellerman, Paul E Murphy

[-- Attachment #1: Type: text/plain, Size: 2417 bytes --]

On Tue, Jun 6, 2023 at 1:08 PM Umesh Kalappa via Gcc <gcc@gcc.gnu.org>
wrote:

> Hi Segher ,
>
> >>What did you expect, what happened instead?
> For example the complex args are passed in GPR's for  cexp in the case
> GCC and Clang uses  caller memory .
>
> for reference : https://godbolt.org/z/MfMz3cTe7
>
> We have cross tools  like some of libraries built  using  the GCC and
> some use Clang .
>
> We approached Clang developers on this behaviour (Why stack , not the
> FPR's registers like PPC64)  and they are not going to change this
> behaviour, and asked us to refer back to GCC ,hence this email thread.
>
> Question is : Why does GCC choose to use GPR's here and have any
> reference to support this decision  ?
>

The use of GPRs to pass complex floating point arguments was an early
implementation mistake -- the parameter passing code missed the
enumeration of a type.  The behavior cannot be changed and corrected
without breaking the ABI.

I don't know what you mean by "support this decision".  It was not
intentionally chosen through careful performance analysis or type system
design as the preferred method to pass complex floating point values.  The
initial implementation was wrong and not discovered until it was too late.
The reference to support this is that one cannot break the ABI without
causing chaos in the ecosystem.

Thanks, David


>
> Thank you
> ~Umesh
>
>
>
> On Tue, Jun 6, 2023 at 10:16 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > Hi!
> >
> > On Tue, Jun 06, 2023 at 08:35:22PM +0530, Umesh Kalappa wrote:
> > > Hi Adnrew,
> > > Thank you for the quick response and for PPC64 too ,we do have
> > > mismatches in ABI b/w complex operations like
> > > https://godbolt.org/z/bjsYovx4c .
> > >
> > > Any reason why GCC chose to use GPR 's here ?
> >
> > What did you expect, what happened instead?  Why did you expect that,
> > and why then is it an error what did happen?
> >
> > You used -O0.  As long as the code works, all is fine.  But unoptimised
> > code frequently is hard to read, please use -O2 instead?
> >
> > As Andrew says, why did you use -m32 for GCC but -m64 for LLVM?  It is
> > hard to compare those at all!  32-bit PowerPC Linux ABI (based on 32-bit
> > PowerPC ELF ABI from 1995, BE version) vs. 64-bit ELFv2 ABI from 2015
> > (LE version).
> >
> >
> > Segher
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Passing the complex args in the GPR's
  2023-06-06 17:07         ` Umesh Kalappa
  2023-06-06 17:33           ` David Edelsohn
@ 2023-06-07 13:17           ` Michael Matz
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Matz @ 2023-06-07 13:17 UTC (permalink / raw)
  To: Umesh Kalappa
  Cc: Segher Boessenkool, Andrew Pinski, Nicholas Piggin, linuxppc-dev,
	gcc, libc-alpha, Michael Ellerman, Paul E Murphy

[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]

Hey,

On Tue, 6 Jun 2023, Umesh Kalappa via Gcc wrote:

> Question is : Why does GCC choose to use GPR's here and have any
> reference to support this decision  ?

You explicitely used -m32 ppc, so 
https://www.polyomino.org.uk/publications/2011/Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf 
applies.  It explicitely states in "B.1 ATR-Linux Inclusion and 
Conformance" that it is "ATR-PASS-COMPLEX-IN-GPRS", and other sections 
detail what that means (namely passing complex args in r3 .. r10, whatever 
fits).  GCC adheres to that, and has to.

The history how that came to be was explained in the thread.


Ciao,
Michael.

 > 
> Thank you
> ~Umesh
> 
> 
> 
> On Tue, Jun 6, 2023 at 10:16 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > Hi!
> >
> > On Tue, Jun 06, 2023 at 08:35:22PM +0530, Umesh Kalappa wrote:
> > > Hi Adnrew,
> > > Thank you for the quick response and for PPC64 too ,we do have
> > > mismatches in ABI b/w complex operations like
> > > https://godbolt.org/z/bjsYovx4c .
> > >
> > > Any reason why GCC chose to use GPR 's here ?
> >
> > What did you expect, what happened instead?  Why did you expect that,
> > and why then is it an error what did happen?
> >
> > You used -O0.  As long as the code works, all is fine.  But unoptimised
> > code frequently is hard to read, please use -O2 instead?
> >
> > As Andrew says, why did you use -m32 for GCC but -m64 for LLVM?  It is
> > hard to compare those at all!  32-bit PowerPC Linux ABI (based on 32-bit
> > PowerPC ELF ABI from 1995, BE version) vs. 64-bit ELFv2 ABI from 2015
> > (LE version).
> >
> >
> > Segher
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-06-07 13:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24  9:38 [PATCH Linux] powerpc: add documentation for HWCAPs Nicholas Piggin
2022-05-24  9:52 ` Florian Weimer
2022-05-24 18:32   ` Segher Boessenkool
2022-07-15  1:17     ` Nicholas Piggin
2022-07-15 14:35       ` Segher Boessenkool
2022-05-24 17:38 ` Segher Boessenkool
2022-07-15  1:00   ` Nicholas Piggin
2023-06-06 14:49 ` Passing the complex args in the GPR's Umesh Kalappa
2023-06-06 14:58   ` Andrew Pinski
2023-06-06 15:05     ` Umesh Kalappa
2023-06-06 15:16       ` Andrew Pinski
2023-06-06 16:42       ` Segher Boessenkool
2023-06-06 17:07         ` Umesh Kalappa
2023-06-06 17:33           ` David Edelsohn
2023-06-07 13:17           ` Michael Matz
2023-06-06 17:18     ` Joseph Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).