[Aarch64] Vector Function Application Binary Interface Specification for OpenMP

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Sekhar, Ashwin" <Ashwin.Sekhar@cavium.com>
To: "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
Cc: "richard.earnshaw@arm.com" <richard.earnshaw@arm.com>,
	"marcus.shawcroft@arm.com" <marcus.shawcroft@arm.com>,
	"james.greenhalgh@arm.com" <james.greenhalgh@arm.com>
Subject: [Aarch64] Vector Function Application Binary Interface Specification for OpenMP
Date: Wed, 15 Mar 2017 09:50:00 -0000	[thread overview]
Message-ID: <BY2PR07MB2421EB8B7FDC1CFFDE065DDE92270@BY2PR07MB2421.namprd07.prod.outlook.com> (raw)

Hi GCC Team, Aarch64 Maintainers,

The rules in Vector Function Application Binary Interface Specification  for OpenMP  (https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt)  is used in x86 for generating the simd clones of a function.

Is there a similar one defined for Aarch64?

If not, would like to start a discussion on the same for Aarch64. To  kick start the same, a draft proposal for Aarch64 (on the same lines as  x86 ABI) is included below. The only change from x86 ABI is in the  function name mangling. Here the letter 'b' is used for indicating the  ASIMD isa.

Please review and comment.

Thanks and Regards,

Ashwin Sekhar T K

------------------------------------ CUT HERE ----------------------------------

================================================================================
 Aarch64 Vector Function Application Binary Interface Specification for OpenMP
================================================================================

1. Vector Function ABI Overview

Aarch64 Vector Function ABI provides ABI for the vector functions generated by
compiler supporting SIMD constructs of OpenMP 4.0 [1] in Aarch64. This is
based on the x86 Vector Function Application Binary Interface Specification for
OpenMP [2].

================================================================================

2. Vector Function ABI

Vector Function ABI defines a set of rules that the caller and the callee
functions must obey.

These rules consist of:
  * Calling convention
  * Vector length (the number of concurrent scalar invocations to be processed
    per invocation of the vector function)
  * Mapping from element data types to vector data types
  * Ordering of vector arguments
  * Vector function masking
  * Vector function name mangling
  * Compiler generated variants of vector function

--------------------------------------------------------------------------------

2.1. Calling Convention

The vector functions should use calling convention described in Procedure Call
Standard for the ARM 64-bit Architecture (AArch64) [3].

--------------------------------------------------------------------------------

2.2. Vector Length

Every vector variant of a SIMD-enabled function has a vector length (VLEN). If
OpenMP clause "simdlen" is used, the VLEN is the value of the argument of that
clause. The VLEN value must be power of 2. In other case the notion of the
function`s "characteristic data type" (CDT) is used to compute the vector
length.

CDT is defined in the following order:
  a) For non-void function, the CDT is the return type.
  b) If the function has any non-uniform, non-linear parameters, then the CDT
     is the type of the first such parameter.
  c) If the CDT determined by a) or b) above is struct, union, or class type
     which is pass-by-value (except for the type that maps to the built-in
     complex data type), the characteristic data type is int.
  d) If none of the above three cases is applicable, the CDT is int.

VLEN  = sizeof(vector_register) / sizeof(CDT),

For example, if ISA is ASIMD, sizeof(vector_register) = 16, as the vector
registers are 128 bit. And if the CDT of the function is "int", sizeof(CDT) = 4.
So, VLEN = 4.

--------------------------------------------------------------------------------

2.3. Element Data Type to Vector Data Type Mapping

The vector data types for parameters are selected depending on ISA, vector
length, data type of original parameter, and parameter specification.

For uniform and linear parameters (detailed description could be found in [1]),
the original data type is preserved.

For vector parameters, vector data types are selected by the compiler. The
mapping from element data type to vector data type is described as below.

  * The bit size of vector data type of parameter is computed as:

    size_of_vector_data_type = VLEN * sizeof(original_parameter_data_type) * 8

    For instance, for ASIMD version of vector function with parameter data type
    "int": If VLEN = 4, size_of_vector_data_type = 4 * 4 * 8 = 128 (bits), which
    means one argument of type __m128 to be passed.

  * If the size_of_vector_data_type is greater than the width of the vector
    register, multiple vector registers are selected and the parameter will be
    passed in multiple vector registers.

    For instance, for ASIMD version of vector function with parameter data type
    "int":

    If VLEN = 8, size_of_vector_data_type = 8 * 4 * 8 = 256 (bits), so the
    vector data type is __m256, which means 2 arguments of type __m128 are to
    be passed.

--------------------------------------------------------------------------------

2.4. Ordering of Vector Arguments

  * When a parameter in the original data type results in one argument in the
    vector function, the ordering rule is a simple one to one match with the
    original argument order.

    For example, when the original  argument list is (int a, float b, int c),
    VLEN is 4, the ISA is ASIMD, and all a, b, and c are classified  vector
    parameters, the vector function argument list becomes (__m128i vec_a,
    __m128 vec_b, __m128i vec_c).

  * There are cases where a single parameter in the original data type results
    in the multiple arguments in the vector function. Those addition second and
    subsequent arguments are inserted in the argument list right after the
    corresponding first argument, not appended to the end of the argument list
    of the vector function.

    For example, the original argument list is (int a, float
    b, int c), VLEN is 8, the ISA is ASIMD, and all a, b, and c are classified
    as vector parameters, the vector function argument list becomes
    (__m128i vec_a1, __m128i vec_a2, __m128 vec_b1, __m128 vec_b2,
    __m128i vec_c1, __m128i vec_c2).

--------------------------------------------------------------------------------

2.5. Masking of Vector Function

Masked vector function variant used for invocation in conditional statement
(please refer to [1] for detailed information) additionally takes an implicit
mask argument, which disables processing of some of the vector lanes. For
masked vector functions, the additional "mask" parameters are required.

Each element of "mask" parameters has the data type of the CDT (see Section
2.2). The number of mask parameters is the same as number of parameters
required to pass the vector of CDT for the given vector length. The value of a
mask parameter must be either bit patterns of all ones or all zeros for each
element.

For each element of the vector, if the corresponding mask value is zero, the
return value associated to that element is zero. Mask parameters are passed
after all other parameters in the same order of parameters that they are apply
to.

--------------------------------------------------------------------------------

2.6. Vector Function Name Mangling

The name mangling of the generated vector function based on vector annotation
is important part of Vector ABI. It allows the caller and the callee functions
to be compiled in separate files or compilation units. Using the function
prototype in header files to communicate vector function annotation
information, the compiler can perform function matching while vectorizing code
at call sites.

The vector function name is mangled as the concatenation of the following items:

<vector_prefix> <isa> <mask> <vlen> <vparameters> '_' <original_name>

The descriptions of each item are:
  * <vector_prefix>
      string "_ZGV"

  * <original_name>
      name of scalar function, including C++ and Fortran mangling

  * <isa>
      'b'    // ASIMD

  * <mask>
      'M'    // masked version
      | 'N'  // unmasked version

  * <vlen>
      decimal-number

  * <vparameters>
      /* empty */
      <vparameter> <opt-align> <vparameters>
          o <vparameter>
          (please refer to [1] for information about parameter types used below)
              's' decimal-number // linear parameter, variable stride ,
                                 // decimal number is the position # of
                                 // stride argument, which starts from 0
              | 'l' <number>     // linear parameter, constant stride
              | 'u'              // uniform parameter  
              | 'v'              // vector parameter
                 o <number>
                     [n] non-negative decimal integer  // n indicates negative
          o <opt-align>  
              /* empty */
              | 'a' non-negative-decimal-number

Please refer to section 2.7 Compiler generated variants of vector function for
examples of vector function mangling.

--------------------------------------------------------------------------------

2.7. Compiler generated variants of vector function

Compiler's architecture selection flag has no impact on ISA selection for the
generated vector variants.

Vector variants should be generated by compiler for each ISA for both masked and
unmasked versions for each ISA (if one of them is not specified with according
clause). Compiler implementations must not generate calls to version of other
ISAs unless some non-standard pragma or clause is used to declare those other
versions are available.

Example 1.
#pragma omp declare simd uniform(q) aligned(q:16) linear(k:1)
float foo(float *q, float x, int k)
{
    q[k] = q[k] + x;
    return q[k];
}

Below is the list of generated function names or list of symbols provided by
library with the same pragma in "foo" prototype.

1) _ZGVbN4ua16vl_foo (ASIMD ISA, unmasked version)
2) _ZGVbM4ua16vl_foo (ASIMD ISA, masked version)

Where the "foo" is the original mangled function name, "_ZGV" is the prefix of
the vector function name, "b" indicates the ASIMD ISA, "N" indicates that this
is a unmasked version, "M" indicates that this is a masked version, "4" is the
vector length for ASIMD ISA, "ua16" indicates uniform(q) and align(a:32), "v"
indicates second argument x is vector argument, "l" indicates linear(k:1) - k
is a linear variable whose stride is 1.

Example 2.
#pragma omp declare simd notinbranch
double foo(double x)
{
    return x*x;
}

Below is the list of generated function names or list of symbols provided by
library with the same pragma in "foo" prototype.

1) _ZGVbN2v_foo (ASIMD ISA, unmasked version)

Where the "foo" is the original mangled function name, "_ZGV" is the prefix of
the vector function name, "b" indicates the ASIMD ISA, "N" indicates that this
is a unmasked version, "2" is the vector length for ASIMD ISA, "v" indicates
single argument x is vector argument.

================================================================================

3. References

[1] OpenMP 4.0 Specification
http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf

[2] Vector Function Application Binary Interface Specification for OpenMP (x86)
https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt

[3] Procedure Call Standard for the ARM 64-bit Architecture (AArch64)
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf

next             reply	other threads:[~2017-03-15  9:50 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-15  9:50 Sekhar, Ashwin [this message]
2017-03-17 14:02 ` James Greenhalgh
2017-03-20  4:30   ` Sekhar, Ashwin
2018-02-09 21:47 Steve Ellcey
2018-05-15 18:29 ` Francesco Petrogalli
2018-05-16 16:21   ` Steve Ellcey
2018-05-16 16:30     ` Richard Earnshaw (lists)
2018-05-16 17:30       ` Steve Ellcey
2018-05-16 21:11         ` Richard Sandiford
2018-05-24 17:50           ` Steve Ellcey
2018-05-26 10:09             ` Richard Sandiford
2018-05-26 22:13               ` Segher Boessenkool
2018-05-27 15:59               ` Jeff Law
2018-05-29 10:06                 ` Richard Sandiford
2018-05-31 10:39                   ` Alan Hayward
2018-06-12  3:11                     ` Jeff Law
2018-06-11 23:06                   ` Jeff Law
2018-07-02 18:16     ` Francesco Petrogalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BY2PR07MB2421EB8B7FDC1CFFDE065DDE92270@BY2PR07MB2421.namprd07.prod.outlook.com \
    --to=ashwin.sekhar@cavium.com \
    --cc=gcc@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=marcus.shawcroft@arm.com \
    --cc=richard.earnshaw@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).