public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/96342] New: [SVE] Add support for "omp declare simd"
@ 2020-07-27 16:51 rsandifo at gcc dot gnu.org
  2020-08-05  2:46 ` [Bug target/96342] " yangyang305 at huawei dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-07-27 16:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

            Bug ID: 96342
           Summary: [SVE] Add support for "omp declare simd"
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64*-*-*

We don't yet generate SVE functions for "omp declare simd".  See:

https://github.com/ARM-software/abi-aa/tree/master/vfabia64

for how this is supposed to work.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
@ 2020-08-05  2:46 ` yangyang305 at huawei dot com
  2020-08-05  8:53 ` rsandifo at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yangyang305 at huawei dot com @ 2020-08-05  2:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

yangyang <yangyang305 at huawei dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |yangyang305 at huawei dot com

--- Comment #1 from yangyang <yangyang305 at huawei dot com> ---
I'm looking into this.  Will update when I have something to discuss.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
  2020-08-05  2:46 ` [Bug target/96342] " yangyang305 at huawei dot com
@ 2020-08-05  8:53 ` rsandifo at gcc dot gnu.org
  2020-10-21  8:38 ` yangyang305 at huawei dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-08-05  8:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #2 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
That's great, thanks for the heads-up.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
  2020-08-05  2:46 ` [Bug target/96342] " yangyang305 at huawei dot com
  2020-08-05  8:53 ` rsandifo at gcc dot gnu.org
@ 2020-10-21  8:38 ` yangyang305 at huawei dot com
  2020-10-21  8:39 ` yangyang305 at huawei dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yangyang305 at huawei dot com @ 2020-10-21  8:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #3 from yangyang <yangyang305 at huawei dot com> ---
Hi,
    Sorry for the slow reply. After studying the specification of SVE "omp
declare simd" and GCC's current implementation of "omp declare simd", I have
developed a rough plan to support GCC to generating SVE functions for "omp
declare simd". However, there are still some uncertainties in the plan which
might need further discussion. 

    The work is mainly composed of three parts: the generating of SVE functions
for "omp declare simd" in pass_omp_simd_clone, the supporting of SVE PCS of
non-builtin types, and the generating of the call of SVE vectoried functions in
pass_vect. I plan to finish this work in the following five steps, each step
corresponds to a patch:

Part 1) Change the type of the field "simdlen" of struct cgraph_simd_clone from
unsigned int to poly_uint64 and related adaptation. Since the length might be
variable for the SVE cases.

PR96342-part1-v1.patch

Part 2) During debugging, I find that all the calls to interface
simd_clone_subparts needing to be replaced with calls to TYPE_VECTOR_SUBPARTS
due to the introduction of SVE simdclones. So I plan to complete all the
replacements in a patch.

PR96342-part1-v2.patch

Part 3) Add the generating of VLA SVE (vector length agnostic, without
"simdlen") functions for "omp declare simd" and skip the VLS (vector length
specific) ones, specifically:

a) In aarch64_simd_clone_compute_vecsize_and_simdlen, add 1 to “count” when
TARGE_SVE is specified.

b) Add bool type field “always_masked” in struct cgraph_simd_clone to mark
simdclones that always masked and skip the generating of noinbranch version
when always_masked is true. In aarch64_simd_clone_compute_vecsize_and_simdlen,
set it to true when processing SVE simdclones.

c) In aarch64_simd_clone_compute_vecsize_and_simdlen, set the “vecsize_mangle”
to ‘s’, and the “vec_bits” to BITS_PER_SVE_VECTOR when processing VLA SVE
simdclones. Report an unsupported warning when processing VLS SVE simdclones.

d) Adjust simd_clone_mangle.

e) Support SVE masking: For SVE vector functions, masked signatures are
generated by add a svbool_t mask (corresponds to a predicate register) as the
last parameter. Since aarch64 GCC currently doesn’t support muti-types
simdclones, the input predicate works for all the types, GCC doesn’t need to do
special adjustment. For now, I plan to follow current scheme, transform the
input predicate into a bool array with [16, 16] elements (since the input
predicate always has a mode of VNx16BImode), and use the active elements to
build the branch, the following gimple stmts are expected to be generated:

MEM <vector([16,16]) <signed-boolean:1>> [(<signed-boolean:1> *)&mask.34] =
mask.37_17(D);
…
_9 = iter.38_6 * 4;
_8 = mask.34[_9];
if (_8 == 0)
…

The number 4 in _9 = iter.38_6 * 4; comes from arg_unit_size / mask_unit_size.
For how to do this, set “clonei->mask_mode” to VNx16BImode when processing SVE
simdclones in aarch64_simd_clone_compute_vecsize_and_simdlen. And when
processing cgraph_simd_clone->mask_mode in common codes, add special treatment
if cgraph_simd_clone->mask_mode != VOIDmode and cgraph_simd_clone->mask_mode is
VECTOR_MODE, which corresponds to the SVE cases (It’s OK to do so since
cgraph_simd_clone->mask_mode != VOIDmode is established only when the mask is
passed in integer argument(s) in current GCC).

f) In pass_expand, only when a “SVE type” attribute is added to the tree nodes
of the types of arguments and return type, these types use the SVE PCS. For
now, GCC only has a mechanism for adding attributes to SVE builtin type, so I
plan to define a new hook to add attribute to the types of arguments and return
type of simdclones generated if needed. The related processing functions are
planned to be moved to aarch64.c from aarch64-sve-builtin.cc in addition.

Part 4) Add the generating of VLS SVE functions for "omp declare simd". The
specification writes: “When using a simdlen(len) clause, the compiler expects a
VLS vector version of the function that is tuned for a specific implementation
of SVE. ”. Therefore I think only when the number of bits in a SVE vector
register of the target is specified and coincides with the simdlen clause, GCC
is supposed to generate the VLS SVE functions for "omp declare simd",
specifically:

a) In aarch64_simd_clone_compute_vecsize_and_simdlen, when processing VLS SVE
simdclones, if the number of bits in an SVE vector register is specified and
coincides with the simdlen clause, set “clonei->vecsize_mangle”,
“clonei->mask_mode”, and “clonei->always_masked” and calculate the “vec_bits”,
otherwise report a warning and return NULL.

b) In this case, the field "simdlen" is a constant, so using build_vector_type
to build the vector type will get an advanced SIMD version instead of a SVE
version, which seems to be wrong. I plan to add a new hook. The hook does some
special treatment to build a SVE version vector type when processing VLS SVE
simdclones, while call build_vector_type directly in other cases.

Part 5) Generate the call of SVE vectoried functions in pass_vect,
specifically:

a) Define a new hook that return true if the target support variable vector
length simdclones and set the aarch64 return value to true if TARGET_SVE. In
vectorizable_simd_clone_call, continue analyzing instead of directly returning
false.

b) Adjustment to the calculation of badness.

c) The generating of mask.

    Since there is still not enough debugging, the detailed implementation
plans of Part 5) b) and Part 5) c) have not been developed yet.

    For now, I’m working on Part 3) and Part 4). I think it’s necessary to
propose the plan to be reviewed and see if there is any suggestion, since there
are many detailed designs that I’m not sure whether they are the best ways to
do so, any comments?

    In addition, I have finished the first two patches and attached them on
this PR. Is it necessary to send the patchs to the GCC patches mailing list for
reviewing?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-10-21  8:38 ` yangyang305 at huawei dot com
@ 2020-10-21  8:39 ` yangyang305 at huawei dot com
  2020-10-21  8:39 ` yangyang305 at huawei dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yangyang305 at huawei dot com @ 2020-10-21  8:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #4 from yangyang <yangyang305 at huawei dot com> ---
Created attachment 49413
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49413&action=edit
part1-patch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2020-10-21  8:39 ` yangyang305 at huawei dot com
@ 2020-10-21  8:39 ` yangyang305 at huawei dot com
  2020-10-26 16:20 ` rsandifo at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: yangyang305 at huawei dot com @ 2020-10-21  8:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #5 from yangyang <yangyang305 at huawei dot com> ---
Created attachment 49414
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49414&action=edit
part2-patch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2020-10-21  8:39 ` yangyang305 at huawei dot com
@ 2020-10-26 16:20 ` rsandifo at gcc dot gnu.org
  2020-10-26 16:27 ` rsandifo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-10-26 16:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #6 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Comment on attachment 49413
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49413
part1-patch

Thanks for the summary and patches, and sorry for the delayed reply.

Taking part1-patch first: this generally looks good to me.  Some minor
things:

- It isn't necessary to use poly_int stuff in i386.c, since the new
  poly_uint64 will naturally decay to a uint64_t in i386 target files.
  It might still be necessary to update some of the printf formats
  though.

- It's more correct to use "unsigned HOST_WIDE_INT" instead of
  "long unsigned int" and %wd instead of %ld.  One reason is that
  we support cross builds from 32-bit hosts, where long is 32 bits.
  Long is also 32 bits for ILP32.

- Some of the multiple_p calls can instead use exact_div, which is
  the preferred way of performing a division that is known to have
  no remainder.

I think it's clear that we need to make this change, and since it's
a natural "poly_intification", I don't think it needs to wait for
other SVE work to be completed.  So would you be able to submit
the patch to the list independently of the other work?  Stage 1
closes in three weeks' time so it would be good to get this in
before then.

Thanks again for doing this.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2020-10-26 16:20 ` rsandifo at gcc dot gnu.org
@ 2020-10-26 16:27 ` rsandifo at gcc dot gnu.org
  2020-10-26 16:53 ` rsandifo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-10-26 16:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #7 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Comment on attachment 49414
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49414
part2-patch

Nice :-)

For the constant_multiple_p calls that calculate a vector multiple,
it might be good to have a macro in poly-int-types.h with the same
definition as vector_element_size, but with a name that more closely
matches its use here.  Maybe "vector_unroll_factor" or something
(suggestions for better names welcome -- I'm not too happy with
my attempt.)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2020-10-26 16:27 ` rsandifo at gcc dot gnu.org
@ 2020-10-26 16:53 ` rsandifo at gcc dot gnu.org
  2020-11-03 16:14 ` cvs-commit at gcc dot gnu.org
  2023-02-08 11:48 ` avieira at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-10-26 16:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #8 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to yangyang from comment #3)
>     The work is mainly composed of three parts: the generating of SVE
> functions for "omp declare simd" in pass_omp_simd_clone, the supporting of
> SVE PCS of non-builtin types, and the generating of the call of SVE
> vectoried functions in pass_vect. I plan to finish this work in the
> following five steps, each step corresponds to a patch:
This plan looks really good to me, thanks.  I agree with everything
I've snipped in this reply.

> f) In pass_expand, only when a “SVE type” attribute is added to the tree
> nodes of the types of arguments and return type, these types use the SVE
> PCS. For now, GCC only has a mechanism for adding attributes to SVE builtin
> type, so I plan to define a new hook to add attribute to the types of
> arguments and return type of simdclones generated if needed. The related
> processing functions are planned to be moved to aarch64.c from
> aarch64-sve-builtin.cc in addition.
It's a very minor detail, sorry, but I'd prefer to keep stuff in
aarch64-sve-builtins.cc if possible, and simply export the functions
that we need via aarch64-protos.h.

> Part 4) Add the generating of VLS SVE functions for "omp declare simd". The
> specification writes: “When using a simdlen(len) clause, the compiler
> expects a VLS vector version of the function that is tuned for a specific
> implementation of SVE. ”. Therefore I think only when the number of bits in
> a SVE vector register of the target is specified and coincides with the
> simdlen clause, GCC is supposed to generate the VLS SVE functions for "omp
> declare simd",
I think in principle we should generate this unconditionally.
There are two possible approaches, in increasing order of
quality of implementation:

(1) Divide the problem into three cases:

    (a) -msve-vector-bits=scalable

        In this case, generate VLA code for the VLS routines.
        The point here is that the VLS interface guarantees
        that the SVE registers are a particular size, but the
        compiler is not required to take advantage of that
        information.  Using VLA code is a valid implementation
        choice.

    (b) -msve-vectors-bits=N, N matches the simdlen

        For this we'd generate VLS code in the way that you
        describe.

    (c) -msve-vectors-bits=N, N does not match the simdlen

        We should silently accept this for declarations, but emit
        a warning or an error if the compiler needs to generate a
        definition.

(2) Allow -msve-vector-bits= to vary on a function-by-function basis,
    in the same way that the set of target features can already vary
    on a function-by-function basis.  Then, as a follow-on change,
    use this feature to generate VLS code for whichever simdlen
    the code specifies.

(2) is likely to be tricky, so I'd recommend starting with (1)
and treating (2) as a potential future optimisation.

> Part 5) Generate the call of SVE vectoried functions in pass_vect,
> specifically:
> 
> a) Define a new hook that return true if the target support variable vector
> length simdclones and set the aarch64 return value to true if TARGET_SVE. In
> vectorizable_simd_clone_call, continue analyzing instead of directly
> returning false.
It would be good to generalise existing hooks if possible, rather than
add one specifically for VLA vs. VLS.

>     In addition, I have finished the first two patches and attached them on
> this PR. Is it necessary to send the patchs to the GCC patches mailing list
> for reviewing?
Yeah, if you could send them to gcc-patches, that'd be great.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2020-10-26 16:53 ` rsandifo at gcc dot gnu.org
@ 2020-11-03 16:14 ` cvs-commit at gcc dot gnu.org
  2023-02-08 11:48 ` avieira at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-11-03 16:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:abe93733a265f8a8b56dbdd307380f8c83dd3ab5

commit r11-4676-gabe93733a265f8a8b56dbdd307380f8c83dd3ab5
Author: Yang Yang <yangyang305@huawei.com>
Date:   Tue Nov 3 16:13:47 2020 +0000

    PR target/96342 Change field "simdlen" into poly_uint64

    This is the first patch of PR96342. In order to add support for
    "omp declare simd", change the type of the field "simdlen" of
    struct cgraph_simd_clone from unsigned int to poly_uint64 and
    related adaptation. Since the length might be variable for the
    SVE cases.

    2020-11-03  Yang Yang  <yangyang305@huawei.com>

    gcc/ChangeLog:

            * cgraph.h (struct cgraph_simd_clone): Change field "simdlen" of
            struct cgraph_simd_clone from unsigned int to poly_uint64.
            * config/aarch64/aarch64.c
            (aarch64_simd_clone_compute_vecsize_and_simdlen): adaptation of
            operations on "simdlen".
            * config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
            Printf formats update.
            * gengtype.c (main): Handle poly_uint64.
            * omp-simd-clone.c (simd_clone_mangle): Likewise.Re
            (simd_clone_adjust_return_type): Likewise.
            (create_tmp_simd_array): Likewise.
            (simd_clone_adjust_argument_types): Likewise.
            (simd_clone_init_simd_arrays): Likewise.
            (ipa_simd_modify_function_body): Likewise.
            (simd_clone_adjust): Likewise.
            (expand_simd_clones): Likewise.
            * poly-int-types.h (vector_unroll_factor): New macro.
            * poly-int.h (constant_multiple_p): Add two-argument versions.
            * tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/96342] [SVE] Add support for "omp declare simd"
  2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2020-11-03 16:14 ` cvs-commit at gcc dot gnu.org
@ 2023-02-08 11:48 ` avieira at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: avieira at gcc dot gnu.org @ 2023-02-08 11:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96342

--- Comment #10 from avieira at gcc dot gnu.org ---
yang I assume you are no longer working on this?

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-02-08 11:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-27 16:51 [Bug target/96342] New: [SVE] Add support for "omp declare simd" rsandifo at gcc dot gnu.org
2020-08-05  2:46 ` [Bug target/96342] " yangyang305 at huawei dot com
2020-08-05  8:53 ` rsandifo at gcc dot gnu.org
2020-10-21  8:38 ` yangyang305 at huawei dot com
2020-10-21  8:39 ` yangyang305 at huawei dot com
2020-10-21  8:39 ` yangyang305 at huawei dot com
2020-10-26 16:20 ` rsandifo at gcc dot gnu.org
2020-10-26 16:27 ` rsandifo at gcc dot gnu.org
2020-10-26 16:53 ` rsandifo at gcc dot gnu.org
2020-11-03 16:14 ` cvs-commit at gcc dot gnu.org
2023-02-08 11:48 ` avieira at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).