public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
@ 2021-02-22 10:27 ktkachov at gcc dot gnu.org
  2021-03-04 11:55 ` [Bug target/99195] " ktkachov at gcc dot gnu.org
                   ` (20 more replies)
  0 siblings, 21 replies; 22+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-02-22 10:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

            Bug ID: 99195
           Summary: Optimise away vec_concat of 64-bit AdvancedSIMD
                    operations with zeroes in aarch64
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

Motivating testcases:
#include <arm_neon.h>

#define ONE(OT,IT,OP,S)                         \
OT                                              \
foo_##OP##_##S (IT a, IT b)                     \
{                                               \
  IT zeros = vcreate_##S (0);                   \
  return vcombine_##S (v##OP##_##S (a, b), zeros);      \
}


#define FUNC(T,IS,OS,OP,S) ONE (T##x##OS##_t, T##x##IS##_t, OP, S)

#define OPTWO(T,IS,OS,S,OP1,OP2)        \
FUNC (T, IS, OS, OP1, S)                \
FUNC (T, IS, OS, OP2, S)

#define OPTHREE(T, IS, OS, S, OP1, OP2, OP3)    \
FUNC (T, IS, OS, OP1, S)        \
OPTWO (T, IS, OS, S, OP2, OP3)

#define OPFOUR(T,IS,OS,S,OP1,OP2,OP3,OP4)       \
FUNC (T, IS, OS, OP1, S)                \
OPTHREE (T, IS, OS, S, OP2, OP3, OP4)

#define OPFIVE(T,IS,OS,S,OP1,OP2,OP3,OP4, OP5)  \
FUNC (T, IS, OS, OP1, S)                \
OPFOUR (T, IS, OS, S, OP2, OP3, OP4, OP5)

#define OPSIX(T,IS,OS,S,OP1,OP2,OP3,OP4,OP5,OP6)        \
FUNC (T, IS, OS, OP1, S)                \
OPFIVE (T, IS, OS, S, OP2, OP3, OP4, OP5, OP6)

OPSIX (int8, 8, 16, s8, add, sub, mul, and, orr, eor)
OPSIX (int16, 4, 8, s16, add, sub, mul, and, orr, eor)
OPSIX (int32, 2, 4, s32, add, sub, mul, and, orr, eor)
OPFIVE (int64, 1, 2, s64, add, sub, and, orr, eor)

OPSIX (uint8, 8, 16, u8, add, sub, mul, and, orr, eor)
OPSIX (uint16, 4, 8, u16, add, sub, mul, and, orr, eor)
OPSIX (uint32, 2, 4, u32, add, sub, mul, and, orr, eor)
OPFIVE (uint64, 1, 2, u64, add, sub, and, orr, eor)

for example generates:
foo_add_s8:
        add     v0.8b, v0.8b, v1.8b
        mov     v0.8b, v0.8b
        ret

The 64-bit V8QI ADD instruction implicitly zeroes out the top bits of the
128-bit destination so the vec_concat with zeroes can be represented easily.
However we don't have such pattern for all the AdvancedSIMd operations that we
support. Indeed, it would bloat the MD files quite a bit. Can we come up with a
define_subst scheme to auto-generate the patterns to match things like:
(set (reg:V16QI 93 [ <retval> ])
    (vec_concat:V16QI (plus:V8QI (reg:V8QI 98)
            (reg:V8QI 99))
        (const_vector:V8QI [
                (const_int 0 [0]) repeated x8
            ])))
?
Then we should be able to just generate:
foo_add_s8:
        add     v0.8b, v0.8b, v1.8b
        ret
etc.
The testcase above shows the problem for some of the simple binary ops, but
there are many more instructions that can benefit from this.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
@ 2021-03-04 11:55 ` ktkachov at gcc dot gnu.org
  2021-03-07  2:05 ` pinskia at gcc dot gnu.org
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-03-04 11:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #1 from ktkachov at gcc dot gnu.org ---
Using a define_subst like:
(define_subst "add_vec_concat_subst"
  [(set (match_operand:VDMOV 0 "" "")
        (match_operand:VDMOV 1 "" ""))]
  "!BYTES_BIG_ENDIAN"
  [(set (match_operand:<VDBL> 0 "register_operand" "=w")
        (vec_concat:<VDBL>
         (match_dup 1)
         (match_operand:VDMOV 2 "aarch64_simd_or_scalar_imm_zero")))]
)

(define_subst_attr "add_vec_concat" "add_vec_concat_subst" "" "_vec_concat")

and adding it to patterns in aarch64-simd.md through <add_vec_concat> seems to
work. It doesn't handle the big-endian case, but maybe we can handle that
separately (or with a second define_subst?)

Does this approach make sense?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
  2021-03-04 11:55 ` [Bug target/99195] " ktkachov at gcc dot gnu.org
@ 2021-03-07  2:05 ` pinskia at gcc dot gnu.org
  2023-04-21 17:57 ` cvs-commit at gcc dot gnu.org
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-03-07  2:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-03-07

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
  2021-03-04 11:55 ` [Bug target/99195] " ktkachov at gcc dot gnu.org
  2021-03-07  2:05 ` pinskia at gcc dot gnu.org
@ 2023-04-21 17:57 ` cvs-commit at gcc dot gnu.org
  2023-04-23 13:41 ` cvs-commit at gcc dot gnu.org
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-21 17:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:f824216cdb078ea9de0980ae066a0e1e83494fd2

commit r14-153-gf824216cdb078ea9de0980ae066a0e1e83494fd2
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Fri Apr 21 18:56:21 2023 +0100

    aarch64: PR target/99195 Add scheme to optimise away vec_concat with zeroes
on 64-bit Advanced SIMD ops

    I finally got around to trying out the define_subst approach for PR
target/99195.
    The problem we have is that many Advanced SIMD instructions have 64-bit
vector variants that
    clear the top half of the 128-bit Q register. This would allow the compiler
to avoid generating
    explicit zeroing instructions to concat the 64-bit result with zeroes for
code like:
    vcombine_u16(vadd_u16(a, b), vdup_n_u16(0))
    We've been getting user reports of GCC missing this optimisation in real
world code, so it's worth
    doing something about it.
    The straightforward approach that we've been taking so far is adding extra
patterns in aarch64-simd.md
    that match the 64-bit result in a vec_concat with zeroes. Unfortunately for
big-endian the vec_concat
    operands to match have to be the other way around, so we would end up
adding two extra define_insns.
    This would lead to too much bloat in aarch64-simd.md

    This patch defines a pair of define_subst constructs that allow us to
annotate patterns in aarch64-simd.md
    with the <vczle> and <vczbe> subst_attrs and the compiler will
automatically produce the vec_concat widening patterns,
    properly gated for BYTES_BIG_ENDIAN when needed. This seems like the least
intrusive way to describe the extra zeroing semantics.

    I've had a look at the generated insn-*.cc files in the build directory and
it seems that define_subst does what we want it to do
    when applied multiple times on a pattern in terms of insn conditions and
modes.

    This patch adds the define_subst machinery and adds the annotations to some
of the straightforward binary and unary integer
    operations. Many more such annotations are possible and I aim add them in
future patches if this approach is acceptable.

    Bootstrapped and tested on aarch64-none-linux-gnu and on
aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (add_vec_concat_subst_le): Define.
            (add_vec_concat_subst_be): Likewise.
            (vczle): Likewise.
            (vczbe): Likewise.
            (add<mode>3): Rename to...
            (add<mode>3<vczle><vczbe>): ... This.
            (sub<mode>3): Rename to...
            (sub<mode>3<vczle><vczbe>): ... This.
            (mul<mode>3): Rename to...
            (mul<mode>3<vczle><vczbe>): ... This.
            (and<mode>3): Rename to...
            (and<mode>3<vczle><vczbe>): ... This.
            (ior<mode>3): Rename to...
            (ior<mode>3<vczle><vczbe>): ... This.
            (xor<mode>3): Rename to...
            (xor<mode>3<vczle><vczbe>): ... This.
            * config/aarch64/iterators.md (VDZ): Define.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-04-21 17:57 ` cvs-commit at gcc dot gnu.org
@ 2023-04-23 13:41 ` cvs-commit at gcc dot gnu.org
  2023-04-25 13:55 ` cvs-commit at gcc dot gnu.org
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-23 13:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:3b13c59c835f92b353ef318398e39907cdeec4fa

commit r14-178-g3b13c59c835f92b353ef318398e39907cdeec4fa
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Sun Apr 23 14:40:17 2023 +0100

    aarch64: Add vect_concat with zeroes annotation to addp pattern

    Similar to others, the addp pattern can be safely annotated with
<vczle><vczbe> to create
    the implicit vec_concat-with-zero variants.

    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_addp<mode>): Rename to...
            (aarch64_addp<mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add testing for vpadd
intrinsics.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-04-23 13:41 ` cvs-commit at gcc dot gnu.org
@ 2023-04-25 13:55 ` cvs-commit at gcc dot gnu.org
  2023-04-28  8:34 ` cvs-commit at gcc dot gnu.org
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-25 13:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:9e9503e7b2c1517e8c46ea4d2e8805cc20301f34

commit r14-222-g9e9503e7b2c1517e8c46ea4d2e8805cc20301f34
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Tue Apr 25 14:52:37 2023 +0100

    aarch64: PR target/PR99195 Annotate more simple integer binary patterns
with vcz subst rules

    This patch adds more straightforward annotations to some more integer
binary ops to
    eliminate redundant fmovs around 64-bit SIMD results.

    Bootstrapped and tested on aarch64-none-linux.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (orn<mode>3): Rename to...
            (orn<mode>3<vczle><vczbe>): ... This.
            (bic<mode>3): Rename to...
            (bic<mode>3<vczle><vczbe>): ... This.
            (<su><maxmin><mode>3): Rename to...
            (<su><maxmin><mode>3<vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add tests for orn, bic, max
and min.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-04-25 13:55 ` cvs-commit at gcc dot gnu.org
@ 2023-04-28  8:34 ` cvs-commit at gcc dot gnu.org
  2023-05-03 10:16 ` cvs-commit at gcc dot gnu.org
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-28  8:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:889a0791c632aa2804c4e01cc7dddca1ae0d229c

commit r14-320-g889a0791c632aa2804c4e01cc7dddca1ae0d229c
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Fri Apr 28 09:33:16 2023 +0100

    aarch64: PR target/99195 annotate more integer unary patterns for
vec-concat with zero

    More of the straightforward cases to annotate plus tests, this time for
simple integer unary ops.
    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_rbit<mode>): Rename to...
            (aarch64_rbit<mode><vczle><vczbe>): ... This.
            (neg<mode>2): Rename to...
            (neg<mode>2<vczle><vczbe>): ... This.
            (abs<mode>2): Rename to...
            (abs<mode>2<vczle><vczbe>): ... This.
            (aarch64_abs<mode>): Rename to...
            (aarch64_abs<mode><vczle><vczbe>): ... This.
            (one_cmpl<mode>2): Rename to...
            (one_cmpl<mode>2<vczle><vczbe>): ... This.
            (clrsb<mode>2): Rename to...
            (clrsb<mode>2<vczle><vczbe>): ... This.
            (clz<mode>2): Rename to...
            (clz<mode>2<vczle><vczbe>): ... This.
            (popcount<mode>2): Rename to...
            (popcount<mode>2<vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add tests for unary integer
ops.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-04-28  8:34 ` cvs-commit at gcc dot gnu.org
@ 2023-05-03 10:16 ` cvs-commit at gcc dot gnu.org
  2023-05-03 10:18 ` cvs-commit at gcc dot gnu.org
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-03 10:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:1133cfab47258c147fcb2d453465d10e72acbfd9

commit r14-422-g1133cfab47258c147fcb2d453465d10e72acbfd9
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 3 11:15:34 2023 +0100

    aarch64: PR target/99195 annotate simple floating-point patterns for
vec-concat with zero

    Continuing the, almost mechanical, series this patch adds annotation for
some of the simple
    floating-point patterns we have, and adds testing to ensure that redundant
zeroing instructions
    are eliminated.

    Bootstrapped and tested on aarch64-none-linux-gnu and also
aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (add<mode>3): Rename to...
            (add<mode>3<vczle><vczbe>): ... This.
            (sub<mode>3): Rename to...
            (sub<mode>3<vczle><vczbe>): ... This.
            (mul<mode>3): Rename to...
            (mul<mode>3<vczle><vczbe>): ... This.
            (*div<mode>3): Rename to...
            (*div<mode>3<vczle><vczbe>): ... This.
            (neg<mode>2): Rename to...
            (neg<mode>2<vczle><vczbe>): ... This.
            (abs<mode>2): Rename to...
            (abs<mode>2<vczle><vczbe>): ... This.
            (<frint_pattern><mode>2): Rename to...
            (<frint_pattern><mode>2<vczle><vczbe>): ... This.
            (<fmaxmin><mode>3): Rename to...
            (<fmaxmin><mode>3<vczle><vczbe>): ... This.
            (*sqrt<mode>2): Rename to...
            (*sqrt<mode>2<vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add testing for some unary
            and binary floating-point ops.
            * gcc.target/aarch64/simd/pr99195_2.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-05-03 10:16 ` cvs-commit at gcc dot gnu.org
@ 2023-05-03 10:18 ` cvs-commit at gcc dot gnu.org
  2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-03 10:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:12fae1f7fbe4df554bb257f805d9d324e276ab57

commit r14-423-g12fae1f7fbe4df554bb257f805d9d324e276ab57
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 3 11:17:28 2023 +0100

    aarch64: PR target/99195 annotate HADDSUB patterns for vec-concat with zero

    Further straightforward patch for the various halving intrinsics with or
without rounding, plus tests.
    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_<sur>h<addsub><mode>):
Rename to...
            (aarch64_<sur>h<addsub><mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add tests for halving and
rounding
            add/sub intrinsics.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2023-05-03 10:18 ` cvs-commit at gcc dot gnu.org
@ 2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
  2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-04  8:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:d840bc5cab39aa3dd8222d72b2cd40942bf91c93

commit r14-472-gd840bc5cab39aa3dd8222d72b2cd40942bf91c93
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Thu May 4 09:41:46 2023 +0100

    aarch64: PR target/99195 annotate more simple binary ops for vec-concat
with zero

    More pattern annotations and tests to eliminate redundant vec-concat with
zero instructions.
    These are for the abd family of instructions and the pairwise
floating-point max/min and fadd
    operations too.

    Bootstrapped and tested on aarch64-none-linux-gnu.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_<su>abd<mode>): Rename
to...
            (aarch64_<su>abd<mode><vczle><vczbe>): ... This.
            (fabd<mode>3): Rename to...
            (fabd<mode>3<vczle><vczbe>): ... This.
            (aarch64_<optab>p<mode>): Rename to...
            (aarch64_<optab>p<mode><vczle><vczbe>): ... This.
            (aarch64_faddp<mode>): Rename to...
            (aarch64_faddp<mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add testing for more binary
ops.
            * gcc.target/aarch64/simd/pr99195_2.c: Add testing for more binary
ops.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
@ 2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
  2023-05-10  9:42 ` cvs-commit at gcc dot gnu.org
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-04  8:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:93c26deab98fc80b616a1c53c324a88f61036f53

commit r14-473-g93c26deab98fc80b616a1c53c324a88f61036f53
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Thu May 4 09:42:37 2023 +0100

    aarch64: PR target/99195 annotate simple ternary ops for vec-concat with
zero

    We're now moving onto various simple ternary instructions, including some
lane forms.
    These include intrinsics that map down to mla, mls, fma, aba, bsl
instructions.
    Tests are added for lane 0 and lane 1 as for some of these instructions the
lane 0 variants
    use separate simpler patterns that need a separate annotation.

    Bootstrapped and tested on aarch64-none-linux-gnu.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_<su>aba<mode>): Rename
to...
            (aarch64_<su>aba<mode><vczle><vczbe>): ... This.
            (aarch64_mla<mode>): Rename to...
            (aarch64_mla<mode><vczle><vczbe>): ... This.
            (*aarch64_mla_elt<mode>): Rename to...
            (*aarch64_mla_elt<mode><vczle><vczbe>): ... This.
            (*aarch64_mla_elt_<vswap_width_name><mode>): Rename to...
            (*aarch64_mla_elt_<vswap_width_name><mode><vczle><vczbe>): ...
This.
            (aarch64_mla_n<mode>): Rename to...
            (aarch64_mla_n<mode><vczle><vczbe>): ... This.
            (aarch64_mls<mode>): Rename to...
            (aarch64_mls<mode><vczle><vczbe>): ... This.
            (*aarch64_mls_elt<mode>): Rename to...
            (*aarch64_mls_elt<mode><vczle><vczbe>): ... This.
            (*aarch64_mls_elt_<vswap_width_name><mode>): Rename to...
            (*aarch64_mls_elt_<vswap_width_name><mode><vczle><vczbe>): ...
This.
            (aarch64_mls_n<mode>): Rename to...
            (aarch64_mls_n<mode><vczle><vczbe>): ... This.
            (fma<mode>4): Rename to...
            (fma<mode>4<vczle><vczbe>): ... This.
            (*aarch64_fma4_elt<mode>): Rename to...
            (*aarch64_fma4_elt<mode><vczle><vczbe>): ... This.
            (*aarch64_fma4_elt_<vswap_width_name><mode>): Rename to...
            (*aarch64_fma4_elt_<vswap_width_name><mode><vczle><vczbe>): ...
This.
            (*aarch64_fma4_elt_from_dup<mode>): Rename to...
            (*aarch64_fma4_elt_from_dup<mode><vczle><vczbe>): ... This.
            (fnma<mode>4): Rename to...
            (fnma<mode>4<vczle><vczbe>): ... This.
            (*aarch64_fnma4_elt<mode>): Rename to...
            (*aarch64_fnma4_elt<mode><vczle><vczbe>): ... This.
            (*aarch64_fnma4_elt_<vswap_width_name><mode>): Rename to...
            (*aarch64_fnma4_elt_<vswap_width_name><mode><vczle><vczbe>): ...
This.
            (*aarch64_fnma4_elt_from_dup<mode>): Rename to...
            (*aarch64_fnma4_elt_from_dup<mode><vczle><vczbe>): ... This.
            (aarch64_simd_bsl<mode>_internal): Rename to...
            (aarch64_simd_bsl<mode>_internal<vczle><vczbe>): ... This.
            (*aarch64_simd_bsl<mode>_alt): Rename to...
            (*aarch64_simd_bsl<mode>_alt<vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_3.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
@ 2023-05-10  9:42 ` cvs-commit at gcc dot gnu.org
  2023-05-10 10:51 ` cvs-commit at gcc dot gnu.org
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-10  9:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:d1e7f9993084b87e6676a5ccef3c8b7f807a6013

commit r14-651-gd1e7f9993084b87e6676a5ccef3c8b7f807a6013
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 10 10:40:06 2023 +0100

    aarch64: PR target/99195 annotate simple narrowing patterns for
vec-concat-zero

    This patch cleans up some almost-duplicate patterns for the XTN, SQXTN,
UQXTN instructions.
    Using the <vczle><vczbe> attributes we can remove the BYTES_BIG_ENDIAN and
!BYTES_BIG_ENDIAN cases,
    as well as the intrinsic expanders that select between the two.
    Tests are also added. Thankfully the diffstat comes out negative \O/.

    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_xtn<mode>_insn_le):
Delete.
            (aarch64_xtn<mode>_insn_be): Likewise.
            (trunc<mode><Vnarrowq>2): Rename to...
            (trunc<mode><Vnarrowq>2<vczle><vczbe>): ... This.
            (aarch64_xtn<mode>): Move under the above.  Just emit the truncate
RTL.
            (aarch64_<su>qmovn<mode>): Likewise.
            (aarch64_<su>qmovn<mode><vczle><vczbe>): New define_insn.
            (aarch64_<su>qmovn<mode>_insn_le): Delete.
            (aarch64_<su>qmovn<mode>_insn_be): Likewise.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_4.c: Add tests for vmovn, vqmovn.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2023-05-10  9:42 ` cvs-commit at gcc dot gnu.org
@ 2023-05-10 10:51 ` cvs-commit at gcc dot gnu.org
  2023-05-10 11:02 ` cvs-commit at gcc dot gnu.org
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-10 10:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:c8977cf5f2daa9fecfc5d67a737506d0d31c578a

commit r14-653-gc8977cf5f2daa9fecfc5d67a737506d0d31c578a
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 10 11:50:01 2023 +0100

    aarch64: PR target/99195 annotate simple saturating add/sub patterns for
vec-concat-zero

    Moving onto the saturating instructions, this one goes through the simple
add/sub ones.
    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md
(aarch64_<su_optab>q<addsub><mode>):
            Rename to...
            (aarch64_<su_optab>q<addsub><mode><vczle><vczbe>): ... This.
            (aarch64_<sur>qadd<mode>): Rename to...
            (aarch64_<sur>qadd<mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add testing for qadd, qsub.
            * gcc.target/aarch64/simd/pr99195_6.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2023-05-10 10:51 ` cvs-commit at gcc dot gnu.org
@ 2023-05-10 11:02 ` cvs-commit at gcc dot gnu.org
  2023-05-15  8:50 ` cvs-commit at gcc dot gnu.org
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-10 11:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:3ed5677bb61b334a2d01c769859cdd3279e12a07

commit r14-654-g3ed5677bb61b334a2d01c769859cdd3279e12a07
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 10 12:00:17 2023 +0100

    [PATCH] aarch64: PR target/99195 annotate simple permutation patterns for
vec-concat-zero

    Another straightforward patch annotating patterns for the zip1, zip2, uzp1,
uzp2, rev* instructions, plus tests.
    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md
(aarch64_<PERMUTE:perm_insn><mode>):
            Rename to...
            (aarch64_<PERMUTE:perm_insn><mode><vczle><vczbe>): ... This.
            (aarch64_rev<REVERSE:rev_op><mode>): Rename to...
            (aarch64_rev<REVERSE:rev_op><mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add tests for zip and rev
            intrinsics.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2023-05-10 11:02 ` cvs-commit at gcc dot gnu.org
@ 2023-05-15  8:50 ` cvs-commit at gcc dot gnu.org
  2023-05-15  8:56 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-15  8:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:676d33f95eede2dfa629c9b0174c15cc55c4a45a

commit r14-819-g676d33f95eede2dfa629c9b0174c15cc55c4a45a
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Mon May 15 09:49:48 2023 +0100

    aarch64: PR target/99195 annotate qabs,qneg patterns for vec-concat-zero

    Straightforward like previous patches in this series.
    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_s<optab><mode>): Rename
to...
            (aarch64_s<optab><mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_4.c: Add testing for qabs, qneg.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2023-05-15  8:50 ` cvs-commit at gcc dot gnu.org
@ 2023-05-15  8:56 ` cvs-commit at gcc dot gnu.org
  2023-05-24 13:53 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-15  8:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:e90791e5a02b021d22ffb4c36673b9af623e2063

commit r14-820-ge90791e5a02b021d22ffb4c36673b9af623e2063
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Mon May 15 09:55:44 2023 +0100

    aarch64: PR target/99195 annotate vector compare patterns for
vec-concat-zero

    This instalment of the series goes through the vector comparison patterns
in the backend.
    One wart are the int64x1_t comparisons that this patch doesn't touch.
    Those are a bit trickier because they have define_insn_and_split mechanisms
for falling back to
    GP reg comparisons after reload and I don't think a simple annotation will
catch those cases correctly.
    Those will need more custom thinking.
    As said, this patch doesn't touch those and is a decent straightforward
improvement on its own.

    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_cm<optab><mode>): Rename
to...
            (aarch64_cm<optab><mode><vczle><vczbe>): ... This.
            (aarch64_cmtst<mode>): Rename to...
            (aarch64_cmtst<mode><vczle><vczbe>): ... This.
            (*aarch64_cmtst_same_<mode>): Rename to...
            (*aarch64_cmtst_same_<mode><vczle><vczbe>): ... This.
            (*aarch64_cmtstdi): Rename to...
            (*aarch64_cmtstdi<vczle><vczbe>): ... This.
            (aarch64_fac<optab><mode>): Rename to...
            (aarch64_fac<optab><mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_7.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2023-05-15  8:56 ` cvs-commit at gcc dot gnu.org
@ 2023-05-24 13:53 ` cvs-commit at gcc dot gnu.org
  2023-05-25 14:01 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-24 13:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #16 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:b30ab0dcf9db2ac6d81fb3743add1fbfa0d18f6e

commit r14-1167-gb30ab0dcf9db2ac6d81fb3743add1fbfa0d18f6e
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 24 14:52:34 2023 +0100

    aarch64: PR target/99195 Annotate vector shift patterns for vec-concat-zero

    Continuing the series of straightforward annotations, this one handles the
normal (not widening or narrowing) vector shifts.
    Tests included.

    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_simd_lshr<mode>): Rename
to...
            (aarch64_simd_lshr<mode><vczle><vczbe>): ... This.
            (aarch64_simd_ashr<mode>): Rename to...
            (aarch64_simd_ashr<mode><vczle><vczbe>): ... This.
            (aarch64_simd_imm_shl<mode>): Rename to...
            (aarch64_simd_imm_shl<mode><vczle><vczbe>): ... This.
            (aarch64_simd_reg_sshl<mode>): Rename to...
            (aarch64_simd_reg_sshl<mode><vczle><vczbe>): ... This.
            (aarch64_simd_reg_shl<mode>_unsigned): Rename to...
            (aarch64_simd_reg_shl<mode>_unsigned<vczle><vczbe>): ... This.
            (aarch64_simd_reg_shl<mode>_signed): Rename to...
            (aarch64_simd_reg_shl<mode>_signed<vczle><vczbe>): ... This.
            (vec_shr_<mode>): Rename to...
            (vec_shr_<mode><vczle><vczbe>): ... This.
            (aarch64_<sur>shl<mode>): Rename to...
            (aarch64_<sur>shl<mode><vczle><vczbe>): ... This.
            (aarch64_<sur>q<r>shl<mode>): Rename to...
            (aarch64_<sur>q<r>shl<mode><vczle><vczbe>): ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add testing for shifts.
            * gcc.target/aarch64/simd/pr99195_6.c: Likewise.
            * gcc.target/aarch64/simd/pr99195_8.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2023-05-24 13:53 ` cvs-commit at gcc dot gnu.org
@ 2023-05-25 14:01 ` cvs-commit at gcc dot gnu.org
  2023-05-31 16:45 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-25 14:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #17 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:560bb845321f5ad039a318a081b0e88d9900f5cb

commit r14-1241-g560bb845321f5ad039a318a081b0e88d9900f5cb
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Thu May 25 15:00:16 2023 +0100

    aarch64: PR target/99195 Annotate complex FP patterns for vec-concat-zero

    This patch annotates the complex add and mla patterns for vec-concat-zero.
    Testing showed an interesting bug in our MD patterns where they were
defined to match:
            (plus:VHSDF (match_operand:VHSDF 1 "register_operand" "0")
                        (unspec:VHSDF [(match_operand:VHSDF 2
"register_operand" "w")
                                       (match_operand:VHSDF 3
"register_operand" "w")
                                       (match_operand:SI 4 "const_int_operand"
"n")]
                                       FCMLA))

    but the canonicalisation rules for PLUS require the more "complex" operand
to be first so
    during combine when the new substituted patterns were attempted to be
formed combine/recog would
    try to match:
    (plus:V2SF (unspec:V2SF [
                            (reg:V2SF 100)
                            (reg:V2SF 101)
                            (const_int 0 [0])
                        ] UNSPEC_FCMLA270)
                    (reg:V2SF 99))
    instead. This patch fixes the operands of the PLUS RTX in these patterns.
    Similar patterns for the dot-product instructions already used the right
order.

    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_fcadd<rot><mode>): Rename
to...
            (aarch64_fcadd<rot><mode><vczle><vczbe>): ... This.
            Fix canonicalization of PLUS operands.
            (aarch64_fcmla<rot><mode>): Rename to...
            (aarch64_fcmla<rot><mode><vczle><vczbe>): ... This.
            Fix canonicalization of PLUS operands.
            (aarch64_fcmla_lane<rot><mode>): Rename to...
            (aarch64_fcmla_lane<rot><mode><vczle><vczbe>): ... This.
            Fix canonicalization of PLUS operands.
            (aarch64_fcmla_laneq<rot>v4hf): Rename to...
            (aarch64_fcmla_laneq<rot>v4hf<vczle><vczbe>): ... This.
            Fix canonicalization of PLUS operands.
            (aarch64_fcmlaq_lane<rot><mode>): Fix canonicalization of PLUS
operands.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_9.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2023-05-25 14:01 ` cvs-commit at gcc dot gnu.org
@ 2023-05-31 16:45 ` cvs-commit at gcc dot gnu.org
  2023-05-31 16:46 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-31 16:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:547d3bce0c02dbcbb6f62d9469a71eedf17bd688

commit r14-1447-g547d3bce0c02dbcbb6f62d9469a71eedf17bd688
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 31 17:43:20 2023 +0100

    aarch64: PR target/99195 Annotate saturating mult patterns for
vec-concat-zero

    This patch goes through the various alphabet soup saturating multiplication
patterns, including those in TARGET_RDMA
    and annotates them with <vczle><vczbe>. Many other patterns are widening
and always write the full 128-bit vectors
    so this annotation doesn't apply to them. Nothing out of the ordinary in
this patch.

    Bootstrapped and tested on aarch64-none-linux and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (aarch64_sq<r>dmulh<mode>): Rename
to...
            (aarch64_sq<r>dmulh<mode><vczle><vczbe>): ... This.
            (aarch64_sq<r>dmulh_n<mode>): Rename to...
            (aarch64_sq<r>dmulh_n<mode><vczle><vczbe>): ... This.
            (aarch64_sq<r>dmulh_lane<mode>): Rename to...
            (aarch64_sq<r>dmulh_lane<mode><vczle><vczbe>): ... This.
            (aarch64_sq<r>dmulh_laneq<mode>): Rename to...
            (aarch64_sq<r>dmulh_laneq<mode><vczle><vczbe>): ... This.
            (aarch64_sqrdml<SQRDMLH_AS:rdma_as>h<mode>): Rename to...
            (aarch64_sqrdml<SQRDMLH_AS:rdma_as>h<mode><vczle><vczbe>): ...
This.
            (aarch64_sqrdml<SQRDMLH_AS:rdma_as>h_lane<mode>): Rename to...
            (aarch64_sqrdml<SQRDMLH_AS:rdma_as>h_lane<mode><vczle><vczbe>): ...
This.
            (aarch64_sqrdml<SQRDMLH_AS:rdma_as>h_laneq<mode>): Rename to...
            (aarch64_sqrdml<SQRDMLH_AS:rdma_as>h_laneq<mode><vczle><vczbe>):
... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_1.c: Add tests for qdmulh,
qrdmulh.
            * gcc.target/aarch64/simd/pr99195_10.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2023-05-31 16:45 ` cvs-commit at gcc dot gnu.org
@ 2023-05-31 16:46 ` cvs-commit at gcc dot gnu.org
  2024-02-27  8:38 ` pinskia at gcc dot gnu.org
  2024-04-04  8:21 ` ktkachov at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-31 16:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #19 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <ktkachov@gcc.gnu.org>:

https://gcc.gnu.org/g:d0c064c3eabc75cf83df296ebcd1db19b4a68851

commit r14-1448-gd0c064c3eabc75cf83df296ebcd1db19b4a68851
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed May 31 17:46:19 2023 +0100

    aarch64: PR target/99195 Annotate dot-product patterns for vec-concat-zero

    This straightforward patch annotates the dotproduct instructions, including
the i8mm ones.
    Tests included.
    Nothing unexpected here.

    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

    gcc/ChangeLog:

            PR target/99195
            * config/aarch64/aarch64-simd.md (<sur>dot_prod<vsi2qi>): Rename
to...
            (<sur>dot_prod<vsi2qi><vczle><vczbe>): ... This.
            (usdot_prod<vsi2qi>): Rename to...
            (usdot_prod<vsi2qi><vczle><vczbe>): ... This.
            (aarch64_<sur>dot_lane<vsi2qi>): Rename to...
            (aarch64_<sur>dot_lane<vsi2qi><vczle><vczbe>): ... This.
            (aarch64_<sur>dot_laneq<vsi2qi>): Rename to...
            (aarch64_<sur>dot_laneq<vsi2qi><vczle><vczbe>): ... This.
            (aarch64_<DOTPROD_I8MM:sur>dot_lane<VB:isquadop><VS:vsi2qi>):
Rename to...
           
(aarch64_<DOTPROD_I8MM:sur>dot_lane<VB:isquadop><VS:vsi2qi><vczle><vczbe>):
            ... This.

    gcc/testsuite/ChangeLog:

            PR target/99195
            * gcc.target/aarch64/simd/pr99195_11.c: New test.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (18 preceding siblings ...)
  2023-05-31 16:46 ` cvs-commit at gcc dot gnu.org
@ 2024-02-27  8:38 ` pinskia at gcc dot gnu.org
  2024-04-04  8:21 ` ktkachov at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-27  8:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

--- Comment #20 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Is there any remaining patterns that need vczle/vczbe added to it?

Otherwise please close this as fixed for GCC 14.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
  2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
                   ` (19 preceding siblings ...)
  2024-02-27  8:38 ` pinskia at gcc dot gnu.org
@ 2024-04-04  8:21 ` ktkachov at gcc dot gnu.org
  20 siblings, 0 replies; 22+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2024-04-04  8:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |14.0
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #21 from ktkachov at gcc dot gnu.org ---
I think all the straightforward cases are handled and the infrastructure for
doing this is added. Any future improvements in the area should be tracked
separately. Marking as fixed for GCC 14.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-04-04  8:21 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-22 10:27 [Bug target/99195] New: Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64 ktkachov at gcc dot gnu.org
2021-03-04 11:55 ` [Bug target/99195] " ktkachov at gcc dot gnu.org
2021-03-07  2:05 ` pinskia at gcc dot gnu.org
2023-04-21 17:57 ` cvs-commit at gcc dot gnu.org
2023-04-23 13:41 ` cvs-commit at gcc dot gnu.org
2023-04-25 13:55 ` cvs-commit at gcc dot gnu.org
2023-04-28  8:34 ` cvs-commit at gcc dot gnu.org
2023-05-03 10:16 ` cvs-commit at gcc dot gnu.org
2023-05-03 10:18 ` cvs-commit at gcc dot gnu.org
2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
2023-05-04  8:45 ` cvs-commit at gcc dot gnu.org
2023-05-10  9:42 ` cvs-commit at gcc dot gnu.org
2023-05-10 10:51 ` cvs-commit at gcc dot gnu.org
2023-05-10 11:02 ` cvs-commit at gcc dot gnu.org
2023-05-15  8:50 ` cvs-commit at gcc dot gnu.org
2023-05-15  8:56 ` cvs-commit at gcc dot gnu.org
2023-05-24 13:53 ` cvs-commit at gcc dot gnu.org
2023-05-25 14:01 ` cvs-commit at gcc dot gnu.org
2023-05-31 16:45 ` cvs-commit at gcc dot gnu.org
2023-05-31 16:46 ` cvs-commit at gcc dot gnu.org
2024-02-27  8:38 ` pinskia at gcc dot gnu.org
2024-04-04  8:21 ` ktkachov at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).