public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/96366] New: [AArch64] ICE due to lack of support for VNx2SI sub instruction
@ 2020-07-29 12:21 bule1 at huawei dot com
  2020-07-29 13:41 ` [Bug target/96366] " rsandifo at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: bule1 at huawei dot com @ 2020-07-29 12:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96366

            Bug ID: 96366
           Summary: [AArch64] ICE due to lack of support for VNx2SI sub
                    instruction
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bule1 at huawei dot com
                CC: richard.sandiford at arm dot com
  Target Milestone: ---

Created attachment 48950
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48950&action=edit
preprocessed source code for recurent the problem

Hi, 

The test case bb-slp-20.c in the gcc testsuit will cause an ICE in the expand
pass because the gcc lack of a pattern for subtraction of the VNx2SI mode.

The preprocessed file is attached and the problem will be triggered when
compiled with -march=armv8.5-a+sve -msve-vector-bits=256 -O3 -fno-tree-forwprop
options.

By tracing the debug infomation, it is found that the error is due to a
vectorized subtraction gimple with VNx2SI mode cannot find its pattern during
the expand pass.

I tried to extend the mode of this pattern from SVE_FULL_I to SVE_I as
following, after which the problem is solved. 

diff -Nurp a/gcc/config/aarch64/aarch64-sve.md
b/gcc/config/aarch64/aarch64-sve.md
--- a/gcc/config/aarch64/aarch64-sve.md 2020-07-29 15:54:39.360000000 +0800
+++ b/gcc/config/aarch64/aarch64-sve.md 2020-07-29 14:37:21.932000000 +0800
@@ -3644,10 +3644,10 @@
 ;; -------------------------------------------------------------------------

 (define_insn "sub<mode>3"
-  [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w, w, ?&w")
-       (minus:SVE_FULL_I
-         (match_operand:SVE_FULL_I 1 "aarch64_sve_arith_operand" "w, vsa,
vsa")
-         (match_operand:SVE_FULL_I 2 "register_operand" "w, 0, w")))]
+  [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?&w")
+       (minus:SVE_I
+         (match_operand:SVE_I 1 "aarch64_sve_arith_operand" "w, vsa, vsa")
+         (match_operand:SVE_I 2 "register_operand" "w, 0, w")))]
   "TARGET_SVE"
   "@
    sub\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>

I noticed that this mode iterator was changed from SVE_I to SVE_FULL_I in Nov
2019 by richard to support partial SVE vectors. However, in the following patch
the addition pattern is supported by changing SVE_FULL_I to SVE_I but not the
subtraction pattern. Is there any specific reason why this pattern is not
supported?

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/96366] [AArch64] ICE due to lack of support for VNx2SI sub instruction
  2020-07-29 12:21 [Bug target/96366] New: [AArch64] ICE due to lack of support for VNx2SI sub instruction bule1 at huawei dot com
@ 2020-07-29 13:41 ` rsandifo at gcc dot gnu.org
  2020-07-30 13:01 ` bule1 at huawei dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-07-29 13:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96366

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #1 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to Bu Le from comment #0)
> Created attachment 48950 [details]
> preprocessed source code for recurent the problem
> 
> Hi, 
> 
> The test case bb-slp-20.c in the gcc testsuit will cause an ICE in the
> expand pass because the gcc lack of a pattern for subtraction of the VNx2SI
> mode.
> 
> The preprocessed file is attached and the problem will be triggered when
> compiled with -march=armv8.5-a+sve -msve-vector-bits=256 -O3
> -fno-tree-forwprop options.
> 
> By tracing the debug infomation, it is found that the error is due to a
> vectorized subtraction gimple with VNx2SI mode cannot find its pattern
> during the expand pass.

Hmm.  In general, the lack of a vector pattern shouldn't case ICEs,
but I suppose the add/sub pairing is somewhat special because of
the canonicalisation rules.  It would be worth looking at exactly
why we generate the subtract though, just to confirm that this is
an “expected” ICE rather than a symptom of a deeper problem.

> I tried to extend the mode of this pattern from SVE_FULL_I to SVE_I as
> following, after which the problem is solved. 
> 
> diff -Nurp a/gcc/config/aarch64/aarch64-sve.md
> b/gcc/config/aarch64/aarch64-sve.md
> --- a/gcc/config/aarch64/aarch64-sve.md 2020-07-29 15:54:39.360000000 +0800
> +++ b/gcc/config/aarch64/aarch64-sve.md 2020-07-29 14:37:21.932000000 +0800
> @@ -3644,10 +3644,10 @@
>  ;; -------------------------------------------------------------------------
> 
>  (define_insn "sub<mode>3"
> -  [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w, w, ?&w")
> -       (minus:SVE_FULL_I
> -         (match_operand:SVE_FULL_I 1 "aarch64_sve_arith_operand" "w, vsa,
> vsa")
> -         (match_operand:SVE_FULL_I 2 "register_operand" "w, 0, w")))]
> +  [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?&w")
> +       (minus:SVE_I
> +         (match_operand:SVE_I 1 "aarch64_sve_arith_operand" "w, vsa, vsa")
> +         (match_operand:SVE_I 2 "register_operand" "w, 0, w")))]
>    "TARGET_SVE"
>    "@
>     sub\t%0.<Vetype>, %1.<Vetype>, %2.<Vetype>
> 
> I noticed that this mode iterator was changed from SVE_I to SVE_FULL_I in
> Nov 2019 by richard to support partial SVE vectors. However, in the
> following patch the addition pattern is supported by changing SVE_FULL_I to
> SVE_I but not the subtraction pattern. Is there any specific reason why this
> pattern is not supported?

The idea was for that patch to add the bare minimum needed
to support the “unpacked vector” infrastructure.  Then, once the
infrastructure was in place, we could add support for other
unpacked vector operations too.

However, the infrastructure went in late during the GCC 10
cycle, so the idea was to postpone any new unpacked vector
support to GCC 11.  So far the only additional operations
has been Joe Ramsay's patches for logical operations
(g:bb3ab62a8b4a108f01ea2eddfe31e9f733bd9cb6 and
g:6802b5ba8234427598abfd9f0163eb5e7c0d6aa8).

The reason for not changing many operations at once is that,
in each case, a decision needs to be made whether the
operation should use the container mode (as for INDEX),
the element mode (as for right shifts, once they're
implemented) or whether it doesn't matter (as for addition).
Each operation also needs tests.  So from that point of view,
it's more convenient to have a separate patch for each
operation (or at least closely-related groups of operations).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/96366] [AArch64] ICE due to lack of support for VNx2SI sub instruction
  2020-07-29 12:21 [Bug target/96366] New: [AArch64] ICE due to lack of support for VNx2SI sub instruction bule1 at huawei dot com
  2020-07-29 13:41 ` [Bug target/96366] " rsandifo at gcc dot gnu.org
@ 2020-07-30 13:01 ` bule1 at huawei dot com
  2020-07-31 12:31 ` rsandifo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bule1 at huawei dot com @ 2020-07-30 13:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96366

--- Comment #2 from Bu Le <bule1 at huawei dot com> ---
(In reply to rsandifo@gcc.gnu.org from comment #1)
> (In reply to Bu Le from comment #0)
> Hmm.  In general, the lack of a vector pattern shouldn't case ICEs,
> but I suppose the add/sub pairing is somewhat special because of
> the canonicalisation rules.  It would be worth looking at exactly
> why we generate the subtract though, just to confirm that this is
> an “expected” ICE rather than a symptom of a deeper problem.

Sure. The logic is that the subtraction will be expanded in expr.c:8989, before
which I believe it still works fine. The gimple to be expand is 

vect__5.16_77 = { 4294967273, 4294967154, 4294967294, 4294967265 } -
vect__1.14_73

When the logic goes on, it went into the binop routine at expr:9948 because op1
(vect__1.14_73) is not a const op and missed the oppotunity to turn into an add
negative pair. Then,the routine will call expand_binop to finalize the
subtraction. The expand_binop function also has an oppotunity to turn this
subtraction to add a negative number, but also missed because op1 is not a
constant.

It occurs to me that we can brought the check for the availbility of this
pattern to the decision condition for whether turning the subtraction to a
addition of negative equivalent. This can be an insurance measurement for the
similar case that the pattern is missed, preventing the ICE. So I tried
following change, which turns out could also solve the problem by turning the
subtraction into addition as expected.

diff -Nurp gcc-20200728-org/gcc/optabs.c gcc-20200728/gcc/optabs.c
--- gcc-20200728-org/gcc/optabs.c       2020-07-29 15:53:52.760000000 +0800
+++ gcc-20200728/gcc/optabs.c   2020-07-30 11:00:00.964000000 +0800
@@ -1171,10 +1171,12 @@ expand_binop (machine_mode mode, optab b

   mclass = GET_MODE_CLASS (mode);

-  /* If subtracting an integer constant, convert this into an addition of
-     the negated constant.  */
+  /* If subtracting an integer constant, or if no subtraction pattern
available
+     for this mode, convert this into an addition of the negated constant.  */

-  if (binoptab == sub_optab && CONST_INT_P (op1))
+  if (binoptab == sub_optab
+      && (CONST_INT_P (op1)
+      || optab_handler (binoptab, mode) == CODE_FOR_nothing))
     {
       op1 = negate_rtx (mode, op1);
       binoptab = add_optab;


> The idea was for that patch to add the bare minimum needed
> to support the “unpacked vector” infrastructure.  Then, once the
> infrastructure was in place, we could add support for other
> unpacked vector operations too.
> 
> However, the infrastructure went in late during the GCC 10
> cycle, so the idea was to postpone any new unpacked vector
> support to GCC 11.  So far the only additional operations
> has been Joe Ramsay's patches for logical operations
> (g:bb3ab62a8b4a108f01ea2eddfe31e9f733bd9cb6 and
> g:6802b5ba8234427598abfd9f0163eb5e7c0d6aa8).
> 
> The reason for not changing many operations at once is that,
> in each case, a decision needs to be made whether the
> operation should use the container mode (as for INDEX),
> the element mode (as for right shifts, once they're
> implemented) or whether it doesn't matter (as for addition).
> Each operation also needs tests.  So from that point of view,
> it's more convenient to have a separate patch for each
> operation (or at least closely-related groups of operations).

Oh, I see. From the performance's point of view, I beleive that add the
subtraction pattern is necessary eventually. I compiled and ran the test case
attached with the subtraction pattern sulotion, which works fine. Logically,
the subtraction should be the same as the addition, which is not sensetive to
the operation mode.

My idea of solving this problem is that we upstream the patch for mode
extension independently, after which upstream the sub-to-add patch for
insurance that other cases might step into the same routine.

Any suggestions?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/96366] [AArch64] ICE due to lack of support for VNx2SI sub instruction
  2020-07-29 12:21 [Bug target/96366] New: [AArch64] ICE due to lack of support for VNx2SI sub instruction bule1 at huawei dot com
  2020-07-29 13:41 ` [Bug target/96366] " rsandifo at gcc dot gnu.org
  2020-07-30 13:01 ` bule1 at huawei dot com
@ 2020-07-31 12:31 ` rsandifo at gcc dot gnu.org
  2020-08-03 15:42 ` rsandifo at gcc dot gnu.org
  2020-08-04  2:30 ` bule1 at huawei dot com
  4 siblings, 0 replies; 6+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-07-31 12:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96366

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-07-31
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #3 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to Bu Le from comment #2)
> (In reply to rsandifo@gcc.gnu.org from comment #1)
> > (In reply to Bu Le from comment #0)
> > Hmm.  In general, the lack of a vector pattern shouldn't case ICEs,
> > but I suppose the add/sub pairing is somewhat special because of
> > the canonicalisation rules.  It would be worth looking at exactly
> > why we generate the subtract though, just to confirm that this is
> > an “expected” ICE rather than a symptom of a deeper problem.
> 
> Sure. The logic is that the subtraction will be expanded in expr.c:8989,
> before which I believe it still works fine. The gimple to be expand is 
> 
> vect__5.16_77 = { 4294967273, 4294967154, 4294967294, 4294967265 } -
> vect__1.14_73
Yeah.  What I was worried about was: why did we generate this
in the first place if the target doesn't support it?

But it looks like the answer is that, after
g:bb3ab62a8b4a108f01ea2eddfe31e9f733bd9cb6, SVE now advertises
support for both unpacked addition *and* unpacked negation.
Initially we generate a supported vector negation:

  vect_a0_52.15_78 = vect__1.14_73 + { 23, 142, 2, 31 };
  vect__5.16_77 = -vect_a0_52.15_78;

and this later gets folded into the subtraction above.
Before the patch there was no ICE.

Generating a subtraction out of an addition seemed odd since
canonicalisations usually go the other way.  But if the target
says it supports negation too then that changes things.  It doesn't
make much sense to support addition and negation but not subtraction.

I think that's enough to show that the patch to the .md file
is the right fix and isn't papering over a problem elsewhere.
Could you post it to gcc-patches?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/96366] [AArch64] ICE due to lack of support for VNx2SI sub instruction
  2020-07-29 12:21 [Bug target/96366] New: [AArch64] ICE due to lack of support for VNx2SI sub instruction bule1 at huawei dot com
                   ` (2 preceding siblings ...)
  2020-07-31 12:31 ` rsandifo at gcc dot gnu.org
@ 2020-08-03 15:42 ` rsandifo at gcc dot gnu.org
  2020-08-04  2:30 ` bule1 at huawei dot com
  4 siblings, 0 replies; 6+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-08-03 15:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96366

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #4 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Fixed by g:9623f61b142174b87760c81f78928dd14af7cbc6.

As far as I know, only GCC 11 needs the fix, but we can backport
to GCC 10 as well if we find a testcase that needs it.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/96366] [AArch64] ICE due to lack of support for VNx2SI sub instruction
  2020-07-29 12:21 [Bug target/96366] New: [AArch64] ICE due to lack of support for VNx2SI sub instruction bule1 at huawei dot com
                   ` (3 preceding siblings ...)
  2020-08-03 15:42 ` rsandifo at gcc dot gnu.org
@ 2020-08-04  2:30 ` bule1 at huawei dot com
  4 siblings, 0 replies; 6+ messages in thread
From: bule1 at huawei dot com @ 2020-08-04  2:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96366

--- Comment #5 from Bu Le <bule1 at huawei dot com> ---
(In reply to rsandifo@gcc.gnu.org from comment #3)
> (In reply to Bu Le from comment #2)
> > (In reply to rsandifo@gcc.gnu.org from comment #1)
> > > (In reply to Bu Le from comment #0)

> Generating a subtraction out of an addition seemed odd since
> canonicalisations usually go the other way.  But if the target
> says it supports negation too then that changes things.  It doesn't
> make much sense to support addition and negation but not subtraction.

If some mode or target do not have a subtraction pattern, should we let the
compiler try to use the addition and negation before it fall into an ICE? If
so, the changes for optabs seems reasonable as well.


(In reply to rsandifo@gcc.gnu.org from comment #4)
> Fixed by g:9623f61b142174b87760c81f78928dd14af7cbc6.
> 
> As far as I know, only GCC 11 needs the fix, but we can backport
> to GCC 10 as well if we find a testcase that needs it.

Sure, I will have a try to see whether the problem also exists in gcc10.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-04  2:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-29 12:21 [Bug target/96366] New: [AArch64] ICE due to lack of support for VNx2SI sub instruction bule1 at huawei dot com
2020-07-29 13:41 ` [Bug target/96366] " rsandifo at gcc dot gnu.org
2020-07-30 13:01 ` bule1 at huawei dot com
2020-07-31 12:31 ` rsandifo at gcc dot gnu.org
2020-08-03 15:42 ` rsandifo at gcc dot gnu.org
2020-08-04  2:30 ` bule1 at huawei dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).