public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison
@ 2023-08-12 20:11 pinskia at gcc dot gnu.org
  2023-08-12 20:48 ` [Bug tree-optimization/111006] " pinskia at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-12 20:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006

            Bug ID: 111006
           Summary: [SVE] Extra neg for storing to short from int
                    comparison
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
void __attribute__ ((noipa))
f0 (unsigned short *__restrict r,
   int *__restrict a,
   int *__restrict pred)
{
  for (int i = 0; i < 1024; ++i)
  {
    int p = pred[i]?-1:0;
    r[i] = p ;
  }
}
```
Compile with `-march=armv8.5+sve2 -O3`.

Currently we get:
```
.L2:
        ld1w    z31.s, p7/z, [x2, x1, lsl 2]
        cmpne   p15.s, p6/z, z31.s, #0
        mov     z31.s, p15/z, #1
        neg     z31.h, p6/m, z31.h
        st1h    z31.s, p7, [x0, x1, lsl 1]
        incw    x1
        whilelo p7.s, w1, w3
        b.any   .L2
```
But we should just get:
```
.L2:
        ld1w    z31.s, p7/z, [x2, x1, lsl 2]
        cmpne   p15.s, p6/z, z31.s, #0
        mov     z31.s, p15/z, #-1
        st1h    z31.s, p7, [x0, x1, lsl 1]
        incw    x1
        whilelo p7.s, w1, w3
        b.any   .L2
```

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison
  2023-08-12 20:11 [Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison pinskia at gcc dot gnu.org
@ 2023-08-12 20:48 ` pinskia at gcc dot gnu.org
  2023-08-12 20:54 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-12 20:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |pinskia at gcc dot gnu.org
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2023-08-12
     Ever confirmed|0                           |1

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
We have:
  vect_patt_30.10_67 = VEC_COND_EXPR <mask__16.9_64, { 1, ... }, { 0, ... }>;
  vect_patt_29.11_68 = (vector([4,4]) signed short) vect_patt_30.10_67;
  vect_patt_28.12_69 = -vect_patt_29.11_68;

So:
/* Sink convert to branches, but only if we do fold both and @0 is the thurth
type for the new version too. */
(simplify
 (convert (vec_cond:s @0 @1 @2))
 (if (is_truth_type_for (type, TREE_TYPE (@0)))
  (vec_cond @0 (convert! @1) (convert! @2))))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison
  2023-08-12 20:11 [Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison pinskia at gcc dot gnu.org
  2023-08-12 20:48 ` [Bug tree-optimization/111006] " pinskia at gcc dot gnu.org
@ 2023-08-12 20:54 ` pinskia at gcc dot gnu.org
  2023-08-15 22:03 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-12 20:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note the non-SVE code generation can be improved too.
With:

(simplify
 (negate
  (vec_pack_trunc:s
   (vec_cond:s @0 uniform_integer_cst_p@1 uniform_integer_cst_p@2)
   (vec_cond:s @3 @1 @2)))
  (with {
   tree outer_mask_type = truth_type_for (type);
   tree allones = build_minus_one_cst (type);
   tree zeros = build_zero_cst (type);
  }
  (if (integer_onep (@1) && integer_zerop (@2))
   (vec_cond (vec_pack_trunc:outer_mask_type @0 @3) { allones; } { zeros; } )
   (if (integer_onep (@2) && integer_zerop (@1))
    (vec_cond (vec_pack_trunc:outer_mask_type @0 @3) { zeros; } { allones; }
)))))


I will submit both later next week.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison
  2023-08-12 20:11 [Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison pinskia at gcc dot gnu.org
  2023-08-12 20:48 ` [Bug tree-optimization/111006] " pinskia at gcc dot gnu.org
  2023-08-12 20:54 ` pinskia at gcc dot gnu.org
@ 2023-08-15 22:03 ` pinskia at gcc dot gnu.org
  2023-08-20  7:27 ` cvs-commit at gcc dot gnu.org
  2023-08-20  7:27 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-15 22:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 55740
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55740&action=edit
The patch which fixes the SVE part

It took me longer to come up with this due to failures which I didn't know
about expand_vec_cond_expr_p, I was just thinking is_truth_type_for would have
worked here but nope that crashes for types with BLKmode .

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison
  2023-08-12 20:11 [Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison pinskia at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-08-15 22:03 ` pinskia at gcc dot gnu.org
@ 2023-08-20  7:27 ` cvs-commit at gcc dot gnu.org
  2023-08-20  7:27 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-08-20  7:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Andrew Pinski <pinskia@gcc.gnu.org>:

https://gcc.gnu.org/g:70c50c87273d940918225d5c6b03f1ccfb6f978e

commit r14-3337-g70c50c87273d940918225d5c6b03f1ccfb6f978e
Author: Andrew Pinski <apinski@marvell.com>
Date:   Mon Aug 14 18:35:53 2023 -0700

    MATCH: Sink convert for vec_cond

    Convert be sinked into a vec_cond if both sides
    fold. Unlike other unary operations, we need to check that we still can
handle
    this vec_cond's first operand is the same as the new truth type.

    I tried a few different versions of this patch:
    view_convert to the new truth_type but that does not work as we always
support all vec_cond
    afterwards.
    using expand_vec_cond_expr_p; but that would allow too much.

    I also tried to see if view_convert can be handled here but we end up with:
      _3 = VEC_COND_EXPR <_2, {  Nan(-1),  Nan(-1),  Nan(-1),  Nan(-1) }, {
0.0, 0.0, 0.0, 0.0 }>;
    Which isel does not know how to handle as just being a view_convert from
`vector(4) <signed-boolean:32>`
    to `vector(4) float` and causes a regression with
`g++.target/i386/pr88152.C`

    Note, in the case of the SVE testcase, we will sink negate after the
convert and be able
    to remove a few extra instructions in the end.
    Also with this change gcc.target/aarch64/sve/cond_unary_5.c will now pass.

    Committed as approved after a bootstrapped and tested on x86_64-linux-gnu
and aarch64-linux-gnu.

    gcc/ChangeLog:

            PR tree-optimization/111006
            PR tree-optimization/110986
            * match.pd: (op(vec_cond(a,b,c))): Handle convert for op.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/111006
            * gcc.target/aarch64/sve/cond_convert_7.c: New test.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison
  2023-08-12 20:11 [Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison pinskia at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-08-20  7:27 ` cvs-commit at gcc dot gnu.org
@ 2023-08-20  7:27 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-20  7:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |14.0
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-20  7:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-12 20:11 [Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison pinskia at gcc dot gnu.org
2023-08-12 20:48 ` [Bug tree-optimization/111006] " pinskia at gcc dot gnu.org
2023-08-12 20:54 ` pinskia at gcc dot gnu.org
2023-08-15 22:03 ` pinskia at gcc dot gnu.org
2023-08-20  7:27 ` cvs-commit at gcc dot gnu.org
2023-08-20  7:27 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).