[Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
@ 2020-06-29 15:50 rsandifo at gcc dot gnu.org
  2021-09-02  5:31 ` [Bug target/95969] " pinskia at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-06-29 15:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

            Bug ID: 95969
           Summary: Use of __builtin_aarch64_im_lane_boundsi in AArch64
                    arm_neon.h interferes with gimple optimisation
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
            Blocks: 95958
  Target Milestone: ---
            Target: aarch64*-*-*

For:

----------------------------------------------------
#include <arm_neon.h>

void
f (float32x4_t **ptr)
{
  float32x4_t res = vsetq_lane_f32 (0.0f, **ptr, 0);
  **ptr = res;
}
----------------------------------------------------

the final gimple .optimized dump looks like:

----------------------------------------------------
  float32x4_t __vec;
  float32x4_t * _1;
  __Float32x4_t _2;
  float32x4_t * _3;

  <bb 2> [local count: 1073741824]:
  _1 = *ptr_5(D);
  _2 = *_1;
  __builtin_aarch64_im_lane_boundsi (16, 4, 0);
  __vec_8 = BIT_INSERT_EXPR <_2, 0.0, 0>;
  _3 = *ptr_5(D);
  *_3 = __vec_8;
  return;
----------------------------------------------------

where we still have two loads from *ptr.  This is because
__builtin_aarch64_im_lane_boundsi has no attributes:
it's modelled a general function that could do pretty
much anything.

Although we fix this testcase in the RTL optimisers, it's easy
for the issue to cause unoptimised code in larger, more realistic
testcases.

The problem is similar to PR95964.  The difficulty is that
here we have to model the function as having some kind of
side-effect, otherwise it will simply get optimised away.

Ideally we'd fix this by implementing the intrinsics directly
in the compiler and doing the checks in the frontend via
TARGET_CHECK_BUILTIN_CALL.  That's obviously a big change
though.  Until then, we should optimise away calls whose
arguments are already correct so that they don't clog
up the IL.

If not returning a value makes it harder to fold the call
for some reason, perhaps an alternative would be to pass
a vector value through a dummy const function, e.g.:

  _4 = __builtin_aarch64_... (_2, 16, 4, 0);
  __vec_8 = BIT_INSERT_EXPR <_4, 0.0, 0>;

That might not be necessary though -- haven't checked.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95958
[Bug 95958] [meta-bug] Inefficient arm_neon.h code for AArch64

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
@ 2021-09-02  5:31 ` pinskia at gcc dot gnu.org
  2021-09-02  5:32 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  5:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-09-02
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
                 CC|                            |pinskia at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I guess a data dependence is indeed the only viable thing here, but that
will also hinder optimization (on that data).  If you're merely delaying
some static checking to RTL expansion by this builtin that's of course
quite stupid.

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

Shouldn't be too hard to add a few attributes to it or fold it when the
arguments are constant.

Let me look into doing the second thing.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
  2021-09-02  5:31 ` [Bug target/95969] " pinskia at gcc dot gnu.org
@ 2021-09-02  5:32 ` pinskia at gcc dot gnu.org
  2021-09-02  7:14 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  5:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |pinskia at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
  2021-09-02  5:31 ` [Bug target/95969] " pinskia at gcc dot gnu.org
  2021-09-02  5:32 ` pinskia at gcc dot gnu.org
@ 2021-09-02  7:14 ` pinskia at gcc dot gnu.org
  2021-09-02  8:28 ` ktkachov at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 51396
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51396&action=edit
Patch

Simple patch which adds both generic and gimple level folding for
__builtin_aarch64_im_lane_boundsi.
In this case (and most likely others), __builtin_aarch64_im_lane_boundsi is
removed during early inlining so it will fix the majority of the issue.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-09-02  7:14 ` pinskia at gcc dot gnu.org
@ 2021-09-02  8:28 ` ktkachov at gcc dot gnu.org
  2021-09-02  8:37 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-09-02  8:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ktkachov at gcc dot gnu.org

--- Comment #4 from ktkachov at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #3)
> Created attachment 51396 [details]
> Patch
> 
> Simple patch which adds both generic and gimple level folding for
> __builtin_aarch64_im_lane_boundsi.
> In this case (and most likely others), __builtin_aarch64_im_lane_boundsi is
> removed during early inlining so it will fix the majority of the issue.

looks like the wrong patch was attached?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-09-02  8:28 ` ktkachov at gcc dot gnu.org
@ 2021-09-02  8:37 ` pinskia at gcc dot gnu.org
  2021-09-02 10:35 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02  8:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51396|0                           |1
        is obsolete|                            |

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 51397
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51397&action=edit
Correct patch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-09-02  8:37 ` pinskia at gcc dot gnu.org
@ 2021-09-02 10:35 ` pinskia at gcc dot gnu.org
  2021-09-02 20:51 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02 10:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There are two bugs with this change.
First is in the change itself; I forgot to check to make sure totalsize and
elementsize are non-zero. I have that fixed.
The second issue is some gimple level passes are not ready to change the
function to a GIMPLE_NOP yet.  I will fix those tomorrow.

Right now I see evpr and cunroll so far.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-09-02 10:35 ` pinskia at gcc dot gnu.org
@ 2021-09-02 20:51 ` pinskia at gcc dot gnu.org
  2021-09-02 22:26 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02 20:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Testcases that fail:
gcc.target/aarch64/vmov_n_1.c

at -O1:t.c: In function ‘test_vmov_n_f32’:
t.c:165:10: error: missing definition
for SSA_NAME: .MEM_161 in statement:
# VUSE <.MEM_161>
_162 = BIT_FIELD_REF <__a, 32, 0>;
during GIMPLE pass: cunroll
0xfdd233 verify_ssa(bool, bool)
        /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/tree-ssa.c:1214
0xf092ab verify_loop_closed_ssa(bool, loop*)
        /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/tree-ssa-loop-manip.c:766
0xf092ab verify_loop_closed_ssa(bool, loop*)
        /home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/tree-ssa-loop-manip.c:760
0xeeeed3 tree_unroll_loops_completely
       
/home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/tree-ssa-loop-ivcanon.c:1498
0xeeef97 execute
       
/home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/tree-ssa-loop-ivcanon.c:1602
0xeeef97 execute
       
/home/ubuntu/src/upstream-gcc-aarch64/gcc/gcc/tree-ssa-loop-ivcanon.c:1592
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

at -O2+:
t.c: In function ‘test_vmov_n_f64’:
t.c:356:1: error: missing definition
  356 | }
      | ^
for SSA_NAME: .MEM_129 in statement:
# .MEM_131 = VDEF <.MEM_129>
__a ={v} {CLOBBER};
during GIMPLE pass: evrp
t.c:356:1: internal compiler error: verify_ssa failed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-09-02 20:51 ` pinskia at gcc dot gnu.org
@ 2021-09-02 22:26 ` pinskia at gcc dot gnu.org
  2021-09-13 15:19 ` cvs-commit at gcc dot gnu.org
  2021-09-13 15:20 ` pinskia at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-02 22:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #51397|0                           |1
        is obsolete|                            |

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 51407
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51407&action=edit
Updated patch which should fix the regressions

So it turns out it was not evpr/complete_unrolling handling of this but rather
if a replacement happens inside gimple_fold you need to unlink the vdefs and
such.

Attached is the new patch which I am now doing a full test run on.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-09-02 22:26 ` pinskia at gcc dot gnu.org
@ 2021-09-13 15:19 ` cvs-commit at gcc dot gnu.org
  2021-09-13 15:20 ` pinskia at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-13 15:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Andrew Pinski <pinskia@gcc.gnu.org>:

https://gcc.gnu.org/g:03312cbd54f337dfb25be356a1d1abc9925c6c03

commit r12-3493-g03312cbd54f337dfb25be356a1d1abc9925c6c03
Author: Andrew Pinski <apinski@marvell.com>
Date:   Thu Sep 2 07:08:22 2021 +0000

    [aarch64] Fix target/95969: __builtin_aarch64_im_lane_boundsi interferes
with gimple

    This patch adds simple folding of __builtin_aarch64_im_lane_boundsi where
    we are not going to error out. It fixes the problem by the removal
    of the function from the IR.

    OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.

    gcc/ChangeLog:

            PR target/95969
            * config/aarch64/aarch64-builtins.c
(aarch64_fold_builtin_lane_check):
            New function.
            (aarch64_general_fold_builtin): Handle
AARCH64_SIMD_BUILTIN_LANE_CHECK.
            (aarch64_general_gimple_fold_builtin): Likewise.

    gcc/testsuite/ChangeLog:

            PR target/95969
            * gcc.target/aarch64/lane-bound-1.c: New test.
            * gcc.target/aarch64/lane-bound-2.c: New test.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/95969] Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation
  2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-09-13 15:19 ` cvs-commit at gcc dot gnu.org
@ 2021-09-13 15:20 ` pinskia at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-13 15:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95969

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed on the trunk for GCC 12.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-13 15:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-29 15:50 [Bug target/95969] New: Use of __builtin_aarch64_im_lane_boundsi in AArch64 arm_neon.h interferes with gimple optimisation rsandifo at gcc dot gnu.org
2021-09-02  5:31 ` [Bug target/95969] " pinskia at gcc dot gnu.org
2021-09-02  5:32 ` pinskia at gcc dot gnu.org
2021-09-02  7:14 ` pinskia at gcc dot gnu.org
2021-09-02  8:28 ` ktkachov at gcc dot gnu.org
2021-09-02  8:37 ` pinskia at gcc dot gnu.org
2021-09-02 10:35 ` pinskia at gcc dot gnu.org
2021-09-02 20:51 ` pinskia at gcc dot gnu.org
2021-09-02 22:26 ` pinskia at gcc dot gnu.org
2021-09-13 15:19 ` cvs-commit at gcc dot gnu.org
2021-09-13 15:20 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).