public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/95964] New: AArch64 arm_neon.h arithmetic functions lack appropriate attributes
@ 2020-06-29 13:03 rsandifo at gcc dot gnu.org
  2020-06-29 13:40 ` [Bug target/95964] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-06-29 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95964

            Bug ID: 95964
           Summary: AArch64 arm_neon.h arithmetic functions lack
                    appropriate attributes
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
            Blocks: 95958
  Target Milestone: ---
            Target: aarch64*-*-*

For:

---------------------------------------
#include <arm_neon.h>
#include <vector>

std::vector<float32x4_t> a, b, c;

void
foo (size_t n)
{
  for (size_t i = 0; i < n; ++i)
    a[i] = vfmaq_f32(a[i], b[i], c[i]);
}
---------------------------------------

we generate code that loads the start of a, b and c
in every iteration of the loop:

---------------------------------------
        .cfi_startproc
        cbz     x0, .L4
        adrp    x3, .LANCHOR0
        add     x3, x3, :lo12:.LANCHOR0
        mov     x2, 0
        .p2align 3,,7
.L6:
        ldr     x4, [x3]
        lsl     x1, x2, 4
        ldr     x6, [x3, 24]
        add     x2, x2, 1
        ldr     x5, [x3, 48]
        ldr     q0, [x4, x1]
        ldr     q2, [x6, x1]
        ldr     q1, [x5, x1]
        fmla    v0.4s, v2.4s, v1.4s
        str     q0, [x4, x1]
        cmp     x0, x2
        bne     .L6
.L4:
        ret
        .cfi_endproc
---------------------------------------

The problem is that __builtin_aarch64_fmav4sf and similar
operations are treated as general functions that can read
memory, write memory, and call other functions.  If the
intrinsic is replaced by arithmetic then the start addresses
are hoisted, as expected.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95958
[Bug 95958] [meta-bug] Inefficient arm_neon.h code for AArch64

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/95964] AArch64 arm_neon.h arithmetic functions lack appropriate attributes
  2020-06-29 13:03 [Bug target/95964] New: AArch64 arm_neon.h arithmetic functions lack appropriate attributes rsandifo at gcc dot gnu.org
@ 2020-06-29 13:40 ` rguenth at gcc dot gnu.org
  2020-06-29 14:56 ` rsandifo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-06-29 13:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95964

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
You could use fnspec attributes to improve things but of course open-coding
those as GIMPLE is preferable (last resort is to "fold" the calls to
GIMPLE sequences as powerpc does for select builtins).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/95964] AArch64 arm_neon.h arithmetic functions lack appropriate attributes
  2020-06-29 13:03 [Bug target/95964] New: AArch64 arm_neon.h arithmetic functions lack appropriate attributes rsandifo at gcc dot gnu.org
  2020-06-29 13:40 ` [Bug target/95964] " rguenth at gcc dot gnu.org
@ 2020-06-29 14:56 ` rsandifo at gcc dot gnu.org
  2021-02-10 12:58 ` ktkachov at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-06-29 14:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95964

--- Comment #2 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> You could use fnspec attributes to improve things but of course open-coding
> those as GIMPLE is preferable (last resort is to "fold" the calls to
> GIMPLE sequences as powerpc does for select builtins).
Yeah, open-coding is another option, if the operation and
command-line options are right.  Part of the problem though
is that the intrinsics are supposed to behave like the associated
instructions, including in terms of honouring the rounding mode,
etc.  They're also not supposed to inherit C's idea of what's
undefined behaviour for the closest “equivalent” operators or
libm functions.

So yeah, I think adding attributes is the way to go in general.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/95964] AArch64 arm_neon.h arithmetic functions lack appropriate attributes
  2020-06-29 13:03 [Bug target/95964] New: AArch64 arm_neon.h arithmetic functions lack appropriate attributes rsandifo at gcc dot gnu.org
  2020-06-29 13:40 ` [Bug target/95964] " rguenth at gcc dot gnu.org
  2020-06-29 14:56 ` rsandifo at gcc dot gnu.org
@ 2021-02-10 12:58 ` ktkachov at gcc dot gnu.org
  2021-08-12  7:48 ` tnfchris at gcc dot gnu.org
  2021-08-20 11:45 ` rsandifo at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-02-10 12:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95964

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-02-10
     Ever confirmed|0                           |1
                 CC|                            |ktkachov at gcc dot gnu.org
             Status|UNCONFIRMED                 |NEW

--- Comment #3 from ktkachov at gcc dot gnu.org ---
In GCC 11 these builtins have do get a fnspec attribute and the start pointer
is hoisted. But this happens only with -fno-trapping-math (part of -Ofast)
because the operation can raise FP exceptions and therefore is considered to
modify global state unless -fnop-trapping-math.

Is that good enough for this PR or do we want more something more fine-grained?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/95964] AArch64 arm_neon.h arithmetic functions lack appropriate attributes
  2020-06-29 13:03 [Bug target/95964] New: AArch64 arm_neon.h arithmetic functions lack appropriate attributes rsandifo at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-02-10 12:58 ` ktkachov at gcc dot gnu.org
@ 2021-08-12  7:48 ` tnfchris at gcc dot gnu.org
  2021-08-20 11:45 ` rsandifo at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2021-08-12  7:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95964

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tnfchris at gcc dot gnu.org

--- Comment #4 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to ktkachov from comment #3)
> In GCC 11 these builtins have do get a fnspec attribute and the start
> pointer is hoisted. But this happens only with -fno-trapping-math (part of
> -Ofast) because the operation can raise FP exceptions and therefore is
> considered to modify global state unless -fnop-trapping-math.
> 
> Is that good enough for this PR or do we want more something more
> fine-grained?

This behavior seems reasonable to me given the default trapping math behavior. 
Is the current behavior good for you Richard S?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/95964] AArch64 arm_neon.h arithmetic functions lack appropriate attributes
  2020-06-29 13:03 [Bug target/95964] New: AArch64 arm_neon.h arithmetic functions lack appropriate attributes rsandifo at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-08-12  7:48 ` tnfchris at gcc dot gnu.org
@ 2021-08-20 11:45 ` rsandifo at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2021-08-20 11:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95964

--- Comment #5 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
There are some lingering aarch64-simd-builtins.def entries
that use “, ALL”, so I think we should keep this open until
they've all been converted.  The testcase was just an example,
rather than the single motivating case.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-20 11:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-29 13:03 [Bug target/95964] New: AArch64 arm_neon.h arithmetic functions lack appropriate attributes rsandifo at gcc dot gnu.org
2020-06-29 13:40 ` [Bug target/95964] " rguenth at gcc dot gnu.org
2020-06-29 14:56 ` rsandifo at gcc dot gnu.org
2021-02-10 12:58 ` ktkachov at gcc dot gnu.org
2021-08-12  7:48 ` tnfchris at gcc dot gnu.org
2021-08-20 11:45 ` rsandifo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).