public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
@ 2021-01-29 14:49 clyon at gcc dot gnu.org
  2021-01-29 16:14 ` [Bug target/98891] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: clyon at gcc dot gnu.org @ 2021-01-29 14:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

            Bug ID: 98891
           Summary: [11 regression] Neon logical operations not vectorized
                    in DImode since
                    g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 (r10-2761), I have noticed
that vorr/vorn is no longer vectorized in DImode.

I am compiling a modified version of
gcc/testsuite/gcc.target/arm/neon-vorns64.c with -mfloat-abi=hard
-mcpu=cortex-a9 -mfpu=auto -O3:

====================================================
#include "arm_neon.h"
#include <stdlib.h>

int64x1_t out_int64x1_t = 0;
int64x1_t arg0_int64x1_t = (int64x1_t)0xdeadbeef00000000LL;
int64x1_t arg1_int64x1_t = (int64x1_t)(~0xdead00000000beefLL);

int main (void)
{

  out_int64x1_t = vorn_s64 (arg0_int64x1_t, arg1_int64x1_t);
  if (out_int64x1_t != (int64x1_t)0xdeadbeef0000beefLL)
    abort();
  return 0;
}
====================================================

Before that commit I get:

        vldr.64 d17, [r3]       @ int
...
        vldr.64 d16, [r3, #8]   @ int
        vorn    d16, d16, d17
...

After that commit:
        ldr     lr, [r3]
        ldr     r4, [r3, #8]
        ldr     ip, [r3, #4]
        ldr     r6, [r3, #12]
        mvn     r3, lr
        orr     r0, r4, r3
...
        mvn     r3, ip
        orr     r1, r6, r3
...


Recent trunk has:
        ldrd    r2, [r1]
        ldrd    r0, [r1, #8]
        mvn     r2, r2
        mvn     r3, r3
        orr     r2, r2, r0
...
        orr     r3, r3, r1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
@ 2021-01-29 16:14 ` rguenth at gcc dot gnu.org
  2021-02-26 11:53 ` [Bug target/98891] [10/11 " rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-29 16:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Target Milestone|---                         |11.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [10/11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
  2021-01-29 16:14 ` [Bug target/98891] " rguenth at gcc dot gnu.org
@ 2021-02-26 11:53 ` rguenth at gcc dot gnu.org
  2021-03-17  9:37 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-26 11:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.0                        |10.3
            Summary|[11 regression] Neon        |[10/11 regression] Neon
                   |logical operations not      |logical operations not
                   |vectorized in DImode since  |vectorized in DImode since
                   |g:cdfc0e863a03698a80c74896c |g:cdfc0e863a03698a80c74896c
                   |bdc9f5c8c652e64             |bdc9f5c8c652e64
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [10/11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
  2021-01-29 16:14 ` [Bug target/98891] " rguenth at gcc dot gnu.org
  2021-02-26 11:53 ` [Bug target/98891] [10/11 " rguenth at gcc dot gnu.org
@ 2021-03-17  9:37 ` jakub at gcc dot gnu.org
  2021-03-17  9:39 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-17  9:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |wilco at gcc dot gnu.org
   Last reconfirmed|                            |2021-03-17

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Reduced testcase:
extern unsigned long long a, b, c;

void
foo (void)
{
  a = b | ~c;
}

Seems this is the usual dilemma between split double-word operations early vs.
split it late, each has its advantages and serious disadvantages.
By splitting early, combiner can't really do much with it, it is split into
loads, not, or and store of the halves separately and combiner doesn't see the
two halves together, one would need essentially vectorization on RTL to match
that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [10/11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-03-17  9:37 ` jakub at gcc dot gnu.org
@ 2021-03-17  9:39 ` jakub at gcc dot gnu.org
  2021-03-17 12:58 ` wilco at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-17  9:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
E.g. x86_64 (both -m32 and -m64) keeps the double-word logicals in the IL, then
has its machine dependent stv pass that promotes some sets of operations into
SIMD ones and finally (admittedly, clearly too late) splits the double-word
operations into the operations on halves when SIMD wasn't beneficial.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [10/11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-03-17  9:39 ` jakub at gcc dot gnu.org
@ 2021-03-17 12:58 ` wilco at gcc dot gnu.org
  2021-03-17 13:09 ` wilco at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: wilco at gcc dot gnu.org @ 2021-03-17 12:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

--- Comment #3 from Wilco <wilco at gcc dot gnu.org> ---
Older GCCs only ever did this for vorn, not for other operations like
add/sub/and/orr/eor, so current behaviour is now fully consistent, and I don't
consider it a bug.

One could argue these intrinsics should always map to Neon instructions rather
than being optimized into 64-bit integer operations. However GCC never did
support this except for vorn, so it's not clear whether there is an advantage
in changing this now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [10/11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-03-17 12:58 ` wilco at gcc dot gnu.org
@ 2021-03-17 13:09 ` wilco at gcc dot gnu.org
  2021-04-08 12:02 ` rguenth at gcc dot gnu.org
  2021-04-08 12:10 ` wilco at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: wilco at gcc dot gnu.org @ 2021-03-17 13:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

--- Comment #4 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #1)
> Reduced testcase:
> extern unsigned long long a, b, c;
> 
> void
> foo (void)
> {
>   a = b | ~c;
> }
> 
> Seems this is the usual dilemma between split double-word operations early
> vs. split it late, each has its advantages and serious disadvantages.
> By splitting early, combiner can't really do much with it, it is split into
> loads, not, or and store of the halves separately and combiner doesn't see
> the two halves together, one would need essentially vectorization on RTL to
> match that.

Splitting early is required since it results in much more efficient code.
However the real underlying problem is the concept that a type can map to
different register files. Generally a compiler must decide the register file
for each operand before register allocation, but GCC does this during register
allocation. And it does it badly with incomplete knowledge and way too many
costing hacks. To get decent code for AArch64 we had to add special hooks to
force the allocator to strongly prefer allocating integer types to integer
registers and FP/SIMD types to FP/SIMD registers.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [10/11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-03-17 13:09 ` wilco at gcc dot gnu.org
@ 2021-04-08 12:02 ` rguenth at gcc dot gnu.org
  2021-04-08 12:10 ` wilco at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-08 12:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.3                        |10.4

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10.3 is being released, retargeting bugs to GCC 10.4.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98891] [10/11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64
  2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-04-08 12:02 ` rguenth at gcc dot gnu.org
@ 2021-04-08 12:10 ` wilco at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: wilco at gcc dot gnu.org @ 2021-04-08 12:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|NEW                         |RESOLVED

--- Comment #6 from Wilco <wilco at gcc dot gnu.org> ---
Current codegen is more optimal (there is no gain from using Neon for 64-bit
types in general), so closing.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-04-08 12:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 14:49 [Bug target/98891] New: [11 regression] Neon logical operations not vectorized in DImode since g:cdfc0e863a03698a80c74896cbdc9f5c8c652e64 clyon at gcc dot gnu.org
2021-01-29 16:14 ` [Bug target/98891] " rguenth at gcc dot gnu.org
2021-02-26 11:53 ` [Bug target/98891] [10/11 " rguenth at gcc dot gnu.org
2021-03-17  9:37 ` jakub at gcc dot gnu.org
2021-03-17  9:39 ` jakub at gcc dot gnu.org
2021-03-17 12:58 ` wilco at gcc dot gnu.org
2021-03-17 13:09 ` wilco at gcc dot gnu.org
2021-04-08 12:02 ` rguenth at gcc dot gnu.org
2021-04-08 12:10 ` wilco at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).