public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102575] New: Failure to optimize double _Complex stores to use largest loads/stores possible
@ 2021-10-03 14:01 gabravier at gmail dot com
  2021-10-03 18:48 ` [Bug tree-optimization/102575] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: gabravier at gmail dot com @ 2021-10-03 14:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102575

            Bug ID: 102575
           Summary: Failure to optimize double _Complex stores to use
                    largest loads/stores possible
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

void test(double _Complex *a)
{
    a[0] = 1;
    a[1] = 1;
}

With -O3, on AMD64 GCC outputs this:

test(double _Complex*):
        movsd   xmm1, QWORD PTR .LC0[rip]
        movsd   xmm0, QWORD PTR .LC0[rip+8]
        movsd   QWORD PTR [rdi], xmm1
        movsd   QWORD PTR [rdi+8], xmm0
        movsd   QWORD PTR [rdi+16], xmm1
        movsd   QWORD PTR [rdi+24], xmm0
        ret

Clang instead outputs this:

test(double _Complex*):
        movsd   xmm0, qword ptr [rip + .LCPI0_0] # xmm0 = mem[0],zero
        movups  xmmword ptr [rdi], xmm0
        movups  xmmword ptr [rdi + 16], xmm0
        ret

It seems to me like the second output should always be faster.

PS: The difference is even larger with `-mavx2`.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102575] Failure to optimize double _Complex stores to use largest loads/stores possible
  2021-10-03 14:01 [Bug target/102575] New: Failure to optimize double _Complex stores to use largest loads/stores possible gabravier at gmail dot com
@ 2021-10-03 18:48 ` pinskia at gcc dot gnu.org
  2021-10-04  7:21 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-03 18:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102575

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |tree-optimization
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-10-03
           Severity|normal                      |enhancement

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---

/app/example.cpp:5:1: note:  === vect_slp_analyze_bb ===
/app/example.cpp:5:1: note:   === vect_analyze_data_refs ===
/app/example.cpp:5:1: missed:   not vectorized: no vectype for stmt: *a_2(D) =
__complex__ (1.0e+0, 0.0);
 scalar_type: complex double
/app/example.cpp:5:1: missed:   not vectorized: no vectype for stmt:
MEM[(complex double *)a_2(D) + 16B] = __complex__ (1.0e+0, 0.0);
 scalar_type: complex double
/app/example.cpp:5:1: note:   === vect_analyze_data_ref_accesses ===
/app/example.cpp:5:1: missed:  not vectorized: no grouped stores in basic
block.
/app/example.cpp:5:1: note: ***** Analysis failed with vector mode VOID
void test (complex double * a)

Yes SLP does not understand how to handle complex double

And then expand splits the complex type store into two different stores.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102575] Failure to optimize double _Complex stores to use largest loads/stores possible
  2021-10-03 14:01 [Bug target/102575] New: Failure to optimize double _Complex stores to use largest loads/stores possible gabravier at gmail dot com
  2021-10-03 18:48 ` [Bug tree-optimization/102575] " pinskia at gcc dot gnu.org
@ 2021-10-04  7:21 ` rguenth at gcc dot gnu.org
  2022-07-20  8:18 ` crazylht at gmail dot com
  2022-07-20  9:06 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-04  7:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102575

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org
             Blocks|                            |53947

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The vectorizer doesn't like vector or complex typed loads and stores (and it
starts before SLP).  There are duplicate bugreports and it's unfortunately not
as easy as it looks like.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102575] Failure to optimize double _Complex stores to use largest loads/stores possible
  2021-10-03 14:01 [Bug target/102575] New: Failure to optimize double _Complex stores to use largest loads/stores possible gabravier at gmail dot com
  2021-10-03 18:48 ` [Bug tree-optimization/102575] " pinskia at gcc dot gnu.org
  2021-10-04  7:21 ` rguenth at gcc dot gnu.org
@ 2022-07-20  8:18 ` crazylht at gmail dot com
  2022-07-20  9:06 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: crazylht at gmail dot com @ 2022-07-20  8:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102575

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |crazylht at gmail dot com

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
This should be fixed by r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102575] Failure to optimize double _Complex stores to use largest loads/stores possible
  2021-10-03 14:01 [Bug target/102575] New: Failure to optimize double _Complex stores to use largest loads/stores possible gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2022-07-20  8:18 ` crazylht at gmail dot com
@ 2022-07-20  9:06 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-20  9:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102575

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Thanks Hongtao

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-20  9:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-03 14:01 [Bug target/102575] New: Failure to optimize double _Complex stores to use largest loads/stores possible gabravier at gmail dot com
2021-10-03 18:48 ` [Bug tree-optimization/102575] " pinskia at gcc dot gnu.org
2021-10-04  7:21 ` rguenth at gcc dot gnu.org
2022-07-20  8:18 ` crazylht at gmail dot com
2022-07-20  9:06 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).