public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/110630] New: Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets
@ 2023-07-11 14:49 macro at orcam dot me.uk
  2023-07-12  8:53 ` [Bug tree-optimization/110630] " rguenth at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: macro at orcam dot me.uk @ 2023-07-11 14:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110630

            Bug ID: 110630
           Summary: Missed optimization: bb-slp-pr95839.c not vectorised
                    with V2SF targets
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: macro at orcam dot me.uk
                CC: rguenth at gcc dot gnu.org
            Blocks: 53947
  Target Milestone: ---

With targets that only support the V2SF vector mode such as:

mips-linux-gnu-gcc -march=mips64 -mabi=64 -mpaired-single

scalar code is produced for V4SF data with bb-slp-pr95839.c while a
pair of V2SF operations would be more efficient.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/110630] Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets
  2023-07-11 14:49 [Bug tree-optimization/110630] New: Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets macro at orcam dot me.uk
@ 2023-07-12  8:53 ` rguenth at gcc dot gnu.org
  2023-07-12 11:02 ` cvs-commit at gcc dot gnu.org
  2023-07-12 11:03 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-12  8:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110630

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2023-07-12
     Ever confirmed|0                           |1
           Keywords|                            |missed-optimization
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
typedef float __attribute__((vector_size(32))) v8f32;

v8f32 f(v8f32 a, v8f32 b)
{
  /* Check that we vectorize this CTOR without any loads.  */
  return (v8f32){a[0] + b[0], a[1] + b[1], a[2] + b[2], a[3] + b[3],
      a[4] + b[4], a[5] + b[5], a[6] + b[6], a[7] + b[7]};
}

fails to optimally vectorize with SSE2 on x86_64 (would need AVX2).

It works OK when avoiding ABI issues like with the following so the
importance of fixing this might be low.

typedef float __attribute__((vector_size(32))) v8f32;
v8f32 a, b;
v8f32 res;
void f()
{
  /* Check that we vectorize this CTOR without any loads.  */
  res = (v8f32){a[0] + b[0], a[1] + b[1], a[2] + b[2], a[3] + b[3],
      a[4] + b[4], a[5] + b[5], a[6] + b[6], a[7] + b[7]};
}

the issue on x86_64 is that we run into

t.c:6:10: note:   vectorizing permutation op0[0] op0[1] op0[2] op0[3] op0[4]
op0[5] op0[6] op0[7]
t.c:6:10: note:   vectorizing permutation op0[0] op0[1] op0[2] op0[3] op0[4]
op0[5] op0[6] op0[7]
t.c:6:10: note:   as vops0[0][0] vops0[0][1] vops0[0][2] vops0[0][3],
vops0[0][4] vops0[0][5] vops0[0][6] vops0[0][7]
t.c:6:10: missed:   unsupported vect permute { 4 5 6 7 }
t.c:6:10: note:   Building vector operands of 0x47865f0 from scalars instead

the issue on mips with -mpaired-single is the same:

t.c:6:10: note:   vectorizing permutation op0[0] op0[1] op0[2] op0[3]
t.c:6:10: note:   vectorizing permutation op0[0] op0[1] op0[2] op0[3]
t.c:6:10: note:   as vops0[0][0] vops0[0][1], vops0[0][2] vops0[0][3]
t.c:6:10: missed:   unsupported vect permute { 2 3 }

but interestingly it doesn't emit any psABI warning so maybe it has a
defined ABI for the V4SFmode vectors.

The fix is to vectorizable_slp_permutation to try a vector extraction as well,
or for BLKmode vector operands simply allow this to go through.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/110630] Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets
  2023-07-11 14:49 [Bug tree-optimization/110630] New: Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets macro at orcam dot me.uk
  2023-07-12  8:53 ` [Bug tree-optimization/110630] " rguenth at gcc dot gnu.org
@ 2023-07-12 11:02 ` cvs-commit at gcc dot gnu.org
  2023-07-12 11:03 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-12 11:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110630

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:25f831eab368d1bbec4dc67bf058cb7cf6b721ee

commit r14-2460-g25f831eab368d1bbec4dc67bf058cb7cf6b721ee
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Jul 12 11:19:58 2023 +0200

    tree-optimization/110630 - enhance SLP permute support

    The following enhances the existing lowpart extraction support for
    SLP VEC_PERM nodes to cover all vector aligned extractions.  This
    allows the existing bb-slp-pr95839.c testcase to be vectorized
    with mips -mpaired-single and the new bb-slp-pr95839-3.c testcase
    with SSE2.

            PR tree-optimization/110630
            * tree-vect-slp.cc (vect_add_slp_permutation): New
            offset parameter, honor that for the extract code generation.
            (vectorizable_slp_permutation_1): Handle offsetted identities.

            * gcc.dg/vect/bb-slp-pr95839.c: Make stricter.
            * gcc.dg/vect/bb-slp-pr95839-3.c: New variant testcase.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/110630] Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets
  2023-07-11 14:49 [Bug tree-optimization/110630] New: Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets macro at orcam dot me.uk
  2023-07-12  8:53 ` [Bug tree-optimization/110630] " rguenth at gcc dot gnu.org
  2023-07-12 11:02 ` cvs-commit at gcc dot gnu.org
@ 2023-07-12 11:03 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-12 11:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110630

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-12 11:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-11 14:49 [Bug tree-optimization/110630] New: Missed optimization: bb-slp-pr95839.c not vectorised with V2SF targets macro at orcam dot me.uk
2023-07-12  8:53 ` [Bug tree-optimization/110630] " rguenth at gcc dot gnu.org
2023-07-12 11:02 ` cvs-commit at gcc dot gnu.org
2023-07-12 11:03 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).