public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/116573] New: [15 Regression] Recent SLP work appears to generate significantly worse code on RISC-V
@ 2024-09-02 16:47 law at gcc dot gnu.org
  2024-09-02 17:37 ` [Bug tree-optimization/116573] " rguenther at suse dot de
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: law at gcc dot gnu.org @ 2024-09-02 16:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116573

            Bug ID: 116573
           Summary: [15 Regression] Recent SLP work appears to generate
                    significantly worse code on RISC-V
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: law at gcc dot gnu.org
  Target Milestone: ---

This change:

commit 9aaedfc4146c5e4b8412913a6ca4092a2731c35c (HEAD)
Author: Richard Biener <rguenther@suse.de>
Date:   Fri Jul 5 10:35:08 2024 +0200

    load and store-lanes with SLP

    The following is a prototype for how to represent load/store-lanes
    within SLP.  I've for now settled with having a single load node
    with multiple permute nodes acting as selection, one for each loaded lane
    and a single store node fed from all stored lanes.  For
[ ... ]

Is causing a multiple scan failures in the testsuite.  Several are "don't care"
changes in code generation.  But one class seems to indicate a notable
regression in the quality of the generated code:

Before the change for the attached testcase we would generate:

.L3:
        vsetvli a5,a3,e8,m1,ta,ma
        vle8.v  v2,0(a0)
        vle8.v  v3,0(a1)
        slli    a4,a5,1
        sub     a3,a3,a5
        add     a0,a0,a5
        add     a1,a1,a5
        vsseg2e8.v      v2,(a2)
        add     a2,a2,a4
        bne     a3,zero,.L3

Nothing really of note there.  Load up two values, then store them elsewhere
with a segmented store and the usual pointer updates.

After the change:
.L4:
        mv      a6,a3
        mv      a4,a3
        bleu    a3,a5,.L3
        csrr    a4,vlenb
.L3:
        vsetvli zero,a4,e8,m1,ta,ma
        vle8.v  v2,0(a0)
        vle8.v  v3,0(a1)
        sub     a3,a3,a5
        add     a0,a0,a5
        add     a1,a1,a5
        vsseg2e8.v      v2,(a2)
        add     a2,a2,a7
        bgtu    a6,a5,.L4

Ugh.  We've got a conditional branch in the middle of the loop, a CSR read and
a bit of sillyness with those extra move instructions.  Not really sure what
went wrong, but it's a reasonable assumption that this code is less performant
than the original.

Compile with "  -O3 -ftree-vectorize -std=c99 -march=rv32gcv_zvfh -mabi=ilp32d
-mrvv-vector-bits=scalable -fno-vect-cost-model"



typedef unsigned char uint8_t;
typedef signed char int8_t;
#ifndef TYPE
#define TYPE uint8_t
#define ITYPE int8_t
#endif

void __attribute__ ((noinline, noclone))
g2 (TYPE *__restrict a, TYPE *__restrict b, TYPE *__restrict c, ITYPE n)
{
  for (ITYPE i = 0; i < n; ++i)
    {
      c[i * 2] = a[i];
      c[i * 2 + 1] = b[i];
    }
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-09-19 11:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-02 16:47 [Bug tree-optimization/116573] New: [15 Regression] Recent SLP work appears to generate significantly worse code on RISC-V law at gcc dot gnu.org
2024-09-02 17:37 ` [Bug tree-optimization/116573] " rguenther at suse dot de
2024-09-02 22:19 ` pinskia at gcc dot gnu.org
2024-09-03  6:56 ` rguenth at gcc dot gnu.org
2024-09-06  8:48 ` rguenth at gcc dot gnu.org
2024-09-12 22:58 ` juzhe.zhong at rivai dot ai
2024-09-13  6:06 ` rguenth at gcc dot gnu.org
2024-09-13  7:51 ` rguenth at gcc dot gnu.org
2024-09-17  8:56 ` rdapp at gcc dot gnu.org
2024-09-17  8:59 ` rguenth at gcc dot gnu.org
2024-09-18  7:10 ` rguenth at gcc dot gnu.org
2024-09-18  7:19 ` juzhe.zhong at rivai dot ai
2024-09-18  7:25 ` rguenth at gcc dot gnu.org
2024-09-19 11:28 ` cvs-commit at gcc dot gnu.org
2024-09-19 11:33 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).