public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/114887] New: RISC-V: expect M8 but M4 generated with dynamic LMUL for TSVC s319
@ 2024-04-29  9:52 deminhan at gcc dot gnu.org
  2024-04-29 12:58 ` [Bug target/114887] " juzhe.zhong at rivai dot ai
  2024-04-29 13:02 ` juzhe.zhong at rivai dot ai
  0 siblings, 2 replies; 3+ messages in thread
From: deminhan at gcc dot gnu.org @ 2024-04-29  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114887

            Bug ID: 114887
           Summary: RISC-V: expect M8 but M4 generated with dynamic LMUL
                    for TSVC s319
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: deminhan at gcc dot gnu.org
  Target Milestone: ---

we expect M8 when using following code and options, but M4 generated.

-march=rv64gcv_zba_zbb_zvl256b -mabi=lp64d -mrvv-max-lmul=dynamic -O3
-ffast-math

typedef float real_t;
__attribute__((aligned(64))) real_t
a[32000],b[32000],c[32000],d[32000],e[32000],
                                  
aa[256][256],bb[256][256],cc[256][256],tt[256][256];

real_t s319()
{
    real_t sum;
    for (int nl = 0; nl < 2*256; nl++) {
        sum = 0.;
        for (int i = 0; i < 32000; i++) {
            a[i] = c[i] + d[i];
            sum += a[i];
            b[i] = c[i] + e[i];
            sum += b[i];
        }
    }
    return sum;
}

generated asm:
.L2:
        vsetvli t0,zero,e32,m4,ta,ma
        vmv.v.i v12,0
        li      a4,32768
        addi    a4,a4,-768
        mv      a2,t6
        mv      a6,t5
        mv      a3,t4
        mv      a0,t3
        mv      a1,t1
.L3:
        vsetvli a5,a4,e32,m4,tu,ma
        vle32.v v8,0(a1)
        vle32.v v4,0(a0)
        vle32.v v16,0(a6)
        sub     a4,a4,a5
        sh2add  a1,a5,a1
        sh2add  a0,a5,a0
        sh2add  a6,a5,a6
        vfadd.vv        v4,v4,v8
        vfadd.vv        v8,v8,v16
        vse32.v v4,0(a3)
        vfadd.vv        v4,v4,v8
        sh2add  a3,a5,a3
        vfadd.vv        v12,v12,v4
        vse32.v v8,0(a2)
        sh2add  a2,a5,a2
        bne     a4,zero,.L3
        addiw   a7,a7,-1
        bne     a7,zero,.L2
        fmv.s.x fa5,zero
        vsetvli a4,zero,e32,m4,ta,ma
        vfmv.s.f        v1,fa5
        vfredusum.vs    v12,v12,v1
        vfmv.f.s        fa0,v12
        ret
        .cfi_endproc

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/114887] RISC-V: expect M8 but M4 generated with dynamic LMUL for TSVC s319
  2024-04-29  9:52 [Bug c/114887] New: RISC-V: expect M8 but M4 generated with dynamic LMUL for TSVC s319 deminhan at gcc dot gnu.org
@ 2024-04-29 12:58 ` juzhe.zhong at rivai dot ai
  2024-04-29 13:02 ` juzhe.zhong at rivai dot ai
  1 sibling, 0 replies; 3+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-04-29 12:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114887

--- Comment #1 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
The "vect" cost model analysis:

https://godbolt.org/z/qbqzon8x1

note:   Maximum lmul = 8, At most 40 number of live V_REG at program point 6
for bb 3

It seems that we count one more variable in program point 6 ?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/114887] RISC-V: expect M8 but M4 generated with dynamic LMUL for TSVC s319
  2024-04-29  9:52 [Bug c/114887] New: RISC-V: expect M8 but M4 generated with dynamic LMUL for TSVC s319 deminhan at gcc dot gnu.org
  2024-04-29 12:58 ` [Bug target/114887] " juzhe.zhong at rivai dot ai
@ 2024-04-29 13:02 ` juzhe.zhong at rivai dot ai
  1 sibling, 0 replies; 3+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-04-29 13:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114887

--- Comment #2 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
I think there is a too conservative analysis here:

note:   _1: type = float, start = 1, end = 6
note:   _5: type = float, start = 6, end = 8
note:   _3: type = float, start = 3, end = 7
note:   _4: type = float, start = 5, end = 6
note:   _2: type = float, start = 2, end = 3
note:   _28: type = float, start = 7, end = 9
note:   sum_18: type = real_t, start = 9, end = 9
note:   sum_26: type = real_t, start = 0, end = 9

The variables live at point 6 should be:
1. _1
2. _3
3. _4
4. sum_26

So there are total 4 variables each variable occupies 8 register at LMUL = 8.

Then the total live register should 4 * 8 = 32 which is ok to pick LMUL = 8.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-29 13:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-29  9:52 [Bug c/114887] New: RISC-V: expect M8 but M4 generated with dynamic LMUL for TSVC s319 deminhan at gcc dot gnu.org
2024-04-29 12:58 ` [Bug target/114887] " juzhe.zhong at rivai dot ai
2024-04-29 13:02 ` juzhe.zhong at rivai dot ai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).