public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/111451] New: RISC-V: Missed optimization of vrgather.vv into vrgatherei16.vv
@ 2023-09-18  3:24 juzhe.zhong at rivai dot ai
  2023-09-21  8:23 ` [Bug target/111451] " juzhe.zhong at rivai dot ai
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-09-18  3:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111451

            Bug ID: 111451
           Summary: RISC-V: Missed optimization of vrgather.vv into
                    vrgatherei16.vv
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

Consider this following case:

#include <stdint.h>
typedef int32_t vnx32si __attribute__ ((vector_size (128)));

#define MASK_2(X, Y) (Y) - 1 - (X), (Y) - 2 - (X)
#define MASK_4(X, Y) MASK_2 (X, Y), MASK_2 (X + 2, Y)
#define MASK_8(X, Y) MASK_4 (X, Y), MASK_4 (X + 4, Y)
#define MASK_16(X, Y) MASK_8 (X, Y), MASK_8 (X + 8, Y)
#define MASK_32(X, Y) MASK_16 (X, Y), MASK_16 (X + 16, Y)
#define MASK_64(X, Y) MASK_32 (X, Y), MASK_32 (X + 32, Y)
#define MASK_128(X, Y) MASK_64 (X, Y), MASK_64 (X + 64, Y)

#define PERMUTE(TYPE, NUNITS)                                                 
\
  __attribute__ ((noipa)) void permute_##TYPE (TYPE values1, TYPE values2,    
\
                                               TYPE *out)                     
\
  {                                                                           
\
    TYPE v                                                                    
\
      = __builtin_shufflevector (values1, values2, MASK_##NUNITS (0, NUNITS));
\
    *(TYPE *) out = v;                                                        
\
  }

#define TEST_ALL(T)                                                           
\
  T (vnx32si, 32)                                                             
\

TEST_ALL (PERMUTE)

ASM:

permute_vnx32si:
        li      a5,32
        li      a4,31
        vsetvli zero,a5,e32,m8,ta,ma
        vid.v   v8
        vle32.v v24,0(a0)
        vrsub.vx        v8,v8,a4
        vrgather.vv     v16,v24,v8
        vse32.v v16,0(a2)
        ret

https://godbolt.org/z/Mh77YY91r

Here we use:

vsetvli zero,a5,e32,m8,ta,ma
...
vrgather.vv     v16,v24,v8

The index vector register "v8" occupies 8 registers.
We should optimize it into vrgatherei16.vv which is using int16 as the index 
elements.

Then with vrgatherei16.vv, the v8 will occupy 4 registers instead of 8.
Lower the register consuming and register pressure.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-09-22  7:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-18  3:24 [Bug c/111451] New: RISC-V: Missed optimization of vrgather.vv into vrgatherei16.vv juzhe.zhong at rivai dot ai
2023-09-21  8:23 ` [Bug target/111451] " juzhe.zhong at rivai dot ai
2023-09-21  8:28 ` juzhe.zhong at rivai dot ai
2023-09-22  4:20 ` cvs-commit at gcc dot gnu.org
2023-09-22  7:20 ` juzhe.zhong at rivai dot ai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).