* [PATCH 0/3] RISC-V:Enable basic auto-vectorization for RVV @ 2023-04-06 14:42 juzhe.zhong 2023-04-06 14:42 ` [PATCH 1/3] VECT: Add WHILE_LEN pattern to support decrement IV manipulation for loop vectorizer juzhe.zhong ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: juzhe.zhong @ 2023-04-06 14:42 UTC (permalink / raw) To: gcc-patches Cc: kito.cheng, palmer, richard.sandiford, rguenther, jeffreyalaw, Juzhe-Zhong From: Juzhe-Zhong <juzhe.zhong@rivai.ai> PATCH 1: Add WHILE_LEN pattern in Loop Vectorizer to support decrement IV for RVV. PATCH 2: Enable basic auto-vectorization for RVV in RISC-V port. PATCH 3: Add testcases for basic RVV auto-vectorization of WHILE_LEN pattern includeing single rgroup test and multiple rgroup test of SLP. *** BLURB HERE *** Juzhe-Zhong (3): VECT: Add WHILE_LEN pattern to support decrement IV manipulation for loop vectorizer. RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern RISC-V: Add testcase for basic RVV auto-vectorization gcc/config/riscv/autovec.md | 63 ++ gcc/config/riscv/riscv-opts.h | 16 + gcc/config/riscv/riscv-protos.h | 3 +- gcc/config/riscv/riscv-v.cc | 61 +- gcc/config/riscv/riscv-vector-switch.def | 47 +- gcc/config/riscv/riscv-vsetvl.cc | 210 ++++- gcc/config/riscv/riscv-vsetvl.h | 1 + gcc/config/riscv/riscv.cc | 34 +- gcc/config/riscv/riscv.opt | 40 + gcc/config/riscv/vector.md | 6 +- gcc/doc/md.texi | 14 + gcc/internal-fn.cc | 29 + gcc/internal-fn.def | 1 + gcc/optabs.def | 1 + gcc/testsuite/gcc.target/riscv/rvv/api/vadc.c | 361 ++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vadd.c | 713 ++++++++++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vand.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vcpop.c | 65 ++ gcc/testsuite/gcc.target/riscv/rvv/api/vdiv.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vdivu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vfirst.c | 65 ++ gcc/testsuite/gcc.target/riscv/rvv/api/vid.c | 185 +++++ .../gcc.target/riscv/rvv/api/viota.c | 185 +++++ .../gcc.target/riscv/rvv/api/vle16.c | 105 +++ .../gcc.target/riscv/rvv/api/vle32.c | 129 +++ .../gcc.target/riscv/rvv/api/vle64.c | 73 ++ gcc/testsuite/gcc.target/riscv/rvv/api/vle8.c | 121 +++ gcc/testsuite/gcc.target/riscv/rvv/api/vlm.c | 37 + .../gcc.target/riscv/rvv/api/vloxei16.c | 385 +++++++++ .../gcc.target/riscv/rvv/api/vloxei32.c | 353 ++++++++ .../gcc.target/riscv/rvv/api/vloxei64.c | 297 +++++++ .../gcc.target/riscv/rvv/api/vloxei8.c | 401 +++++++++ .../gcc.target/riscv/rvv/api/vlse16.c | 105 +++ .../gcc.target/riscv/rvv/api/vlse32.c | 129 +++ .../gcc.target/riscv/rvv/api/vlse64.c | 73 ++ .../gcc.target/riscv/rvv/api/vlse8.c | 121 +++ .../gcc.target/riscv/rvv/api/vluxei16.c | 385 +++++++++ .../gcc.target/riscv/rvv/api/vluxei32.c | 353 ++++++++ .../gcc.target/riscv/rvv/api/vluxei64.c | 297 +++++++ .../gcc.target/riscv/rvv/api/vluxei8.c | 401 +++++++++ .../gcc.target/riscv/rvv/api/vmacc.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vmadc.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vmadd.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vmand.c | 37 + .../gcc.target/riscv/rvv/api/vmandn.c | 37 + gcc/testsuite/gcc.target/riscv/rvv/api/vmax.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmaxu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmclr.c | 37 + .../gcc.target/riscv/rvv/api/vmerge.c | 361 ++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vmin.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vminu.c | 361 ++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vmmv.c | 37 + .../gcc.target/riscv/rvv/api/vmnand.c | 37 + .../gcc.target/riscv/rvv/api/vmnor.c | 37 + .../gcc.target/riscv/rvv/api/vmnot.c | 37 + gcc/testsuite/gcc.target/riscv/rvv/api/vmor.c | 37 + .../gcc.target/riscv/rvv/api/vmorn.c | 37 + .../gcc.target/riscv/rvv/api/vmsbc.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vmsbf.c | 65 ++ .../gcc.target/riscv/rvv/api/vmseq.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vmset.c | 37 + .../gcc.target/riscv/rvv/api/vmsge.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmsgeu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmsgt.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmsgtu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmsif.c | 65 ++ .../gcc.target/riscv/rvv/api/vmsle.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmsleu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmslt.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmsltu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmsne.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vmsof.c | 65 ++ gcc/testsuite/gcc.target/riscv/rvv/api/vmul.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vmulh.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmulhsu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmulhu.c | 361 ++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vmv.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vmxnor.c | 37 + .../gcc.target/riscv/rvv/api/vmxor.c | 37 + gcc/testsuite/gcc.target/riscv/rvv/api/vneg.c | 185 +++++ .../gcc.target/riscv/rvv/api/vnmsac.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vnmsub.c | 713 ++++++++++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vnot.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vnsra.c | 249 ++++++ .../gcc.target/riscv/rvv/api/vnsrl.c | 249 ++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vor.c | 713 ++++++++++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vrem.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vremu.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vrsub.c | 361 ++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vsbc.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vse16.c | 105 +++ .../gcc.target/riscv/rvv/api/vse32.c | 129 +++ .../gcc.target/riscv/rvv/api/vse64.c | 73 ++ gcc/testsuite/gcc.target/riscv/rvv/api/vse8.c | 121 +++ .../gcc.target/riscv/rvv/api/vsetvl.c | 97 +++ .../gcc.target/riscv/rvv/api/vsetvlmax.c | 97 +++ .../gcc.target/riscv/rvv/api/vsext_vf2.c | 129 +++ .../gcc.target/riscv/rvv/api/vsext_vf4.c | 81 ++ .../gcc.target/riscv/rvv/api/vsext_vf8.c | 41 + gcc/testsuite/gcc.target/riscv/rvv/api/vsll.c | 713 ++++++++++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vsm.c | 37 + .../gcc.target/riscv/rvv/api/vsoxei16.c | 385 +++++++++ .../gcc.target/riscv/rvv/api/vsoxei32.c | 353 ++++++++ .../gcc.target/riscv/rvv/api/vsoxei64.c | 297 +++++++ .../gcc.target/riscv/rvv/api/vsoxei8.c | 401 +++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vsra.c | 361 ++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vsrl.c | 361 ++++++++ .../gcc.target/riscv/rvv/api/vsse16.c | 105 +++ .../gcc.target/riscv/rvv/api/vsse32.c | 129 +++ .../gcc.target/riscv/rvv/api/vsse64.c | 73 ++ .../gcc.target/riscv/rvv/api/vsse8.c | 121 +++ gcc/testsuite/gcc.target/riscv/rvv/api/vsub.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vsuxei16.c | 385 +++++++++ .../gcc.target/riscv/rvv/api/vsuxei32.c | 353 ++++++++ .../gcc.target/riscv/rvv/api/vsuxei64.c | 297 +++++++ .../gcc.target/riscv/rvv/api/vsuxei8.c | 401 +++++++++ .../gcc.target/riscv/rvv/api/vwadd.c | 489 +++++++++++ .../gcc.target/riscv/rvv/api/vwaddu.c | 489 +++++++++++ .../gcc.target/riscv/rvv/api/vwmacc.c | 249 ++++++ .../gcc.target/riscv/rvv/api/vwmaccsu.c | 249 ++++++ .../gcc.target/riscv/rvv/api/vwmaccu.c | 249 ++++++ .../gcc.target/riscv/rvv/api/vwmaccus.c | 129 +++ .../gcc.target/riscv/rvv/api/vwmul.c | 249 ++++++ .../gcc.target/riscv/rvv/api/vwmulsu.c | 249 ++++++ .../gcc.target/riscv/rvv/api/vwmulu.c | 249 ++++++ .../gcc.target/riscv/rvv/api/vwsub.c | 489 +++++++++++ .../gcc.target/riscv/rvv/api/vwsubu.c | 489 +++++++++++ gcc/testsuite/gcc.target/riscv/rvv/api/vxor.c | 713 ++++++++++++++++ .../gcc.target/riscv/rvv/api/vzext_vf2.c | 129 +++ .../gcc.target/riscv/rvv/api/vzext_vf4.c | 81 ++ .../gcc.target/riscv/rvv/api/vzext_vf8.c | 41 + .../rvv/autovec/partial/multiple_rgroup-1.c | 6 + .../rvv/autovec/partial/multiple_rgroup-1.h | 304 +++++++ .../rvv/autovec/partial/multiple_rgroup-2.c | 6 + .../rvv/autovec/partial/multiple_rgroup-2.h | 546 ++++++++++++ .../rvv/autovec/partial/multiple_rgroup-2.s | 774 ++++++++++++++++++ .../autovec/partial/multiple_rgroup_run-1.c | 19 + .../autovec/partial/multiple_rgroup_run-2.c | 19 + .../rvv/autovec/partial/single_rgroup-1.c | 8 + .../rvv/autovec/partial/single_rgroup-1.h | 106 +++ .../rvv/autovec/partial/single_rgroup_run-1.c | 19 + .../gcc.target/riscv/rvv/autovec/template-1.h | 68 ++ .../gcc.target/riscv/rvv/autovec/v-1.c | 4 + .../gcc.target/riscv/rvv/autovec/v-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve32f-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve32f-2.c | 5 + .../riscv/rvv/autovec/zve32f_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve32f_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve32x-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve32x-2.c | 6 + .../riscv/rvv/autovec/zve32x_zvl128b-1.c | 5 + .../riscv/rvv/autovec/zve32x_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve64d-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve64d-2.c | 4 + .../riscv/rvv/autovec/zve64d_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve64d_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve64f-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve64f-2.c | 4 + .../riscv/rvv/autovec/zve64f_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve64f_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve64x-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve64x-2.c | 4 + .../riscv/rvv/autovec/zve64x_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve64x_zvl128b-2.c | 6 + gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 16 + .../gcc.target/riscv/rvv/vsetvl/vsetvl-17.c | 2 +- gcc/tree-ssa-loop-manip.cc | 4 +- gcc/tree-ssa-loop-manip.h | 2 +- gcc/tree-vect-loop-manip.cc | 186 ++++- gcc/tree-vect-loop.cc | 35 +- gcc/tree-vect-stmts.cc | 9 +- gcc/tree-vectorizer.h | 4 +- 172 files changed, 36786 insertions(+), 46 deletions(-) create mode 100644 gcc/config/riscv/autovec.md create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vadc.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vadd.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vand.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vcpop.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vdiv.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vdivu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vfirst.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vid.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/viota.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vle16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vle32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vle64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vle8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vlm.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vloxei16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vloxei32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vloxei64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vloxei8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vlse16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vlse32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vlse64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vlse8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vluxei16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vluxei32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vluxei64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vluxei8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmacc.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmadc.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmadd.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmand.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmandn.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmax.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmaxu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmclr.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmerge.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmin.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vminu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmmv.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmnand.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmnor.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmnot.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmor.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmorn.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsbc.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsbf.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmseq.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmset.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsge.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsgeu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsgt.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsgtu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsif.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsle.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsleu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmslt.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsltu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsne.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmsof.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmul.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmulh.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmulhsu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmulhu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmv.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmxnor.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vmxor.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vneg.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vnmsac.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vnmsub.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vnot.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vnsra.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vnsrl.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vor.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vrem.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vremu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vrsub.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsbc.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vse16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vse32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vse64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vse8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsetvl.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsetvlmax.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsext_vf2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsext_vf4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsext_vf8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsll.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsm.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsoxei16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsoxei32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsoxei64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsoxei8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsra.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsrl.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsse16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsse32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsse64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsse8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsub.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsuxei16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsuxei32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsuxei64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vsuxei8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwadd.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwaddu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwmacc.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwmaccsu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwmaccu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwmaccus.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwmul.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwmulsu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwmulu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwsub.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vwsubu.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vxor.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vzext_vf2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vzext_vf4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/api/vzext_vf8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c -- 2.36.3 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] VECT: Add WHILE_LEN pattern to support decrement IV manipulation for loop vectorizer. 2023-04-06 14:42 [PATCH 0/3] RISC-V:Enable basic auto-vectorization for RVV juzhe.zhong @ 2023-04-06 14:42 ` juzhe.zhong 2023-04-06 14:42 ` [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern juzhe.zhong 2023-04-06 14:42 ` [PATCH] RISC-V: Add RVV auto-vectorization testcase juzhe.zhong 2 siblings, 0 replies; 7+ messages in thread From: juzhe.zhong @ 2023-04-06 14:42 UTC (permalink / raw) To: gcc-patches Cc: kito.cheng, palmer, richard.sandiford, rguenther, jeffreyalaw, Juzhe-Zhong From: Juzhe-Zhong <juzhe.zhong@rivai.ai> This patch is to add WHILE_LEN pattern. It's inspired by RVV ISA simple "vvaddint32.s" example: https://github.com/riscv/riscv-v-spec/blob/master/example/vvaddint32.s More details are in "vect_set_loop_controls_by_while_len" implementation and comments. Consider such following case: #define N 16 int src[N]; int dest[N]; void foo (int n) { for (int i = 0; i < n; i++) dest[i] = src[i]; } -march=rv64gcv -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -fno-tree-loop-distribute-patterns: foo: ble a0,zero,.L1 lui a4,%hi(.LANCHOR0) addi a4,a4,%lo(.LANCHOR0) addi a3,a4,64 csrr a2,vlenb .L3: vsetvli a5,a0,e32,m1,ta,ma vle32.v v1,0(a4) sub a0,a0,a5 vse32.v v1,0(a3) add a4,a4,a2 add a3,a3,a2 bne a0,zero,.L3 .L1: ret Also, we support multiple rgroup for SLP: More testcases are in gcc/testsuite/gcc.target/riscv/rvv/autovec. gcc/ChangeLog: * doc/md.texi: Add while_len support * internal-fn.cc (while_len_direct): Ditto. (expand_while_len_optab_fn): Ditto. (direct_while_len_optab_supported_p): Ditto. * internal-fn.def (WHILE_LEN): Ditto. * optabs.def (OPTAB_D): Ditto. * tree-ssa-loop-manip.cc (create_iv): Ditto. * tree-ssa-loop-manip.h (create_iv): Ditto. * tree-vect-loop-manip.cc (vect_set_loop_controls_by_while_len): New function. (vect_set_loop_condition_partial_vectors): Add while_len support. * tree-vect-loop.cc (vect_get_loop_len): Ditto. * tree-vect-stmts.cc (vectorizable_store): Ditto. (vectorizable_load): Ditto * tree-vectorizer.h (vect_get_loop_len): Ditto. --- gcc/doc/md.texi | 14 +++ gcc/internal-fn.cc | 29 ++++++ gcc/internal-fn.def | 1 + gcc/optabs.def | 1 + gcc/tree-ssa-loop-manip.cc | 4 +- gcc/tree-ssa-loop-manip.h | 2 +- gcc/tree-vect-loop-manip.cc | 186 ++++++++++++++++++++++++++++++++++-- gcc/tree-vect-loop.cc | 35 +++++-- gcc/tree-vect-stmts.cc | 9 +- gcc/tree-vectorizer.h | 4 +- 10 files changed, 264 insertions(+), 21 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 8e3113599fd..72178ab014c 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -4965,6 +4965,20 @@ for (i = 1; i < operand3; i++) operand0[i] = operand0[i - 1] && (operand1 + i < operand2); @end smallexample +@cindex @code{while_len@var{m}@var{n}} instruction pattern +@item @code{while_len@var{m}@var{n}} +Set operand 0 to the number of active elements in vector will be updated value. +operand 1 is the total elements need to be updated value. +operand 2 is the vectorization factor. +The operation is equivalent to: + +@smallexample +operand0 = MIN (operand1, operand2); +operand2 can be const_poly_int or poly_int related to vector mode size. +Some target like RISC-V has a standalone instruction to get MIN (n, MODE SIZE) so +that we can reduce a use of general purpose register. +@end smallexample + @cindex @code{check_raw_ptrs@var{m}} instruction pattern @item @samp{check_raw_ptrs@var{m}} Check whether, given two pointers @var{a} and @var{b} and a length @var{len}, diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 6e81dc05e0e..5f44def90d3 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -127,6 +127,7 @@ init_internal_fns () #define cond_binary_direct { 1, 1, true } #define cond_ternary_direct { 1, 1, true } #define while_direct { 0, 2, false } +#define while_len_direct { 0, 0, false } #define fold_extract_direct { 2, 2, false } #define fold_left_direct { 1, 1, false } #define mask_fold_left_direct { 1, 1, false } @@ -3702,6 +3703,33 @@ expand_while_optab_fn (internal_fn, gcall *stmt, convert_optab optab) emit_move_insn (lhs_rtx, ops[0].value); } +/* Expand WHILE_LEN call STMT using optab OPTAB. */ +static void +expand_while_len_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +{ + expand_operand ops[3]; + tree rhs_type[2]; + + tree lhs = gimple_call_lhs (stmt); + tree lhs_type = TREE_TYPE (lhs); + rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + create_output_operand (&ops[0], lhs_rtx, TYPE_MODE (lhs_type)); + + for (unsigned int i = 0; i < gimple_call_num_args (stmt); ++i) + { + tree rhs = gimple_call_arg (stmt, i); + rhs_type[i] = TREE_TYPE (rhs); + rtx rhs_rtx = expand_normal (rhs); + create_input_operand (&ops[i + 1], rhs_rtx, TYPE_MODE (rhs_type[i])); + } + + insn_code icode = direct_optab_handler (optab, TYPE_MODE (rhs_type[0])); + + expand_insn (icode, 3, ops); + if (!rtx_equal_p (lhs_rtx, ops[0].value)) + emit_move_insn (lhs_rtx, ops[0].value); +} + /* Expand a call to a convert-like optab using the operands in STMT. FN has a single output operand and NARGS input operands. */ @@ -3843,6 +3871,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_scatter_store_optab_supported_p convert_optab_supported_p #define direct_len_store_optab_supported_p direct_optab_supported_p #define direct_while_optab_supported_p convert_optab_supported_p +#define direct_while_len_optab_supported_p direct_optab_supported_p #define direct_fold_extract_optab_supported_p direct_optab_supported_p #define direct_fold_left_optab_supported_p direct_optab_supported_p #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae7..3a933abff5d 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -153,6 +153,7 @@ DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set) DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store) DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while) +DEF_INTERNAL_OPTAB_FN (WHILE_LEN, ECF_CONST | ECF_NOTHROW, while_len, while_len) DEF_INTERNAL_OPTAB_FN (CHECK_RAW_PTRS, ECF_CONST | ECF_NOTHROW, check_raw_ptrs, check_ptrs) DEF_INTERNAL_OPTAB_FN (CHECK_WAR_PTRS, ECF_CONST | ECF_NOTHROW, diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b30..f5938bd2c24 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -476,3 +476,4 @@ OPTAB_DC (vec_series_optab, "vec_series$a", VEC_SERIES) OPTAB_D (vec_shl_insert_optab, "vec_shl_insert_$a") OPTAB_D (len_load_optab, "len_load_$a") OPTAB_D (len_store_optab, "len_store_$a") +OPTAB_D (while_len_optab, "while_len$a") diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc index 09acc1c94cc..cdbf280e249 100644 --- a/gcc/tree-ssa-loop-manip.cc +++ b/gcc/tree-ssa-loop-manip.cc @@ -59,14 +59,14 @@ static bitmap_obstack loop_renamer_obstack; void create_iv (tree base, tree step, tree var, class loop *loop, gimple_stmt_iterator *incr_pos, bool after, - tree *var_before, tree *var_after) + tree *var_before, tree *var_after, enum tree_code code) { gassign *stmt; gphi *phi; tree initial, step1; gimple_seq stmts; tree vb, va; - enum tree_code incr_op = PLUS_EXPR; + enum tree_code incr_op = code; edge pe = loop_preheader_edge (loop); if (var != NULL_TREE) diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h index d49273a3987..da755320a3a 100644 --- a/gcc/tree-ssa-loop-manip.h +++ b/gcc/tree-ssa-loop-manip.h @@ -23,7 +23,7 @@ along with GCC; see the file COPYING3. If not see typedef void (*transform_callback)(class loop *, void *); extern void create_iv (tree, tree, tree, class loop *, gimple_stmt_iterator *, - bool, tree *, tree *); + bool, tree *, tree *, enum tree_code = PLUS_EXPR); extern void rewrite_into_loop_closed_ssa (bitmap, unsigned); extern void verify_loop_closed_ssa (bool, class loop * = NULL); diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index f60fa50e8f4..f3cd6c51d2e 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -682,6 +682,173 @@ vect_set_loop_controls_directly (class loop *loop, loop_vec_info loop_vinfo, return next_ctrl; } +/* Helper for vect_set_loop_condition_partial_vectors. Generate definitions + for all the rgroup controls in RGC and return a control that is nonzero + when the loop needs to iterate. Add any new preheader statements to + PREHEADER_SEQ. Use LOOP_COND_GSI to insert code before the exit gcond. + + RGC belongs to loop LOOP. The loop originally iterated NITERS + times and has been vectorized according to LOOP_VINFO. + + Unlike vect_set_loop_controls_directly which is iterating from 0-based IV + to TEST_LIMIT - bias. + + In vect_set_loop_controls_by_while_len, we are iterating from start at + IV = TEST_LIMIT - bias and keep subtract IV by the length calculated by + IFN_WHILE_LEN pattern. + + Note: the cost of the code generated by this function is modeled + by vect_estimate_min_profitable_iters, so changes here may need + corresponding changes there. + + 1. Single rgroup, the Gimple IR should be: + + <bb 3> + _19 = (unsigned long) n_5(D); + ... + + <bb 4>: + ... + # ivtmp_20 = PHI <ivtmp_21(4), _19(3)> + ... + _22 = .WHILE_LEN (ivtmp_20, vf); + ... + vector statement (use _22); + ... + ivtmp_21 = ivtmp_20 - _22; + ... + if (ivtmp_21 != 0) + goto <bb 4>; [75.00%] + else + goto <bb 5>; [25.00%] + + <bb 5> + return; + + Note: IFN_WHILE_LEN will guarantee "ivtmp_21 = ivtmp_20 - _22" never + underflow 0. + + 2. Multiple rgroup, the Gimple IR should be: + + <bb 3> + _70 = (unsigned long) bnd.7_52; + _71 = _70 * 2; + _72 = MAX_EXPR <_71, 4>; + _73 = _72 + 18446744073709551612; + ... + + <bb 4>: + ... + # ivtmp_74 = PHI <ivtmp_75(6), _73(12)> + # ivtmp_77 = PHI <ivtmp_78(6), _71(12)> + _76 = .WHILE_LEN (ivtmp_74, vf * nitems_per_ctrl); + _79 = .WHILE_LEN (ivtmp_77, vf * nitems_per_ctrl); + ... + vector statement (use _79); + ... + vector statement (use _76); + ... + _65 = _79 / 2; + vector statement (use _65); + ... + _68 = _76 / 2; + vector statement (use _68); + ... + ivtmp_78 = ivtmp_77 - _79; + ivtmp_75 = ivtmp_74 - _76; + ... + if (ivtmp_78 != 0) + goto <bb 4>; [75.00%] + else + goto <bb 5>; [25.00%] + + <bb 5> + return; + +*/ + +static tree +vect_set_loop_controls_by_while_len (class loop *loop, loop_vec_info loop_vinfo, + gimple_seq *preheader_seq, + gimple_seq *header_seq, + rgroup_controls *rgc, tree niters) +{ + tree compare_type = LOOP_VINFO_RGROUP_COMPARE_TYPE (loop_vinfo); + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); + /* We are not allowing masked approach in WHILE_LEN. */ + gcc_assert (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)); + + tree ctrl_type = rgc->type; + unsigned int nitems_per_iter = rgc->max_nscalars_per_iter * rgc->factor; + poly_uint64 nitems_per_ctrl = TYPE_VECTOR_SUBPARTS (ctrl_type) * rgc->factor; + poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); + + /* Calculate the maximum number of item values that the rgroup + handles in total, the number that it handles for each iteration + of the vector loop. */ + tree nitems_total = niters; + if (nitems_per_iter != 1) + { + /* We checked before setting LOOP_VINFO_USING_PARTIAL_VECTORS_P that + these multiplications don't overflow. */ + tree compare_factor = build_int_cst (compare_type, nitems_per_iter); + nitems_total = gimple_build (preheader_seq, MULT_EXPR, compare_type, + nitems_total, compare_factor); + } + + /* Convert the comparison value to the IV type (either a no-op or + a promotion). */ + nitems_total = gimple_convert (preheader_seq, iv_type, nitems_total); + + /* Create an induction variable that counts the number of items + processed. */ + tree index_before_incr, index_after_incr; + gimple_stmt_iterator incr_gsi; + bool insert_after; + standard_iv_increment_position (loop, &incr_gsi, &insert_after); + + /* Test the decremented IV, which will never underflow 0 since we have + IFN_WHILE_LEN to gurantee that. */ + tree test_limit = nitems_total; + + /* Provide a definition of each control in the group. */ + tree ctrl; + unsigned int i; + FOR_EACH_VEC_ELT_REVERSE (rgc->controls, i, ctrl) + { + /* Previous controls will cover BIAS items. This control covers the + next batch. */ + poly_uint64 bias = nitems_per_ctrl * i; + tree bias_tree = build_int_cst (iv_type, bias); + + /* Rather than have a new IV that starts at TEST_LIMIT and goes down to + BIAS, prefer to use the same TEST_LIMIT - BIAS based IV for each + control and adjust the bound down by BIAS. */ + tree this_test_limit = test_limit; + if (i != 0) + { + this_test_limit = gimple_build (preheader_seq, MAX_EXPR, iv_type, + this_test_limit, bias_tree); + this_test_limit = gimple_build (preheader_seq, MINUS_EXPR, iv_type, + this_test_limit, bias_tree); + } + + /* Create decrement IV. */ + create_iv (this_test_limit, ctrl, NULL_TREE, loop, &incr_gsi, + insert_after, &index_before_incr, &index_after_incr, + MINUS_EXPR); + + poly_uint64 final_vf = vf * nitems_per_iter; + tree vf_step = build_int_cst (iv_type, final_vf); + tree res_len = gimple_build (header_seq, IFN_WHILE_LEN, iv_type, + index_before_incr, vf_step); + gassign *assign = gimple_build_assign (ctrl, res_len); + gimple_seq_add_stmt (header_seq, assign); + } + + return index_after_incr; +} + /* Set up the iteration condition and rgroup controls for LOOP, given that LOOP_VINFO_USING_PARTIAL_VECTORS_P is true for the vectorized loop. LOOP_VINFO describes the vectorization of LOOP. NITERS is @@ -703,6 +870,7 @@ vect_set_loop_condition_partial_vectors (class loop *loop, bool use_masks_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo); tree compare_type = LOOP_VINFO_RGROUP_COMPARE_TYPE (loop_vinfo); + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); unsigned int compare_precision = TYPE_PRECISION (compare_type); tree orig_niters = niters; @@ -757,12 +925,18 @@ vect_set_loop_condition_partial_vectors (class loop *loop, bool might_wrap_p = vect_rgroup_iv_might_wrap_p (loop_vinfo, rgc); /* Set up all controls for this group. */ - test_ctrl = vect_set_loop_controls_directly (loop, loop_vinfo, - &preheader_seq, - &header_seq, - loop_cond_gsi, rgc, - niters, niters_skip, - might_wrap_p); + if (direct_internal_fn_supported_p (IFN_WHILE_LEN, iv_type, + OPTIMIZE_FOR_SPEED)) + test_ctrl + = vect_set_loop_controls_by_while_len (loop, loop_vinfo, + &preheader_seq, &header_seq, + rgc, niters); + else + test_ctrl + = vect_set_loop_controls_directly (loop, loop_vinfo, &preheader_seq, + &header_seq, loop_cond_gsi, rgc, + niters, niters_skip, + might_wrap_p); } /* Emit all accumulated statements. */ diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 1ba9f18d73e..5bffd9a6322 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -10360,12 +10360,14 @@ vect_record_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS. */ tree -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, - unsigned int nvectors, unsigned int index) +vect_get_loop_len (gimple_stmt_iterator *gsi, loop_vec_info loop_vinfo, + vec_loop_lens *lens, unsigned int nvectors, tree vectype, + unsigned int index) { rgroup_controls *rgl = &(*lens)[nvectors - 1]; - bool use_bias_adjusted_len = - LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) != 0; + bool use_bias_adjusted_len + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) != 0; + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); /* Populate the rgroup's len array, if this is the first time we've used it. */ @@ -10386,8 +10388,8 @@ vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, if (use_bias_adjusted_len) { gcc_assert (i == 0); - tree adjusted_len = - make_temp_ssa_name (len_type, NULL, "adjusted_loop_len"); + tree adjusted_len + = make_temp_ssa_name (len_type, NULL, "adjusted_loop_len"); SSA_NAME_DEF_STMT (adjusted_len) = gimple_build_nop (); rgl->bias_adjusted_ctrl = adjusted_len; } @@ -10396,6 +10398,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, if (use_bias_adjusted_len) return rgl->bias_adjusted_ctrl; + else if (direct_internal_fn_supported_p (IFN_WHILE_LEN, iv_type, + OPTIMIZE_FOR_SPEED)) + { + tree loop_len = rgl->controls[index]; + poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); + poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); + if (maybe_ne (nunits1, nunits2)) + { + /* A loop len for data type X can be reused for data type Y + if X has N times more elements than Y and if Y's elements + are N times bigger than X's. */ + gcc_assert (multiple_p (nunits1, nunits2)); + unsigned int factor = exact_div (nunits1, nunits2).to_constant (); + gimple_seq seq = NULL; + loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len, + build_int_cst (iv_type, factor)); + if (seq) + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); + } + return loop_len; + } else return rgl->controls[index]; } diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index efa2d0daa52..708c8a1d806 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8653,8 +8653,9 @@ vectorizable_store (vec_info *vinfo, else if (loop_lens) { tree final_len - = vect_get_loop_len (loop_vinfo, loop_lens, - vec_num * ncopies, vec_num * j + i); + = vect_get_loop_len (gsi, loop_vinfo, loop_lens, + vec_num * ncopies, vectype, + vec_num * j + i); tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); machine_mode vmode = TYPE_MODE (vectype); opt_machine_mode new_ovmode @@ -10009,8 +10010,8 @@ vectorizable_load (vec_info *vinfo, else if (loop_lens && memory_access_type != VMAT_INVARIANT) { tree final_len - = vect_get_loop_len (loop_vinfo, loop_lens, - vec_num * ncopies, + = vect_get_loop_len (gsi, loop_vinfo, loop_lens, + vec_num * ncopies, vectype, vec_num * j + i); tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 9cf2fb23fe3..e5cf38caf4b 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2293,8 +2293,8 @@ extern tree vect_get_loop_mask (gimple_stmt_iterator *, vec_loop_masks *, unsigned int, tree, unsigned int); extern void vect_record_loop_len (loop_vec_info, vec_loop_lens *, unsigned int, tree, unsigned int); -extern tree vect_get_loop_len (loop_vec_info, vec_loop_lens *, unsigned int, - unsigned int); +extern tree vect_get_loop_len (gimple_stmt_iterator *, loop_vec_info, + vec_loop_lens *, unsigned int, tree, unsigned int); extern gimple_seq vect_gen_len (tree, tree, tree, tree); extern stmt_vec_info info_for_reduction (vec_info *, stmt_vec_info); extern bool reduction_fn_for_scalar_code (code_helper, internal_fn *); -- 2.36.3 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern 2023-04-06 14:42 [PATCH 0/3] RISC-V:Enable basic auto-vectorization for RVV juzhe.zhong 2023-04-06 14:42 ` [PATCH 1/3] VECT: Add WHILE_LEN pattern to support decrement IV manipulation for loop vectorizer juzhe.zhong @ 2023-04-06 14:42 ` juzhe.zhong 2023-04-06 16:04 ` Kito Cheng 2023-04-06 14:42 ` [PATCH] RISC-V: Add RVV auto-vectorization testcase juzhe.zhong 2 siblings, 1 reply; 7+ messages in thread From: juzhe.zhong @ 2023-04-06 14:42 UTC (permalink / raw) To: gcc-patches Cc: kito.cheng, palmer, richard.sandiford, rguenther, jeffreyalaw, Juzhe-Zhong From: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/riscv-opts.h (enum riscv_autovec_preference_enum): Add compile option for RVV auto-vectorization. (enum riscv_autovec_lmul_enum): Ditto. * config/riscv/riscv-protos.h (get_vector_mode): Remove unused global function. (preferred_simd_mode): Enable basic auto-vectorization for RVV. (expand_while_len): Enable while_len pattern. * config/riscv/riscv-v.cc (get_avl_type_rtx): Ditto. (autovec_use_vlmax_p): New function. (preferred_simd_mode): New function. (expand_while_len): Ditto. * config/riscv/riscv-vector-switch.def (ENTRY): Disable SEW = 64 for MIN_VLEN > 32 but EEW = 32. * config/riscv/riscv-vsetvl.cc (get_all_successors): New function. (get_all_overlap_blocks): Ditto. (local_eliminate_vsetvl_insn): Ditto. (vector_insn_info::skip_avl_compatible_p): Ditto. (vector_insn_info::merge): Ditto. (pass_vsetvl::compute_local_backward_infos): Ehance VSETVL PASS for RVV auto-vectorization. (pass_vsetvl::global_eliminate_vsetvl_p): Ditto. (pass_vsetvl::cleanup_insns): Ditto. * config/riscv/riscv-vsetvl.h: Ditto. * config/riscv/riscv.cc (riscv_convert_vector_bits): Add basic RVV auto-vectorization support. (riscv_preferred_simd_mode): Ditto. (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto. * config/riscv/riscv.opt: Add compile option. * config/riscv/vector.md: Add RVV auto-vectorization. * config/riscv/autovec.md: New file. --- gcc/config/riscv/autovec.md | 63 +++++++ gcc/config/riscv/riscv-opts.h | 16 ++ gcc/config/riscv/riscv-protos.h | 3 +- gcc/config/riscv/riscv-v.cc | 61 ++++++- gcc/config/riscv/riscv-vector-switch.def | 47 +++-- gcc/config/riscv/riscv-vsetvl.cc | 210 ++++++++++++++++++++++- gcc/config/riscv/riscv-vsetvl.h | 1 + gcc/config/riscv/riscv.cc | 34 +++- gcc/config/riscv/riscv.opt | 40 +++++ gcc/config/riscv/vector.md | 6 +- 10 files changed, 457 insertions(+), 24 deletions(-) create mode 100644 gcc/config/riscv/autovec.md diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md new file mode 100644 index 00000000000..ff616d81586 --- /dev/null +++ b/gcc/config/riscv/autovec.md @@ -0,0 +1,63 @@ +;; Machine description for auto-vectorization using RVV for GNU compiler. +;; Copyright (C) 2023-2023 Free Software Foundation, Inc. +;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. + +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +;; ========================================================================= +;; == While_len +;; ========================================================================= + +(define_expand "while_len<mode>" + [(match_operand:P 0 "register_operand") + (match_operand:P 1 "vector_length_operand") + (match_operand:P 2 "")] + "TARGET_VECTOR" +{ + riscv_vector::expand_while_len (operands); + DONE; +}) + +;; ========================================================================= +;; == Loads/Stores +;; ========================================================================= + +;; len_load/len_store is sub-optimal pattern for RVV auto-vectorization support. +;; We will replace them when len_maskload/len_maskstore is supported in loop vectorizer. +(define_expand "len_load_<mode>" + [(match_operand:V 0 "register_operand") + (match_operand:V 1 "memory_operand") + (match_operand 2 "vector_length_operand") + (match_operand 3 "const_0_operand")] + "TARGET_VECTOR" +{ + riscv_vector::emit_nonvlmax_op (code_for_pred_mov (<MODE>mode), operands[0], + operands[1], operands[2], <VM>mode); + DONE; +}) + +(define_expand "len_store_<mode>" + [(match_operand:V 0 "memory_operand") + (match_operand:V 1 "register_operand") + (match_operand 2 "vector_length_operand") + (match_operand 3 "const_0_operand")] + "TARGET_VECTOR" +{ + riscv_vector::emit_nonvlmax_op (code_for_pred_mov (<MODE>mode), operands[0], + operands[1], operands[2], <VM>mode); + DONE; +}) diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index cf0cd669be4..22b79b65de5 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -67,6 +67,22 @@ enum stack_protector_guard { SSP_GLOBAL /* global canary */ }; +/* RISC-V auto-vectorization preference. */ +enum riscv_autovec_preference_enum { + NO_AUTOVEC, + RVV_SCALABLE, + RVV_FIXED_VLMIN, + RVV_FIXED_VLMAX +}; + +/* RISC-V auto-vectorization RVV LMUL. */ +enum riscv_autovec_lmul_enum { + RVV_M1 = 1, + RVV_M2 = 2, + RVV_M4 = 4, + RVV_M8 = 8 +}; + #define MASK_ZICSR (1 << 0) #define MASK_ZIFENCEI (1 << 1) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 4611447ddde..7db0deb4dbf 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -184,7 +184,6 @@ enum mask_policy enum tail_policy get_prefer_tail_policy (); enum mask_policy get_prefer_mask_policy (); rtx get_avl_type_rtx (enum avl_type); -opt_machine_mode get_vector_mode (scalar_mode, poly_uint64); bool simm5_p (rtx); bool neg_simm5_p (rtx); #ifdef RTX_CODE @@ -206,6 +205,8 @@ enum vlen_enum bool slide1_sew64_helper (int, machine_mode, machine_mode, machine_mode, rtx *); rtx gen_avl_for_scalar_move (rtx); +machine_mode preferred_simd_mode (scalar_mode); +void expand_while_len (rtx *); } /* We classify builtin types into two classes: diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index ed3c5e0756f..0e0cffaf5a4 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -43,6 +43,7 @@ #include "optabs.h" #include "tm-constrs.h" #include "rtx-vector-builder.h" +#include "targhooks.h" using namespace riscv_vector; @@ -424,7 +425,7 @@ get_avl_type_rtx (enum avl_type type) /* Return the RVV vector mode that has NUNITS elements of mode INNER_MODE. This function is not only used by builtins, but also will be used by auto-vectorization in the future. */ -opt_machine_mode +static opt_machine_mode get_vector_mode (scalar_mode inner_mode, poly_uint64 nunits) { enum mode_class mclass; @@ -729,4 +730,62 @@ gen_avl_for_scalar_move (rtx avl) } } +/* SCALABLE means that the vector-length is agnostic (run-time invariant and + compile-time unknown). FIXED meands that the vector-length is specific + (compile-time known). Both RVV_SCALABLE and RVV_FIXED_VLMAX are doing + auto-vectorization using VLMAX vsetvl configuration. */ +static bool +autovec_use_vlmax_p (void) +{ + return riscv_autovec_preference == RVV_SCALABLE + || riscv_autovec_preference == RVV_FIXED_VLMAX; +} + +/* Return the vectorization machine mode for RVV according to LMUL. */ +machine_mode +preferred_simd_mode (scalar_mode mode) +{ + if (autovec_use_vlmax_p ()) + { + /* We use LMUL = 1 as base bytesize which is BYTES_PER_RISCV_VECTOR and + riscv_autovec_lmul as multiply factor to calculate the the NUNITS to + get the auto-vectorization mode. */ + poly_uint64 nunits; + poly_uint64 vector_size + = BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul); + poly_uint64 scalar_size = GET_MODE_SIZE (mode); + if (!multiple_p (vector_size, scalar_size, &nunits)) + return word_mode; + machine_mode rvv_mode; + if (get_vector_mode (mode, nunits).exists (&rvv_mode)) + return rvv_mode; + } + /* TODO: We will support minimum length VLS auto-vectorization in the future. + */ + return word_mode; +} + +void +expand_while_len (rtx *ops) +{ + poly_int64 nunits; + gcc_assert (poly_int_rtx_p (ops[2], &nunits)); + /* We arbitrary picked QImode as inner scalar mode to get vector mode. + since vsetvl only demand ratio. We let VSETVL PASS to optimize it. */ + scalar_int_mode mode = QImode; + machine_mode rvv_mode; + if (get_vector_mode (mode, nunits).exists (&rvv_mode)) + { + rtx vsetvl_rtx + = gen_no_side_effects_vsetvl_rtx (rvv_mode, ops[0], ops[1]); + emit_insn (vsetvl_rtx); + } + else + { + rtx tmp = gen_reg_rtx (Pmode); + emit_move_insn (tmp, gen_int_mode (nunits, Pmode)); + expand_binop (Pmode, umin_optab, tmp, ops[1], ops[0], true, OPTAB_LIB); + } +} + } // namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-switch.def b/gcc/config/riscv/riscv-vector-switch.def index bfb591773dc..f75287d9070 100644 --- a/gcc/config/riscv/riscv-vector-switch.def +++ b/gcc/config/riscv/riscv-vector-switch.def @@ -121,37 +121,43 @@ TODO: FP16 vector needs support of 'zvfh', we don't support it yet. */ /* Mask modes. Disable VNx128BI when TARGET_MIN_VLEN < 128. */ /* Mask modes. Disable VNx64BImode when TARGET_MIN_VLEN == 32. */ /* Mask modes. Disable VNx1BImode when TARGET_MIN_VLEN >= 128. */ -ENTRY (VNx128BI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 1) +ENTRY (VNx128BI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, + LMUL_8, 1) ENTRY (VNx64BI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 1, LMUL_4, 2) ENTRY (VNx32BI, true, LMUL_8, 1, LMUL_4, 2, LMUL_2, 4) ENTRY (VNx16BI, true, LMUL_4, 2, LMUL_2, 4, LMUL_1, 8) ENTRY (VNx8BI, true, LMUL_2, 4, LMUL_1, 8, LMUL_F2, 16) ENTRY (VNx4BI, true, LMUL_1, 8, LMUL_F2, 16, LMUL_F4, 32) ENTRY (VNx2BI, true, LMUL_F2, 16, LMUL_F4, 32, LMUL_F8, 64) -ENTRY (VNx1BI, TARGET_MIN_VLEN < 128, LMUL_F4, 32, LMUL_F8, 64, LMUL_RESERVED, 0) +ENTRY (VNx1BI, TARGET_MIN_VLEN < 128, LMUL_F4, 32, LMUL_F8, 64, LMUL_RESERVED, + 0) /* SEW = 8. Disable VNx128QImode when TARGET_MIN_VLEN < 128. */ /* SEW = 8. Disable VNx64QImode when TARGET_MIN_VLEN == 32. */ /* SEW = 8. Disable VNx1QImode when TARGET_MIN_VLEN >= 128. */ -ENTRY (VNx128QI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 1) +ENTRY (VNx128QI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, + LMUL_8, 1) ENTRY (VNx64QI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 1, LMUL_4, 2) ENTRY (VNx32QI, true, LMUL_8, 1, LMUL_4, 2, LMUL_2, 4) ENTRY (VNx16QI, true, LMUL_4, 2, LMUL_2, 4, LMUL_1, 8) ENTRY (VNx8QI, true, LMUL_2, 4, LMUL_1, 8, LMUL_F2, 16) ENTRY (VNx4QI, true, LMUL_1, 8, LMUL_F2, 16, LMUL_F4, 32) ENTRY (VNx2QI, true, LMUL_F2, 16, LMUL_F4, 32, LMUL_F8, 64) -ENTRY (VNx1QI, TARGET_MIN_VLEN < 128, LMUL_F4, 32, LMUL_F8, 64, LMUL_RESERVED, 0) +ENTRY (VNx1QI, TARGET_MIN_VLEN < 128, LMUL_F4, 32, LMUL_F8, 64, LMUL_RESERVED, + 0) /* SEW = 16. Disable VNx64HImode when TARGET_MIN_VLEN < 128. */ /* SEW = 16. Disable VNx32HImode when TARGET_MIN_VLEN == 32. */ /* SEW = 16. Disable VNx1HImode when TARGET_MIN_VLEN >= 128. */ -ENTRY (VNx64HI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 2) +ENTRY (VNx64HI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, + LMUL_8, 2) ENTRY (VNx32HI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 2, LMUL_4, 4) ENTRY (VNx16HI, true, LMUL_8, 2, LMUL_4, 4, LMUL_2, 8) ENTRY (VNx8HI, true, LMUL_4, 4, LMUL_2, 8, LMUL_1, 16) ENTRY (VNx4HI, true, LMUL_2, 8, LMUL_1, 16, LMUL_F2, 32) ENTRY (VNx2HI, true, LMUL_1, 16, LMUL_F2, 32, LMUL_F4, 64) -ENTRY (VNx1HI, TARGET_MIN_VLEN < 128, LMUL_F2, 32, LMUL_F4, 64, LMUL_RESERVED, 0) +ENTRY (VNx1HI, TARGET_MIN_VLEN < 128, LMUL_F2, 32, LMUL_F4, 64, LMUL_RESERVED, + 0) /* TODO:Disable all FP16 vector, enable them when 'zvfh' is supported. */ ENTRY (VNx64HF, false, LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 2) @@ -167,38 +173,45 @@ ENTRY (VNx1HF, false, LMUL_F2, 32, LMUL_F4, 64, LMUL_RESERVED, 0) For single-precision floating-point, we need TARGET_VECTOR_FP32 == RVV_ENABLE. */ /* SEW = 32. Disable VNx1SImode/VNx1SFmode when TARGET_MIN_VLEN >= 128. */ -ENTRY (VNx32SI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 4) +ENTRY (VNx32SI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, + LMUL_8, 4) ENTRY (VNx16SI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 4, LMUL_4, 8) ENTRY (VNx8SI, true, LMUL_8, 4, LMUL_4, 8, LMUL_2, 16) ENTRY (VNx4SI, true, LMUL_4, 8, LMUL_2, 16, LMUL_1, 32) ENTRY (VNx2SI, true, LMUL_2, 16, LMUL_1, 32, LMUL_F2, 64) ENTRY (VNx1SI, TARGET_MIN_VLEN < 128, LMUL_1, 32, LMUL_F2, 64, LMUL_RESERVED, 0) -ENTRY (VNx32SF, TARGET_VECTOR_FP32 && (TARGET_MIN_VLEN >= 128), LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 4) +ENTRY (VNx32SF, TARGET_VECTOR_FP32 && (TARGET_MIN_VLEN >= 128), LMUL_RESERVED, + 0, LMUL_RESERVED, 0, LMUL_8, 4) ENTRY (VNx16SF, TARGET_VECTOR_FP32 && (TARGET_MIN_VLEN > 32), LMUL_RESERVED, 0, LMUL_8, 4, LMUL_4, 8) ENTRY (VNx8SF, TARGET_VECTOR_FP32, LMUL_8, 4, LMUL_4, 8, LMUL_2, 16) ENTRY (VNx4SF, TARGET_VECTOR_FP32, LMUL_4, 8, LMUL_2, 16, LMUL_1, 32) ENTRY (VNx2SF, TARGET_VECTOR_FP32, LMUL_2, 16, LMUL_1, 32, LMUL_F2, 64) -ENTRY (VNx1SF, TARGET_VECTOR_FP32 && TARGET_MIN_VLEN < 128, LMUL_1, 32, LMUL_F2, 64, LMUL_RESERVED, 0) +ENTRY (VNx1SF, TARGET_VECTOR_FP32 && TARGET_MIN_VLEN < 128, LMUL_1, 32, LMUL_F2, + 64, LMUL_RESERVED, 0) /* SEW = 64. Disable VNx16DImode/VNx16DFmode when TARGET_MIN_VLEN < 128. */ /* SEW = 64. Enable VNx8DImode/VNx8DFmode when TARGET_MIN_VLEN > 32. For double-precision floating-point, we need TARGET_VECTOR_FP64 == RVV_ENABLE. */ /* SEW = 64. Disable VNx1DImode/VNx1DFmode when TARGET_MIN_VLEN >= 128. */ -ENTRY (VNx16DI, TARGET_MIN_VLEN >= 128, LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 8) -ENTRY (VNx8DI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_8, 8, LMUL_4, 16) -ENTRY (VNx4DI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_4, 16, LMUL_2, 32) -ENTRY (VNx2DI, TARGET_MIN_VLEN > 32, LMUL_RESERVED, 0, LMUL_2, 32, LMUL_1, 64) -ENTRY (VNx1DI, TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN < 128, LMUL_RESERVED, 0, LMUL_1, 64, LMUL_RESERVED, 0) - -ENTRY (VNx16DF, TARGET_VECTOR_FP64 && (TARGET_MIN_VLEN >= 128), LMUL_RESERVED, 0, LMUL_RESERVED, 0, LMUL_8, 8) +ENTRY (VNx16DI, TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128, LMUL_RESERVED, + 0, LMUL_RESERVED, 0, LMUL_8, 8) +ENTRY (VNx8DI, TARGET_VECTOR_ELEN_64, LMUL_RESERVED, 0, LMUL_8, 8, LMUL_4, 16) +ENTRY (VNx4DI, TARGET_VECTOR_ELEN_64, LMUL_RESERVED, 0, LMUL_4, 16, LMUL_2, 32) +ENTRY (VNx2DI, TARGET_VECTOR_ELEN_64, LMUL_RESERVED, 0, LMUL_2, 32, LMUL_1, 64) +ENTRY (VNx1DI, TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128, LMUL_RESERVED, 0, + LMUL_1, 64, LMUL_RESERVED, 0) + +ENTRY (VNx16DF, TARGET_VECTOR_FP64 && (TARGET_MIN_VLEN >= 128), LMUL_RESERVED, + 0, LMUL_RESERVED, 0, LMUL_8, 8) ENTRY (VNx8DF, TARGET_VECTOR_FP64 && (TARGET_MIN_VLEN > 32), LMUL_RESERVED, 0, LMUL_8, 8, LMUL_4, 16) ENTRY (VNx4DF, TARGET_VECTOR_FP64, LMUL_RESERVED, 0, LMUL_4, 16, LMUL_2, 32) ENTRY (VNx2DF, TARGET_VECTOR_FP64, LMUL_RESERVED, 0, LMUL_2, 32, LMUL_1, 64) -ENTRY (VNx1DF, TARGET_VECTOR_FP64 && TARGET_MIN_VLEN < 128, LMUL_RESERVED, 0, LMUL_1, 64, LMUL_RESERVED, 0) +ENTRY (VNx1DF, TARGET_VECTOR_FP64 && TARGET_MIN_VLEN < 128, LMUL_RESERVED, 0, + LMUL_1, 64, LMUL_RESERVED, 0) #undef TARGET_VECTOR_FP32 #undef TARGET_VECTOR_FP64 diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 7e8a5376705..52b453a7660 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -532,6 +532,43 @@ get_all_predecessors (basic_block cfg_bb) return blocks; } +/* Recursively find all successor blocks for cfg_bb. */ +static hash_set<basic_block> +get_all_successors (basic_block cfg_bb) +{ + hash_set<basic_block> blocks; + auto_vec<basic_block> work_list; + hash_set<basic_block> visited_list; + work_list.safe_push (cfg_bb); + + while (!work_list.is_empty ()) + { + basic_block new_cfg_bb = work_list.pop (); + visited_list.add (new_cfg_bb); + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, new_cfg_bb->succs) + { + if (!visited_list.contains (e->dest)) + work_list.safe_push (e->dest); + blocks.add (e->dest); + } + } + return blocks; +} + +/* Get all overlap blocks between set. */ +static hash_set<basic_block> +get_all_overlap_blocks (hash_set<basic_block> blocks1, + hash_set<basic_block> blocks2) +{ + hash_set<basic_block> blocks; + for (const auto &block : blocks1) + if (blocks2.contains (block)) + blocks.add (block); + return blocks; +} + /* Return true if there is an INSN in insns staying in the block BB. */ static bool any_set_in_bb_p (hash_set<set_info *> sets, const bb_info *bb) @@ -1054,6 +1091,51 @@ change_vsetvl_insn (const insn_info *insn, const vector_insn_info &info) change_insn (rinsn, new_pat); } +static void +local_eliminate_vsetvl_insn (const vector_insn_info &dem) +{ + const insn_info *insn = dem.get_insn (); + if (!insn || insn->is_artificial ()) + return; + rtx_insn *rinsn = insn->rtl (); + const bb_info *bb = insn->bb (); + if (vsetvl_insn_p (rinsn)) + { + rtx vl = get_vl (rinsn); + for (insn_info *i = insn->next_nondebug_insn (); + real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) + { + if (i->is_call () || i->is_asm () + || find_access (i->defs (), VL_REGNUM) + || find_access (i->defs (), VTYPE_REGNUM)) + return; + + if (has_vtype_op (i->rtl ())) + { + if (!vsetvl_discard_result_insn_p (PREV_INSN (i->rtl ()))) + return; + rtx avl = get_avl (i->rtl ()); + if (avl != vl) + return; + set_info *def = find_access (i->uses (), REGNO (avl))->def (); + if (def->insn () != insn) + return; + + vector_insn_info new_info; + new_info.parse_insn (i); + if (!new_info.skip_avl_compatible_p (dem)) + return; + + new_info.set_avl_info (dem.get_avl_info ()); + new_info = dem.merge (new_info, LOCAL_MERGE); + change_vsetvl_insn (insn, new_info); + eliminate_insn (PREV_INSN (i->rtl ())); + return; + } + } + } +} + static bool source_equal_p (insn_info *insn1, insn_info *insn2) { @@ -1984,6 +2066,19 @@ vector_insn_info::compatible_p (const vector_insn_info &other) const return true; } +bool +vector_insn_info::skip_avl_compatible_p (const vector_insn_info &other) const +{ + gcc_assert (valid_or_dirty_p () && other.valid_or_dirty_p () + && "Can't compare invalid demanded infos"); + unsigned array_size = sizeof (incompatible_conds) / sizeof (demands_cond); + /* Bypass AVL incompatible cases. */ + for (unsigned i = 1; i < array_size; i++) + if (incompatible_conds[i].dual_incompatible_p (*this, other)) + return false; + return true; +} + bool vector_insn_info::compatible_avl_p (const vl_vtype_info &other) const { @@ -2178,7 +2273,7 @@ vector_insn_info::fuse_mask_policy (const vector_insn_info &info1, vector_insn_info vector_insn_info::merge (const vector_insn_info &merge_info, - enum merge_type type = LOCAL_MERGE) const + enum merge_type type) const { if (!vsetvl_insn_p (get_insn ()->rtl ())) gcc_assert (this->compatible_p (merge_info) @@ -2642,6 +2737,7 @@ private: void pre_vsetvl (void); /* Phase 5. */ + bool global_eliminate_vsetvl_p (const bb_info *) const; void cleanup_insns (void) const; /* Phase 6. */ @@ -2716,7 +2812,7 @@ pass_vsetvl::compute_local_backward_infos (const bb_info *bb) && !reg_available_p (insn, change)) && change.compatible_p (info)) { - info = change.merge (info); + info = change.merge (info, LOCAL_MERGE); /* Fix PR109399, we should update user vsetvl instruction if there is a change in demand fusion. */ if (vsetvl_insn_p (insn->rtl ())) @@ -3990,14 +4086,124 @@ pass_vsetvl::pre_vsetvl (void) commit_edge_insertions (); } +/* Eliminate VSETVL insn that has multiple AVL source, we don't let LCM + do that since it's quite complicated and may be buggy in some situations. +*/ +bool +pass_vsetvl::global_eliminate_vsetvl_p (const bb_info *bb) const +{ + const auto &dem + = m_vector_manager->vector_block_infos[bb->index ()].local_dem; + if (!dem.valid_p ()) + return false; + if (dem.get_insn ()->is_artificial ()) + return false; + + insn_info *insn = dem.get_insn (); + if (!has_vtype_op (insn->rtl ())) + return false; + + rtx_insn *prev_rinsn = PREV_INSN (insn->rtl ()); + if (!prev_rinsn) + return false; + if (!vsetvl_discard_result_insn_p (prev_rinsn)) + return false; + + if (!dem.has_avl_reg ()) + return false; + rtx avl = dem.get_avl (); + set_info *def = find_access (insn->uses (), REGNO (avl))->def (); + hash_set<set_info *> sets = get_all_sets (def, true, true, true); + if (sets.is_empty ()) + return false; + + sbitmap avin = m_vector_manager->vector_avin[bb->index ()]; + if (!bitmap_empty_p (avin)) + return false; + + hash_set<basic_block> pred_cfg_bbs = get_all_predecessors (bb->cfg_bb ()); + auto_vec<vector_insn_info> vsetvl_infos; + for (const auto &set : sets) + { + if (set->insn ()->is_artificial ()) + return false; + insn_info *set_insn = set->insn (); + if (!vsetvl_insn_p (set_insn->rtl ())) + return false; + vector_insn_info vsetvl_info; + vsetvl_info.parse_insn (set_insn); + if (!vsetvl_info.skip_avl_compatible_p (dem)) + return false; + + /* Make sure there is no other vsetvl from set_bb to bb. */ + hash_set<basic_block> succ_cfg_bbs + = get_all_successors (set->insn ()->bb ()->cfg_bb ()); + hash_set<basic_block> overlap_cfg_bbs + = get_all_overlap_blocks (pred_cfg_bbs, succ_cfg_bbs); + for (const auto &overlap_cfg_bb : overlap_cfg_bbs) + { + unsigned int index = overlap_cfg_bb->index; + if (index == bb->index ()) + continue; + const auto &overlap_dem + = m_vector_manager->vector_block_infos[index].local_dem; + /* TODO: Currently, we only allow optimize user vsetvl when + there is empty overlap blocks. + + We could support check accurately there is no instructions + modifiy VL/VTYPE in overlap blocks. */ + if (!overlap_dem.empty_p ()) + return false; + } + vsetvl_infos.safe_push (vsetvl_info); + } + + /* Update VTYPE for each SET vsetvl instructions. */ + for (const auto &vsetvl_info : vsetvl_infos) + { + vector_insn_info info = dem; + info.set_avl_info (vsetvl_info.get_avl_info ()); + info = vsetvl_info.merge (info, LOCAL_MERGE); + insn_info *vsetvl_insn = vsetvl_info.get_insn (); + change_vsetvl_insn (vsetvl_insn, info); + } + + return true; +} + void pass_vsetvl::cleanup_insns (void) const { for (const bb_info *bb : crtl->ssa->bbs ()) { + /* Eliminate global vsetvl: + bb 0: + vsetvl a5,zero,... + bb 1: + vsetvl a5,a6,... + + bb 2: + vsetvl zero,a5. + + Eliminate vsetvl in bb2 when a5 is only coming from + bb 0 and bb1. */ + const auto &local_dem + = m_vector_manager->vector_block_infos[bb->index ()].local_dem; + if (global_eliminate_vsetvl_p (bb)) + eliminate_insn (PREV_INSN (local_dem.get_insn ()->rtl ())); + for (insn_info *insn : bb->real_nondebug_insns ()) { rtx_insn *rinsn = insn->rtl (); + const auto &dem = m_vector_manager->vector_insn_infos[insn->uid ()]; + /* Eliminate local vsetvl: + bb 0: + vsetvl a5,a6,... + vsetvl zero,a5. + + Eliminate vsetvl in bb2 when a5 is only coming from + bb 0. */ + local_eliminate_vsetvl_insn (dem); if (vlmax_avl_insn_p (rinsn)) { diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h index d05472c86a0..d7a6c14e931 100644 --- a/gcc/config/riscv/riscv-vsetvl.h +++ b/gcc/config/riscv/riscv-vsetvl.h @@ -380,6 +380,7 @@ public: void fuse_mask_policy (const vector_insn_info &, const vector_insn_info &); bool compatible_p (const vector_insn_info &) const; + bool skip_avl_compatible_p (const vector_insn_info &) const; bool compatible_avl_p (const vl_vtype_info &) const; bool compatible_avl_p (const avl_info &) const; bool compatible_vtype_p (const vl_vtype_info &) const; diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index b460c8a0b8b..3f68740737d 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -6217,7 +6217,15 @@ riscv_convert_vector_bits (void) to set RVV mode size. The RVV machine modes size are run-time constant if TARGET_VECTOR is enabled. The RVV machine modes size remains default compile-time constant if TARGET_VECTOR is disabled. */ - return TARGET_VECTOR ? poly_uint16 (1, 1) : 1; + if (TARGET_VECTOR) + { + if (riscv_autovec_preference == RVV_FIXED_VLMAX) + return (int) TARGET_MIN_VLEN / (riscv_bytes_per_vector_chunk * 8); + else + return poly_uint16 (1, 1); + } + else + return 1; } /* Implement TARGET_OPTION_OVERRIDE. */ @@ -7076,6 +7084,27 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask) return shamt == ctz_hwi (mask); } +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE. */ + +static machine_mode +riscv_preferred_simd_mode (scalar_mode mode) +{ + /* We only enable auto-vectorization when TARGET_MIN_VLEN >= 128 + which is -march=rv64gcv. Since GCC loop vectorizer report ICE + when we enable -march=rv64gc_zve32* and -march=rv32gc_zve64x. + in tree-vect-slp.cc:437. Since we have VNx1SImode in -march=*zve32* + and VNx1DImode in -march=*zve64*, they are enabled in targetm. + vector_mode_supported_p and SLP vectorizer will try to use them. + Currently, we can support auto-vectorization in -march=rv32_zve32x_zvl128b. + Wheras, -march=rv32_zve32x_zvl32b or -march=rv32_zve32x_zvl64b are + disabled. + */ + if (TARGET_VECTOR && TARGET_MIN_VLEN >= 128) + return riscv_vector::preferred_simd_mode (mode); + + return word_mode; +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -7327,6 +7356,9 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask) #undef TARGET_DWARF_POLY_INDETERMINATE_VALUE #define TARGET_DWARF_POLY_INDETERMINATE_VALUE riscv_dwarf_poly_indeterminate_value +#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE +#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-riscv.h" diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index ff1dd4ddd4f..7d26e450be5 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -254,3 +254,43 @@ Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213) misa-spec= Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC) Set the version of RISC-V ISA spec. + +Enum +Name(riscv_autovec_preference) Type(enum riscv_autovec_preference_enum) +The RISC-V auto-vectorization preference: + +EnumValue +Enum(riscv_autovec_preference) String(none) Value(NO_AUTOVEC) + +EnumValue +Enum(riscv_autovec_preference) String(scalable) Value(RVV_SCALABLE) + +EnumValue +Enum(riscv_autovec_preference) String(fixed-vlmin) Value(RVV_FIXED_VLMIN) + +EnumValue +Enum(riscv_autovec_preference) String(fixed-vlmax) Value(RVV_FIXED_VLMAX) + +-param=riscv-autovec-preference= +Target RejectNegative Joined Enum(riscv_autovec_preference) Var(riscv_autovec_preference) Init(NO_AUTOVEC) +-param=riscv-autovec-preference=<string> Set the preference of auto-vectorization in RISC-V port. + +Enum +Name(riscv_autovec_lmul) Type(enum riscv_autovec_lmul_enum) +The RVV possible LMUL: + +EnumValue +Enum(riscv_autovec_lmul) String(m1) Value(RVV_M1) + +EnumValue +Enum(riscv_autovec_lmul) String(m2) Value(RVV_M2) + +EnumValue +Enum(riscv_autovec_lmul) String(m4) Value(RVV_M4) + +EnumValue +Enum(riscv_autovec_lmul) String(m8) Value(RVV_M8) + +-param=riscv-autovec-lmul= +Target RejectNegative Joined Enum(riscv_autovec_lmul) Var(riscv_autovec_lmul) Init(RVV_M1) +-param=riscv-autovec-lmul=<string> Set the RVV LMUL of auto-vectorization in RISC-V port. diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 27bdacc35af..9151a4c9891 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -23,7 +23,7 @@ ;; This file include : ;; ;; - Intrinsics (https://github.com/riscv/rvv-intrinsic-doc) -;; - Auto-vectorization (TBD) +;; - Auto-vectorization (autovec.md) ;; - Combine optimization (TBD) (include "vector-iterators.md") @@ -2015,7 +2015,7 @@ riscv_vector::neg_simm5_p (operands[4]), [] (rtx *operands, rtx boardcast_scalar) { emit_insn (gen_pred_sub<mode> (operands[0], operands[1], - operands[2], operands[3], boardcast_scalar, operands[5], + operands[2], boardcast_scalar, operands[3], operands[5], operands[6], operands[7], operands[8])); })) DONE; @@ -7688,3 +7688,5 @@ "vle<sew>ff.v\t%0,%3%p1" [(set_attr "type" "vldff") (set_attr "mode" "<MODE>")]) + +(include "autovec.md") -- 2.36.3 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern 2023-04-06 14:42 ` [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern juzhe.zhong @ 2023-04-06 16:04 ` Kito Cheng 2023-04-07 1:40 ` juzhe.zhong 0 siblings, 1 reply; 7+ messages in thread From: Kito Cheng @ 2023-04-06 16:04 UTC (permalink / raw) To: juzhe.zhong Cc: gcc-patches, palmer, richard.sandiford, rguenther, jeffreyalaw Is changes for riscv-vsetvl.cc necessary for autovec? or is it additional optimization for the autovec use case? I would suggest splitting that if it's later one. And plz split out fixed-vlmax part into separated patch, that would be easier to review. On Thu, Apr 6, 2023 at 10:44 PM <juzhe.zhong@rivai.ai> wrote: > > From: Juzhe-Zhong <juzhe.zhong@rivai.ai> > > gcc/ChangeLog: > > * config/riscv/riscv-opts.h (enum riscv_autovec_preference_enum): Add compile option for RVV auto-vectorization. > (enum riscv_autovec_lmul_enum): Ditto. > * config/riscv/riscv-protos.h (get_vector_mode): Remove unused global function. > (preferred_simd_mode): Enable basic auto-vectorization for RVV. > (expand_while_len): Enable while_len pattern. > * config/riscv/riscv-v.cc (get_avl_type_rtx): Ditto. > (autovec_use_vlmax_p): New function. > (preferred_simd_mode): New function. > (expand_while_len): Ditto. > * config/riscv/riscv-vector-switch.def (ENTRY): Disable SEW = 64 for MIN_VLEN > 32 but EEW = 32. It's bug fix? plz send a separated patch if it's a bug. > * config/riscv/riscv-vsetvl.cc (get_all_successors): New function. > (get_all_overlap_blocks): Ditto. > (local_eliminate_vsetvl_insn): Ditto. > (vector_insn_info::skip_avl_compatible_p): Ditto. > (vector_insn_info::merge): Ditto. > (pass_vsetvl::compute_local_backward_infos): Ehance VSETVL PASS for RVV auto-vectorization. > (pass_vsetvl::global_eliminate_vsetvl_p): Ditto. > (pass_vsetvl::cleanup_insns): Ditto. > * config/riscv/riscv-vsetvl.h: Ditto. > * config/riscv/riscv.cc (riscv_convert_vector_bits): Add basic RVV auto-vectorization support. > (riscv_preferred_simd_mode): Ditto. > (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto. > * config/riscv/riscv.opt: Add compile option. > * config/riscv/vector.md: Add RVV auto-vectorization. > * config/riscv/autovec.md: New file. > > --- > gcc/config/riscv/autovec.md | 63 +++++++ > gcc/config/riscv/riscv-opts.h | 16 ++ > gcc/config/riscv/riscv-protos.h | 3 +- > gcc/config/riscv/riscv-v.cc | 61 ++++++- > gcc/config/riscv/riscv-vector-switch.def | 47 +++-- > gcc/config/riscv/riscv-vsetvl.cc | 210 ++++++++++++++++++++++- > gcc/config/riscv/riscv-vsetvl.h | 1 + > gcc/config/riscv/riscv.cc | 34 +++- > gcc/config/riscv/riscv.opt | 40 +++++ > gcc/config/riscv/vector.md | 6 +- > 10 files changed, 457 insertions(+), 24 deletions(-) > create mode 100644 gcc/config/riscv/autovec.md > > diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md > new file mode 100644 > index 00000000000..ff616d81586 > --- /dev/null > +++ b/gcc/config/riscv/autovec.md > @@ -0,0 +1,63 @@ > +;; Machine description for auto-vectorization using RVV for GNU compiler. > +;; Copyright (C) 2023-2023 Free Software Foundation, Inc. 2023 rather than 2023-2023 > +;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. > + > +;; This file is part of GCC. > + > +;; GCC is free software; you can redistribute it and/or modify > +;; it under the terms of the GNU General Public License as published by > +;; the Free Software Foundation; either version 3, or (at your option) > +;; any later version. > + > +;; GCC is distributed in the hope that it will be useful, > +;; but WITHOUT ANY WARRANTY; without even the implied warranty of > +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +;; GNU General Public License for more details. > + > +;; You should have received a copy of the GNU General Public License > +;; along with GCC; see the file COPYING3. If not see > +;; <http://www.gnu.org/licenses/>. > + > +;; ========================================================================= > +;; == While_len > +;; ========================================================================= > + > +(define_expand "while_len<mode>" > + [(match_operand:P 0 "register_operand") > + (match_operand:P 1 "vector_length_operand") > + (match_operand:P 2 "")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_while_len (operands); > + DONE; > +}) > + > +;; ========================================================================= > +;; == Loads/Stores > +;; ========================================================================= > + > +;; len_load/len_store is sub-optimal pattern for RVV auto-vectorization support. Google doc say you need a `a`: "is a sub-optimal " :P > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index 4611447ddde..7db0deb4dbf 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -184,7 +184,6 @@ enum mask_policy > enum tail_policy get_prefer_tail_policy (); > enum mask_policy get_prefer_mask_policy (); > rtx get_avl_type_rtx (enum avl_type); > -opt_machine_mode get_vector_mode (scalar_mode, poly_uint64); Separated NFC patch, and Yanzhang's patch has used that, so I think it's not rush to remove that. > /* Implement TARGET_OPTION_OVERRIDE. */ > @@ -7076,6 +7084,27 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask) > return shamt == ctz_hwi (mask); > } > > +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE. */ > + > +static machine_mode > +riscv_preferred_simd_mode (scalar_mode mode) > +{ > + /* We only enable auto-vectorization when TARGET_MIN_VLEN >= 128 > + which is -march=rv64gcv. Since GCC loop vectorizer report ICE > + when we enable -march=rv64gc_zve32* and -march=rv32gc_zve64x. > + in tree-vect-slp.cc:437. Since we have VNx1SImode in -march=*zve32* Reference function name rather than line number, line number might change, although function name might change too, but that would be less likely to change than line number. > + and VNx1DImode in -march=*zve64*, they are enabled in targetm. > + vector_mode_supported_p and SLP vectorizer will try to use them. > + Currently, we can support auto-vectorization in -march=rv32_zve32x_zvl128b. > + Wheras, -march=rv32_zve32x_zvl32b or -march=rv32_zve32x_zvl64b are > + disabled. > + */ What if we use M2 when TARGET_MIN_VLEN=64 and M4 for TARGET_MIN_VLEN = 32? Or maybe just return word_mode for those cases? > + if (TARGET_VECTOR && TARGET_MIN_VLEN >= 128) > + return riscv_vector::preferred_simd_mode (mode); > + > + return word_mode; > +} > + > /* Initialize the GCC target structure. */ > #undef TARGET_ASM_ALIGNED_HI_OP > #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" > @@ -7327,6 +7356,9 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask) > #undef TARGET_DWARF_POLY_INDETERMINATE_VALUE > #define TARGET_DWARF_POLY_INDETERMINATE_VALUE riscv_dwarf_poly_indeterminate_value > > +#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE > +#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode > + > struct gcc_target targetm = TARGET_INITIALIZER; > > #include "gt-riscv.h" > diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt > index ff1dd4ddd4f..7d26e450be5 100644 > --- a/gcc/config/riscv/riscv.opt > +++ b/gcc/config/riscv/riscv.opt > @@ -254,3 +254,43 @@ Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213) > misa-spec= > Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC) > Set the version of RISC-V ISA spec. > + > +Enum > +Name(riscv_autovec_preference) Type(enum riscv_autovec_preference_enum) > +The RISC-V auto-vectorization preference: > + > +EnumValue > +Enum(riscv_autovec_preference) String(none) Value(NO_AUTOVEC) > + > +EnumValue > +Enum(riscv_autovec_preference) String(scalable) Value(RVV_SCALABLE) > + > +EnumValue > +Enum(riscv_autovec_preference) String(fixed-vlmin) Value(RVV_FIXED_VLMIN) Drop unsupported stuff, and added back once it has implemented. > + > +EnumValue > +Enum(riscv_autovec_preference) String(fixed-vlmax) Value(RVV_FIXED_VLMAX) > + > +-param=riscv-autovec-preference= > +Target RejectNegative Joined Enum(riscv_autovec_preference) Var(riscv_autovec_preference) Init(NO_AUTOVEC) > +-param=riscv-autovec-preference=<string> Set the preference of auto-vectorization in RISC-V port. > +-param=riscv-autovec-lmul= > +Target RejectNegative Joined Enum(riscv_autovec_lmul) Var(riscv_autovec_lmul) Init(RVV_M1) > +-param=riscv-autovec-lmul=<string> Set the RVV LMUL of auto-vectorization in RISC-V port. in the RISC-V port > diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md > index 27bdacc35af..9151a4c9891 100644 > --- a/gcc/config/riscv/vector.md > +++ b/gcc/config/riscv/vector.md > @@ -23,7 +23,7 @@ > ;; This file include : > ;; > ;; - Intrinsics (https://github.com/riscv/rvv-intrinsic-doc) > -;; - Auto-vectorization (TBD) > +;; - Auto-vectorization (autovec.md) > ;; - Combine optimization (TBD) > > (include "vector-iterators.md") > @@ -2015,7 +2015,7 @@ > riscv_vector::neg_simm5_p (operands[4]), > [] (rtx *operands, rtx boardcast_scalar) { > emit_insn (gen_pred_sub<mode> (operands[0], operands[1], > - operands[2], operands[3], boardcast_scalar, operands[5], > + operands[2], boardcast_scalar, operands[3], operands[5], Seems like you mixed some other patch by accidently here. > operands[6], operands[7], operands[8])); > })) > DONE; > @@ -7688,3 +7688,5 @@ > "vle<sew>ff.v\t%0,%3%p1" > [(set_attr "type" "vldff") > (set_attr "mode" "<MODE>")]) > + > +(include "autovec.md") > -- > 2.36.3 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Re: [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern 2023-04-06 16:04 ` Kito Cheng @ 2023-04-07 1:40 ` juzhe.zhong 0 siblings, 0 replies; 7+ messages in thread From: juzhe.zhong @ 2023-04-07 1:40 UTC (permalink / raw) To: kito.cheng; +Cc: gcc-patches, palmer, richard.sandiford, rguenther, jeffreyalaw [-- Attachment #1: Type: text/plain, Size: 11001 bytes --] Address all comments, and fix all of them in these splitted patches: These 5 patches only including RISC-V port changes: https://patchwork.sourceware.org/project/gcc/patch/20230407011143.46004-1-juzhe.zhong@rivai.ai/ https://patchwork.sourceware.org/project/gcc/patch/20230407012129.63142-1-juzhe.zhong@rivai.ai/ https://patchwork.sourceware.org/project/gcc/patch/20230407012503.65215-1-juzhe.zhong@rivai.ai/ https://patchwork.sourceware.org/project/gcc/patch/20230407013413.127686-1-juzhe.zhong@rivai.ai/ https://patchwork.sourceware.org/project/gcc/patch/20230407013701.129875-1-juzhe.zhong@rivai.ai/ I would like to resend a patch for pure middle-end changes for WHILE_LEN pattern support in Middle-end. Ignore this serise of patches. Thanks! juzhe.zhong@rivai.ai From: Kito Cheng Date: 2023-04-07 00:04 To: juzhe.zhong CC: gcc-patches; palmer; richard.sandiford; rguenther; jeffreyalaw Subject: Re: [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern Is changes for riscv-vsetvl.cc necessary for autovec? or is it additional optimization for the autovec use case? I would suggest splitting that if it's later one. And plz split out fixed-vlmax part into separated patch, that would be easier to review. On Thu, Apr 6, 2023 at 10:44 PM <juzhe.zhong@rivai.ai> wrote: > > From: Juzhe-Zhong <juzhe.zhong@rivai.ai> > > gcc/ChangeLog: > > * config/riscv/riscv-opts.h (enum riscv_autovec_preference_enum): Add compile option for RVV auto-vectorization. > (enum riscv_autovec_lmul_enum): Ditto. > * config/riscv/riscv-protos.h (get_vector_mode): Remove unused global function. > (preferred_simd_mode): Enable basic auto-vectorization for RVV. > (expand_while_len): Enable while_len pattern. > * config/riscv/riscv-v.cc (get_avl_type_rtx): Ditto. > (autovec_use_vlmax_p): New function. > (preferred_simd_mode): New function. > (expand_while_len): Ditto. > * config/riscv/riscv-vector-switch.def (ENTRY): Disable SEW = 64 for MIN_VLEN > 32 but EEW = 32. It's bug fix? plz send a separated patch if it's a bug. > * config/riscv/riscv-vsetvl.cc (get_all_successors): New function. > (get_all_overlap_blocks): Ditto. > (local_eliminate_vsetvl_insn): Ditto. > (vector_insn_info::skip_avl_compatible_p): Ditto. > (vector_insn_info::merge): Ditto. > (pass_vsetvl::compute_local_backward_infos): Ehance VSETVL PASS for RVV auto-vectorization. > (pass_vsetvl::global_eliminate_vsetvl_p): Ditto. > (pass_vsetvl::cleanup_insns): Ditto. > * config/riscv/riscv-vsetvl.h: Ditto. > * config/riscv/riscv.cc (riscv_convert_vector_bits): Add basic RVV auto-vectorization support. > (riscv_preferred_simd_mode): Ditto. > (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto. > * config/riscv/riscv.opt: Add compile option. > * config/riscv/vector.md: Add RVV auto-vectorization. > * config/riscv/autovec.md: New file. > > --- > gcc/config/riscv/autovec.md | 63 +++++++ > gcc/config/riscv/riscv-opts.h | 16 ++ > gcc/config/riscv/riscv-protos.h | 3 +- > gcc/config/riscv/riscv-v.cc | 61 ++++++- > gcc/config/riscv/riscv-vector-switch.def | 47 +++-- > gcc/config/riscv/riscv-vsetvl.cc | 210 ++++++++++++++++++++++- > gcc/config/riscv/riscv-vsetvl.h | 1 + > gcc/config/riscv/riscv.cc | 34 +++- > gcc/config/riscv/riscv.opt | 40 +++++ > gcc/config/riscv/vector.md | 6 +- > 10 files changed, 457 insertions(+), 24 deletions(-) > create mode 100644 gcc/config/riscv/autovec.md > > diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md > new file mode 100644 > index 00000000000..ff616d81586 > --- /dev/null > +++ b/gcc/config/riscv/autovec.md > @@ -0,0 +1,63 @@ > +;; Machine description for auto-vectorization using RVV for GNU compiler. > +;; Copyright (C) 2023-2023 Free Software Foundation, Inc. 2023 rather than 2023-2023 > +;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. > + > +;; This file is part of GCC. > + > +;; GCC is free software; you can redistribute it and/or modify > +;; it under the terms of the GNU General Public License as published by > +;; the Free Software Foundation; either version 3, or (at your option) > +;; any later version. > + > +;; GCC is distributed in the hope that it will be useful, > +;; but WITHOUT ANY WARRANTY; without even the implied warranty of > +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +;; GNU General Public License for more details. > + > +;; You should have received a copy of the GNU General Public License > +;; along with GCC; see the file COPYING3. If not see > +;; <http://www.gnu.org/licenses/>. > + > +;; ========================================================================= > +;; == While_len > +;; ========================================================================= > + > +(define_expand "while_len<mode>" > + [(match_operand:P 0 "register_operand") > + (match_operand:P 1 "vector_length_operand") > + (match_operand:P 2 "")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_while_len (operands); > + DONE; > +}) > + > +;; ========================================================================= > +;; == Loads/Stores > +;; ========================================================================= > + > +;; len_load/len_store is sub-optimal pattern for RVV auto-vectorization support. Google doc say you need a `a`: "is a sub-optimal " :P > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index 4611447ddde..7db0deb4dbf 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -184,7 +184,6 @@ enum mask_policy > enum tail_policy get_prefer_tail_policy (); > enum mask_policy get_prefer_mask_policy (); > rtx get_avl_type_rtx (enum avl_type); > -opt_machine_mode get_vector_mode (scalar_mode, poly_uint64); Separated NFC patch, and Yanzhang's patch has used that, so I think it's not rush to remove that. > /* Implement TARGET_OPTION_OVERRIDE. */ > @@ -7076,6 +7084,27 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask) > return shamt == ctz_hwi (mask); > } > > +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE. */ > + > +static machine_mode > +riscv_preferred_simd_mode (scalar_mode mode) > +{ > + /* We only enable auto-vectorization when TARGET_MIN_VLEN >= 128 > + which is -march=rv64gcv. Since GCC loop vectorizer report ICE > + when we enable -march=rv64gc_zve32* and -march=rv32gc_zve64x. > + in tree-vect-slp.cc:437. Since we have VNx1SImode in -march=*zve32* Reference function name rather than line number, line number might change, although function name might change too, but that would be less likely to change than line number. > + and VNx1DImode in -march=*zve64*, they are enabled in targetm. > + vector_mode_supported_p and SLP vectorizer will try to use them. > + Currently, we can support auto-vectorization in -march=rv32_zve32x_zvl128b. > + Wheras, -march=rv32_zve32x_zvl32b or -march=rv32_zve32x_zvl64b are > + disabled. > + */ What if we use M2 when TARGET_MIN_VLEN=64 and M4 for TARGET_MIN_VLEN = 32? Or maybe just return word_mode for those cases? > + if (TARGET_VECTOR && TARGET_MIN_VLEN >= 128) > + return riscv_vector::preferred_simd_mode (mode); > + > + return word_mode; > +} > + > /* Initialize the GCC target structure. */ > #undef TARGET_ASM_ALIGNED_HI_OP > #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" > @@ -7327,6 +7356,9 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask) > #undef TARGET_DWARF_POLY_INDETERMINATE_VALUE > #define TARGET_DWARF_POLY_INDETERMINATE_VALUE riscv_dwarf_poly_indeterminate_value > > +#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE > +#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode > + > struct gcc_target targetm = TARGET_INITIALIZER; > > #include "gt-riscv.h" > diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt > index ff1dd4ddd4f..7d26e450be5 100644 > --- a/gcc/config/riscv/riscv.opt > +++ b/gcc/config/riscv/riscv.opt > @@ -254,3 +254,43 @@ Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213) > misa-spec= > Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC) > Set the version of RISC-V ISA spec. > + > +Enum > +Name(riscv_autovec_preference) Type(enum riscv_autovec_preference_enum) > +The RISC-V auto-vectorization preference: > + > +EnumValue > +Enum(riscv_autovec_preference) String(none) Value(NO_AUTOVEC) > + > +EnumValue > +Enum(riscv_autovec_preference) String(scalable) Value(RVV_SCALABLE) > + > +EnumValue > +Enum(riscv_autovec_preference) String(fixed-vlmin) Value(RVV_FIXED_VLMIN) Drop unsupported stuff, and added back once it has implemented. > + > +EnumValue > +Enum(riscv_autovec_preference) String(fixed-vlmax) Value(RVV_FIXED_VLMAX) > + > +-param=riscv-autovec-preference= > +Target RejectNegative Joined Enum(riscv_autovec_preference) Var(riscv_autovec_preference) Init(NO_AUTOVEC) > +-param=riscv-autovec-preference=<string> Set the preference of auto-vectorization in RISC-V port. > +-param=riscv-autovec-lmul= > +Target RejectNegative Joined Enum(riscv_autovec_lmul) Var(riscv_autovec_lmul) Init(RVV_M1) > +-param=riscv-autovec-lmul=<string> Set the RVV LMUL of auto-vectorization in RISC-V port. in the RISC-V port > diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md > index 27bdacc35af..9151a4c9891 100644 > --- a/gcc/config/riscv/vector.md > +++ b/gcc/config/riscv/vector.md > @@ -23,7 +23,7 @@ > ;; This file include : > ;; > ;; - Intrinsics (https://github.com/riscv/rvv-intrinsic-doc) > -;; - Auto-vectorization (TBD) > +;; - Auto-vectorization (autovec.md) > ;; - Combine optimization (TBD) > > (include "vector-iterators.md") > @@ -2015,7 +2015,7 @@ > riscv_vector::neg_simm5_p (operands[4]), > [] (rtx *operands, rtx boardcast_scalar) { > emit_insn (gen_pred_sub<mode> (operands[0], operands[1], > - operands[2], operands[3], boardcast_scalar, operands[5], > + operands[2], boardcast_scalar, operands[3], operands[5], Seems like you mixed some other patch by accidently here. > operands[6], operands[7], operands[8])); > })) > DONE; > @@ -7688,3 +7688,5 @@ > "vle<sew>ff.v\t%0,%3%p1" > [(set_attr "type" "vldff") > (set_attr "mode" "<MODE>")]) > + > +(include "autovec.md") > -- > 2.36.3 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] RISC-V: Add RVV auto-vectorization testcase 2023-04-06 14:42 [PATCH 0/3] RISC-V:Enable basic auto-vectorization for RVV juzhe.zhong 2023-04-06 14:42 ` [PATCH 1/3] VECT: Add WHILE_LEN pattern to support decrement IV manipulation for loop vectorizer juzhe.zhong 2023-04-06 14:42 ` [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern juzhe.zhong @ 2023-04-06 14:42 ` juzhe.zhong 2023-04-06 15:36 ` Kito Cheng 2 siblings, 1 reply; 7+ messages in thread From: juzhe.zhong @ 2023-04-06 14:42 UTC (permalink / raw) To: gcc-patches Cc: kito.cheng, palmer, richard.sandiford, rguenther, jeffreyalaw, Juzhe-Zhong From: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.exp: Add testing for RVV auto-vectorization. * gcc.target/riscv/rvv/vsetvl/vsetvl-17.c: Adapt testcase. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c: New test. * gcc.target/riscv/rvv/autovec/template-1.h: New test. * gcc.target/riscv/rvv/autovec/v-1.c: New test. * gcc.target/riscv/rvv/autovec/v-2.c: New test. * gcc.target/riscv/rvv/autovec/zve32f-1.c: New test. * gcc.target/riscv/rvv/autovec/zve32f-2.c: New test. * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c: New test. * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c: New test. * gcc.target/riscv/rvv/autovec/zve32x-1.c: New test. * gcc.target/riscv/rvv/autovec/zve32x-2.c: New test. * gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c: New test. * gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c: New test. * gcc.target/riscv/rvv/autovec/zve64d-1.c: New test. * gcc.target/riscv/rvv/autovec/zve64d-2.c: New test. * gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c: New test. * gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c: New test. * gcc.target/riscv/rvv/autovec/zve64f-1.c: New test. * gcc.target/riscv/rvv/autovec/zve64f-2.c: New test. * gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c: New test. * gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c: New test. * gcc.target/riscv/rvv/autovec/zve64x-1.c: New test. * gcc.target/riscv/rvv/autovec/zve64x-2.c: New test. * gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c: New test. * gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c: New test. --- .../rvv/autovec/partial/multiple_rgroup-1.c | 6 + .../rvv/autovec/partial/multiple_rgroup-1.h | 304 +++++++ .../rvv/autovec/partial/multiple_rgroup-2.c | 6 + .../rvv/autovec/partial/multiple_rgroup-2.h | 546 ++++++++++++ .../rvv/autovec/partial/multiple_rgroup-2.s | 774 ++++++++++++++++++ .../autovec/partial/multiple_rgroup_run-1.c | 19 + .../autovec/partial/multiple_rgroup_run-2.c | 19 + .../rvv/autovec/partial/single_rgroup-1.c | 8 + .../rvv/autovec/partial/single_rgroup-1.h | 106 +++ .../rvv/autovec/partial/single_rgroup_run-1.c | 19 + .../gcc.target/riscv/rvv/autovec/template-1.h | 68 ++ .../gcc.target/riscv/rvv/autovec/v-1.c | 4 + .../gcc.target/riscv/rvv/autovec/v-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve32f-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve32f-2.c | 5 + .../riscv/rvv/autovec/zve32f_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve32f_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve32x-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve32x-2.c | 6 + .../riscv/rvv/autovec/zve32x_zvl128b-1.c | 5 + .../riscv/rvv/autovec/zve32x_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve64d-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve64d-2.c | 4 + .../riscv/rvv/autovec/zve64d_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve64d_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve64f-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve64f-2.c | 4 + .../riscv/rvv/autovec/zve64f_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve64f_zvl128b-2.c | 6 + .../gcc.target/riscv/rvv/autovec/zve64x-1.c | 4 + .../gcc.target/riscv/rvv/autovec/zve64x-2.c | 4 + .../riscv/rvv/autovec/zve64x_zvl128b-1.c | 4 + .../riscv/rvv/autovec/zve64x_zvl128b-2.c | 6 + gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 16 + .../gcc.target/riscv/rvv/vsetvl/vsetvl-17.c | 2 +- 35 files changed, 1996 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c new file mode 100644 index 00000000000..69cc3be78f7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax" } */ + +#include "multiple_rgroup-1.h" + +TEST_ALL (test_1) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h new file mode 100644 index 00000000000..755ee2b3616 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h @@ -0,0 +1,304 @@ +#include <stddef.h> +#include <stdint.h> + +#define test_1(TYPE1, TYPE2) \ + void __attribute__ ((noinline, noclone)) \ + test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x, \ + TYPE1 x2, TYPE2 y, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + { \ + f[i * 2 + 0] = x; \ + f[i * 2 + 1] = x2; \ + d[i] = y; \ + } \ + } + +#define run_1(TYPE1, TYPE2) \ + int n_1_##TYPE1_##TYPE2 = 1; \ + TYPE1 x_1_##TYPE1 = 117; \ + TYPE1 x2_1_##TYPE1 = 232; \ + TYPE2 y_1_##TYPE2 = 9762; \ + TYPE1 f_1_##TYPE1[2 * 2 + 1] = {0}; \ + TYPE2 d_1_##TYPE2[2] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_1_##TYPE1, d_1_##TYPE2, x_1_##TYPE1, x2_1_##TYPE1, \ + y_1_##TYPE2, n_1_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_1_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_1_##TYPE1[i * 2 + 0] != x_1_##TYPE1) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 2 + 1] != x2_1_##TYPE1) \ + __builtin_abort (); \ + if (d_1_##TYPE2[i] != y_1_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_1_##TYPE1_##TYPE2; i < n_1_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_1_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_1_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_2(TYPE1, TYPE2) \ + int n_2_##TYPE1_##TYPE2 = 17; \ + TYPE1 x_2_##TYPE1 = 133; \ + TYPE1 x2_2_##TYPE1 = 94; \ + TYPE2 y_2_##TYPE2 = 8672; \ + TYPE1 f_2_##TYPE1[18 * 2 + 1] = {0}; \ + TYPE2 d_2_##TYPE2[18] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_2_##TYPE1, d_2_##TYPE2, x_2_##TYPE1, x2_2_##TYPE1, \ + y_2_##TYPE2, n_2_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_2_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_2_##TYPE1[i * 2 + 0] != x_2_##TYPE1) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 2 + 1] != x2_2_##TYPE1) \ + __builtin_abort (); \ + if (d_2_##TYPE2[i] != y_2_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_2_##TYPE1_##TYPE2; i < n_2_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_2_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_2_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_3(TYPE1, TYPE2) \ + int n_3_##TYPE1_##TYPE2 = 32; \ + TYPE1 x_3_##TYPE1 = 233; \ + TYPE1 x2_3_##TYPE1 = 78; \ + TYPE2 y_3_##TYPE2 = 1234; \ + TYPE1 f_3_##TYPE1[33 * 2 + 1] = {0}; \ + TYPE2 d_3_##TYPE2[33] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_3_##TYPE1, d_3_##TYPE2, x_3_##TYPE1, x2_3_##TYPE1, \ + y_3_##TYPE2, n_3_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_3_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_3_##TYPE1[i * 2 + 0] != x_3_##TYPE1) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 2 + 1] != x2_3_##TYPE1) \ + __builtin_abort (); \ + if (d_3_##TYPE2[i] != y_3_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_3_##TYPE1_##TYPE2; i < n_3_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_3_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_3_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_4(TYPE1, TYPE2) \ + int n_4_##TYPE1_##TYPE2 = 128; \ + TYPE1 x_4_##TYPE1 = 222; \ + TYPE1 x2_4_##TYPE1 = 59; \ + TYPE2 y_4_##TYPE2 = 4321; \ + TYPE1 f_4_##TYPE1[129 * 2 + 1] = {0}; \ + TYPE2 d_4_##TYPE2[129] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_4_##TYPE1, d_4_##TYPE2, x_4_##TYPE1, x2_4_##TYPE1, \ + y_4_##TYPE2, n_4_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_4_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_4_##TYPE1[i * 2 + 0] != x_4_##TYPE1) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 2 + 1] != x2_4_##TYPE1) \ + __builtin_abort (); \ + if (d_4_##TYPE2[i] != y_4_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_4_##TYPE1_##TYPE2; i < n_4_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_4_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_4_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_5(TYPE1, TYPE2) \ + int n_5_##TYPE1_##TYPE2 = 177; \ + TYPE1 x_5_##TYPE1 = 111; \ + TYPE1 x2_5_##TYPE1 = 189; \ + TYPE2 y_5_##TYPE2 = 5555; \ + TYPE1 f_5_##TYPE1[178 * 2 + 1] = {0}; \ + TYPE2 d_5_##TYPE2[178] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_5_##TYPE1, d_5_##TYPE2, x_5_##TYPE1, x2_5_##TYPE1, \ + y_5_##TYPE2, n_5_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_5_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_5_##TYPE1[i * 2 + 0] != x_5_##TYPE1) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 2 + 1] != x2_5_##TYPE1) \ + __builtin_abort (); \ + if (d_5_##TYPE2[i] != y_5_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_5_##TYPE1_##TYPE2; i < n_5_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_5_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_5_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_6(TYPE1, TYPE2) \ + int n_6_##TYPE1_##TYPE2 = 255; \ + TYPE1 x_6_##TYPE1 = 123; \ + TYPE1 x2_6_##TYPE1 = 132; \ + TYPE2 y_6_##TYPE2 = 6655; \ + TYPE1 f_6_##TYPE1[256 * 2 + 1] = {0}; \ + TYPE2 d_6_##TYPE2[256] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_6_##TYPE1, d_6_##TYPE2, x_6_##TYPE1, x2_6_##TYPE1, \ + y_6_##TYPE2, n_6_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_6_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_6_##TYPE1[i * 2 + 0] != x_6_##TYPE1) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 2 + 1] != x2_6_##TYPE1) \ + __builtin_abort (); \ + if (d_6_##TYPE2[i] != y_6_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_6_##TYPE1_##TYPE2; i < n_6_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_6_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_6_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_7(TYPE1, TYPE2) \ + int n_7_##TYPE1_##TYPE2 = 333; \ + TYPE1 x_7_##TYPE1 = 39; \ + TYPE1 x2_7_##TYPE1 = 59; \ + TYPE2 y_7_##TYPE2 = 5968; \ + TYPE1 f_7_##TYPE1[334 * 2 + 1] = {0}; \ + TYPE2 d_7_##TYPE2[334] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_7_##TYPE1, d_7_##TYPE2, x_7_##TYPE1, x2_7_##TYPE1, \ + y_7_##TYPE2, n_7_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_7_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_7_##TYPE1[i * 2 + 0] != x_7_##TYPE1) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 2 + 1] != x2_7_##TYPE1) \ + __builtin_abort (); \ + if (d_7_##TYPE2[i] != y_7_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_7_##TYPE1_##TYPE2; i < n_7_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_7_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_7_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_8(TYPE1, TYPE2) \ + int n_8_##TYPE1_##TYPE2 = 512; \ + TYPE1 x_8_##TYPE1 = 71; \ + TYPE1 x2_8_##TYPE1 = 255; \ + TYPE2 y_8_##TYPE2 = 3366; \ + TYPE1 f_8_##TYPE1[513 * 2 + 1] = {0}; \ + TYPE2 d_8_##TYPE2[513] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_8_##TYPE1, d_8_##TYPE2, x_8_##TYPE1, x2_8_##TYPE1, \ + y_8_##TYPE2, n_8_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_8_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_8_##TYPE1[i * 2 + 0] != x_8_##TYPE1) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 2 + 1] != x2_8_##TYPE1) \ + __builtin_abort (); \ + if (d_8_##TYPE2[i] != y_8_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_8_##TYPE1_##TYPE2; i < n_8_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_8_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_8_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_9(TYPE1, TYPE2) \ + int n_9_##TYPE1_##TYPE2 = 637; \ + TYPE1 x_9_##TYPE1 = 157; \ + TYPE1 x2_9_##TYPE1 = 89; \ + TYPE2 y_9_##TYPE2 = 5511; \ + TYPE1 f_9_##TYPE1[638 * 2 + 1] = {0}; \ + TYPE2 d_9_##TYPE2[638] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_9_##TYPE1, d_9_##TYPE2, x_9_##TYPE1, x2_9_##TYPE1, \ + y_9_##TYPE2, n_9_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_9_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_9_##TYPE1[i * 2 + 0] != x_9_##TYPE1) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 2 + 1] != x2_9_##TYPE1) \ + __builtin_abort (); \ + if (d_9_##TYPE2[i] != y_9_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_9_##TYPE1_##TYPE2; i < n_9_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_9_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_9_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define run_10(TYPE1, TYPE2) \ + int n_10_##TYPE1_##TYPE2 = 777; \ + TYPE1 x_10_##TYPE1 = 203; \ + TYPE1 x2_10_##TYPE1 = 200; \ + TYPE2 y_10_##TYPE2 = 2023; \ + TYPE1 f_10_##TYPE1[778 * 2 + 1] = {0}; \ + TYPE2 d_10_##TYPE2[778] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_10_##TYPE1, d_10_##TYPE2, x_10_##TYPE1, \ + x2_10_##TYPE1, y_10_##TYPE2, n_10_##TYPE1_##TYPE2); \ + for (int i = 0; i < n_10_##TYPE1_##TYPE2; ++i) \ + { \ + if (f_10_##TYPE1[i * 2 + 0] != x_10_##TYPE1) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 2 + 1] != x2_10_##TYPE1) \ + __builtin_abort (); \ + if (d_10_##TYPE2[i] != y_10_##TYPE2) \ + __builtin_abort (); \ + } \ + for (int i = n_10_##TYPE1_##TYPE2; i < n_10_##TYPE1_##TYPE2 + 1; ++i) \ + { \ + if (f_10_##TYPE1[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (d_10_##TYPE2[i] != 0) \ + __builtin_abort (); \ + } + +#define TEST_ALL(T) \ + T (int8_t, int16_t) \ + T (uint8_t, uint16_t) \ + T (int16_t, int32_t) \ + T (uint16_t, uint32_t) \ + T (int32_t, int64_t) \ + T (uint32_t, uint64_t) \ + T (float, double) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c new file mode 100644 index 00000000000..d1c41907547 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax" } */ + +#include "multiple_rgroup-2.h" + +TEST_ALL (test_1) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h new file mode 100644 index 00000000000..aa50726697c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h @@ -0,0 +1,546 @@ +#include <stddef.h> +#include <stdint.h> + +#define test_1(TYPE1, TYPE2, TYPE3) \ + void __attribute__ ((noinline, noclone)) \ + test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, \ + TYPE3 *__restrict e, TYPE1 x, TYPE1 x2, TYPE1 x3, \ + TYPE1 x4, TYPE2 y, TYPE2 y2, TYPE3 z, int n) \ + { \ + for (int i = 0; i < n; ++i) \ + { \ + f[i * 4 + 0] = x; \ + f[i * 4 + 1] = x2; \ + f[i * 4 + 2] = x3; \ + f[i * 4 + 3] = x4; \ + d[i * 2 + 0] = y; \ + d[i * 2 + 1] = y2; \ + e[i] = z; \ + } \ + } + +#define run_1(TYPE1, TYPE2, TYPE3) \ + int n_1_##TYPE1_##TYPE2_##TYPE3 = 1; \ + TYPE1 x_1_##TYPE1 = 117; \ + TYPE1 x2_1_##TYPE1 = 232; \ + TYPE1 x3_1_##TYPE1 = 127; \ + TYPE1 x4_1_##TYPE1 = 11; \ + TYPE2 y_1_##TYPE2 = 9762; \ + TYPE2 y2_1_##TYPE2 = 6279; \ + TYPE3 z_1_##TYPE3 = 5891663; \ + TYPE1 f_1_##TYPE1[2 * 4 + 1] = {0}; \ + TYPE2 d_1_##TYPE2[2 * 2 + 1] = {0}; \ + TYPE3 e_1_##TYPE3[2] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_1_##TYPE1, d_1_##TYPE2, e_1_##TYPE3, x_1_##TYPE1, \ + x2_1_##TYPE1, x3_1_##TYPE1, x4_1_##TYPE1, \ + y_1_##TYPE2, y2_1_##TYPE2, z_1_##TYPE3, \ + n_1_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_1_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_1_##TYPE1[i * 4 + 0] != x_1_##TYPE1) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 4 + 1] != x2_1_##TYPE1) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 4 + 2] != x3_1_##TYPE1) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 4 + 3] != x4_1_##TYPE1) \ + __builtin_abort (); \ + if (d_1_##TYPE2[i * 2 + 0] != y_1_##TYPE2) \ + __builtin_abort (); \ + if (d_1_##TYPE2[i * 2 + 1] != y2_1_##TYPE2) \ + __builtin_abort (); \ + if (e_1_##TYPE3[i] != z_1_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_1_##TYPE1_##TYPE2_##TYPE3; \ + i < n_1_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_1_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_1_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_1_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_1_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_1_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_2(TYPE1, TYPE2, TYPE3) \ + int n_2_##TYPE1_##TYPE2_##TYPE3 = 17; \ + TYPE1 x_2_##TYPE1 = 107; \ + TYPE1 x2_2_##TYPE1 = 202; \ + TYPE1 x3_2_##TYPE1 = 17; \ + TYPE1 x4_2_##TYPE1 = 53; \ + TYPE2 y_2_##TYPE2 = 5566; \ + TYPE2 y2_2_##TYPE2 = 7926; \ + TYPE3 z_2_##TYPE3 = 781545971; \ + TYPE1 f_2_##TYPE1[18 * 4 + 1] = {0}; \ + TYPE2 d_2_##TYPE2[18 * 2 + 1] = {0}; \ + TYPE3 e_2_##TYPE3[18] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_2_##TYPE1, d_2_##TYPE2, e_2_##TYPE3, x_2_##TYPE1, \ + x2_2_##TYPE1, x3_2_##TYPE1, x4_2_##TYPE1, \ + y_2_##TYPE2, y2_2_##TYPE2, z_2_##TYPE3, \ + n_2_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_2_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_2_##TYPE1[i * 4 + 0] != x_2_##TYPE1) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 4 + 1] != x2_2_##TYPE1) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 4 + 2] != x3_2_##TYPE1) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 4 + 3] != x4_2_##TYPE1) \ + __builtin_abort (); \ + if (d_2_##TYPE2[i * 2 + 0] != y_2_##TYPE2) \ + __builtin_abort (); \ + if (d_2_##TYPE2[i * 2 + 1] != y2_2_##TYPE2) \ + __builtin_abort (); \ + if (e_2_##TYPE3[i] != z_2_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_2_##TYPE1_##TYPE2_##TYPE3; \ + i < n_2_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_2_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_2_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_2_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_2_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_2_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_3(TYPE1, TYPE2, TYPE3) \ + int n_3_##TYPE1_##TYPE2_##TYPE3 = 32; \ + TYPE1 x_3_##TYPE1 = 109; \ + TYPE1 x2_3_##TYPE1 = 239; \ + TYPE1 x3_3_##TYPE1 = 151; \ + TYPE1 x4_3_##TYPE1 = 3; \ + TYPE2 y_3_##TYPE2 = 1234; \ + TYPE2 y2_3_##TYPE2 = 4321; \ + TYPE3 z_3_##TYPE3 = 145615615; \ + TYPE1 f_3_##TYPE1[33 * 4 + 1] = {0}; \ + TYPE2 d_3_##TYPE2[33 * 2 + 1] = {0}; \ + TYPE3 e_3_##TYPE3[33] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_3_##TYPE1, d_3_##TYPE2, e_3_##TYPE3, x_3_##TYPE1, \ + x2_3_##TYPE1, x3_3_##TYPE1, x4_3_##TYPE1, \ + y_3_##TYPE2, y2_3_##TYPE2, z_3_##TYPE3, \ + n_3_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_3_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_3_##TYPE1[i * 4 + 0] != x_3_##TYPE1) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 4 + 1] != x2_3_##TYPE1) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 4 + 2] != x3_3_##TYPE1) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 4 + 3] != x4_3_##TYPE1) \ + __builtin_abort (); \ + if (d_3_##TYPE2[i * 2 + 0] != y_3_##TYPE2) \ + __builtin_abort (); \ + if (d_3_##TYPE2[i * 2 + 1] != y2_3_##TYPE2) \ + __builtin_abort (); \ + if (e_3_##TYPE3[i] != z_3_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_3_##TYPE1_##TYPE2_##TYPE3; \ + i < n_3_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_3_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_3_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_3_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_3_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_3_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_4(TYPE1, TYPE2, TYPE3) \ + int n_4_##TYPE1_##TYPE2_##TYPE3 = 128; \ + TYPE1 x_4_##TYPE1 = 239; \ + TYPE1 x2_4_##TYPE1 = 132; \ + TYPE1 x3_4_##TYPE1 = 39; \ + TYPE1 x4_4_##TYPE1 = 48; \ + TYPE2 y_4_##TYPE2 = 1036; \ + TYPE2 y2_4_##TYPE2 = 3665; \ + TYPE3 z_4_##TYPE3 = 5145656; \ + TYPE1 f_4_##TYPE1[129 * 4 + 1] = {0}; \ + TYPE2 d_4_##TYPE2[129 * 2 + 1] = {0}; \ + TYPE3 e_4_##TYPE3[129] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_4_##TYPE1, d_4_##TYPE2, e_4_##TYPE3, x_4_##TYPE1, \ + x2_4_##TYPE1, x3_4_##TYPE1, x4_4_##TYPE1, \ + y_4_##TYPE2, y2_4_##TYPE2, z_4_##TYPE3, \ + n_4_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_4_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_4_##TYPE1[i * 4 + 0] != x_4_##TYPE1) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 4 + 1] != x2_4_##TYPE1) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 4 + 2] != x3_4_##TYPE1) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 4 + 3] != x4_4_##TYPE1) \ + __builtin_abort (); \ + if (d_4_##TYPE2[i * 2 + 0] != y_4_##TYPE2) \ + __builtin_abort (); \ + if (d_4_##TYPE2[i * 2 + 1] != y2_4_##TYPE2) \ + __builtin_abort (); \ + if (e_4_##TYPE3[i] != z_4_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_4_##TYPE1_##TYPE2_##TYPE3; \ + i < n_4_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_4_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_4_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_4_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_4_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_4_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_5(TYPE1, TYPE2, TYPE3) \ + int n_5_##TYPE1_##TYPE2_##TYPE3 = 177; \ + TYPE1 x_5_##TYPE1 = 239; \ + TYPE1 x2_5_##TYPE1 = 132; \ + TYPE1 x3_5_##TYPE1 = 39; \ + TYPE1 x4_5_##TYPE1 = 48; \ + TYPE2 y_5_##TYPE2 = 1036; \ + TYPE2 y2_5_##TYPE2 = 3665; \ + TYPE3 z_5_##TYPE3 = 5145656; \ + TYPE1 f_5_##TYPE1[178 * 4 + 1] = {0}; \ + TYPE2 d_5_##TYPE2[178 * 2 + 1] = {0}; \ + TYPE3 e_5_##TYPE3[178] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_5_##TYPE1, d_5_##TYPE2, e_5_##TYPE3, x_5_##TYPE1, \ + x2_5_##TYPE1, x3_5_##TYPE1, x4_5_##TYPE1, \ + y_5_##TYPE2, y2_5_##TYPE2, z_5_##TYPE3, \ + n_5_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_5_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_5_##TYPE1[i * 4 + 0] != x_5_##TYPE1) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 4 + 1] != x2_5_##TYPE1) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 4 + 2] != x3_5_##TYPE1) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 4 + 3] != x4_5_##TYPE1) \ + __builtin_abort (); \ + if (d_5_##TYPE2[i * 2 + 0] != y_5_##TYPE2) \ + __builtin_abort (); \ + if (d_5_##TYPE2[i * 2 + 1] != y2_5_##TYPE2) \ + __builtin_abort (); \ + if (e_5_##TYPE3[i] != z_5_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_5_##TYPE1_##TYPE2_##TYPE3; \ + i < n_5_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_5_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_5_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_5_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_5_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_5_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_6(TYPE1, TYPE2, TYPE3) \ + int n_6_##TYPE1_##TYPE2_##TYPE3 = 255; \ + TYPE1 x_6_##TYPE1 = 239; \ + TYPE1 x2_6_##TYPE1 = 132; \ + TYPE1 x3_6_##TYPE1 = 39; \ + TYPE1 x4_6_##TYPE1 = 48; \ + TYPE2 y_6_##TYPE2 = 1036; \ + TYPE2 y2_6_##TYPE2 = 3665; \ + TYPE3 z_6_##TYPE3 = 5145656; \ + TYPE1 f_6_##TYPE1[256 * 4 + 1] = {0}; \ + TYPE2 d_6_##TYPE2[256 * 2 + 1] = {0}; \ + TYPE3 e_6_##TYPE3[256] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_6_##TYPE1, d_6_##TYPE2, e_6_##TYPE3, x_6_##TYPE1, \ + x2_6_##TYPE1, x3_6_##TYPE1, x4_6_##TYPE1, \ + y_6_##TYPE2, y2_6_##TYPE2, z_6_##TYPE3, \ + n_6_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_6_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_6_##TYPE1[i * 4 + 0] != x_6_##TYPE1) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 4 + 1] != x2_6_##TYPE1) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 4 + 2] != x3_6_##TYPE1) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 4 + 3] != x4_6_##TYPE1) \ + __builtin_abort (); \ + if (d_6_##TYPE2[i * 2 + 0] != y_6_##TYPE2) \ + __builtin_abort (); \ + if (d_6_##TYPE2[i * 2 + 1] != y2_6_##TYPE2) \ + __builtin_abort (); \ + if (e_6_##TYPE3[i] != z_6_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_6_##TYPE1_##TYPE2_##TYPE3; \ + i < n_6_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_6_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_6_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_6_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_6_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_6_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_7(TYPE1, TYPE2, TYPE3) \ + int n_7_##TYPE1_##TYPE2_##TYPE3 = 333; \ + TYPE1 x_7_##TYPE1 = 239; \ + TYPE1 x2_7_##TYPE1 = 132; \ + TYPE1 x3_7_##TYPE1 = 39; \ + TYPE1 x4_7_##TYPE1 = 48; \ + TYPE2 y_7_##TYPE2 = 1036; \ + TYPE2 y2_7_##TYPE2 = 3665; \ + TYPE3 z_7_##TYPE3 = 5145656; \ + TYPE1 f_7_##TYPE1[334 * 4 + 1] = {0}; \ + TYPE2 d_7_##TYPE2[334 * 2 + 1] = {0}; \ + TYPE3 e_7_##TYPE3[334] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_7_##TYPE1, d_7_##TYPE2, e_7_##TYPE3, x_7_##TYPE1, \ + x2_7_##TYPE1, x3_7_##TYPE1, x4_7_##TYPE1, \ + y_7_##TYPE2, y2_7_##TYPE2, z_7_##TYPE3, \ + n_7_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_7_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_7_##TYPE1[i * 4 + 0] != x_7_##TYPE1) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 4 + 1] != x2_7_##TYPE1) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 4 + 2] != x3_7_##TYPE1) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 4 + 3] != x4_7_##TYPE1) \ + __builtin_abort (); \ + if (d_7_##TYPE2[i * 2 + 0] != y_7_##TYPE2) \ + __builtin_abort (); \ + if (d_7_##TYPE2[i * 2 + 1] != y2_7_##TYPE2) \ + __builtin_abort (); \ + if (e_7_##TYPE3[i] != z_7_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_7_##TYPE1_##TYPE2_##TYPE3; \ + i < n_7_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_7_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_7_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_7_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_7_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_7_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_8(TYPE1, TYPE2, TYPE3) \ + int n_8_##TYPE1_##TYPE2_##TYPE3 = 512; \ + TYPE1 x_8_##TYPE1 = 239; \ + TYPE1 x2_8_##TYPE1 = 132; \ + TYPE1 x3_8_##TYPE1 = 39; \ + TYPE1 x4_8_##TYPE1 = 48; \ + TYPE2 y_8_##TYPE2 = 1036; \ + TYPE2 y2_8_##TYPE2 = 3665; \ + TYPE3 z_8_##TYPE3 = 5145656; \ + TYPE1 f_8_##TYPE1[513 * 4 + 1] = {0}; \ + TYPE2 d_8_##TYPE2[513 * 2 + 1] = {0}; \ + TYPE3 e_8_##TYPE3[513] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_8_##TYPE1, d_8_##TYPE2, e_8_##TYPE3, x_8_##TYPE1, \ + x2_8_##TYPE1, x3_8_##TYPE1, x4_8_##TYPE1, \ + y_8_##TYPE2, y2_8_##TYPE2, z_8_##TYPE3, \ + n_8_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_8_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_8_##TYPE1[i * 4 + 0] != x_8_##TYPE1) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 4 + 1] != x2_8_##TYPE1) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 4 + 2] != x3_8_##TYPE1) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 4 + 3] != x4_8_##TYPE1) \ + __builtin_abort (); \ + if (d_8_##TYPE2[i * 2 + 0] != y_8_##TYPE2) \ + __builtin_abort (); \ + if (d_8_##TYPE2[i * 2 + 1] != y2_8_##TYPE2) \ + __builtin_abort (); \ + if (e_8_##TYPE3[i] != z_8_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_8_##TYPE1_##TYPE2_##TYPE3; \ + i < n_8_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_8_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_8_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_8_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_8_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_8_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_9(TYPE1, TYPE2, TYPE3) \ + int n_9_##TYPE1_##TYPE2_##TYPE3 = 637; \ + TYPE1 x_9_##TYPE1 = 222; \ + TYPE1 x2_9_##TYPE1 = 111; \ + TYPE1 x3_9_##TYPE1 = 11; \ + TYPE1 x4_9_##TYPE1 = 7; \ + TYPE2 y_9_##TYPE2 = 2034; \ + TYPE2 y2_9_##TYPE2 = 6987; \ + TYPE3 z_9_##TYPE3 = 1564616; \ + TYPE1 f_9_##TYPE1[638 * 4 + 1] = {0}; \ + TYPE2 d_9_##TYPE2[638 * 2 + 1] = {0}; \ + TYPE3 e_9_##TYPE3[638] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_9_##TYPE1, d_9_##TYPE2, e_9_##TYPE3, x_9_##TYPE1, \ + x2_9_##TYPE1, x3_9_##TYPE1, x4_9_##TYPE1, \ + y_9_##TYPE2, y2_9_##TYPE2, z_9_##TYPE3, \ + n_9_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_9_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_9_##TYPE1[i * 4 + 0] != x_9_##TYPE1) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 4 + 1] != x2_9_##TYPE1) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 4 + 2] != x3_9_##TYPE1) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 4 + 3] != x4_9_##TYPE1) \ + __builtin_abort (); \ + if (d_9_##TYPE2[i * 2 + 0] != y_9_##TYPE2) \ + __builtin_abort (); \ + if (d_9_##TYPE2[i * 2 + 1] != y2_9_##TYPE2) \ + __builtin_abort (); \ + if (e_9_##TYPE3[i] != z_9_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_9_##TYPE1_##TYPE2_##TYPE3; \ + i < n_9_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_9_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_9_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_9_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_9_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_9_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define run_10(TYPE1, TYPE2, TYPE3) \ + int n_10_##TYPE1_##TYPE2_##TYPE3 = 777; \ + TYPE1 x_10_##TYPE1 = 222; \ + TYPE1 x2_10_##TYPE1 = 111; \ + TYPE1 x3_10_##TYPE1 = 11; \ + TYPE1 x4_10_##TYPE1 = 7; \ + TYPE2 y_10_##TYPE2 = 2034; \ + TYPE2 y2_10_##TYPE2 = 6987; \ + TYPE3 z_10_##TYPE3 = 1564616; \ + TYPE1 f_10_##TYPE1[778 * 4 + 1] = {0}; \ + TYPE2 d_10_##TYPE2[778 * 2 + 1] = {0}; \ + TYPE3 e_10_##TYPE3[778] = {0}; \ + test_1_##TYPE1_##TYPE2 (f_10_##TYPE1, d_10_##TYPE2, e_10_##TYPE3, x_10_##TYPE1, \ + x2_10_##TYPE1, x3_10_##TYPE1, x4_10_##TYPE1, \ + y_10_##TYPE2, y2_10_##TYPE2, z_10_##TYPE3, \ + n_10_##TYPE1_##TYPE2_##TYPE3); \ + for (int i = 0; i < n_10_##TYPE1_##TYPE2_##TYPE3; ++i) \ + { \ + if (f_10_##TYPE1[i * 4 + 0] != x_10_##TYPE1) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 4 + 1] != x2_10_##TYPE1) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 4 + 2] != x3_10_##TYPE1) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 4 + 3] != x4_10_##TYPE1) \ + __builtin_abort (); \ + if (d_10_##TYPE2[i * 2 + 0] != y_10_##TYPE2) \ + __builtin_abort (); \ + if (d_10_##TYPE2[i * 2 + 1] != y2_10_##TYPE2) \ + __builtin_abort (); \ + if (e_10_##TYPE3[i] != z_10_##TYPE3) \ + __builtin_abort (); \ + } \ + for (int i = n_10_##TYPE1_##TYPE2_##TYPE3; \ + i < n_10_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ + { \ + if (f_10_##TYPE1[i * 4 + 0] != 0) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 4 + 1] != 0) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 4 + 2] != 0) \ + __builtin_abort (); \ + if (f_10_##TYPE1[i * 4 + 3] != 0) \ + __builtin_abort (); \ + if (d_10_##TYPE2[i * 2 + 0] != 0) \ + __builtin_abort (); \ + if (d_10_##TYPE2[i * 2 + 1] != 0) \ + __builtin_abort (); \ + if (e_10_##TYPE3[i] != 0) \ + __builtin_abort (); \ + } + +#define TEST_ALL(T) \ + T (int8_t, int16_t, int32_t) \ + T (uint8_t, uint16_t, uint32_t) \ + T (int16_t, int32_t, int64_t) \ + T (uint16_t, uint32_t, uint64_t) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s new file mode 100644 index 00000000000..64b36fe5092 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s @@ -0,0 +1,774 @@ + .file "multiple_rgroup-2.c" + .option nopic + .attribute arch, "rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0" + .attribute unaligned_access, 0 + .attribute stack_align, 16 + .text + .align 1 + .globl test_1_TYPE1_int16_t + .type test_1_TYPE1_int16_t, @function +test_1_TYPE1_int16_t: + addi sp,sp,-48 + lw t5,56(sp) + lh t3,48(sp) + lw t6,52(sp) + ble t5,zero,.L1 + or t1,a2,a1 + or t1,a0,t1 + andi t1,t1,15 + bne t1,zero,.L3 + andi a4,a4,0xff + andi a3,a3,0xff + slli a4,a4,8 + andi a5,a5,0xff + slli a5,a5,16 + or t4,a3,a4 + vsetvli t1,zero,e8,m1,ta,ma + li t2,16777216 + addi t2,t2,-1 + or t1,t4,a5 + slli a6,a6,24 + and t1,t1,t2 + vmv.v.i v1,0 + sw s0,44(sp) + vs1r.v v1,0(sp) + or s0,t1,a6 + sw s0,0(sp) + lw s0,4(sp) + li t4,-65536 + addi t4,t4,255 + andi t0,s0,-256 + or t0,t0,a3 + and s0,t0,t4 + li t1,-16711680 + addi t1,t1,-1 + or s0,s0,a4 + and t0,s0,t1 + or t0,t0,a5 + and t0,t0,t2 + or s0,t0,a6 + sw s0,4(sp) + lw s0,8(sp) + slli a7,a7,16 + li t0,65536 + srli a7,a7,16 + addi t0,t0,-1 + sw s1,40(sp) + slli s1,t3,16 + vsetvli t3,zero,e16,m1,ta,ma + sw s2,36(sp) + sw s3,32(sp) + andi s2,s0,-256 + addi t3,sp,16 + and s0,a7,t0 + vmv.v.i v1,0 + or s0,s0,s1 + vs1r.v v1,0(t3) + sw s0,16(sp) + lw s0,20(sp) + li t3,-65536 + or s2,s2,a3 + and s0,t3,s0 + and s3,s2,t4 + or s0,s0,a7 + or s3,s3,a4 + and s0,s0,t0 + and s2,s3,t1 + or s0,s0,s1 + or s2,s2,a5 + sw s0,20(sp) + lw s0,12(sp) + and s2,s2,t2 + or s2,s2,a6 + sw s2,8(sp) + andi s2,s0,-256 + lw s0,24(sp) + or s2,s2,a3 + and t4,s2,t4 + and s0,t3,s0 + or a3,s0,a7 + and a3,a3,t0 + or t4,t4,a4 + or a4,a3,s1 + sw a4,24(sp) + lw a4,28(sp) + and t1,t4,t1 + or t1,t1,a5 + and t3,t3,a4 + or t3,t3,a7 + and t1,t1,t2 + and t3,t3,t0 + or a4,t1,a6 + sw a4,12(sp) + or a4,t3,s1 + slli a7,t5,2 + sw a4,28(sp) + li a5,48 + vsetvli a4,zero,e32,m1,ta,ma + mv t3,a7 + vmv.v.x v1,t6 + bltu a7,a5,.L14 + li a5,32 + addi t3,t3,-48 + mv t4,a7 + bltu a7,a5,.L15 +.L5: + li a5,16 + addi t4,t4,-32 + mv t1,a7 + bltu a7,a5,.L16 +.L6: + addi t1,t1,-16 +.L7: + vsetvli a6,a7,e8,m4,tu,mu + vl1re8.v v2,0(sp) + addi a4,a0,16 + vsetvli zero,a6,e8,m1,ta,ma + vse8.v v2,0(a0) + addi a5,a0,32 + vsetvli a3,t1,e8,m4,tu,mu + vsetvli zero,a3,e8,m1,ta,ma + vse8.v v2,0(a4) + addi t6,a0,48 + vsetvli a4,t4,e8,m4,tu,mu + vsetvli zero,a4,e8,m1,ta,ma + vse8.v v2,0(a5) + srli t5,a6,1 + vsetvli a5,t3,e8,m4,tu,mu + addi s0,sp,16 + vsetvli zero,a5,e8,m1,ta,ma + vse8.v v2,0(t6) + vl1re16.v v2,0(s0) + srli t6,a3,1 + vsetvli zero,t5,e16,m1,ta,ma + addi t5,a1,16 + vse16.v v2,0(a1) + vsetvli zero,t6,e16,m1,ta,ma + srli t6,a4,1 + vse16.v v2,0(t5) + addi t5,a1,32 + vsetvli zero,t6,e16,m1,ta,ma + srli t6,a5,1 + vse16.v v2,0(t5) + addi t5,a1,48 + vsetvli zero,t6,e16,m1,ta,ma + vse16.v v2,0(t5) + srli t5,a6,2 + vsetvli zero,t5,e32,m1,ta,ma + addi t6,a2,16 + vse32.v v1,0(a2) + srli a3,a3,2 + vsetvli zero,a3,e32,m1,ta,ma + addi t5,a2,32 + vse32.v v1,0(t6) + srli a4,a4,2 + addi a3,a2,48 + vsetvli zero,a4,e32,m1,ta,ma + srli a5,a5,2 + vse32.v v1,0(t5) + sub a7,a7,a6 + vsetvli zero,a5,e32,m1,ta,ma + vse32.v v1,0(a3) + sub t1,t1,a6 + sub t4,t4,a6 + sub t3,t3,a6 + addi a0,a0,64 + addi a1,a1,64 + addi a2,a2,64 + bne a7,zero,.L7 + lw s0,44(sp) + lw s1,40(sp) + lw s2,36(sp) + lw s3,32(sp) +.L1: + addi sp,sp,48 + jr ra +.L16: + li t1,16 + j .L6 +.L15: + li t4,32 + li a5,16 + addi t4,t4,-32 + mv t1,a7 + bgeu a7,a5,.L6 + j .L16 +.L14: + li t3,48 + li a5,32 + addi t3,t3,-48 + mv t4,a7 + bgeu a7,a5,.L5 + j .L15 +.L3: + slli t5,t5,2 + add t5,a0,t5 +.L9: + sb a3,0(a0) + sb a4,1(a0) + sb a5,2(a0) + sb a6,3(a0) + sh a7,0(a1) + sh t3,2(a1) + sw t6,0(a2) + addi a0,a0,4 + addi a1,a1,4 + addi a2,a2,4 + bne a0,t5,.L9 + addi sp,sp,48 + jr ra + .size test_1_TYPE1_int16_t, .-test_1_TYPE1_int16_t + .align 1 + .globl test_1_TYPE1_uint16_t + .type test_1_TYPE1_uint16_t, @function +test_1_TYPE1_uint16_t: + addi sp,sp,-48 + lw t1,56(sp) + lhu t4,48(sp) + lw t5,52(sp) + ble t1,zero,.L17 + or t3,a2,a1 + or t3,a0,t3 + andi t3,t3,15 + bne t3,zero,.L19 + slli a4,a4,8 + slli a5,a5,16 + or t6,a3,a4 + vsetvli t3,zero,e8,m1,ta,ma + li t2,16777216 + addi t2,t2,-1 + or t3,t6,a5 + slli a6,a6,24 + and t3,t3,t2 + vmv.v.i v1,0 + sw s0,44(sp) + vs1r.v v1,0(sp) + or s0,t3,a6 + sw s0,0(sp) + lw s0,4(sp) + li t6,-65536 + addi t6,t6,255 + andi t0,s0,-256 + or t0,t0,a3 + and s0,t0,t6 + li t3,-16711680 + addi t3,t3,-1 + or s0,s0,a4 + and t0,s0,t3 + or t0,t0,a5 + and t0,t0,t2 + or s0,t0,a6 + sw s0,4(sp) + lw s0,8(sp) + li t0,65536 + addi t0,t0,-1 + sw s2,36(sp) + andi s2,s0,-256 + slli s0,t4,16 + vsetvli t4,zero,e16,m1,ta,ma + sw s1,40(sp) + sw s3,32(sp) + addi t4,sp,16 + and s1,a7,t0 + vmv.v.i v1,0 + or s1,s1,s0 + vs1r.v v1,0(t4) + sw s1,16(sp) + lw s1,20(sp) + li t4,-65536 + or s2,s2,a3 + and s1,t4,s1 + and s3,s2,t6 + or s1,s1,a7 + or s3,s3,a4 + and s1,s1,t0 + and s2,s3,t3 + or s1,s1,s0 + or s2,s2,a5 + sw s1,20(sp) + lw s1,12(sp) + and s2,s2,t2 + or s2,s2,a6 + sw s2,8(sp) + andi s2,s1,-256 + lw s1,24(sp) + or s2,s2,a3 + and t6,s2,t6 + and s1,t4,s1 + or a3,s1,a7 + and a3,a3,t0 + or t6,t6,a4 + or a4,a3,s0 + sw a4,24(sp) + lw a4,28(sp) + and t3,t6,t3 + or t3,t3,a5 + and t4,t4,a4 + and t3,t3,t2 + or t4,t4,a7 + or a4,t3,a6 + and t4,t4,t0 + sw a4,12(sp) + or a4,t4,s0 + slli t1,t1,2 + sw a4,28(sp) + li a5,48 + vsetvli a4,zero,e32,m1,ta,ma + mv t3,t1 + vmv.v.x v1,t5 + bltu t1,a5,.L29 + li a5,32 + addi t3,t3,-48 + mv t4,t1 + bltu t1,a5,.L30 +.L21: + li a5,16 + addi t4,t4,-32 + mv a7,t1 + bltu t1,a5,.L31 +.L22: + addi a7,a7,-16 +.L23: + vsetvli a6,t1,e8,m4,tu,mu + vl1re8.v v2,0(sp) + addi a4,a0,16 + vsetvli zero,a6,e8,m1,ta,ma + vse8.v v2,0(a0) + addi a5,a0,32 + vsetvli a3,a7,e8,m4,tu,mu + vsetvli zero,a3,e8,m1,ta,ma + vse8.v v2,0(a4) + addi t6,a0,48 + vsetvli a4,t4,e8,m4,tu,mu + vsetvli zero,a4,e8,m1,ta,ma + vse8.v v2,0(a5) + srli t5,a6,1 + vsetvli a5,t3,e8,m4,tu,mu + addi s0,sp,16 + vsetvli zero,a5,e8,m1,ta,ma + vse8.v v2,0(t6) + vl1re16.v v2,0(s0) + srli t6,a3,1 + vsetvli zero,t5,e16,m1,ta,ma + addi t5,a1,16 + vse16.v v2,0(a1) + vsetvli zero,t6,e16,m1,ta,ma + srli t6,a4,1 + vse16.v v2,0(t5) + addi t5,a1,32 + vsetvli zero,t6,e16,m1,ta,ma + srli t6,a5,1 + vse16.v v2,0(t5) + addi t5,a1,48 + vsetvli zero,t6,e16,m1,ta,ma + vse16.v v2,0(t5) + srli t5,a6,2 + vsetvli zero,t5,e32,m1,ta,ma + addi t6,a2,16 + vse32.v v1,0(a2) + srli a3,a3,2 + vsetvli zero,a3,e32,m1,ta,ma + addi t5,a2,32 + vse32.v v1,0(t6) + srli a4,a4,2 + addi a3,a2,48 + vsetvli zero,a4,e32,m1,ta,ma + srli a5,a5,2 + vse32.v v1,0(t5) + sub t1,t1,a6 + vsetvli zero,a5,e32,m1,ta,ma + vse32.v v1,0(a3) + sub a7,a7,a6 + sub t4,t4,a6 + sub t3,t3,a6 + addi a0,a0,64 + addi a1,a1,64 + addi a2,a2,64 + bne t1,zero,.L23 + lw s0,44(sp) + lw s1,40(sp) + lw s2,36(sp) + lw s3,32(sp) +.L17: + addi sp,sp,48 + jr ra +.L31: + li a7,16 + j .L22 +.L30: + li t4,32 + li a5,16 + addi t4,t4,-32 + mv a7,t1 + bgeu t1,a5,.L22 + j .L31 +.L29: + li t3,48 + li a5,32 + addi t3,t3,-48 + mv t4,t1 + bgeu t1,a5,.L21 + j .L30 +.L19: + slli t1,t1,2 + add t1,a0,t1 +.L25: + sb a3,0(a0) + sb a4,1(a0) + sb a5,2(a0) + sb a6,3(a0) + sh a7,0(a1) + sh t4,2(a1) + sw t5,0(a2) + addi a0,a0,4 + addi a1,a1,4 + addi a2,a2,4 + bne a0,t1,.L25 + addi sp,sp,48 + jr ra + .size test_1_TYPE1_uint16_t, .-test_1_TYPE1_uint16_t + .align 1 + .globl test_1_TYPE1_int32_t + .type test_1_TYPE1_int32_t, @function +test_1_TYPE1_int32_t: + addi sp,sp,-64 + lw t1,80(sp) + lw t3,64(sp) + lw t5,72(sp) + lw t6,76(sp) + ble t1,zero,.L32 + addi t0,t1,-1 + li t4,6 + bleu t0,t4,.L34 + or t4,a2,a1 + or t4,a0,t4 + andi t4,t4,15 + beq t4,zero,.L44 +.L34: + slli t1,t1,3 + add t1,a0,t1 +.L40: + sh a3,0(a0) + sh a4,2(a0) + sh a5,4(a0) + sh a6,6(a0) + sw a7,0(a1) + sw t3,4(a1) + sw t5,0(a2) + sw t6,4(a2) + addi a0,a0,8 + addi a1,a1,8 + addi a2,a2,8 + bne a0,t1,.L40 +.L32: + addi sp,sp,64 + jr ra +.L44: + sw s0,60(sp) + slli t2,a3,16 + li s0,65536 + addi s0,s0,-1 + vsetvli t4,zero,e16,m1,ta,ma + srli t2,t2,16 + slli a4,a4,16 + and a3,t2,s0 + addi t4,sp,8 + or a3,a3,a4 + vmv.v.i v1,0 + vs1r.v v1,0(t4) + sw a3,8(sp) + lw a3,12(sp) + li t4,-65536 + slli a5,a5,16 + and t0,t4,a3 + srli a5,a5,16 + or t0,t0,a5 + slli a6,a6,16 + and t0,t0,s0 + or a3,t0,a6 + sw a3,12(sp) + lw a3,16(sp) + sw t3,28(sp) + sw t3,36(sp) + and a3,t4,a3 + or a3,a3,t2 + and a3,a3,s0 + or a4,a3,a4 + sw a4,16(sp) + lw a4,20(sp) + sw a7,24(sp) + sw a7,32(sp) + and t4,t4,a4 + or a5,t4,a5 + and t4,a5,s0 + or a4,t4,a6 + sw a4,20(sp) + sw t5,40(sp) + sw t6,44(sp) + slli a5,t1,2 + addi s0,sp,40 + li a3,24 + vsetvli a4,zero,e64,m1,ta,ma + mv t3,a5 + vlse64.v v1,0(s0),zero + bltu a5,a3,.L45 + li a4,16 + addi t3,t3,-24 + mv t4,a5 + bltu a5,a4,.L46 +.L36: + li a4,8 + addi t4,t4,-16 + mv t1,a5 + bltu a5,a4,.L47 +.L37: + addi t1,t1,-8 +.L38: + addi a3,sp,8 + vsetvli a4,a5,e8,m2,tu,mu + vl1re16.v v2,0(a3) + addi a6,a0,16 + vsetvli zero,a4,e16,m1,ta,ma + vse16.v v2,0(a0) + addi a3,a0,32 + vsetvli a7,t1,e8,m2,tu,mu + vsetvli zero,a7,e16,m1,ta,ma + vse16.v v2,0(a6) + addi t6,a0,48 + vsetvli a6,t4,e8,m2,tu,mu + vsetvli zero,a6,e16,m1,ta,ma + vse16.v v2,0(a3) + srli t5,a4,1 + vsetvli a3,t3,e8,m2,tu,mu + addi s0,sp,24 + vsetvli zero,a3,e16,m1,ta,ma + vse16.v v2,0(t6) + vl1re32.v v2,0(s0) + srli t6,a7,1 + vsetvli zero,t5,e32,m1,ta,ma + addi t5,a1,16 + vse32.v v2,0(a1) + vsetvli zero,t6,e32,m1,ta,ma + srli t6,a6,1 + vse32.v v2,0(t5) + addi t5,a1,32 + vsetvli zero,t6,e32,m1,ta,ma + srli t6,a3,1 + vse32.v v2,0(t5) + addi t5,a1,48 + vsetvli zero,t6,e32,m1,ta,ma + vse32.v v2,0(t5) + srli t5,a4,2 + vsetvli zero,t5,e64,m1,ta,ma + addi t6,a2,16 + vse64.v v1,0(a2) + srli a7,a7,2 + vsetvli zero,a7,e64,m1,ta,ma + addi t5,a2,32 + vse64.v v1,0(t6) + srli a6,a6,2 + addi a7,a2,48 + vsetvli zero,a6,e64,m1,ta,ma + srli a3,a3,2 + vse64.v v1,0(t5) + sub a5,a5,a4 + vsetvli zero,a3,e64,m1,ta,ma + vse64.v v1,0(a7) + sub t1,t1,a4 + sub t4,t4,a4 + sub t3,t3,a4 + addi a0,a0,64 + addi a1,a1,64 + addi a2,a2,64 + bne a5,zero,.L38 + lw s0,60(sp) + addi sp,sp,64 + jr ra +.L47: + li t1,8 + j .L37 +.L46: + li t4,16 + li a4,8 + addi t4,t4,-16 + mv t1,a5 + bgeu a5,a4,.L37 + j .L47 +.L45: + li t3,24 + li a4,16 + addi t3,t3,-24 + mv t4,a5 + bgeu a5,a4,.L36 + j .L46 + .size test_1_TYPE1_int32_t, .-test_1_TYPE1_int32_t + .align 1 + .globl test_1_TYPE1_uint32_t + .type test_1_TYPE1_uint32_t, @function +test_1_TYPE1_uint32_t: + addi sp,sp,-48 + lw t1,64(sp) + lw t5,48(sp) + lw t3,56(sp) + lw t4,60(sp) + ble t1,zero,.L48 + addi t0,t1,-1 + li t6,6 + bleu t0,t6,.L50 + or t6,a2,a1 + or t6,a0,t6 + andi t6,t6,15 + beq t6,zero,.L60 +.L50: + slli t1,t1,3 + add t1,a0,t1 +.L56: + sh a3,0(a0) + sh a4,2(a0) + sh a5,4(a0) + sh a6,6(a0) + sw a7,0(a1) + sw t5,4(a1) + sw t3,0(a2) + sw t4,4(a2) + addi a0,a0,8 + addi a1,a1,8 + addi a2,a2,8 + bne a0,t1,.L56 +.L48: + addi sp,sp,48 + jr ra +.L60: + li t0,65536 + addi t0,t0,-1 + vsetvli t6,zero,e16,m1,ta,ma + slli a4,a4,16 + and t2,a3,t0 + addi t6,sp,8 + or t2,t2,a4 + vmv.v.i v1,0 + vs1r.v v1,0(t6) + sw t2,8(sp) + lw t2,12(sp) + li t6,-65536 + slli a6,a6,16 + and t2,t6,t2 + or t2,t2,a5 + and t2,t2,t0 + or t2,t2,a6 + sw t2,12(sp) + lw t2,16(sp) + sw t3,40(sp) + sw a7,24(sp) + and t2,t6,t2 + or a3,t2,a3 + and a3,a3,t0 + or a4,a3,a4 + sw a4,16(sp) + lw a4,20(sp) + sw t5,28(sp) + sw a7,32(sp) + and t6,t6,a4 + or t6,t6,a5 + and t6,t6,t0 + or a5,t6,a6 + sw a5,20(sp) + sw t4,44(sp) + slli a4,t1,2 + sw t5,36(sp) + addi a6,sp,40 + li a3,24 + vsetvli a5,zero,e64,m1,ta,ma + mv t3,a4 + vlse64.v v1,0(a6),zero + bltu a4,a3,.L61 + li a5,16 + addi t3,t3,-24 + mv t4,a4 + bltu a4,a5,.L62 +.L52: + li a5,8 + addi t4,t4,-16 + mv t1,a4 + bltu a4,a5,.L63 +.L53: + addi t1,t1,-8 +.L54: + addi a3,sp,8 + vsetvli a5,a4,e8,m2,tu,mu + vl1re16.v v2,0(a3) + addi a6,a0,16 + vsetvli zero,a5,e16,m1,ta,ma + vse16.v v2,0(a0) + addi a3,a0,32 + vsetvli a7,t1,e8,m2,tu,mu + vsetvli zero,a7,e16,m1,ta,ma + vse16.v v2,0(a6) + addi t6,a0,48 + vsetvli a6,t4,e8,m2,tu,mu + vsetvli zero,a6,e16,m1,ta,ma + vse16.v v2,0(a3) + srli t5,a5,1 + vsetvli a3,t3,e8,m2,tu,mu + vsetvli zero,a3,e16,m1,ta,ma + vse16.v v2,0(t6) + addi t6,sp,24 + vl1re32.v v2,0(t6) + vsetvli zero,t5,e32,m1,ta,ma + srli t6,a7,1 + vse32.v v2,0(a1) + addi t5,a1,16 + vsetvli zero,t6,e32,m1,ta,ma + srli t6,a6,1 + vse32.v v2,0(t5) + addi t5,a1,32 + vsetvli zero,t6,e32,m1,ta,ma + srli t6,a3,1 + vse32.v v2,0(t5) + addi t5,a1,48 + vsetvli zero,t6,e32,m1,ta,ma + vse32.v v2,0(t5) + srli t5,a5,2 + vsetvli zero,t5,e64,m1,ta,ma + addi t6,a2,16 + vse64.v v1,0(a2) + srli a7,a7,2 + vsetvli zero,a7,e64,m1,ta,ma + addi t5,a2,32 + vse64.v v1,0(t6) + srli a6,a6,2 + addi a7,a2,48 + vsetvli zero,a6,e64,m1,ta,ma + srli a3,a3,2 + vse64.v v1,0(t5) + sub a4,a4,a5 + vsetvli zero,a3,e64,m1,ta,ma + vse64.v v1,0(a7) + sub t1,t1,a5 + sub t4,t4,a5 + sub t3,t3,a5 + addi a0,a0,64 + addi a1,a1,64 + addi a2,a2,64 + bne a4,zero,.L54 + addi sp,sp,48 + jr ra +.L63: + li t1,8 + j .L53 +.L62: + li t4,16 + li a5,8 + addi t4,t4,-16 + mv t1,a4 + bgeu a4,a5,.L53 + j .L63 +.L61: + li t3,24 + li a5,16 + addi t3,t3,-24 + mv t4,a4 + bgeu a4,a5,.L52 + j .L62 + .size test_1_TYPE1_uint32_t, .-test_1_TYPE1_uint32_t + .ident "GCC: (GNU) 13.0.1 20230324 (experimental)" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c new file mode 100644 index 00000000000..d3e187eae68 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c @@ -0,0 +1,19 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax" } */ + +#include "multiple_rgroup-1.c" + +int main (void) +{ + TEST_ALL (run_1) + TEST_ALL (run_2) + TEST_ALL (run_3) + TEST_ALL (run_4) + TEST_ALL (run_5) + TEST_ALL (run_6) + TEST_ALL (run_7) + TEST_ALL (run_8) + TEST_ALL (run_9) + TEST_ALL (run_10) + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c new file mode 100644 index 00000000000..5166c9e35a0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c @@ -0,0 +1,19 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax" } */ + +#include "multiple_rgroup-2.c" + +int main (void) +{ + TEST_ALL (run_1) + TEST_ALL (run_2) + TEST_ALL (run_3) + TEST_ALL (run_4) + TEST_ALL (run_5) + TEST_ALL (run_6) + TEST_ALL (run_7) + TEST_ALL (run_8) + TEST_ALL (run_9) + TEST_ALL (run_10) + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c new file mode 100644 index 00000000000..6384888dd03 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=scalable -fno-vect-cost-model -fno-tree-loop-distribute-patterns" } */ + +#include "single_rgroup-1.h" + +TEST_ALL (test_1) + +/* { dg-final { scan-assembler-times {vsetvli} 10 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h new file mode 100644 index 00000000000..be6b4c641cb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h @@ -0,0 +1,106 @@ +#include <stddef.h> +#include <stdint.h> + +#define N 777 + +#define test_1(TYPE) \ + TYPE a_##TYPE[N]; \ + TYPE b_##TYPE[N]; \ + void __attribute__ ((noinline, noclone)) test_1_##TYPE (unsigned int n) \ + { \ + unsigned int i = 0; \ + for (i = 0; i < n; i++) \ + b_##TYPE[i] = a_##TYPE[i]; \ + } + +#define run_1(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 33 + 1 + 109; \ + test_1_##TYPE (5); \ + for (unsigned int i = 0; i < 5; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_2(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 57 + 1 + 999; \ + test_1_##TYPE (17); \ + for (unsigned int i = 0; i < 17; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_3(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 77 + 1 + 3; \ + test_1_##TYPE (32); \ + for (unsigned int i = 0; i < 32; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_4(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 45 + 1 + 11; \ + test_1_##TYPE (128); \ + for (unsigned int i = 0; i < 128; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_5(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 199 + 1 + 79; \ + test_1_##TYPE (177); \ + for (unsigned int i = 0; i < 177; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_6(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 377 + 1 + 73; \ + test_1_##TYPE (255); \ + for (unsigned int i = 0; i < 255; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_7(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 98 + 1 + 66; \ + test_1_##TYPE (333); \ + for (unsigned int i = 0; i < 333; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_8(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 7 + 1 * 7; \ + test_1_##TYPE (512); \ + for (unsigned int i = 0; i < 512; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_9(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 + 1 + 88; \ + test_1_##TYPE (637); \ + for (unsigned int i = 0; i < 637; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define run_10(TYPE) \ + for (unsigned int i = 0; i < N; i++) \ + a_##TYPE[i] = i * 2 * 331 + 1 + 547; \ + test_1_##TYPE (777); \ + for (unsigned int i = 0; i < 777; i++) \ + if (b_##TYPE[i] != a_##TYPE[i]) \ + __builtin_abort (); + +#define TEST_ALL(T) \ + T (int8_t) \ + T (uint8_t) \ + T (int16_t) \ + T (uint16_t) \ + T (int32_t) \ + T (uint32_t) \ + T (int64_t) \ + T (uint64_t) \ + T (float) \ + T (double) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c new file mode 100644 index 00000000000..4af2f18de8a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c @@ -0,0 +1,19 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "-fno-vect-cost-model -fno-tree-loop-distribute-patterns --param riscv-autovec-preference=scalable" } */ + +#include "single_rgroup-1.c" + +int main (void) +{ + TEST_ALL (run_1) + TEST_ALL (run_2) + TEST_ALL (run_3) + TEST_ALL (run_4) + TEST_ALL (run_5) + TEST_ALL (run_6) + TEST_ALL (run_7) + TEST_ALL (run_8) + TEST_ALL (run_9) + TEST_ALL (run_10) + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h new file mode 100644 index 00000000000..799e2d7d754 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h @@ -0,0 +1,68 @@ +#include <stddef.h> +#include <stdint.h> + +void +foo0 (int8_t *__restrict f, int16_t *__restrict d, int n) +{ + for (int i = 0; i < n; ++i) + { + f[i * 2 + 0] = 1; + f[i * 2 + 1] = 2; + d[i] = 3; + } +} + +void +foo1 (int16_t *__restrict f, int32_t *__restrict d, int n) +{ + for (int i = 0; i < n; ++i) + { + f[i * 2 + 0] = 1; + f[i * 2 + 1] = 2; + d[i] = 3; + } +} + +void +foo2 (int32_t *__restrict f, int64_t *__restrict d, int n) +{ + for (int i = 0; i < n; ++i) + { + f[i * 2 + 0] = 1; + f[i * 2 + 1] = 2; + d[i] = 3; + } +} + +void +foo3 (int16_t *__restrict f, float *__restrict d, int n) +{ + for (int i = 0; i < n; ++i) + { + f[i * 2 + 0] = 1; + f[i * 2 + 1] = 2; + d[i] = 3; + } +} + +void +foo4 (int32_t *__restrict f, float *__restrict d, int n) +{ + for (int i = 0; i < n; ++i) + { + f[i * 2 + 0] = 1; + f[i * 2 + 1] = 2; + d[i] = 3; + } +} + +void +foo5 (float *__restrict f, double *__restrict d, int n) +{ + for (int i = 0; i < n; ++i) + { + f[i * 2 + 0] = 1; + f[i * 2 + 1] = 2; + d[i] = 3; + } +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c new file mode 100644 index 00000000000..7ff84f60749 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c new file mode 100644 index 00000000000..dc22eefbd36 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c new file mode 100644 index 00000000000..36f6d98a5cb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c new file mode 100644 index 00000000000..794f28e73bd --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c new file mode 100644 index 00000000000..d5e36190b31 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c new file mode 100644 index 00000000000..d154df4c4ba --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 4 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c new file mode 100644 index 00000000000..68e7696ed65 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32x -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c new file mode 100644 index 00000000000..f8860a36332 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32x -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c new file mode 100644 index 00000000000..3a6a3aa1261 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c new file mode 100644 index 00000000000..d1aaf3f4297 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 2 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c new file mode 100644 index 00000000000..0d03536389f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c new file mode 100644 index 00000000000..ca423285011 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c new file mode 100644 index 00000000000..4c6c7e2fb3b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64d_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c new file mode 100644 index 00000000000..b8253476973 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64d_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c new file mode 100644 index 00000000000..e7900b82215 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c new file mode 100644 index 00000000000..1c0e8c2785b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c new file mode 100644 index 00000000000..daf4a4e8e64 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c new file mode 100644 index 00000000000..3866e45546c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 4 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c new file mode 100644 index 00000000000..4c190c303c1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64x -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c new file mode 100644 index 00000000000..66bb1f44170 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64x -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c new file mode 100644 index 00000000000..6920a395d1c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c new file mode 100644 index 00000000000..d8b60babf9a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +#include "template-1.h" + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp index 7a9a2b6ac48..5893dbf9742 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp @@ -44,6 +44,22 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/base/*.\[cS\]]] \ "" $CFLAGS gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vsetvl/*.\[cS\]]] \ "" $CFLAGS +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/*.\[cS\]]] \ + "" $CFLAGS + +set AUTOVEC_TEST_OPTS [list \ + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m1} \ + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m2} \ + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m4} \ + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m8} \ + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m1} \ + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m2} \ + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m4} \ + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m8} ] +foreach op $AUTOVEC_TEST_OPTS { + gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/partial/*.\[cS\]]] \ + "" "$op" +} # All done. dg-finish diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c index ee58f9bbdfc..8a1bbb40fc8 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c @@ -11,4 +11,4 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int c __riscv_vse32_v_i32m1(out, c, __riscv_vsetvl_e8mf2 (vl)); } -/* { dg-final { scan-assembler-times {vsetvli} 8 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ \ No newline at end of file +/* { dg-final { scan-assembler-times {vsetvli} 7 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ -- 2.36.3 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] RISC-V: Add RVV auto-vectorization testcase 2023-04-06 14:42 ` [PATCH] RISC-V: Add RVV auto-vectorization testcase juzhe.zhong @ 2023-04-06 15:36 ` Kito Cheng 0 siblings, 0 replies; 7+ messages in thread From: Kito Cheng @ 2023-04-06 15:36 UTC (permalink / raw) To: juzhe.zhong Cc: gcc-patches, palmer, richard.sandiford, rguenther, jeffreyalaw You included asm output by accidently :P On Thu, Apr 6, 2023 at 10:45 PM <juzhe.zhong@rivai.ai> wrote: > > From: Juzhe-Zhong <juzhe.zhong@rivai.ai> > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/rvv.exp: Add testing for RVV auto-vectorization. > * gcc.target/riscv/rvv/vsetvl/vsetvl-17.c: Adapt testcase. > * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test. > * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test. > * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test. > * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test. > * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s: New test. > * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New test. > * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New test. > * gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c: New test. > * gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h: New test. > * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c: New test. > * gcc.target/riscv/rvv/autovec/template-1.h: New test. > * gcc.target/riscv/rvv/autovec/v-1.c: New test. > * gcc.target/riscv/rvv/autovec/v-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve32f-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve32f-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve32x-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve32x-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve64d-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve64d-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve64f-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve64f-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve64x-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve64x-2.c: New test. > * gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c: New test. > * gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c: New test. > > --- > .../rvv/autovec/partial/multiple_rgroup-1.c | 6 + > .../rvv/autovec/partial/multiple_rgroup-1.h | 304 +++++++ > .../rvv/autovec/partial/multiple_rgroup-2.c | 6 + > .../rvv/autovec/partial/multiple_rgroup-2.h | 546 ++++++++++++ > .../rvv/autovec/partial/multiple_rgroup-2.s | 774 ++++++++++++++++++ > .../autovec/partial/multiple_rgroup_run-1.c | 19 + > .../autovec/partial/multiple_rgroup_run-2.c | 19 + > .../rvv/autovec/partial/single_rgroup-1.c | 8 + > .../rvv/autovec/partial/single_rgroup-1.h | 106 +++ > .../rvv/autovec/partial/single_rgroup_run-1.c | 19 + > .../gcc.target/riscv/rvv/autovec/template-1.h | 68 ++ > .../gcc.target/riscv/rvv/autovec/v-1.c | 4 + > .../gcc.target/riscv/rvv/autovec/v-2.c | 6 + > .../gcc.target/riscv/rvv/autovec/zve32f-1.c | 4 + > .../gcc.target/riscv/rvv/autovec/zve32f-2.c | 5 + > .../riscv/rvv/autovec/zve32f_zvl128b-1.c | 4 + > .../riscv/rvv/autovec/zve32f_zvl128b-2.c | 6 + > .../gcc.target/riscv/rvv/autovec/zve32x-1.c | 4 + > .../gcc.target/riscv/rvv/autovec/zve32x-2.c | 6 + > .../riscv/rvv/autovec/zve32x_zvl128b-1.c | 5 + > .../riscv/rvv/autovec/zve32x_zvl128b-2.c | 6 + > .../gcc.target/riscv/rvv/autovec/zve64d-1.c | 4 + > .../gcc.target/riscv/rvv/autovec/zve64d-2.c | 4 + > .../riscv/rvv/autovec/zve64d_zvl128b-1.c | 4 + > .../riscv/rvv/autovec/zve64d_zvl128b-2.c | 6 + > .../gcc.target/riscv/rvv/autovec/zve64f-1.c | 4 + > .../gcc.target/riscv/rvv/autovec/zve64f-2.c | 4 + > .../riscv/rvv/autovec/zve64f_zvl128b-1.c | 4 + > .../riscv/rvv/autovec/zve64f_zvl128b-2.c | 6 + > .../gcc.target/riscv/rvv/autovec/zve64x-1.c | 4 + > .../gcc.target/riscv/rvv/autovec/zve64x-2.c | 4 + > .../riscv/rvv/autovec/zve64x_zvl128b-1.c | 4 + > .../riscv/rvv/autovec/zve64x_zvl128b-2.c | 6 + > gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 16 + > .../gcc.target/riscv/rvv/vsetvl/vsetvl-17.c | 2 +- > 35 files changed, 1996 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c > > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c > new file mode 100644 > index 00000000000..69cc3be78f7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax" } */ > + > +#include "multiple_rgroup-1.h" > + > +TEST_ALL (test_1) > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h > new file mode 100644 > index 00000000000..755ee2b3616 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h > @@ -0,0 +1,304 @@ > +#include <stddef.h> > +#include <stdint.h> > + > +#define test_1(TYPE1, TYPE2) \ > + void __attribute__ ((noinline, noclone)) \ > + test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x, \ > + TYPE1 x2, TYPE2 y, int n) \ > + { \ > + for (int i = 0; i < n; ++i) \ > + { \ > + f[i * 2 + 0] = x; \ > + f[i * 2 + 1] = x2; \ > + d[i] = y; \ > + } \ > + } > + > +#define run_1(TYPE1, TYPE2) \ > + int n_1_##TYPE1_##TYPE2 = 1; \ > + TYPE1 x_1_##TYPE1 = 117; \ > + TYPE1 x2_1_##TYPE1 = 232; \ > + TYPE2 y_1_##TYPE2 = 9762; \ > + TYPE1 f_1_##TYPE1[2 * 2 + 1] = {0}; \ > + TYPE2 d_1_##TYPE2[2] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_1_##TYPE1, d_1_##TYPE2, x_1_##TYPE1, x2_1_##TYPE1, \ > + y_1_##TYPE2, n_1_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_1_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_1_##TYPE1[i * 2 + 0] != x_1_##TYPE1) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 2 + 1] != x2_1_##TYPE1) \ > + __builtin_abort (); \ > + if (d_1_##TYPE2[i] != y_1_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_1_##TYPE1_##TYPE2; i < n_1_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_1_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_1_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_2(TYPE1, TYPE2) \ > + int n_2_##TYPE1_##TYPE2 = 17; \ > + TYPE1 x_2_##TYPE1 = 133; \ > + TYPE1 x2_2_##TYPE1 = 94; \ > + TYPE2 y_2_##TYPE2 = 8672; \ > + TYPE1 f_2_##TYPE1[18 * 2 + 1] = {0}; \ > + TYPE2 d_2_##TYPE2[18] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_2_##TYPE1, d_2_##TYPE2, x_2_##TYPE1, x2_2_##TYPE1, \ > + y_2_##TYPE2, n_2_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_2_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_2_##TYPE1[i * 2 + 0] != x_2_##TYPE1) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 2 + 1] != x2_2_##TYPE1) \ > + __builtin_abort (); \ > + if (d_2_##TYPE2[i] != y_2_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_2_##TYPE1_##TYPE2; i < n_2_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_2_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_2_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_3(TYPE1, TYPE2) \ > + int n_3_##TYPE1_##TYPE2 = 32; \ > + TYPE1 x_3_##TYPE1 = 233; \ > + TYPE1 x2_3_##TYPE1 = 78; \ > + TYPE2 y_3_##TYPE2 = 1234; \ > + TYPE1 f_3_##TYPE1[33 * 2 + 1] = {0}; \ > + TYPE2 d_3_##TYPE2[33] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_3_##TYPE1, d_3_##TYPE2, x_3_##TYPE1, x2_3_##TYPE1, \ > + y_3_##TYPE2, n_3_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_3_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_3_##TYPE1[i * 2 + 0] != x_3_##TYPE1) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 2 + 1] != x2_3_##TYPE1) \ > + __builtin_abort (); \ > + if (d_3_##TYPE2[i] != y_3_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_3_##TYPE1_##TYPE2; i < n_3_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_3_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_3_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_4(TYPE1, TYPE2) \ > + int n_4_##TYPE1_##TYPE2 = 128; \ > + TYPE1 x_4_##TYPE1 = 222; \ > + TYPE1 x2_4_##TYPE1 = 59; \ > + TYPE2 y_4_##TYPE2 = 4321; \ > + TYPE1 f_4_##TYPE1[129 * 2 + 1] = {0}; \ > + TYPE2 d_4_##TYPE2[129] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_4_##TYPE1, d_4_##TYPE2, x_4_##TYPE1, x2_4_##TYPE1, \ > + y_4_##TYPE2, n_4_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_4_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_4_##TYPE1[i * 2 + 0] != x_4_##TYPE1) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 2 + 1] != x2_4_##TYPE1) \ > + __builtin_abort (); \ > + if (d_4_##TYPE2[i] != y_4_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_4_##TYPE1_##TYPE2; i < n_4_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_4_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_4_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_5(TYPE1, TYPE2) \ > + int n_5_##TYPE1_##TYPE2 = 177; \ > + TYPE1 x_5_##TYPE1 = 111; \ > + TYPE1 x2_5_##TYPE1 = 189; \ > + TYPE2 y_5_##TYPE2 = 5555; \ > + TYPE1 f_5_##TYPE1[178 * 2 + 1] = {0}; \ > + TYPE2 d_5_##TYPE2[178] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_5_##TYPE1, d_5_##TYPE2, x_5_##TYPE1, x2_5_##TYPE1, \ > + y_5_##TYPE2, n_5_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_5_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_5_##TYPE1[i * 2 + 0] != x_5_##TYPE1) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 2 + 1] != x2_5_##TYPE1) \ > + __builtin_abort (); \ > + if (d_5_##TYPE2[i] != y_5_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_5_##TYPE1_##TYPE2; i < n_5_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_5_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_5_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_6(TYPE1, TYPE2) \ > + int n_6_##TYPE1_##TYPE2 = 255; \ > + TYPE1 x_6_##TYPE1 = 123; \ > + TYPE1 x2_6_##TYPE1 = 132; \ > + TYPE2 y_6_##TYPE2 = 6655; \ > + TYPE1 f_6_##TYPE1[256 * 2 + 1] = {0}; \ > + TYPE2 d_6_##TYPE2[256] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_6_##TYPE1, d_6_##TYPE2, x_6_##TYPE1, x2_6_##TYPE1, \ > + y_6_##TYPE2, n_6_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_6_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_6_##TYPE1[i * 2 + 0] != x_6_##TYPE1) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 2 + 1] != x2_6_##TYPE1) \ > + __builtin_abort (); \ > + if (d_6_##TYPE2[i] != y_6_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_6_##TYPE1_##TYPE2; i < n_6_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_6_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_6_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_7(TYPE1, TYPE2) \ > + int n_7_##TYPE1_##TYPE2 = 333; \ > + TYPE1 x_7_##TYPE1 = 39; \ > + TYPE1 x2_7_##TYPE1 = 59; \ > + TYPE2 y_7_##TYPE2 = 5968; \ > + TYPE1 f_7_##TYPE1[334 * 2 + 1] = {0}; \ > + TYPE2 d_7_##TYPE2[334] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_7_##TYPE1, d_7_##TYPE2, x_7_##TYPE1, x2_7_##TYPE1, \ > + y_7_##TYPE2, n_7_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_7_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_7_##TYPE1[i * 2 + 0] != x_7_##TYPE1) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 2 + 1] != x2_7_##TYPE1) \ > + __builtin_abort (); \ > + if (d_7_##TYPE2[i] != y_7_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_7_##TYPE1_##TYPE2; i < n_7_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_7_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_7_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_8(TYPE1, TYPE2) \ > + int n_8_##TYPE1_##TYPE2 = 512; \ > + TYPE1 x_8_##TYPE1 = 71; \ > + TYPE1 x2_8_##TYPE1 = 255; \ > + TYPE2 y_8_##TYPE2 = 3366; \ > + TYPE1 f_8_##TYPE1[513 * 2 + 1] = {0}; \ > + TYPE2 d_8_##TYPE2[513] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_8_##TYPE1, d_8_##TYPE2, x_8_##TYPE1, x2_8_##TYPE1, \ > + y_8_##TYPE2, n_8_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_8_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_8_##TYPE1[i * 2 + 0] != x_8_##TYPE1) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 2 + 1] != x2_8_##TYPE1) \ > + __builtin_abort (); \ > + if (d_8_##TYPE2[i] != y_8_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_8_##TYPE1_##TYPE2; i < n_8_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_8_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_8_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_9(TYPE1, TYPE2) \ > + int n_9_##TYPE1_##TYPE2 = 637; \ > + TYPE1 x_9_##TYPE1 = 157; \ > + TYPE1 x2_9_##TYPE1 = 89; \ > + TYPE2 y_9_##TYPE2 = 5511; \ > + TYPE1 f_9_##TYPE1[638 * 2 + 1] = {0}; \ > + TYPE2 d_9_##TYPE2[638] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_9_##TYPE1, d_9_##TYPE2, x_9_##TYPE1, x2_9_##TYPE1, \ > + y_9_##TYPE2, n_9_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_9_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_9_##TYPE1[i * 2 + 0] != x_9_##TYPE1) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 2 + 1] != x2_9_##TYPE1) \ > + __builtin_abort (); \ > + if (d_9_##TYPE2[i] != y_9_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_9_##TYPE1_##TYPE2; i < n_9_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_9_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_9_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_10(TYPE1, TYPE2) \ > + int n_10_##TYPE1_##TYPE2 = 777; \ > + TYPE1 x_10_##TYPE1 = 203; \ > + TYPE1 x2_10_##TYPE1 = 200; \ > + TYPE2 y_10_##TYPE2 = 2023; \ > + TYPE1 f_10_##TYPE1[778 * 2 + 1] = {0}; \ > + TYPE2 d_10_##TYPE2[778] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_10_##TYPE1, d_10_##TYPE2, x_10_##TYPE1, \ > + x2_10_##TYPE1, y_10_##TYPE2, n_10_##TYPE1_##TYPE2); \ > + for (int i = 0; i < n_10_##TYPE1_##TYPE2; ++i) \ > + { \ > + if (f_10_##TYPE1[i * 2 + 0] != x_10_##TYPE1) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 2 + 1] != x2_10_##TYPE1) \ > + __builtin_abort (); \ > + if (d_10_##TYPE2[i] != y_10_##TYPE2) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_10_##TYPE1_##TYPE2; i < n_10_##TYPE1_##TYPE2 + 1; ++i) \ > + { \ > + if (f_10_##TYPE1[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (d_10_##TYPE2[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define TEST_ALL(T) \ > + T (int8_t, int16_t) \ > + T (uint8_t, uint16_t) \ > + T (int16_t, int32_t) \ > + T (uint16_t, uint32_t) \ > + T (int32_t, int64_t) \ > + T (uint32_t, uint64_t) \ > + T (float, double) > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c > new file mode 100644 > index 00000000000..d1c41907547 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax" } */ > + > +#include "multiple_rgroup-2.h" > + > +TEST_ALL (test_1) > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h > new file mode 100644 > index 00000000000..aa50726697c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h > @@ -0,0 +1,546 @@ > +#include <stddef.h> > +#include <stdint.h> > + > +#define test_1(TYPE1, TYPE2, TYPE3) \ > + void __attribute__ ((noinline, noclone)) \ > + test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, \ > + TYPE3 *__restrict e, TYPE1 x, TYPE1 x2, TYPE1 x3, \ > + TYPE1 x4, TYPE2 y, TYPE2 y2, TYPE3 z, int n) \ > + { \ > + for (int i = 0; i < n; ++i) \ > + { \ > + f[i * 4 + 0] = x; \ > + f[i * 4 + 1] = x2; \ > + f[i * 4 + 2] = x3; \ > + f[i * 4 + 3] = x4; \ > + d[i * 2 + 0] = y; \ > + d[i * 2 + 1] = y2; \ > + e[i] = z; \ > + } \ > + } > + > +#define run_1(TYPE1, TYPE2, TYPE3) \ > + int n_1_##TYPE1_##TYPE2_##TYPE3 = 1; \ > + TYPE1 x_1_##TYPE1 = 117; \ > + TYPE1 x2_1_##TYPE1 = 232; \ > + TYPE1 x3_1_##TYPE1 = 127; \ > + TYPE1 x4_1_##TYPE1 = 11; \ > + TYPE2 y_1_##TYPE2 = 9762; \ > + TYPE2 y2_1_##TYPE2 = 6279; \ > + TYPE3 z_1_##TYPE3 = 5891663; \ > + TYPE1 f_1_##TYPE1[2 * 4 + 1] = {0}; \ > + TYPE2 d_1_##TYPE2[2 * 2 + 1] = {0}; \ > + TYPE3 e_1_##TYPE3[2] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_1_##TYPE1, d_1_##TYPE2, e_1_##TYPE3, x_1_##TYPE1, \ > + x2_1_##TYPE1, x3_1_##TYPE1, x4_1_##TYPE1, \ > + y_1_##TYPE2, y2_1_##TYPE2, z_1_##TYPE3, \ > + n_1_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_1_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_1_##TYPE1[i * 4 + 0] != x_1_##TYPE1) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 4 + 1] != x2_1_##TYPE1) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 4 + 2] != x3_1_##TYPE1) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 4 + 3] != x4_1_##TYPE1) \ > + __builtin_abort (); \ > + if (d_1_##TYPE2[i * 2 + 0] != y_1_##TYPE2) \ > + __builtin_abort (); \ > + if (d_1_##TYPE2[i * 2 + 1] != y2_1_##TYPE2) \ > + __builtin_abort (); \ > + if (e_1_##TYPE3[i] != z_1_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_1_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_1_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_1_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_1_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_1_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_1_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_1_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_2(TYPE1, TYPE2, TYPE3) \ > + int n_2_##TYPE1_##TYPE2_##TYPE3 = 17; \ > + TYPE1 x_2_##TYPE1 = 107; \ > + TYPE1 x2_2_##TYPE1 = 202; \ > + TYPE1 x3_2_##TYPE1 = 17; \ > + TYPE1 x4_2_##TYPE1 = 53; \ > + TYPE2 y_2_##TYPE2 = 5566; \ > + TYPE2 y2_2_##TYPE2 = 7926; \ > + TYPE3 z_2_##TYPE3 = 781545971; \ > + TYPE1 f_2_##TYPE1[18 * 4 + 1] = {0}; \ > + TYPE2 d_2_##TYPE2[18 * 2 + 1] = {0}; \ > + TYPE3 e_2_##TYPE3[18] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_2_##TYPE1, d_2_##TYPE2, e_2_##TYPE3, x_2_##TYPE1, \ > + x2_2_##TYPE1, x3_2_##TYPE1, x4_2_##TYPE1, \ > + y_2_##TYPE2, y2_2_##TYPE2, z_2_##TYPE3, \ > + n_2_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_2_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_2_##TYPE1[i * 4 + 0] != x_2_##TYPE1) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 4 + 1] != x2_2_##TYPE1) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 4 + 2] != x3_2_##TYPE1) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 4 + 3] != x4_2_##TYPE1) \ > + __builtin_abort (); \ > + if (d_2_##TYPE2[i * 2 + 0] != y_2_##TYPE2) \ > + __builtin_abort (); \ > + if (d_2_##TYPE2[i * 2 + 1] != y2_2_##TYPE2) \ > + __builtin_abort (); \ > + if (e_2_##TYPE3[i] != z_2_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_2_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_2_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_2_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_2_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_2_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_2_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_2_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_3(TYPE1, TYPE2, TYPE3) \ > + int n_3_##TYPE1_##TYPE2_##TYPE3 = 32; \ > + TYPE1 x_3_##TYPE1 = 109; \ > + TYPE1 x2_3_##TYPE1 = 239; \ > + TYPE1 x3_3_##TYPE1 = 151; \ > + TYPE1 x4_3_##TYPE1 = 3; \ > + TYPE2 y_3_##TYPE2 = 1234; \ > + TYPE2 y2_3_##TYPE2 = 4321; \ > + TYPE3 z_3_##TYPE3 = 145615615; \ > + TYPE1 f_3_##TYPE1[33 * 4 + 1] = {0}; \ > + TYPE2 d_3_##TYPE2[33 * 2 + 1] = {0}; \ > + TYPE3 e_3_##TYPE3[33] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_3_##TYPE1, d_3_##TYPE2, e_3_##TYPE3, x_3_##TYPE1, \ > + x2_3_##TYPE1, x3_3_##TYPE1, x4_3_##TYPE1, \ > + y_3_##TYPE2, y2_3_##TYPE2, z_3_##TYPE3, \ > + n_3_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_3_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_3_##TYPE1[i * 4 + 0] != x_3_##TYPE1) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 4 + 1] != x2_3_##TYPE1) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 4 + 2] != x3_3_##TYPE1) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 4 + 3] != x4_3_##TYPE1) \ > + __builtin_abort (); \ > + if (d_3_##TYPE2[i * 2 + 0] != y_3_##TYPE2) \ > + __builtin_abort (); \ > + if (d_3_##TYPE2[i * 2 + 1] != y2_3_##TYPE2) \ > + __builtin_abort (); \ > + if (e_3_##TYPE3[i] != z_3_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_3_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_3_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_3_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_3_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_3_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_3_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_3_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_4(TYPE1, TYPE2, TYPE3) \ > + int n_4_##TYPE1_##TYPE2_##TYPE3 = 128; \ > + TYPE1 x_4_##TYPE1 = 239; \ > + TYPE1 x2_4_##TYPE1 = 132; \ > + TYPE1 x3_4_##TYPE1 = 39; \ > + TYPE1 x4_4_##TYPE1 = 48; \ > + TYPE2 y_4_##TYPE2 = 1036; \ > + TYPE2 y2_4_##TYPE2 = 3665; \ > + TYPE3 z_4_##TYPE3 = 5145656; \ > + TYPE1 f_4_##TYPE1[129 * 4 + 1] = {0}; \ > + TYPE2 d_4_##TYPE2[129 * 2 + 1] = {0}; \ > + TYPE3 e_4_##TYPE3[129] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_4_##TYPE1, d_4_##TYPE2, e_4_##TYPE3, x_4_##TYPE1, \ > + x2_4_##TYPE1, x3_4_##TYPE1, x4_4_##TYPE1, \ > + y_4_##TYPE2, y2_4_##TYPE2, z_4_##TYPE3, \ > + n_4_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_4_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_4_##TYPE1[i * 4 + 0] != x_4_##TYPE1) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 4 + 1] != x2_4_##TYPE1) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 4 + 2] != x3_4_##TYPE1) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 4 + 3] != x4_4_##TYPE1) \ > + __builtin_abort (); \ > + if (d_4_##TYPE2[i * 2 + 0] != y_4_##TYPE2) \ > + __builtin_abort (); \ > + if (d_4_##TYPE2[i * 2 + 1] != y2_4_##TYPE2) \ > + __builtin_abort (); \ > + if (e_4_##TYPE3[i] != z_4_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_4_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_4_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_4_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_4_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_4_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_4_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_4_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_5(TYPE1, TYPE2, TYPE3) \ > + int n_5_##TYPE1_##TYPE2_##TYPE3 = 177; \ > + TYPE1 x_5_##TYPE1 = 239; \ > + TYPE1 x2_5_##TYPE1 = 132; \ > + TYPE1 x3_5_##TYPE1 = 39; \ > + TYPE1 x4_5_##TYPE1 = 48; \ > + TYPE2 y_5_##TYPE2 = 1036; \ > + TYPE2 y2_5_##TYPE2 = 3665; \ > + TYPE3 z_5_##TYPE3 = 5145656; \ > + TYPE1 f_5_##TYPE1[178 * 4 + 1] = {0}; \ > + TYPE2 d_5_##TYPE2[178 * 2 + 1] = {0}; \ > + TYPE3 e_5_##TYPE3[178] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_5_##TYPE1, d_5_##TYPE2, e_5_##TYPE3, x_5_##TYPE1, \ > + x2_5_##TYPE1, x3_5_##TYPE1, x4_5_##TYPE1, \ > + y_5_##TYPE2, y2_5_##TYPE2, z_5_##TYPE3, \ > + n_5_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_5_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_5_##TYPE1[i * 4 + 0] != x_5_##TYPE1) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 4 + 1] != x2_5_##TYPE1) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 4 + 2] != x3_5_##TYPE1) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 4 + 3] != x4_5_##TYPE1) \ > + __builtin_abort (); \ > + if (d_5_##TYPE2[i * 2 + 0] != y_5_##TYPE2) \ > + __builtin_abort (); \ > + if (d_5_##TYPE2[i * 2 + 1] != y2_5_##TYPE2) \ > + __builtin_abort (); \ > + if (e_5_##TYPE3[i] != z_5_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_5_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_5_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_5_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_5_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_5_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_5_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_5_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_6(TYPE1, TYPE2, TYPE3) \ > + int n_6_##TYPE1_##TYPE2_##TYPE3 = 255; \ > + TYPE1 x_6_##TYPE1 = 239; \ > + TYPE1 x2_6_##TYPE1 = 132; \ > + TYPE1 x3_6_##TYPE1 = 39; \ > + TYPE1 x4_6_##TYPE1 = 48; \ > + TYPE2 y_6_##TYPE2 = 1036; \ > + TYPE2 y2_6_##TYPE2 = 3665; \ > + TYPE3 z_6_##TYPE3 = 5145656; \ > + TYPE1 f_6_##TYPE1[256 * 4 + 1] = {0}; \ > + TYPE2 d_6_##TYPE2[256 * 2 + 1] = {0}; \ > + TYPE3 e_6_##TYPE3[256] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_6_##TYPE1, d_6_##TYPE2, e_6_##TYPE3, x_6_##TYPE1, \ > + x2_6_##TYPE1, x3_6_##TYPE1, x4_6_##TYPE1, \ > + y_6_##TYPE2, y2_6_##TYPE2, z_6_##TYPE3, \ > + n_6_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_6_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_6_##TYPE1[i * 4 + 0] != x_6_##TYPE1) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 4 + 1] != x2_6_##TYPE1) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 4 + 2] != x3_6_##TYPE1) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 4 + 3] != x4_6_##TYPE1) \ > + __builtin_abort (); \ > + if (d_6_##TYPE2[i * 2 + 0] != y_6_##TYPE2) \ > + __builtin_abort (); \ > + if (d_6_##TYPE2[i * 2 + 1] != y2_6_##TYPE2) \ > + __builtin_abort (); \ > + if (e_6_##TYPE3[i] != z_6_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_6_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_6_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_6_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_6_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_6_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_6_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_6_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_7(TYPE1, TYPE2, TYPE3) \ > + int n_7_##TYPE1_##TYPE2_##TYPE3 = 333; \ > + TYPE1 x_7_##TYPE1 = 239; \ > + TYPE1 x2_7_##TYPE1 = 132; \ > + TYPE1 x3_7_##TYPE1 = 39; \ > + TYPE1 x4_7_##TYPE1 = 48; \ > + TYPE2 y_7_##TYPE2 = 1036; \ > + TYPE2 y2_7_##TYPE2 = 3665; \ > + TYPE3 z_7_##TYPE3 = 5145656; \ > + TYPE1 f_7_##TYPE1[334 * 4 + 1] = {0}; \ > + TYPE2 d_7_##TYPE2[334 * 2 + 1] = {0}; \ > + TYPE3 e_7_##TYPE3[334] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_7_##TYPE1, d_7_##TYPE2, e_7_##TYPE3, x_7_##TYPE1, \ > + x2_7_##TYPE1, x3_7_##TYPE1, x4_7_##TYPE1, \ > + y_7_##TYPE2, y2_7_##TYPE2, z_7_##TYPE3, \ > + n_7_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_7_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_7_##TYPE1[i * 4 + 0] != x_7_##TYPE1) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 4 + 1] != x2_7_##TYPE1) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 4 + 2] != x3_7_##TYPE1) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 4 + 3] != x4_7_##TYPE1) \ > + __builtin_abort (); \ > + if (d_7_##TYPE2[i * 2 + 0] != y_7_##TYPE2) \ > + __builtin_abort (); \ > + if (d_7_##TYPE2[i * 2 + 1] != y2_7_##TYPE2) \ > + __builtin_abort (); \ > + if (e_7_##TYPE3[i] != z_7_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_7_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_7_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_7_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_7_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_7_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_7_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_7_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_8(TYPE1, TYPE2, TYPE3) \ > + int n_8_##TYPE1_##TYPE2_##TYPE3 = 512; \ > + TYPE1 x_8_##TYPE1 = 239; \ > + TYPE1 x2_8_##TYPE1 = 132; \ > + TYPE1 x3_8_##TYPE1 = 39; \ > + TYPE1 x4_8_##TYPE1 = 48; \ > + TYPE2 y_8_##TYPE2 = 1036; \ > + TYPE2 y2_8_##TYPE2 = 3665; \ > + TYPE3 z_8_##TYPE3 = 5145656; \ > + TYPE1 f_8_##TYPE1[513 * 4 + 1] = {0}; \ > + TYPE2 d_8_##TYPE2[513 * 2 + 1] = {0}; \ > + TYPE3 e_8_##TYPE3[513] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_8_##TYPE1, d_8_##TYPE2, e_8_##TYPE3, x_8_##TYPE1, \ > + x2_8_##TYPE1, x3_8_##TYPE1, x4_8_##TYPE1, \ > + y_8_##TYPE2, y2_8_##TYPE2, z_8_##TYPE3, \ > + n_8_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_8_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_8_##TYPE1[i * 4 + 0] != x_8_##TYPE1) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 4 + 1] != x2_8_##TYPE1) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 4 + 2] != x3_8_##TYPE1) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 4 + 3] != x4_8_##TYPE1) \ > + __builtin_abort (); \ > + if (d_8_##TYPE2[i * 2 + 0] != y_8_##TYPE2) \ > + __builtin_abort (); \ > + if (d_8_##TYPE2[i * 2 + 1] != y2_8_##TYPE2) \ > + __builtin_abort (); \ > + if (e_8_##TYPE3[i] != z_8_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_8_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_8_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_8_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_8_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_8_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_8_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_8_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_9(TYPE1, TYPE2, TYPE3) \ > + int n_9_##TYPE1_##TYPE2_##TYPE3 = 637; \ > + TYPE1 x_9_##TYPE1 = 222; \ > + TYPE1 x2_9_##TYPE1 = 111; \ > + TYPE1 x3_9_##TYPE1 = 11; \ > + TYPE1 x4_9_##TYPE1 = 7; \ > + TYPE2 y_9_##TYPE2 = 2034; \ > + TYPE2 y2_9_##TYPE2 = 6987; \ > + TYPE3 z_9_##TYPE3 = 1564616; \ > + TYPE1 f_9_##TYPE1[638 * 4 + 1] = {0}; \ > + TYPE2 d_9_##TYPE2[638 * 2 + 1] = {0}; \ > + TYPE3 e_9_##TYPE3[638] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_9_##TYPE1, d_9_##TYPE2, e_9_##TYPE3, x_9_##TYPE1, \ > + x2_9_##TYPE1, x3_9_##TYPE1, x4_9_##TYPE1, \ > + y_9_##TYPE2, y2_9_##TYPE2, z_9_##TYPE3, \ > + n_9_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_9_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_9_##TYPE1[i * 4 + 0] != x_9_##TYPE1) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 4 + 1] != x2_9_##TYPE1) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 4 + 2] != x3_9_##TYPE1) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 4 + 3] != x4_9_##TYPE1) \ > + __builtin_abort (); \ > + if (d_9_##TYPE2[i * 2 + 0] != y_9_##TYPE2) \ > + __builtin_abort (); \ > + if (d_9_##TYPE2[i * 2 + 1] != y2_9_##TYPE2) \ > + __builtin_abort (); \ > + if (e_9_##TYPE3[i] != z_9_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_9_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_9_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_9_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_9_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_9_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_9_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_9_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define run_10(TYPE1, TYPE2, TYPE3) \ > + int n_10_##TYPE1_##TYPE2_##TYPE3 = 777; \ > + TYPE1 x_10_##TYPE1 = 222; \ > + TYPE1 x2_10_##TYPE1 = 111; \ > + TYPE1 x3_10_##TYPE1 = 11; \ > + TYPE1 x4_10_##TYPE1 = 7; \ > + TYPE2 y_10_##TYPE2 = 2034; \ > + TYPE2 y2_10_##TYPE2 = 6987; \ > + TYPE3 z_10_##TYPE3 = 1564616; \ > + TYPE1 f_10_##TYPE1[778 * 4 + 1] = {0}; \ > + TYPE2 d_10_##TYPE2[778 * 2 + 1] = {0}; \ > + TYPE3 e_10_##TYPE3[778] = {0}; \ > + test_1_##TYPE1_##TYPE2 (f_10_##TYPE1, d_10_##TYPE2, e_10_##TYPE3, x_10_##TYPE1, \ > + x2_10_##TYPE1, x3_10_##TYPE1, x4_10_##TYPE1, \ > + y_10_##TYPE2, y2_10_##TYPE2, z_10_##TYPE3, \ > + n_10_##TYPE1_##TYPE2_##TYPE3); \ > + for (int i = 0; i < n_10_##TYPE1_##TYPE2_##TYPE3; ++i) \ > + { \ > + if (f_10_##TYPE1[i * 4 + 0] != x_10_##TYPE1) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 4 + 1] != x2_10_##TYPE1) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 4 + 2] != x3_10_##TYPE1) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 4 + 3] != x4_10_##TYPE1) \ > + __builtin_abort (); \ > + if (d_10_##TYPE2[i * 2 + 0] != y_10_##TYPE2) \ > + __builtin_abort (); \ > + if (d_10_##TYPE2[i * 2 + 1] != y2_10_##TYPE2) \ > + __builtin_abort (); \ > + if (e_10_##TYPE3[i] != z_10_##TYPE3) \ > + __builtin_abort (); \ > + } \ > + for (int i = n_10_##TYPE1_##TYPE2_##TYPE3; \ > + i < n_10_##TYPE1_##TYPE2_##TYPE3 + 1; ++i) \ > + { \ > + if (f_10_##TYPE1[i * 4 + 0] != 0) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 4 + 1] != 0) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 4 + 2] != 0) \ > + __builtin_abort (); \ > + if (f_10_##TYPE1[i * 4 + 3] != 0) \ > + __builtin_abort (); \ > + if (d_10_##TYPE2[i * 2 + 0] != 0) \ > + __builtin_abort (); \ > + if (d_10_##TYPE2[i * 2 + 1] != 0) \ > + __builtin_abort (); \ > + if (e_10_##TYPE3[i] != 0) \ > + __builtin_abort (); \ > + } > + > +#define TEST_ALL(T) \ > + T (int8_t, int16_t, int32_t) \ > + T (uint8_t, uint16_t, uint32_t) \ > + T (int16_t, int32_t, int64_t) \ > + T (uint16_t, uint32_t, uint64_t) > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s > new file mode 100644 > index 00000000000..64b36fe5092 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.s > @@ -0,0 +1,774 @@ > + .file "multiple_rgroup-2.c" > + .option nopic > + .attribute arch, "rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0" > + .attribute unaligned_access, 0 > + .attribute stack_align, 16 > + .text > + .align 1 > + .globl test_1_TYPE1_int16_t > + .type test_1_TYPE1_int16_t, @function > +test_1_TYPE1_int16_t: > + addi sp,sp,-48 > + lw t5,56(sp) > + lh t3,48(sp) > + lw t6,52(sp) > + ble t5,zero,.L1 > + or t1,a2,a1 > + or t1,a0,t1 > + andi t1,t1,15 > + bne t1,zero,.L3 > + andi a4,a4,0xff > + andi a3,a3,0xff > + slli a4,a4,8 > + andi a5,a5,0xff > + slli a5,a5,16 > + or t4,a3,a4 > + vsetvli t1,zero,e8,m1,ta,ma > + li t2,16777216 > + addi t2,t2,-1 > + or t1,t4,a5 > + slli a6,a6,24 > + and t1,t1,t2 > + vmv.v.i v1,0 > + sw s0,44(sp) > + vs1r.v v1,0(sp) > + or s0,t1,a6 > + sw s0,0(sp) > + lw s0,4(sp) > + li t4,-65536 > + addi t4,t4,255 > + andi t0,s0,-256 > + or t0,t0,a3 > + and s0,t0,t4 > + li t1,-16711680 > + addi t1,t1,-1 > + or s0,s0,a4 > + and t0,s0,t1 > + or t0,t0,a5 > + and t0,t0,t2 > + or s0,t0,a6 > + sw s0,4(sp) > + lw s0,8(sp) > + slli a7,a7,16 > + li t0,65536 > + srli a7,a7,16 > + addi t0,t0,-1 > + sw s1,40(sp) > + slli s1,t3,16 > + vsetvli t3,zero,e16,m1,ta,ma > + sw s2,36(sp) > + sw s3,32(sp) > + andi s2,s0,-256 > + addi t3,sp,16 > + and s0,a7,t0 > + vmv.v.i v1,0 > + or s0,s0,s1 > + vs1r.v v1,0(t3) > + sw s0,16(sp) > + lw s0,20(sp) > + li t3,-65536 > + or s2,s2,a3 > + and s0,t3,s0 > + and s3,s2,t4 > + or s0,s0,a7 > + or s3,s3,a4 > + and s0,s0,t0 > + and s2,s3,t1 > + or s0,s0,s1 > + or s2,s2,a5 > + sw s0,20(sp) > + lw s0,12(sp) > + and s2,s2,t2 > + or s2,s2,a6 > + sw s2,8(sp) > + andi s2,s0,-256 > + lw s0,24(sp) > + or s2,s2,a3 > + and t4,s2,t4 > + and s0,t3,s0 > + or a3,s0,a7 > + and a3,a3,t0 > + or t4,t4,a4 > + or a4,a3,s1 > + sw a4,24(sp) > + lw a4,28(sp) > + and t1,t4,t1 > + or t1,t1,a5 > + and t3,t3,a4 > + or t3,t3,a7 > + and t1,t1,t2 > + and t3,t3,t0 > + or a4,t1,a6 > + sw a4,12(sp) > + or a4,t3,s1 > + slli a7,t5,2 > + sw a4,28(sp) > + li a5,48 > + vsetvli a4,zero,e32,m1,ta,ma > + mv t3,a7 > + vmv.v.x v1,t6 > + bltu a7,a5,.L14 > + li a5,32 > + addi t3,t3,-48 > + mv t4,a7 > + bltu a7,a5,.L15 > +.L5: > + li a5,16 > + addi t4,t4,-32 > + mv t1,a7 > + bltu a7,a5,.L16 > +.L6: > + addi t1,t1,-16 > +.L7: > + vsetvli a6,a7,e8,m4,tu,mu > + vl1re8.v v2,0(sp) > + addi a4,a0,16 > + vsetvli zero,a6,e8,m1,ta,ma > + vse8.v v2,0(a0) > + addi a5,a0,32 > + vsetvli a3,t1,e8,m4,tu,mu > + vsetvli zero,a3,e8,m1,ta,ma > + vse8.v v2,0(a4) > + addi t6,a0,48 > + vsetvli a4,t4,e8,m4,tu,mu > + vsetvli zero,a4,e8,m1,ta,ma > + vse8.v v2,0(a5) > + srli t5,a6,1 > + vsetvli a5,t3,e8,m4,tu,mu > + addi s0,sp,16 > + vsetvli zero,a5,e8,m1,ta,ma > + vse8.v v2,0(t6) > + vl1re16.v v2,0(s0) > + srli t6,a3,1 > + vsetvli zero,t5,e16,m1,ta,ma > + addi t5,a1,16 > + vse16.v v2,0(a1) > + vsetvli zero,t6,e16,m1,ta,ma > + srli t6,a4,1 > + vse16.v v2,0(t5) > + addi t5,a1,32 > + vsetvli zero,t6,e16,m1,ta,ma > + srli t6,a5,1 > + vse16.v v2,0(t5) > + addi t5,a1,48 > + vsetvli zero,t6,e16,m1,ta,ma > + vse16.v v2,0(t5) > + srli t5,a6,2 > + vsetvli zero,t5,e32,m1,ta,ma > + addi t6,a2,16 > + vse32.v v1,0(a2) > + srli a3,a3,2 > + vsetvli zero,a3,e32,m1,ta,ma > + addi t5,a2,32 > + vse32.v v1,0(t6) > + srli a4,a4,2 > + addi a3,a2,48 > + vsetvli zero,a4,e32,m1,ta,ma > + srli a5,a5,2 > + vse32.v v1,0(t5) > + sub a7,a7,a6 > + vsetvli zero,a5,e32,m1,ta,ma > + vse32.v v1,0(a3) > + sub t1,t1,a6 > + sub t4,t4,a6 > + sub t3,t3,a6 > + addi a0,a0,64 > + addi a1,a1,64 > + addi a2,a2,64 > + bne a7,zero,.L7 > + lw s0,44(sp) > + lw s1,40(sp) > + lw s2,36(sp) > + lw s3,32(sp) > +.L1: > + addi sp,sp,48 > + jr ra > +.L16: > + li t1,16 > + j .L6 > +.L15: > + li t4,32 > + li a5,16 > + addi t4,t4,-32 > + mv t1,a7 > + bgeu a7,a5,.L6 > + j .L16 > +.L14: > + li t3,48 > + li a5,32 > + addi t3,t3,-48 > + mv t4,a7 > + bgeu a7,a5,.L5 > + j .L15 > +.L3: > + slli t5,t5,2 > + add t5,a0,t5 > +.L9: > + sb a3,0(a0) > + sb a4,1(a0) > + sb a5,2(a0) > + sb a6,3(a0) > + sh a7,0(a1) > + sh t3,2(a1) > + sw t6,0(a2) > + addi a0,a0,4 > + addi a1,a1,4 > + addi a2,a2,4 > + bne a0,t5,.L9 > + addi sp,sp,48 > + jr ra > + .size test_1_TYPE1_int16_t, .-test_1_TYPE1_int16_t > + .align 1 > + .globl test_1_TYPE1_uint16_t > + .type test_1_TYPE1_uint16_t, @function > +test_1_TYPE1_uint16_t: > + addi sp,sp,-48 > + lw t1,56(sp) > + lhu t4,48(sp) > + lw t5,52(sp) > + ble t1,zero,.L17 > + or t3,a2,a1 > + or t3,a0,t3 > + andi t3,t3,15 > + bne t3,zero,.L19 > + slli a4,a4,8 > + slli a5,a5,16 > + or t6,a3,a4 > + vsetvli t3,zero,e8,m1,ta,ma > + li t2,16777216 > + addi t2,t2,-1 > + or t3,t6,a5 > + slli a6,a6,24 > + and t3,t3,t2 > + vmv.v.i v1,0 > + sw s0,44(sp) > + vs1r.v v1,0(sp) > + or s0,t3,a6 > + sw s0,0(sp) > + lw s0,4(sp) > + li t6,-65536 > + addi t6,t6,255 > + andi t0,s0,-256 > + or t0,t0,a3 > + and s0,t0,t6 > + li t3,-16711680 > + addi t3,t3,-1 > + or s0,s0,a4 > + and t0,s0,t3 > + or t0,t0,a5 > + and t0,t0,t2 > + or s0,t0,a6 > + sw s0,4(sp) > + lw s0,8(sp) > + li t0,65536 > + addi t0,t0,-1 > + sw s2,36(sp) > + andi s2,s0,-256 > + slli s0,t4,16 > + vsetvli t4,zero,e16,m1,ta,ma > + sw s1,40(sp) > + sw s3,32(sp) > + addi t4,sp,16 > + and s1,a7,t0 > + vmv.v.i v1,0 > + or s1,s1,s0 > + vs1r.v v1,0(t4) > + sw s1,16(sp) > + lw s1,20(sp) > + li t4,-65536 > + or s2,s2,a3 > + and s1,t4,s1 > + and s3,s2,t6 > + or s1,s1,a7 > + or s3,s3,a4 > + and s1,s1,t0 > + and s2,s3,t3 > + or s1,s1,s0 > + or s2,s2,a5 > + sw s1,20(sp) > + lw s1,12(sp) > + and s2,s2,t2 > + or s2,s2,a6 > + sw s2,8(sp) > + andi s2,s1,-256 > + lw s1,24(sp) > + or s2,s2,a3 > + and t6,s2,t6 > + and s1,t4,s1 > + or a3,s1,a7 > + and a3,a3,t0 > + or t6,t6,a4 > + or a4,a3,s0 > + sw a4,24(sp) > + lw a4,28(sp) > + and t3,t6,t3 > + or t3,t3,a5 > + and t4,t4,a4 > + and t3,t3,t2 > + or t4,t4,a7 > + or a4,t3,a6 > + and t4,t4,t0 > + sw a4,12(sp) > + or a4,t4,s0 > + slli t1,t1,2 > + sw a4,28(sp) > + li a5,48 > + vsetvli a4,zero,e32,m1,ta,ma > + mv t3,t1 > + vmv.v.x v1,t5 > + bltu t1,a5,.L29 > + li a5,32 > + addi t3,t3,-48 > + mv t4,t1 > + bltu t1,a5,.L30 > +.L21: > + li a5,16 > + addi t4,t4,-32 > + mv a7,t1 > + bltu t1,a5,.L31 > +.L22: > + addi a7,a7,-16 > +.L23: > + vsetvli a6,t1,e8,m4,tu,mu > + vl1re8.v v2,0(sp) > + addi a4,a0,16 > + vsetvli zero,a6,e8,m1,ta,ma > + vse8.v v2,0(a0) > + addi a5,a0,32 > + vsetvli a3,a7,e8,m4,tu,mu > + vsetvli zero,a3,e8,m1,ta,ma > + vse8.v v2,0(a4) > + addi t6,a0,48 > + vsetvli a4,t4,e8,m4,tu,mu > + vsetvli zero,a4,e8,m1,ta,ma > + vse8.v v2,0(a5) > + srli t5,a6,1 > + vsetvli a5,t3,e8,m4,tu,mu > + addi s0,sp,16 > + vsetvli zero,a5,e8,m1,ta,ma > + vse8.v v2,0(t6) > + vl1re16.v v2,0(s0) > + srli t6,a3,1 > + vsetvli zero,t5,e16,m1,ta,ma > + addi t5,a1,16 > + vse16.v v2,0(a1) > + vsetvli zero,t6,e16,m1,ta,ma > + srli t6,a4,1 > + vse16.v v2,0(t5) > + addi t5,a1,32 > + vsetvli zero,t6,e16,m1,ta,ma > + srli t6,a5,1 > + vse16.v v2,0(t5) > + addi t5,a1,48 > + vsetvli zero,t6,e16,m1,ta,ma > + vse16.v v2,0(t5) > + srli t5,a6,2 > + vsetvli zero,t5,e32,m1,ta,ma > + addi t6,a2,16 > + vse32.v v1,0(a2) > + srli a3,a3,2 > + vsetvli zero,a3,e32,m1,ta,ma > + addi t5,a2,32 > + vse32.v v1,0(t6) > + srli a4,a4,2 > + addi a3,a2,48 > + vsetvli zero,a4,e32,m1,ta,ma > + srli a5,a5,2 > + vse32.v v1,0(t5) > + sub t1,t1,a6 > + vsetvli zero,a5,e32,m1,ta,ma > + vse32.v v1,0(a3) > + sub a7,a7,a6 > + sub t4,t4,a6 > + sub t3,t3,a6 > + addi a0,a0,64 > + addi a1,a1,64 > + addi a2,a2,64 > + bne t1,zero,.L23 > + lw s0,44(sp) > + lw s1,40(sp) > + lw s2,36(sp) > + lw s3,32(sp) > +.L17: > + addi sp,sp,48 > + jr ra > +.L31: > + li a7,16 > + j .L22 > +.L30: > + li t4,32 > + li a5,16 > + addi t4,t4,-32 > + mv a7,t1 > + bgeu t1,a5,.L22 > + j .L31 > +.L29: > + li t3,48 > + li a5,32 > + addi t3,t3,-48 > + mv t4,t1 > + bgeu t1,a5,.L21 > + j .L30 > +.L19: > + slli t1,t1,2 > + add t1,a0,t1 > +.L25: > + sb a3,0(a0) > + sb a4,1(a0) > + sb a5,2(a0) > + sb a6,3(a0) > + sh a7,0(a1) > + sh t4,2(a1) > + sw t5,0(a2) > + addi a0,a0,4 > + addi a1,a1,4 > + addi a2,a2,4 > + bne a0,t1,.L25 > + addi sp,sp,48 > + jr ra > + .size test_1_TYPE1_uint16_t, .-test_1_TYPE1_uint16_t > + .align 1 > + .globl test_1_TYPE1_int32_t > + .type test_1_TYPE1_int32_t, @function > +test_1_TYPE1_int32_t: > + addi sp,sp,-64 > + lw t1,80(sp) > + lw t3,64(sp) > + lw t5,72(sp) > + lw t6,76(sp) > + ble t1,zero,.L32 > + addi t0,t1,-1 > + li t4,6 > + bleu t0,t4,.L34 > + or t4,a2,a1 > + or t4,a0,t4 > + andi t4,t4,15 > + beq t4,zero,.L44 > +.L34: > + slli t1,t1,3 > + add t1,a0,t1 > +.L40: > + sh a3,0(a0) > + sh a4,2(a0) > + sh a5,4(a0) > + sh a6,6(a0) > + sw a7,0(a1) > + sw t3,4(a1) > + sw t5,0(a2) > + sw t6,4(a2) > + addi a0,a0,8 > + addi a1,a1,8 > + addi a2,a2,8 > + bne a0,t1,.L40 > +.L32: > + addi sp,sp,64 > + jr ra > +.L44: > + sw s0,60(sp) > + slli t2,a3,16 > + li s0,65536 > + addi s0,s0,-1 > + vsetvli t4,zero,e16,m1,ta,ma > + srli t2,t2,16 > + slli a4,a4,16 > + and a3,t2,s0 > + addi t4,sp,8 > + or a3,a3,a4 > + vmv.v.i v1,0 > + vs1r.v v1,0(t4) > + sw a3,8(sp) > + lw a3,12(sp) > + li t4,-65536 > + slli a5,a5,16 > + and t0,t4,a3 > + srli a5,a5,16 > + or t0,t0,a5 > + slli a6,a6,16 > + and t0,t0,s0 > + or a3,t0,a6 > + sw a3,12(sp) > + lw a3,16(sp) > + sw t3,28(sp) > + sw t3,36(sp) > + and a3,t4,a3 > + or a3,a3,t2 > + and a3,a3,s0 > + or a4,a3,a4 > + sw a4,16(sp) > + lw a4,20(sp) > + sw a7,24(sp) > + sw a7,32(sp) > + and t4,t4,a4 > + or a5,t4,a5 > + and t4,a5,s0 > + or a4,t4,a6 > + sw a4,20(sp) > + sw t5,40(sp) > + sw t6,44(sp) > + slli a5,t1,2 > + addi s0,sp,40 > + li a3,24 > + vsetvli a4,zero,e64,m1,ta,ma > + mv t3,a5 > + vlse64.v v1,0(s0),zero > + bltu a5,a3,.L45 > + li a4,16 > + addi t3,t3,-24 > + mv t4,a5 > + bltu a5,a4,.L46 > +.L36: > + li a4,8 > + addi t4,t4,-16 > + mv t1,a5 > + bltu a5,a4,.L47 > +.L37: > + addi t1,t1,-8 > +.L38: > + addi a3,sp,8 > + vsetvli a4,a5,e8,m2,tu,mu > + vl1re16.v v2,0(a3) > + addi a6,a0,16 > + vsetvli zero,a4,e16,m1,ta,ma > + vse16.v v2,0(a0) > + addi a3,a0,32 > + vsetvli a7,t1,e8,m2,tu,mu > + vsetvli zero,a7,e16,m1,ta,ma > + vse16.v v2,0(a6) > + addi t6,a0,48 > + vsetvli a6,t4,e8,m2,tu,mu > + vsetvli zero,a6,e16,m1,ta,ma > + vse16.v v2,0(a3) > + srli t5,a4,1 > + vsetvli a3,t3,e8,m2,tu,mu > + addi s0,sp,24 > + vsetvli zero,a3,e16,m1,ta,ma > + vse16.v v2,0(t6) > + vl1re32.v v2,0(s0) > + srli t6,a7,1 > + vsetvli zero,t5,e32,m1,ta,ma > + addi t5,a1,16 > + vse32.v v2,0(a1) > + vsetvli zero,t6,e32,m1,ta,ma > + srli t6,a6,1 > + vse32.v v2,0(t5) > + addi t5,a1,32 > + vsetvli zero,t6,e32,m1,ta,ma > + srli t6,a3,1 > + vse32.v v2,0(t5) > + addi t5,a1,48 > + vsetvli zero,t6,e32,m1,ta,ma > + vse32.v v2,0(t5) > + srli t5,a4,2 > + vsetvli zero,t5,e64,m1,ta,ma > + addi t6,a2,16 > + vse64.v v1,0(a2) > + srli a7,a7,2 > + vsetvli zero,a7,e64,m1,ta,ma > + addi t5,a2,32 > + vse64.v v1,0(t6) > + srli a6,a6,2 > + addi a7,a2,48 > + vsetvli zero,a6,e64,m1,ta,ma > + srli a3,a3,2 > + vse64.v v1,0(t5) > + sub a5,a5,a4 > + vsetvli zero,a3,e64,m1,ta,ma > + vse64.v v1,0(a7) > + sub t1,t1,a4 > + sub t4,t4,a4 > + sub t3,t3,a4 > + addi a0,a0,64 > + addi a1,a1,64 > + addi a2,a2,64 > + bne a5,zero,.L38 > + lw s0,60(sp) > + addi sp,sp,64 > + jr ra > +.L47: > + li t1,8 > + j .L37 > +.L46: > + li t4,16 > + li a4,8 > + addi t4,t4,-16 > + mv t1,a5 > + bgeu a5,a4,.L37 > + j .L47 > +.L45: > + li t3,24 > + li a4,16 > + addi t3,t3,-24 > + mv t4,a5 > + bgeu a5,a4,.L36 > + j .L46 > + .size test_1_TYPE1_int32_t, .-test_1_TYPE1_int32_t > + .align 1 > + .globl test_1_TYPE1_uint32_t > + .type test_1_TYPE1_uint32_t, @function > +test_1_TYPE1_uint32_t: > + addi sp,sp,-48 > + lw t1,64(sp) > + lw t5,48(sp) > + lw t3,56(sp) > + lw t4,60(sp) > + ble t1,zero,.L48 > + addi t0,t1,-1 > + li t6,6 > + bleu t0,t6,.L50 > + or t6,a2,a1 > + or t6,a0,t6 > + andi t6,t6,15 > + beq t6,zero,.L60 > +.L50: > + slli t1,t1,3 > + add t1,a0,t1 > +.L56: > + sh a3,0(a0) > + sh a4,2(a0) > + sh a5,4(a0) > + sh a6,6(a0) > + sw a7,0(a1) > + sw t5,4(a1) > + sw t3,0(a2) > + sw t4,4(a2) > + addi a0,a0,8 > + addi a1,a1,8 > + addi a2,a2,8 > + bne a0,t1,.L56 > +.L48: > + addi sp,sp,48 > + jr ra > +.L60: > + li t0,65536 > + addi t0,t0,-1 > + vsetvli t6,zero,e16,m1,ta,ma > + slli a4,a4,16 > + and t2,a3,t0 > + addi t6,sp,8 > + or t2,t2,a4 > + vmv.v.i v1,0 > + vs1r.v v1,0(t6) > + sw t2,8(sp) > + lw t2,12(sp) > + li t6,-65536 > + slli a6,a6,16 > + and t2,t6,t2 > + or t2,t2,a5 > + and t2,t2,t0 > + or t2,t2,a6 > + sw t2,12(sp) > + lw t2,16(sp) > + sw t3,40(sp) > + sw a7,24(sp) > + and t2,t6,t2 > + or a3,t2,a3 > + and a3,a3,t0 > + or a4,a3,a4 > + sw a4,16(sp) > + lw a4,20(sp) > + sw t5,28(sp) > + sw a7,32(sp) > + and t6,t6,a4 > + or t6,t6,a5 > + and t6,t6,t0 > + or a5,t6,a6 > + sw a5,20(sp) > + sw t4,44(sp) > + slli a4,t1,2 > + sw t5,36(sp) > + addi a6,sp,40 > + li a3,24 > + vsetvli a5,zero,e64,m1,ta,ma > + mv t3,a4 > + vlse64.v v1,0(a6),zero > + bltu a4,a3,.L61 > + li a5,16 > + addi t3,t3,-24 > + mv t4,a4 > + bltu a4,a5,.L62 > +.L52: > + li a5,8 > + addi t4,t4,-16 > + mv t1,a4 > + bltu a4,a5,.L63 > +.L53: > + addi t1,t1,-8 > +.L54: > + addi a3,sp,8 > + vsetvli a5,a4,e8,m2,tu,mu > + vl1re16.v v2,0(a3) > + addi a6,a0,16 > + vsetvli zero,a5,e16,m1,ta,ma > + vse16.v v2,0(a0) > + addi a3,a0,32 > + vsetvli a7,t1,e8,m2,tu,mu > + vsetvli zero,a7,e16,m1,ta,ma > + vse16.v v2,0(a6) > + addi t6,a0,48 > + vsetvli a6,t4,e8,m2,tu,mu > + vsetvli zero,a6,e16,m1,ta,ma > + vse16.v v2,0(a3) > + srli t5,a5,1 > + vsetvli a3,t3,e8,m2,tu,mu > + vsetvli zero,a3,e16,m1,ta,ma > + vse16.v v2,0(t6) > + addi t6,sp,24 > + vl1re32.v v2,0(t6) > + vsetvli zero,t5,e32,m1,ta,ma > + srli t6,a7,1 > + vse32.v v2,0(a1) > + addi t5,a1,16 > + vsetvli zero,t6,e32,m1,ta,ma > + srli t6,a6,1 > + vse32.v v2,0(t5) > + addi t5,a1,32 > + vsetvli zero,t6,e32,m1,ta,ma > + srli t6,a3,1 > + vse32.v v2,0(t5) > + addi t5,a1,48 > + vsetvli zero,t6,e32,m1,ta,ma > + vse32.v v2,0(t5) > + srli t5,a5,2 > + vsetvli zero,t5,e64,m1,ta,ma > + addi t6,a2,16 > + vse64.v v1,0(a2) > + srli a7,a7,2 > + vsetvli zero,a7,e64,m1,ta,ma > + addi t5,a2,32 > + vse64.v v1,0(t6) > + srli a6,a6,2 > + addi a7,a2,48 > + vsetvli zero,a6,e64,m1,ta,ma > + srli a3,a3,2 > + vse64.v v1,0(t5) > + sub a4,a4,a5 > + vsetvli zero,a3,e64,m1,ta,ma > + vse64.v v1,0(a7) > + sub t1,t1,a5 > + sub t4,t4,a5 > + sub t3,t3,a5 > + addi a0,a0,64 > + addi a1,a1,64 > + addi a2,a2,64 > + bne a4,zero,.L54 > + addi sp,sp,48 > + jr ra > +.L63: > + li t1,8 > + j .L53 > +.L62: > + li t4,16 > + li a5,8 > + addi t4,t4,-16 > + mv t1,a4 > + bgeu a4,a5,.L53 > + j .L63 > +.L61: > + li t3,24 > + li a5,16 > + addi t3,t3,-24 > + mv t4,a4 > + bgeu a4,a5,.L52 > + j .L62 > + .size test_1_TYPE1_uint32_t, .-test_1_TYPE1_uint32_t > + .ident "GCC: (GNU) 13.0.1 20230324 (experimental)" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c > new file mode 100644 > index 00000000000..d3e187eae68 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c > @@ -0,0 +1,19 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax" } */ > + > +#include "multiple_rgroup-1.c" > + > +int main (void) > +{ > + TEST_ALL (run_1) > + TEST_ALL (run_2) > + TEST_ALL (run_3) > + TEST_ALL (run_4) > + TEST_ALL (run_5) > + TEST_ALL (run_6) > + TEST_ALL (run_7) > + TEST_ALL (run_8) > + TEST_ALL (run_9) > + TEST_ALL (run_10) > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c > new file mode 100644 > index 00000000000..5166c9e35a0 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c > @@ -0,0 +1,19 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax" } */ > + > +#include "multiple_rgroup-2.c" > + > +int main (void) > +{ > + TEST_ALL (run_1) > + TEST_ALL (run_2) > + TEST_ALL (run_3) > + TEST_ALL (run_4) > + TEST_ALL (run_5) > + TEST_ALL (run_6) > + TEST_ALL (run_7) > + TEST_ALL (run_8) > + TEST_ALL (run_9) > + TEST_ALL (run_10) > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c > new file mode 100644 > index 00000000000..6384888dd03 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.c > @@ -0,0 +1,8 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=scalable -fno-vect-cost-model -fno-tree-loop-distribute-patterns" } */ > + > +#include "single_rgroup-1.h" > + > +TEST_ALL (test_1) > + > +/* { dg-final { scan-assembler-times {vsetvli} 10 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h > new file mode 100644 > index 00000000000..be6b4c641cb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup-1.h > @@ -0,0 +1,106 @@ > +#include <stddef.h> > +#include <stdint.h> > + > +#define N 777 > + > +#define test_1(TYPE) \ > + TYPE a_##TYPE[N]; \ > + TYPE b_##TYPE[N]; \ > + void __attribute__ ((noinline, noclone)) test_1_##TYPE (unsigned int n) \ > + { \ > + unsigned int i = 0; \ > + for (i = 0; i < n; i++) \ > + b_##TYPE[i] = a_##TYPE[i]; \ > + } > + > +#define run_1(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 33 + 1 + 109; \ > + test_1_##TYPE (5); \ > + for (unsigned int i = 0; i < 5; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_2(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 57 + 1 + 999; \ > + test_1_##TYPE (17); \ > + for (unsigned int i = 0; i < 17; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_3(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 77 + 1 + 3; \ > + test_1_##TYPE (32); \ > + for (unsigned int i = 0; i < 32; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_4(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 45 + 1 + 11; \ > + test_1_##TYPE (128); \ > + for (unsigned int i = 0; i < 128; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_5(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 199 + 1 + 79; \ > + test_1_##TYPE (177); \ > + for (unsigned int i = 0; i < 177; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_6(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 377 + 1 + 73; \ > + test_1_##TYPE (255); \ > + for (unsigned int i = 0; i < 255; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_7(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 98 + 1 + 66; \ > + test_1_##TYPE (333); \ > + for (unsigned int i = 0; i < 333; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_8(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 7 + 1 * 7; \ > + test_1_##TYPE (512); \ > + for (unsigned int i = 0; i < 512; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_9(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 + 1 + 88; \ > + test_1_##TYPE (637); \ > + for (unsigned int i = 0; i < 637; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define run_10(TYPE) \ > + for (unsigned int i = 0; i < N; i++) \ > + a_##TYPE[i] = i * 2 * 331 + 1 + 547; \ > + test_1_##TYPE (777); \ > + for (unsigned int i = 0; i < 777; i++) \ > + if (b_##TYPE[i] != a_##TYPE[i]) \ > + __builtin_abort (); > + > +#define TEST_ALL(T) \ > + T (int8_t) \ > + T (uint8_t) \ > + T (int16_t) \ > + T (uint16_t) \ > + T (int32_t) \ > + T (uint32_t) \ > + T (int64_t) \ > + T (uint64_t) \ > + T (float) \ > + T (double) > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c > new file mode 100644 > index 00000000000..4af2f18de8a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-1.c > @@ -0,0 +1,19 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "-fno-vect-cost-model -fno-tree-loop-distribute-patterns --param riscv-autovec-preference=scalable" } */ > + > +#include "single_rgroup-1.c" > + > +int main (void) > +{ > + TEST_ALL (run_1) > + TEST_ALL (run_2) > + TEST_ALL (run_3) > + TEST_ALL (run_4) > + TEST_ALL (run_5) > + TEST_ALL (run_6) > + TEST_ALL (run_7) > + TEST_ALL (run_8) > + TEST_ALL (run_9) > + TEST_ALL (run_10) > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h > new file mode 100644 > index 00000000000..799e2d7d754 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/template-1.h > @@ -0,0 +1,68 @@ > +#include <stddef.h> > +#include <stdint.h> > + > +void > +foo0 (int8_t *__restrict f, int16_t *__restrict d, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + f[i * 2 + 0] = 1; > + f[i * 2 + 1] = 2; > + d[i] = 3; > + } > +} > + > +void > +foo1 (int16_t *__restrict f, int32_t *__restrict d, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + f[i * 2 + 0] = 1; > + f[i * 2 + 1] = 2; > + d[i] = 3; > + } > +} > + > +void > +foo2 (int32_t *__restrict f, int64_t *__restrict d, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + f[i * 2 + 0] = 1; > + f[i * 2 + 1] = 2; > + d[i] = 3; > + } > +} > + > +void > +foo3 (int16_t *__restrict f, float *__restrict d, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + f[i * 2 + 0] = 1; > + f[i * 2 + 1] = 2; > + d[i] = 3; > + } > +} > + > +void > +foo4 (int32_t *__restrict f, float *__restrict d, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + f[i * 2 + 0] = 1; > + f[i * 2 + 1] = 2; > + d[i] = 3; > + } > +} > + > +void > +foo5 (float *__restrict f, double *__restrict d, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + f[i * 2 + 0] = 1; > + f[i * 2 + 1] = 2; > + d[i] = 3; > + } > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c > new file mode 100644 > index 00000000000..7ff84f60749 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c > new file mode 100644 > index 00000000000..dc22eefbd36 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c > new file mode 100644 > index 00000000000..36f6d98a5cb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c > new file mode 100644 > index 00000000000..794f28e73bd > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-2.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c > new file mode 100644 > index 00000000000..d5e36190b31 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c > new file mode 100644 > index 00000000000..d154df4c4ba > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 4 "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c > new file mode 100644 > index 00000000000..68e7696ed65 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32x -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c > new file mode 100644 > index 00000000000..f8860a36332 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32x -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > + > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c > new file mode 100644 > index 00000000000..3a6a3aa1261 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-1.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c > new file mode 100644 > index 00000000000..d1aaf3f4297 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32x_zvl128b-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve32x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 2 "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c > new file mode 100644 > index 00000000000..0d03536389f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c > new file mode 100644 > index 00000000000..ca423285011 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d-2.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c > new file mode 100644 > index 00000000000..4c6c7e2fb3b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64d_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c > new file mode 100644 > index 00000000000..b8253476973 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64d_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c > new file mode 100644 > index 00000000000..e7900b82215 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c > new file mode 100644 > index 00000000000..1c0e8c2785b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f-2.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c > new file mode 100644 > index 00000000000..daf4a4e8e64 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c > new file mode 100644 > index 00000000000..3866e45546c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64f_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 4 "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c > new file mode 100644 > index 00000000000..4c190c303c1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64x -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c > new file mode 100644 > index 00000000000..66bb1f44170 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x-2.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64x -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c > new file mode 100644 > index 00000000000..6920a395d1c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-1.c > @@ -0,0 +1,4 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=scalable -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c > new file mode 100644 > index 00000000000..d8b60babf9a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64x_zvl128b-2.c > @@ -0,0 +1,6 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zve64x_zvl128b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ > + > +#include "template-1.h" > + > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp > index 7a9a2b6ac48..5893dbf9742 100644 > --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp > +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp > @@ -44,6 +44,22 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/base/*.\[cS\]]] \ > "" $CFLAGS > gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vsetvl/*.\[cS\]]] \ > "" $CFLAGS > +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/*.\[cS\]]] \ > + "" $CFLAGS > + > +set AUTOVEC_TEST_OPTS [list \ > + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m1} \ > + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m2} \ > + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m4} \ > + {-ftree-vectorize -O3 --param riscv-autovec-lmul=m8} \ > + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m1} \ > + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m2} \ > + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m4} \ > + {-ftree-vectorize -O2 --param riscv-autovec-lmul=m8} ] > +foreach op $AUTOVEC_TEST_OPTS { > + gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/partial/*.\[cS\]]] \ > + "" "$op" > +} > > # All done. > dg-finish > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c > index ee58f9bbdfc..8a1bbb40fc8 100644 > --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-17.c > @@ -11,4 +11,4 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int c > __riscv_vse32_v_i32m1(out, c, __riscv_vsetvl_e8mf2 (vl)); > } > > -/* { dg-final { scan-assembler-times {vsetvli} 8 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ > \ No newline at end of file > +/* { dg-final { scan-assembler-times {vsetvli} 7 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ > -- > 2.36.3 > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-04-07 1:40 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-04-06 14:42 [PATCH 0/3] RISC-V:Enable basic auto-vectorization for RVV juzhe.zhong 2023-04-06 14:42 ` [PATCH 1/3] VECT: Add WHILE_LEN pattern to support decrement IV manipulation for loop vectorizer juzhe.zhong 2023-04-06 14:42 ` [PATCH 2/3] RISC-V: Enable basic RVV auto-vectorization and support WHILE_LEN/LEN_LOAD/LEN_STORE pattern juzhe.zhong 2023-04-06 16:04 ` Kito Cheng 2023-04-07 1:40 ` juzhe.zhong 2023-04-06 14:42 ` [PATCH] RISC-V: Add RVV auto-vectorization testcase juzhe.zhong 2023-04-06 15:36 ` Kito Cheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).