From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id 7BA4D3858C60; Sat, 12 Nov 2022 03:13:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7BA4D3858C60 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668222820; bh=+4UIkZIFJs+hCC84DERHqn/K/4HSrptD6ysb6nUzsp4=; h=From:To:Subject:Date:From; b=ONhkpd874fyu0p8nNb5FP1YMog2APsZEqE82rTr861TA8Kmq02wnihOUM2gb5EIfb XHUi9zeAocI8lHXbltNFYwaOgYQ8oMmGoH/YC6Z/yT1/CKLJwPDGpfm6tLy4zld4E2 6gTHMRwpvpqp1iHPkBWPSkdIFeseMm+GZhGWV9Kw= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/dmf004)] Support load/store vector with right length. X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/dmf004 X-Git-Oldrev: 694e282516d28933354db5824ad07c16785c8a36 X-Git-Newrev: 6cbe4777f457a063fc57acec9119f9162003cc43 Message-Id: <20221112031340.7BA4D3858C60@sourceware.org> Date: Sat, 12 Nov 2022 03:13:40 +0000 (GMT) List-Id: https://gcc.gnu.org/g:6cbe4777f457a063fc57acec9119f9162003cc43 commit 6cbe4777f457a063fc57acec9119f9162003cc43 Author: Michael Meissner Date: Fri Nov 11 22:13:23 2022 -0500 Support load/store vector with right length. This patch adds support for new instructions that may be added to the PowerPC architecture in the future to enhance the load and store vector with length instructions. The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use since the count for the number of bytes must be in the top 8 bits of the GPR register, instead of the bottom 8 bits. This meant that code generating these instructions typically had to do a shift left by 56 bits to get the count into the right position. In a future version of the PowerPC architecture, new variants of these instructions might be added that expect the count to be in the bottom 8 bits of the GPR register. These patches add this support to GCC if the user uses the -mcpu=future option. 2022-11-11 Michael Meissner gcc/ * config/rs6000/vsx.md (define_expand lxvl): If -mcpu=future, generate the lxvl with the shift count automaticaly used in the insn. (lxvrl): New insn for -mcpu=future. (lxvrll): Likewise. (define_expand lxvl): If -mcpu=future, generate the stxvl with the shift count automaticaly used in the insn. (stxvrl): New insn for -mcpu=future. (stxvrll): Likewise. Diff: --- gcc/config/rs6000/vsx.md | 122 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 101 insertions(+), 21 deletions(-) diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index fb5cf04147e..e4e73db9bb8 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -5582,20 +5582,32 @@ DONE; }) -;; Load VSX Vector with Length +;; Load VSX Vector with Length. If we have lxvrl, we don't have to do an +;; explicit shift left into a pseudo. (define_expand "lxvl" - [(set (match_dup 3) - (ashift:DI (match_operand:DI 2 "register_operand") - (const_int 56))) - (set (match_operand:V16QI 0 "vsx_register_operand") - (unspec:V16QI - [(match_operand:DI 1 "gpc_reg_operand") - (mem:V16QI (match_dup 1)) - (match_dup 3)] - UNSPEC_LXVL))] + [(use (match_operand:V16QI 0 "vsx_register_operand")) + (use (match_operand:DI 1 "gpc_reg_operand")) + (use (match_operand:DI 2 "gpc_reg_operand"))] "TARGET_P9_VECTOR && TARGET_64BIT" { - operands[3] = gen_reg_rtx (DImode); + rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56)); + rtx len; + + if (TARGET_FUTURE) + len = shift_len; + else + { + len = gen_reg_rtx (DImode); + emit_insn (gen_rtx_SET (len, shift_len)); + } + + rtx dest = operands[0]; + rtx addr = operands[1]; + rtx mem = gen_rtx_MEM (V16QImode, addr); + rtvec rv = gen_rtvec (3, addr, mem, len); + rtx lxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_LXVL); + emit_insn (gen_rtx_SET (dest, lxvl)); + DONE; }) (define_insn "*lxvl" @@ -5619,6 +5631,34 @@ "lxvll %x0,%1,%2" [(set_attr "type" "vecload")]) +;; For lxvrl and lxvrll, use the combiner to eliminate the shift. The +;; define_expand for lxvl will already incorporate the shift in generating the +;; insn. The lxvll buitl-in function required the user to have already done +;; the shift. Defining lxvrll this way, will optimize cases where the user has +;; done the shift immediately before the built-in. +(define_insn "*lxvrl" + [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa") + (unspec:V16QI + [(match_operand:DI 1 "gpc_reg_operand" "b") + (mem:V16QI (match_dup 1)) + (ashift:DI (match_operand:DI 2 "register_operand" "r") + (const_int 56))] + UNSPEC_LXVL))] + "TARGET_FUTURE && TARGET_64BIT" + "lxvrl %x0,%1,%2" + [(set_attr "type" "vecload")]) + +(define_insn "*lxvrll" + [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa") + (unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b") + (mem:V16QI (match_dup 1)) + (ashift:DI (match_operand:DI 2 "register_operand" "r") + (const_int 56))] + UNSPEC_LXVLL))] + "TARGET_FUTURE" + "lxvrll %x0,%1,%2" + [(set_attr "type" "vecload")]) + ;; Expand for builtin xl_len_r (define_expand "xl_len_r" [(match_operand:V16QI 0 "vsx_register_operand") @@ -5650,18 +5690,29 @@ ;; Store VSX Vector with Length (define_expand "stxvl" - [(set (match_dup 3) - (ashift:DI (match_operand:DI 2 "register_operand") - (const_int 56))) - (set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand")) - (unspec:V16QI - [(match_operand:V16QI 0 "vsx_register_operand") - (mem:V16QI (match_dup 1)) - (match_dup 3)] - UNSPEC_STXVL))] + [(use (match_operand:V16QI 0 "vsx_register_operand")) + (use (match_operand:DI 1 "gpc_reg_operand")) + (use (match_operand:DI 2 "gpc_reg_operand"))] "TARGET_P9_VECTOR && TARGET_64BIT" { - operands[3] = gen_reg_rtx (DImode); + rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56)); + rtx len; + + if (TARGET_FUTURE) + len = shift_len; + else + { + len = gen_reg_rtx (DImode); + emit_insn (gen_rtx_SET (len, shift_len)); + } + + rtx src = operands[0]; + rtx addr = operands[1]; + rtx mem = gen_rtx_MEM (V16QImode, addr); + rtvec rv = gen_rtvec (3, src, mem, len); + rtx stxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_STXVL); + emit_insn (gen_rtx_SET (mem, stxvl)); + DONE; }) ;; Define optab for vector access with length vectorization exploitation. @@ -5705,6 +5756,35 @@ "stxvl %x0,%1,%2" [(set_attr "type" "vecstore")]) +;; For stxvrl and stxvrll, use the combiner to eliminate the shift. The +;; define_expand for stxvl will already incorporate the shift in generating the +;; insn. The stxvll buitl-in function required the user to have already done +;; the shift. Defining stxvrll this way, will optimize cases where the user +;; has done the shift immediately before the built-in. + +(define_insn "*stxvrl" + [(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand" "b")) + (unspec:V16QI + [(match_operand:V16QI 0 "vsx_register_operand" "wa") + (mem:V16QI (match_dup 1)) + (ashift:DI (match_operand:DI 2 "register_operand" "r") + (const_int 56))] + UNSPEC_STXVL))] + "TARGET_FUTURE && TARGET_64BIT" + "stxvrl %x0,%1,%2" + [(set_attr "type" "vecstore")]) + +(define_insn "*stxvrll" + [(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand" "b")) + (unspec:V16QI [(match_operand:V16QI 0 "vsx_register_operand" "wa") + (mem:V16QI (match_dup 1)) + (ashift:DI (match_operand:DI 2 "register_operand" "r") + (const_int 56))] + UNSPEC_STXVLL))] + "TARGET_FUTURE" + "stxvrll %x0,%1,%2" + [(set_attr "type" "vecstore")]) + ;; Expand for builtin xst_len_r (define_expand "xst_len_r" [(match_operand:V16QI 0 "vsx_register_operand" "=wa")