From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id 2058E3858D1E; Mon, 1 May 2023 21:03:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2058E3858D1E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1682975021; bh=A0ufNWWnBaLoUQMsJCiiH/rDcG+bDIe9OEX33WUR8pU=; h=From:To:Subject:Date:From; b=XvFBQInUtyuiWgIaYMzditOH3b3PukflWFVdNvnblT15R1rCY2aU/mQK0YIWDe3Bi gMiOZ0QOLKewPmuUUcwahC3Kv6oLHb8Pub0AkXmm50yl+sQrjqmbKc3i7mEkCNwjWX cwoDQzKwXM+TUhhwzW6RA4izJTm+a2OlgYsR/qKU= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work120)] Optimize vec_extract of V4SF from memory with constant element numbers. X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work120 X-Git-Oldrev: 6f56688cc61d607d85d2a775b623dabaaa8e8518 X-Git-Newrev: ad816e4f31013b615806632e00ad7b1d1914c27f Message-Id: <20230501210341.2058E3858D1E@sourceware.org> Date: Mon, 1 May 2023 21:03:41 +0000 (GMT) List-Id: https://gcc.gnu.org/g:ad816e4f31013b615806632e00ad7b1d1914c27f commit ad816e4f31013b615806632e00ad7b1d1914c27f Author: Michael Meissner Date: Mon May 1 17:03:22 2023 -0400 Optimize vec_extract of V4SF from memory with constant element numbers. This patch updates vec_extract of V4SF from memory with constant element numbers. This patch corrects the ISA for loading SF values to altivec registers to be power8 vector, and not power7. This patch adds a combiner patch to combine loading up a SF element and converting it to double. It also removes the '?' from the 'r' constraint so that if the SFmode is needed in a GPR, it doesn't have to load it to the vector unit and then store it. 2023-05-01 Michael Meissner gcc/ * gcc/config/rs6000/vsx.md (vsx_extract_v4sf_load): Fix ISA for loading up SFmode values with x-form addresses. Remove ? from 'r' constraint. (vsx_extract_v4sf_load_to_df): New insn. gc/testsuite/ * gcc.target/powerpc/vec-extract-mem-float-1.c: New file. Diff: --- gcc/config/rs6000/vsx.md | 73 +++++++++++++++++++--- .../gcc.target/powerpc/vec-extract-mem-float-1.c | 29 +++++++++ 2 files changed, 95 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 417aff5e24b..9d3b3441ed5 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -3549,12 +3549,33 @@ [(set_attr "length" "8") (set_attr "type" "fp")]) +;; V4SF extract from memory with constant element number. +;; Alternatives: +;; Reg: Ele: Cpu: Addr: need scratch +;; 1: FPR 0 any normal address no +;; 2: FPR 1-3 any offsettable address no +;; 3: FPR 1-3 any single register yes +;; 4: VMX 0 p8 reg+reg or reg no +;; 5: VMX 1-3 p8 single register yes +;; 6: VMX 0 p9 normal address no +;; 7: VMX 1-3 p9 offsettable address no +;; 8: GPR 0 any normal address no +;; 9: GPR 0-3 any offsettable address no +;; 10: GPR 0-3 any single register yes (define_insn_and_split "*vsx_extract_v4sf_load" - [(set (match_operand:SF 0 "register_operand" "=f,v,v,?r") + [(set (match_operand:SF 0 "register_operand" + "=f, f, f, v, v, v, v, + r, r, r") (vec_select:SF - (match_operand:V4SF 1 "memory_operand" "m,Z,m,m") - (parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n,n")]))) - (clobber (match_scratch:P 3 "=&b,&b,&b,&b"))] + (match_operand:V4SF 1 "memory_operand" + "m, o, Q, Z, Q, m, o, + m, o, Q") + (parallel [(match_operand:QI 2 "const_0_to_3_operand" + "O, n, n, O, n, O, n, + O, n, n")]))) + (clobber (match_scratch:P 3 + "=X, X, &b, X, &b, X, X, + X, X, &b"))] "VECTOR_MEM_VSX_P (V4SFmode)" "#" "&& reload_completed" @@ -3563,9 +3584,47 @@ operands[4] = rs6000_adjust_vec_address (operands[0], operands[1], operands[2], operands[3], SFmode); } - [(set_attr "type" "fpload,fpload,fpload,load") - (set_attr "length" "8") - (set_attr "isa" "*,p7v,p9v,*")]) + [(set_attr "type" + "fpload, fpload, fpload, fpload, fpload, fpload, fpload, + load, load, load") + (set_attr "isa" + "*, *, *, p8v, p8v, p9v, p9v, + *, *, *")]) + +;; V4SF extract from memory with constant element number and convert to DFmode. +;; Alternatives: +;; Reg: Ele: Cpu: Addr: need scratch +;; 1: FPR 0 any normal address no +;; 2: FPR 1-3 any offsettable address no +;; 3: FPR 1-3 any single register yes +;; 4: VMX 0 p8 reg+reg or reg no +;; 5: VMX 1-3 p8 single register yes +;; 6: VMX 0 p9 normal address no +;; 7: VMX 1-3 p9 offsettable address no +(define_insn_and_split "*vsx_extract_v4sf_load_to_df" + [(set (match_operand:DF 0 "register_operand" + "=f, f, f, v, v, v, v") + (float_extend:DF + (vec_select:SF + (match_operand:V4SF 1 "memory_operand" + "m, o, Q, Z, Q, m, o") + (parallel [(match_operand:QI 2 "const_0_to_3_operand" + "=X, X, &b, X, &b, X, X")])))) + (clobber (match_scratch:P 3 + "=X, X, &b, X, &b, X, X"))] + "VECTOR_MEM_VSX_P (V4SFmode)" + "#" + "&& reload_completed" + [(set (match_dup 0) + (float_extend:DF (match_dup 4)))] +{ + operands[4] = rs6000_adjust_vec_address (operands[0], operands[1], operands[2], + operands[3], SFmode); +} + [(set_attr "type" + "fpload, fpload, fpload, fpload, fpload, fpload, fpload") + (set_attr "isa" + "*, *, *, p8v, p8v, p9v, p9v")]) ;; Variable V4SF extract from a register (define_insn_and_split "vsx_extract_v4sf_var" diff --git a/gcc/testsuite/gcc.target/powerpc/vec-extract-mem-float-1.c b/gcc/testsuite/gcc.target/powerpc/vec-extract-mem-float-1.c new file mode 100644 index 00000000000..4670e261ba8 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-extract-mem-float-1.c @@ -0,0 +1,29 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ + +/* Test to verify that the vec_extract with constant element numbers can load + float elements into a GPR register without doing a LFS/STFS. */ + +#include + +void +extract_v4sf_gpr_0 (vector float *p, float *q) +{ + float x = vec_extract (*p, 0); + __asm__ (" # %0" : "+r" (x)); /* lwz, no lfs/stfs. */ + *q = x; +} + +void +extract_v4sf_gpr_1 (vector float *p, float *q) +{ + float x = vec_extract (*p, 1); + __asm__ (" # %0" : "+r" (x)); /* lwz, no lfs/stfs. */ + *q = x; +} + +/* { dg-final { scan-assembler-times {\mlwzx?\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mstw\M} 2 } } */ +/* { dg-final { scan-assembler-not {\mlfsx?\M|\mlxsspx?\M} } } */ +/* { dg-final { scan-assembler-not {\mstfsx?\M|\mstxsspx?\M} } } */