From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 421483858D35; Wed, 8 Jul 2020 03:20:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 421483858D35 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 06833Bai153512; Tue, 7 Jul 2020 23:20:46 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 324y2xjddt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 07 Jul 2020 23:20:45 -0400 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 06833F77153642; Tue, 7 Jul 2020 23:20:45 -0400 Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com with ESMTP id 324y2xjddg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 07 Jul 2020 23:20:45 -0400 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0683H0hE014716; Wed, 8 Jul 2020 03:20:43 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma02fra.de.ibm.com with ESMTP id 322hd845c7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Jul 2020 03:20:43 +0000 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0683JPjD3539290 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 8 Jul 2020 03:19:26 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DB02A42042; Wed, 8 Jul 2020 03:19:25 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 765EC4203F; Wed, 8 Jul 2020 03:19:23 +0000 (GMT) Received: from luoxhus-MacBook-Pro.local (unknown [9.197.233.182]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 8 Jul 2020 03:19:23 +0000 (GMT) Subject: Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310] To: Segher Boessenkool Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, wschmidt@linux.ibm.com, guojiufu@linux.ibm.com, linkw@gcc.gnu.org References: <20200706021757.1118129-1-luoxhu@linux.ibm.com> <20200707001803.GR3598@gate.crashing.org> <66c7b5d6-afa6-53d7-704d-44834ff00311@linux.ibm.com> <20200707213116.GU3598@gate.crashing.org> From: luoxhu Message-ID: <66faac54-0620-5ee0-ff48-5609ad9e3fa7@linux.ibm.com> Date: Wed, 8 Jul 2020 11:19:21 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200707213116.GU3598@gate.crashing.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-07-08_01:2020-07-08, 2020-07-07 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 suspectscore=0 bulkscore=0 lowpriorityscore=0 phishscore=0 mlxlogscore=999 cotscore=-2147483648 spamscore=0 clxscore=1011 impostorscore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2007080020 X-Spam-Status: No, score=-14.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Jul 2020 03:20:47 -0000 On 2020/7/8 05:31, Segher Boessenkool wrote: > Hi! > > On Tue, Jul 07, 2020 at 04:39:58PM +0800, luoxhu wrote: >>> Lots of questions, sorry! >> >> Thanks for the nice suggestions of the initial patch contains many issues:), > > Pretty much all of it should *work*, it just can be improved and > simplified quite a bit :-) > >> For this case, %1:SF matches with "=wa"? And how to construct cases to >> match("=?r", "wa") and ("=!r", "r") combinations, please? > > operands[0], not operands[1]? > > Simple testcases will not put the output into a GPR, unless you force > the compiler to do that, because of the ? and !. > > Often you can just do > > asm("#" : "+r"(x)); > > to force "x" into a GPR at that point of the program. But there is > nothing stopping the compiler from copying it back to a VSR where it > thinks that is cheaper ;-) > > So maybe this pattern should just have the GPR-to-VSR alternative? It > does not look like the GPR destination variants are useful? > >> + rtx op0 = operands[0]; >> + rtx op1 = operands[1]; >> + rtx op2 = operands[2]; > > (Please just write out operands[N] everywhere). > >> + if (GET_CODE (operands[2]) == SCRATCH) >> + op2 = gen_reg_rtx (DImode); >> + >> + rtx mask = GEN_INT (HOST_WIDE_INT_M1U << 32); >> + emit_insn (gen_anddi3 (op2, op1, mask)); > > Groovy :-) > > So, it looks like you can remove the ? and ! alternatives, leaving just > the first alternative? > Thanks. V3 Update: Leave only GPR-to-VSR alternative and use operands[N]. Bootstrap and regression tested pass on Power8-LE. For extracting high part element from DImode register like: {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} split it before reload with "and mask" to avoid generating shift right 32 bit then shift left 32 bit. This pattern also exists in PR42475 and PR67741, etc. srdi 3,3,32 sldi 9,3,32 mtvsrd 1,9 xscvspdpn 1,1 => rldicr 3,3,0,31 mtvsrd 1,3 xscvspdpn 1,1 gcc/ChangeLog: 2020-07-08 Xionghu Luo PR rtl-optimization/89310 * config/rs6000/rs6000.md (movsf_from_si2): New define_insn_and_split. gcc/testsuite/ChangeLog: 2020-07-08 Xionghu Luo PR rtl-optimization/89310 * gcc.target/powerpc/pr89310.c: New test. --- gcc/config/rs6000/rs6000.md | 34 ++++++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/pr89310.c | 17 +++++++++++ 2 files changed, 51 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr89310.c diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 4fcd6a94022..02c5171378c 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7593,6 +7593,40 @@ (define_insn_and_split "movsf_from_si" "*, *, p9v, p8v, *, *, p8v, p8v, p8v, *")]) +;; For extracting high part element from DImode register like: +;; {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} +;; split it before reload with "and mask" to avoid generating shift right +;; 32 bit then shift left 32 bit. +(define_insn_and_split "movsf_from_si2" + [(set (match_operand:SF 0 "gpc_reg_operand" "=wa") + (unspec:SF [ + (subreg:SI (ashiftrt:DI + (match_operand:DI 1 "input_operand" "r") + (const_int 32)) + 0)] + UNSPEC_SF_FROM_SI)) + (clobber (match_scratch:DI 2 "=r"))] + "TARGET_NO_SF_SUBREG" + "@ + #" + + "&& !reload_completed + && vsx_reg_sfsubreg_ok (operands[0], SFmode)" + [(const_int 0)] +{ + if (GET_CODE (operands[2]) == SCRATCH) + operands[2] = gen_reg_rtx (DImode); + + rtx mask = GEN_INT (HOST_WIDE_INT_M1U << 32); + emit_insn (gen_anddi3 (operands[2], operands[1], mask)); + emit_insn (gen_p8_mtvsrd_sf (operands[0], operands[2])); + emit_insn (gen_vsx_xscvspdpn_directmove (operands[0], operands[0])); + DONE; +} + [(set_attr "length" "12") + (set_attr "type" "vecfloat") + (set_attr "isa" "p8v")]) + ;; Move 64-bit binary/decimal floating point (define_expand "mov" diff --git a/gcc/testsuite/gcc.target/powerpc/pr89310.c b/gcc/testsuite/gcc.target/powerpc/pr89310.c new file mode 100644 index 00000000000..15e78509246 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr89310.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +struct s { + int i; + float f; +}; + +float +foo (struct s arg) +{ + return arg.f; +} + +/* { dg-final { scan-assembler-not {\msrdi\M} } } */ +/* { dg-final { scan-assembler-not {\msldi\M} } } */ +/* { dg-final { scan-assembler-times {\mrldicr\M} 1 } } */ -- 2.21.0.777.g83232e3864