From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 7B362383443F for ; Wed, 3 Feb 2021 06:37:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7B362383443F Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 1136Y7Tk127518; Wed, 3 Feb 2021 01:37:14 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 36fpmtr2yn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 Feb 2021 01:37:14 -0500 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 1136aGHV136974; Wed, 3 Feb 2021 01:37:13 -0500 Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com with ESMTP id 36fpmtr2xh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 Feb 2021 01:37:13 -0500 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 1136WQTR021634; Wed, 3 Feb 2021 06:37:11 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma06fra.de.ibm.com with ESMTP id 36cxqh9vkq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 Feb 2021 06:37:11 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1136b9mZ40108312 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 3 Feb 2021 06:37:09 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E17184C058; Wed, 3 Feb 2021 06:37:08 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CF32F4C04E; Wed, 3 Feb 2021 06:37:06 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.39.45]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 3 Feb 2021 06:37:06 +0000 (GMT) To: GCC Patches Cc: Segher Boessenkool , Bill Schmidt , David Edelsohn From: "Kewen.Lin" Subject: [PATCH] rs6000: Use rldimi for vec init instead of shift + ior Message-ID: <4c85b45c-fbaa-5509-2344-91113478e2d1@linux.ibm.com> Date: Wed, 3 Feb 2021 14:37:05 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------FECFE1A508C5EDB67AFD2731" Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.737 definitions=2021-02-03_01:2021-02-02, 2021-02-03 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 priorityscore=1501 suspectscore=0 impostorscore=0 lowpriorityscore=0 clxscore=1015 bulkscore=0 malwarescore=0 phishscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2102030035 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2021 06:37:18 -0000 This is a multi-part message in MIME format. --------------FECFE1A508C5EDB67AFD2731 Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 7bit Hi, This patch merges the previously approved one[1] and its relied patch made by Segher here[2], it's to make unsigned int vector init go with rldimi to merge two integers instead of shift and ior. Segher's patch in [2] is required to make the test case pass, otherwise the costing for new pseudo-to-pseudo copies and the folding with nonzero_bits in combine will make the rl*imi pattern become compact and split into ior and shift unexpectedly. The commit log of Segher's patch describes it in more details: "An rl*imi is usually written as an IOR of an ASHIFT or similar, and an AND of a register with a constant mask. In some cases combine knows that that AND doesn't do anything (because all zero bits in that mask correspond to bits known to be already zero), and then no pattern matches. This patch adds a define_split for such cases. It uses nonzero_bits in the condition of the splitter, but does not need it afterwards for the instruction to be recognised. This is necessary because later passes can see fewer nonzero_bits. Because it is a splitter, combine will only use it when starting with three insns (or more), even though the result is just one. This isn't a huge problem in practice, but some possible combinations still won't happen." Bootstrapped/regtested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8, also SPEC2017 build/run passed on P9. Is it ok for trunk? BR, Kewen [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562407.html [2] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563526.html gcc/ChangeLog: 2020-02-03 Segher Boessenkool Kewen Lin * config/rs6000/rs6000.md (*rotl3_insert_3): Renamed to... (rotl3_insert_3): ...this. (plus_ior_xor): New code_iterator. (define_split for GPR rl*imi): New splitter. * config/rs6000/vsx.md (vsx_init_v4si): Use gen_rotldi3_insert_3 for integer merging. gcc/testsuite/ChangeLog: * gcc.target/powerpc/vec-init-10.c: New test. ----- --------------FECFE1A508C5EDB67AFD2731 Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0"; name="vec_init.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="vec_init.patch" diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bb9fb42f82a..dca311ebc80 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4067,7 +4067,7 @@ [(set_attr "type" "insert")]) ; There are also some forms without one of the ANDs. -(define_insn "*rotl3_insert_3" +(define_insn "rotl3_insert_3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (ior:GPR (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0") (match_operand:GPR 4 "const_int_operand" "n")) @@ -4082,6 +4082,24 @@ } [(set_attr "type" "insert")]) +(define_code_iterator plus_ior_xor [plus ior xor]) + +(define_split + [(set (match_operand:GPR 0 "gpc_reg_operand") + (plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand") + (match_operand:SI 2 "const_int_operand")) + (match_operand:GPR 3 "gpc_reg_operand")))] + "nonzero_bits (operands[3], mode) + < HOST_WIDE_INT_1U << INTVAL (operands[2])" + [(set (match_dup 0) + (ior:GPR (and:GPR (match_dup 3) + (match_dup 4)) + (ashift:GPR (match_dup 1) + (match_dup 2))))] +{ + operands[4] = GEN_INT ((HOST_WIDE_INT_1U << INTVAL (operands[2])) - 1); +}) + (define_insn "*rotl3_insert_4" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (ior:GPR (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0") diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 0c1bda522a9..07c2f7ffa6e 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -3008,28 +3008,22 @@ (use (match_operand:SI 4 "gpc_reg_operand"))] "VECTOR_MEM_VSX_P (V4SImode) && TARGET_DIRECT_MOVE_64BIT" { - rtx a = gen_reg_rtx (DImode); - rtx b = gen_reg_rtx (DImode); - rtx c = gen_reg_rtx (DImode); - rtx d = gen_reg_rtx (DImode); - emit_insn (gen_zero_extendsidi2 (a, operands[1])); - emit_insn (gen_zero_extendsidi2 (b, operands[2])); - emit_insn (gen_zero_extendsidi2 (c, operands[3])); - emit_insn (gen_zero_extendsidi2 (d, operands[4])); + rtx a = gen_lowpart_SUBREG (DImode, operands[1]); + rtx b = gen_lowpart_SUBREG (DImode, operands[2]); + rtx c = gen_lowpart_SUBREG (DImode, operands[3]); + rtx d = gen_lowpart_SUBREG (DImode, operands[4]); if (!BYTES_BIG_ENDIAN) { std::swap (a, b); std::swap (c, d); } - rtx aa = gen_reg_rtx (DImode); rtx ab = gen_reg_rtx (DImode); - rtx cc = gen_reg_rtx (DImode); rtx cd = gen_reg_rtx (DImode); - emit_insn (gen_ashldi3 (aa, a, GEN_INT (32))); - emit_insn (gen_ashldi3 (cc, c, GEN_INT (32))); - emit_insn (gen_iordi3 (ab, aa, b)); - emit_insn (gen_iordi3 (cd, cc, d)); + emit_insn (gen_rotldi3_insert_3 (ab, a, GEN_INT (32), b, + GEN_INT (0xffffffff))); + emit_insn (gen_rotldi3_insert_3 (cd, c, GEN_INT (32), d, + GEN_INT (0xffffffff))); rtx abcd = gen_reg_rtx (V2DImode); emit_insn (gen_vsx_concat_v2di (abcd, ab, cd)); diff --git a/gcc/testsuite/gcc.target/powerpc/vec-init-10.c b/gcc/testsuite/gcc.target/powerpc/vec-init-10.c new file mode 100644 index 00000000000..680538e67f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-init-10.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */ + +/* Check that we can optimize sldi + or to rldimi for vector int init. */ + +vector unsigned int +testu (unsigned int i1, unsigned int i2, unsigned int i3, unsigned int i4) +{ + vector unsigned int v = {i1, i2, i3, i4}; + return v; +} + +vector signed int +tests (signed int i1, signed int i2, signed int i3, signed int i4) +{ + vector signed int v = {i1, i2, i3, i4}; + return v; +} + +/* { dg-final { scan-assembler-not "sldi" } } */ +/* { dg-final { scan-assembler-not "or" } } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 4 } } */ --------------FECFE1A508C5EDB67AFD2731--