From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 4A7C93858D39 for ; Thu, 14 Oct 2021 06:17:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4A7C93858D39 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19E3EmCL024945; Thu, 14 Oct 2021 02:17:28 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bnshgw7v3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Oct 2021 02:17:27 -0400 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19E55s74007558; Thu, 14 Oct 2021 02:17:27 -0400 Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bnshgw7ud-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Oct 2021 02:17:27 -0400 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19E681G0032523; Thu, 14 Oct 2021 06:17:25 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma06fra.de.ibm.com with ESMTP id 3bk2bjy7b6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Oct 2021 06:17:24 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19E6BjtC43909522 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Oct 2021 06:11:45 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2B24FAE059; Thu, 14 Oct 2021 06:17:22 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AB6BDAE05D; Thu, 14 Oct 2021 06:17:20 +0000 (GMT) Received: from [9.200.54.45] (unknown [9.200.54.45]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 14 Oct 2021 06:17:20 +0000 (GMT) Content-Type: multipart/mixed; boundary="------------j8DfIJ0zgyHWE5foXOzngcut" Message-ID: <9c39ac50-aee1-b50f-dfb3-badb6752e921@linux.ibm.com> Date: Thu, 14 Oct 2021 14:17:17 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 Content-Language: en-US To: gcc-patches Cc: David , Segher Boessenkool , Bill Schmidt From: HAO CHEN GUI Subject: PATCH, rs6000] Optimization for vec_xl_sext X-TM-AS-GCONF: 00 X-Proofpoint-GUID: GB_OcUcoYnAtzZIk8WeGpQAx01xGg593 X-Proofpoint-ORIG-GUID: FkcXRSCHxUfloboraoba-NbUtzNtFCel X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-14_01,2021-10-14_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 mlxscore=0 phishscore=0 priorityscore=1501 impostorscore=0 spamscore=0 clxscore=1015 malwarescore=0 suspectscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110140035 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Oct 2021 06:17:34 -0000 This is a multi-part message in MIME format. --------------j8DfIJ0zgyHWE5foXOzngcut Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hi,   The patch optimizes the code generation for vec_xl_sext builtin. Now all the sign extensions are done on VSX registers directly.   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot.   I refined the patch according to Bill and David's advice. I put the patch.diff and ChangeLog in attachment also in case the indentation doesn't show correctly in email body. ChangeLog 2021-10-11 Haochen Gui gcc/ * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify the expansion for sign extension. All extensions are done within VSX registers. gcc/testsuite/ * gcc.target/powerpc/p10_vec_xl_sext.c: New test. patch.diff diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index b4e13af4dc6..587e9fa2a2a 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl    if (sign_extend)      { -      rtx discratch = gen_reg_rtx (DImode); +      rtx discratch = gen_reg_rtx (V2DImode);        rtx tiscratch = gen_reg_rtx (TImode);        /* Emit the lxvr*x insn.  */ @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl         return 0;        emit_insn (pat); -      /* Emit a sign extension from QI,HI,WI to double (DI).  */ -      rtx scratch = gen_lowpart (smode, tiscratch); +      /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */ +      rtx temp1, temp2;        if (icode == CODE_FOR_vsx_lxvrbx) -       emit_insn (gen_extendqidi2 (discratch, scratch)); +       { +         temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0); +         emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1)); +       }        else if (icode == CODE_FOR_vsx_lxvrhx) -       emit_insn (gen_extendhidi2 (discratch, scratch)); +       { +         temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0); +         emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1)); +       }        else if (icode == CODE_FOR_vsx_lxvrwx) -       emit_insn (gen_extendsidi2 (discratch, scratch)); -      /*  Assign discratch directly if scratch is already DI.  */ -      if (icode == CODE_FOR_vsx_lxvrdx) -       discratch = scratch; +       { +         temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0); +         emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1)); +       } +      else if (icode == CODE_FOR_vsx_lxvrdx) +       discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0); +      else +       gcc_unreachable (); -      /* Emit the sign extension from DI (double) to TI (quad). */ -      emit_insn (gen_extendditi2 (target, discratch)); +      /* Emit the sign extension from V2DI (double) to TI (quad).  */ +      temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0); +      emit_insn (gen_extendditi2_vector (target, temp2));        return target;      } diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c new file mode 100644 index 00000000000..78e72ac5425 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target int128 } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +vector signed __int128 +foo1 (signed long a, signed char *b) +{ +  return vec_xl_sext (a, b); +} + +vector signed __int128 +foo2 (signed long a, signed short *b) +{ +  return vec_xl_sext (a, b); +} + +vector signed __int128 +foo3 (signed long a, signed int *b) +{ +  return vec_xl_sext (a, b); +} + +vector signed __int128 +foo4 (signed long a, signed long *b) +{ +  return vec_xl_sext (a, b); +} + +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */ --------------j8DfIJ0zgyHWE5foXOzngcut Content-Type: text/plain; charset=UTF-8; name="ChangeLog.txt" Content-Disposition: attachment; filename="ChangeLog.txt" Content-Transfer-Encoding: base64 MjAyMS0xMC0xMSBIYW9jaGVuIEd1aSA8Z3VpaGFvY0BsaW51eC5pYm0uY29tPgoKZ2NjLwoJ KiBjb25maWcvcnM2MDAwL3JzNjAwMC1jYWxsLmMgKGFsdGl2ZWNfZXhwYW5kX2x4dnJfYnVp bHRpbik6CglNb2RpZnkgdGhlIGV4cGFuc2lvbiBmb3Igc2lnbiBleHRlbnNpb24uIEFsbCBl eHRlbnNpb25zIGFyZSBkb25lCgl3aXRoaW4gVlNYIHJlZ2lzdGVycy4KCmdjYy90ZXN0c3Vp dGUvCgkqIGdjYy50YXJnZXQvcG93ZXJwYy9wMTBfdmVjX3hsX3NleHQuYzogTmV3IHRlc3Qu Cgo= --------------j8DfIJ0zgyHWE5foXOzngcut Content-Type: text/plain; charset=UTF-8; name="patch.diff.txt" Content-Disposition: attachment; filename="patch.diff.txt" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvcnM2MDAwL3JzNjAwMC1jYWxsLmMgYi9nY2MvY29u ZmlnL3JzNjAwMC9yczYwMDAtY2FsbC5jCmluZGV4IGI0ZTEzYWY0ZGM2Li41ODdlOWZhMmEy YSAxMDA2NDQKLS0tIGEvZ2NjL2NvbmZpZy9yczYwMDAvcnM2MDAwLWNhbGwuYworKysgYi9n Y2MvY29uZmlnL3JzNjAwMC9yczYwMDAtY2FsbC5jCkBAIC05Nzc5LDcgKzk3NzksNyBAQCBh bHRpdmVjX2V4cGFuZF9seHZyX2J1aWx0aW4gKGVudW0gaW5zbl9jb2RlIGljb2RlLCB0cmVl IGV4cCwgcnR4IHRhcmdldCwgYm9vbCBibAogCiAgIGlmIChzaWduX2V4dGVuZCkKICAgICB7 Ci0gICAgICBydHggZGlzY3JhdGNoID0gZ2VuX3JlZ19ydHggKERJbW9kZSk7CisgICAgICBy dHggZGlzY3JhdGNoID0gZ2VuX3JlZ19ydHggKFYyREltb2RlKTsKICAgICAgIHJ0eCB0aXNj cmF0Y2ggPSBnZW5fcmVnX3J0eCAoVEltb2RlKTsKIAogICAgICAgLyogRW1pdCB0aGUgbHh2 cip4IGluc24uICAqLwpAQCAtOTc4OCwyMCArOTc4OCwzMSBAQCBhbHRpdmVjX2V4cGFuZF9s eHZyX2J1aWx0aW4gKGVudW0gaW5zbl9jb2RlIGljb2RlLCB0cmVlIGV4cCwgcnR4IHRhcmdl dCwgYm9vbCBibAogCXJldHVybiAwOwogICAgICAgZW1pdF9pbnNuIChwYXQpOwogCi0gICAg ICAvKiBFbWl0IGEgc2lnbiBleHRlbnNpb24gZnJvbSBRSSxISSxXSSB0byBkb3VibGUgKERJ KS4gICovCi0gICAgICBydHggc2NyYXRjaCA9IGdlbl9sb3dwYXJ0IChzbW9kZSwgdGlzY3Jh dGNoKTsKKyAgICAgIC8qIEVtaXQgYSBzaWduIGV4dGVuc2lvbiBmcm9tIFYxNlFJLFY4SEks VjRTSSB0byBWMkRJLiAgKi8KKyAgICAgIHJ0eCB0ZW1wMSwgdGVtcDI7CiAgICAgICBpZiAo aWNvZGUgPT0gQ09ERV9GT1JfdnN4X2x4dnJieCkKLQllbWl0X2luc24gKGdlbl9leHRlbmRx aWRpMiAoZGlzY3JhdGNoLCBzY3JhdGNoKSk7CisJeworCSAgdGVtcDEgID0gc2ltcGxpZnlf Z2VuX3N1YnJlZyAoVjE2UUltb2RlLCB0aXNjcmF0Y2gsIFRJbW9kZSwgMCk7CisJICBlbWl0 X2luc24gKGdlbl92c3hfc2lnbl9leHRlbmRfcWlfdjJkaSAoZGlzY3JhdGNoLCB0ZW1wMSkp OworCX0KICAgICAgIGVsc2UgaWYgKGljb2RlID09IENPREVfRk9SX3ZzeF9seHZyaHgpCi0J ZW1pdF9pbnNuIChnZW5fZXh0ZW5kaGlkaTIgKGRpc2NyYXRjaCwgc2NyYXRjaCkpOworCXsK KwkgIHRlbXAxICA9IHNpbXBsaWZ5X2dlbl9zdWJyZWcgKFY4SEltb2RlLCB0aXNjcmF0Y2gs IFRJbW9kZSwgMCk7CisJICBlbWl0X2luc24gKGdlbl92c3hfc2lnbl9leHRlbmRfaGlfdjJk aSAoZGlzY3JhdGNoLCB0ZW1wMSkpOworCX0KICAgICAgIGVsc2UgaWYgKGljb2RlID09IENP REVfRk9SX3ZzeF9seHZyd3gpCi0JZW1pdF9pbnNuIChnZW5fZXh0ZW5kc2lkaTIgKGRpc2Ny YXRjaCwgc2NyYXRjaCkpOwotICAgICAgLyogIEFzc2lnbiBkaXNjcmF0Y2ggZGlyZWN0bHkg aWYgc2NyYXRjaCBpcyBhbHJlYWR5IERJLiAgKi8KLSAgICAgIGlmIChpY29kZSA9PSBDT0RF X0ZPUl92c3hfbHh2cmR4KQotCWRpc2NyYXRjaCA9IHNjcmF0Y2g7CisJeworCSAgdGVtcDEg ID0gc2ltcGxpZnlfZ2VuX3N1YnJlZyAoVjRTSW1vZGUsIHRpc2NyYXRjaCwgVEltb2RlLCAw KTsKKwkgIGVtaXRfaW5zbiAoZ2VuX3ZzeF9zaWduX2V4dGVuZF9zaV92MmRpIChkaXNjcmF0 Y2gsIHRlbXAxKSk7CisJfQorICAgICAgZWxzZSBpZiAoaWNvZGUgPT0gQ09ERV9GT1JfdnN4 X2x4dnJkeCkKKwlkaXNjcmF0Y2ggPSBzaW1wbGlmeV9nZW5fc3VicmVnIChWMkRJbW9kZSwg dGlzY3JhdGNoLCBUSW1vZGUsIDApOworICAgICAgZWxzZQorCWdjY191bnJlYWNoYWJsZSAo KTsKIAotICAgICAgLyogRW1pdCB0aGUgc2lnbiBleHRlbnNpb24gZnJvbSBESSAoZG91Ymxl KSB0byBUSSAocXVhZCkuICAqLwotICAgICAgZW1pdF9pbnNuIChnZW5fZXh0ZW5kZGl0aTIg KHRhcmdldCwgZGlzY3JhdGNoKSk7CisgICAgICAvKiBFbWl0IHRoZSBzaWduIGV4dGVuc2lv biBmcm9tIFYyREkgKGRvdWJsZSkgdG8gVEkgKHF1YWQpLiAgKi8KKyAgICAgIHRlbXAyID0g c2ltcGxpZnlfZ2VuX3N1YnJlZyAoVEltb2RlLCBkaXNjcmF0Y2gsIFYyREltb2RlLCAwKTsK KyAgICAgIGVtaXRfaW5zbiAoZ2VuX2V4dGVuZGRpdGkyX3ZlY3RvciAodGFyZ2V0LCB0ZW1w MikpOwogCiAgICAgICByZXR1cm4gdGFyZ2V0OwogICAgIH0KZGlmZiAtLWdpdCBhL2djYy90 ZXN0c3VpdGUvZ2NjLnRhcmdldC9wb3dlcnBjL3AxMF92ZWNfeGxfc2V4dC5jIGIvZ2NjL3Rl c3RzdWl0ZS9nY2MudGFyZ2V0L3Bvd2VycGMvcDEwX3ZlY194bF9zZXh0LmMKbmV3IGZpbGUg bW9kZSAxMDA2NDQKaW5kZXggMDAwMDAwMDAwMDAuLjc4ZTcyYWM1NDI1Ci0tLSAvZGV2L251 bGwKKysrIGIvZ2NjL3Rlc3RzdWl0ZS9nY2MudGFyZ2V0L3Bvd2VycGMvcDEwX3ZlY194bF9z ZXh0LmMKQEAgLTAsMCArMSwzNSBAQAorLyogeyBkZy1kbyBjb21waWxlIH0gKi8KKy8qIHsg ZGctcmVxdWlyZS1lZmZlY3RpdmUtdGFyZ2V0IGludDEyOCB9ICovCisvKiB7IGRnLXJlcXVp cmUtZWZmZWN0aXZlLXRhcmdldCBwb3dlcjEwX29rIH0gKi8KKy8qIHsgZGctb3B0aW9ucyAi LW1kZWphZ251LWNwdT1wb3dlcjEwIC1PMiIgfSAqLworCisjaW5jbHVkZSA8YWx0aXZlYy5o PgorCit2ZWN0b3Igc2lnbmVkIF9faW50MTI4Citmb28xIChzaWduZWQgbG9uZyBhLCBzaWdu ZWQgY2hhciAqYikKK3sKKyAgcmV0dXJuIHZlY194bF9zZXh0IChhLCBiKTsKK30KKwordmVj dG9yIHNpZ25lZCBfX2ludDEyOAorZm9vMiAoc2lnbmVkIGxvbmcgYSwgc2lnbmVkIHNob3J0 ICpiKQoreworICByZXR1cm4gdmVjX3hsX3NleHQgKGEsIGIpOworfQorCit2ZWN0b3Igc2ln bmVkIF9faW50MTI4Citmb28zIChzaWduZWQgbG9uZyBhLCBzaWduZWQgaW50ICpiKQorewor ICByZXR1cm4gdmVjX3hsX3NleHQgKGEsIGIpOworfQorCit2ZWN0b3Igc2lnbmVkIF9faW50 MTI4Citmb280IChzaWduZWQgbG9uZyBhLCBzaWduZWQgbG9uZyAqYikKK3sKKyAgcmV0dXJu IHZlY194bF9zZXh0IChhLCBiKTsKK30KKworLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1i bGVyLXRpbWVzIHtcbXZleHRzZDJxXE19IDQgfSB9ICovCisvKiB7IGRnLWZpbmFsIHsgc2Nh bi1hc3NlbWJsZXItdGltZXMge1xtdmV4dHNiMmRcTX0gMSB9IH0gKi8KKy8qIHsgZGctZmlu YWwgeyBzY2FuLWFzc2VtYmxlci10aW1lcyB7XG12ZXh0c2gyZFxNfSAxIH0gfSAqLworLyog eyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyLXRpbWVzIHtcbXZleHRzdzJkXE19IDEgfSB9 ICovCg== --------------j8DfIJ0zgyHWE5foXOzngcut--