From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 5CABE38515DD for ; Thu, 15 Jul 2021 07:07:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5CABE38515DD Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16F73da6072191; Thu, 15 Jul 2021 03:07:08 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 39sc2ytfj4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 15 Jul 2021 03:07:08 -0400 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 16F74SVl077560; Thu, 15 Jul 2021 03:07:07 -0400 Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 39sc2ytfh6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 15 Jul 2021 03:07:07 -0400 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16F746Pe031907; Thu, 15 Jul 2021 07:07:06 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma03fra.de.ibm.com with ESMTP id 39q3689462-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 15 Jul 2021 07:07:05 +0000 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16F7737e35324164 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 15 Jul 2021 07:07:03 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5F2B142041; Thu, 15 Jul 2021 07:07:03 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D9DE942049; Thu, 15 Jul 2021 07:07:00 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.197.236.160]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 15 Jul 2021 07:07:00 +0000 (GMT) Subject: [PATCH v3] vect: Recog mul_highpart pattern To: Richard Biener Cc: Richard Sandiford , Bill Schmidt , GCC Patches , Segher Boessenkool , Uros Bizjak References: <0b72fa77-a281-35e6-34e3-17cf26f18bc1@linux.ibm.com> From: "Kewen.Lin" Message-ID: <46838de4-3d92-a270-e71a-73fbe923d306@linux.ibm.com> Date: Thu, 15 Jul 2021 15:06:58 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: <0b72fa77-a281-35e6-34e3-17cf26f18bc1@linux.ibm.com> Content-Type: multipart/mixed; boundary="------------B7DD34C278CA859D49272858" Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-GUID: GEW5zOmEcyH52Kjvwd14qX2fb6RBh_y- X-Proofpoint-ORIG-GUID: rRbnE5OxTLaAVFaQiK6NJnurb2iGYekq X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-15_02:2021-07-14, 2021-07-15 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 malwarescore=0 spamscore=0 bulkscore=0 priorityscore=1501 lowpriorityscore=0 phishscore=0 mlxlogscore=999 mlxscore=0 adultscore=0 suspectscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107150052 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Jul 2021 07:07:14 -0000 This is a multi-part message in MIME format. --------------B7DD34C278CA859D49272858 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit on 2021/7/14 下午3:45, Kewen.Lin via Gcc-patches wrote: > on 2021/7/14 下午2:38, Richard Biener wrote: >> On Tue, Jul 13, 2021 at 4:59 PM Kewen.Lin wrote: >>> >>> on 2021/7/13 下午8:42, Richard Biener wrote: >>>> On Tue, Jul 13, 2021 at 12:25 PM Kewen.Lin wrote: >> >>> I guess the proposed IFN would be directly mapped for [us]mul_highpart? >> >> Yes. >> > > Thanks for confirming! The related patch v2 is attached and the testing > is ongoing. > It's bootstrapped & regtested on powerpc64le-linux-gnu P9 and aarch64-linux-gnu. But on x86_64-redhat-linux there are XPASSes as below: XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhuw XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw XFAIL->XPASS: gcc.target/i386/pr100637-3w.c scan-assembler pmulhw They weren't exposed in the testing run with the previous patch which doesn't use IFN way. By investigating it, the difference comes from the different costing on MULT_HIGHPART_EXPR and IFN_MULH. For MULT_HIGHPART_EXPR, it's costed by 16 from below call: case MULT_EXPR: case WIDEN_MULT_EXPR: case MULT_HIGHPART_EXPR: stmt_cost = ix86_multiplication_cost (ix86_cost, mode); While for IFN_MULH, it's costed by 4 as normal stmt so the total cost becomes profitable and the expected vectorization happens. One conservative fix seems to make IFN_MULH costing go through the unique cost interface for multiplication, that is: case CFN_MULH: stmt_cost = ix86_multiplication_cost (ix86_cost, mode); break; As the test case marks the checks as "xfail", probably it's good to revisit the costing on mul_highpart to ensure it's not priced more. The attached patch also addressed Richard S.'s review comments on two reformatting hunks. Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * internal-fn.c (first_commutative_argument): Add info for IFN_MULH. * internal-fn.def (IFN_MULH): New internal function. * tree-vect-patterns.c (vect_recog_mulhs_pattern): Add support to recog normal multiply highpart as IFN_MULH. * config/i386/i386.c (ix86_add_stmt_cost): Adjust for combined function CFN_MULH. --------------B7DD34C278CA859D49272858 Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0"; name="vect-recog-hmul-v3.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="vect-recog-hmul-v3.patch" LS0tCiBnY2MvY29uZmlnL2kzODYvaTM4Ni5jICAgfCAgMyArKysKIGdjYy9pbnRlcm5hbC1m bi5jICAgICAgICB8ICAxICsKIGdjYy9pbnRlcm5hbC1mbi5kZWYgICAgICB8ICAyICsrCiBn Y2MvdHJlZS12ZWN0LXBhdHRlcm5zLmMgfCAzOCArKysrKysrKysrKysrKysrKysrKysrKysr KysrLS0tLS0tLS0tLQogNCBmaWxlcyBjaGFuZ2VkLCAzNCBpbnNlcnRpb25zKCspLCAxMCBk ZWxldGlvbnMoLSkKCmRpZmYgLS1naXQgYS9nY2MvY29uZmlnL2kzODYvaTM4Ni5jIGIvZ2Nj L2NvbmZpZy9pMzg2L2kzODYuYwppbmRleCBhOTMxMjhmYTBhNC4uMWRkOTEwODM1M2MgMTAw NjQ0Ci0tLSBhL2djYy9jb25maWcvaTM4Ni9pMzg2LmMKKysrIGIvZ2NjL2NvbmZpZy9pMzg2 L2kzODYuYwpAQCAtMjI1NTksNiArMjI1NTksOSBAQCBpeDg2X2FkZF9zdG10X2Nvc3QgKGNs YXNzIHZlY19pbmZvICp2aW5mbywgdm9pZCAqZGF0YSwgaW50IGNvdW50LAogCQkJCSAgIG1v ZGUgPT0gU0Ztb2RlID8gaXg4Nl9jb3N0LT5mbWFzcwogCQkJCSAgIDogaXg4Nl9jb3N0LT5m bWFzZCk7CiAJYnJlYWs7CisgICAgICBjYXNlIENGTl9NVUxIOgorCXN0bXRfY29zdCA9IGl4 ODZfbXVsdGlwbGljYXRpb25fY29zdCAoaXg4Nl9jb3N0LCBtb2RlKTsKKwlicmVhazsKICAg ICAgIGRlZmF1bHQ6CiAJYnJlYWs7CiAgICAgICB9CmRpZmYgLS1naXQgYS9nY2MvaW50ZXJu YWwtZm4uYyBiL2djYy9pbnRlcm5hbC1mbi5jCmluZGV4IGZiOGI0M2QxY2UyLi5iMWI0Mjg5 MzU3YyAxMDA2NDQKLS0tIGEvZ2NjL2ludGVybmFsLWZuLmMKKysrIGIvZ2NjL2ludGVybmFs LWZuLmMKQEAgLTM3MDMsNiArMzcwMyw3IEBAIGZpcnN0X2NvbW11dGF0aXZlX2FyZ3VtZW50 IChpbnRlcm5hbF9mbiBmbikKICAgICBjYXNlIElGTl9GTk1TOgogICAgIGNhc2UgSUZOX0FW R19GTE9PUjoKICAgICBjYXNlIElGTl9BVkdfQ0VJTDoKKyAgICBjYXNlIElGTl9NVUxIOgog ICAgIGNhc2UgSUZOX01VTEhTOgogICAgIGNhc2UgSUZOX01VTEhSUzoKICAgICBjYXNlIElG Tl9GTUlOOgpkaWZmIC0tZ2l0IGEvZ2NjL2ludGVybmFsLWZuLmRlZiBiL2djYy9pbnRlcm5h bC1mbi5kZWYKaW5kZXggYzNiOGU3MzA5NjAuLmVkNmQ3ZGUxNjgwIDEwMDY0NAotLS0gYS9n Y2MvaW50ZXJuYWwtZm4uZGVmCisrKyBiL2djYy9pbnRlcm5hbC1mbi5kZWYKQEAgLTE2OSw2 ICsxNjksOCBAQCBERUZfSU5URVJOQUxfU0lHTkVEX09QVEFCX0ZOIChBVkdfRkxPT1IsIEVD Rl9DT05TVCB8IEVDRl9OT1RIUk9XLCBmaXJzdCwKIERFRl9JTlRFUk5BTF9TSUdORURfT1BU QUJfRk4gKEFWR19DRUlMLCBFQ0ZfQ09OU1QgfCBFQ0ZfTk9USFJPVywgZmlyc3QsCiAJCQkg ICAgICBzYXZnX2NlaWwsIHVhdmdfY2VpbCwgYmluYXJ5KQogCitERUZfSU5URVJOQUxfU0lH TkVEX09QVEFCX0ZOIChNVUxILCBFQ0ZfQ09OU1QgfCBFQ0ZfTk9USFJPVywgZmlyc3QsCisJ CQkgICAgICBzbXVsX2hpZ2hwYXJ0LCB1bXVsX2hpZ2hwYXJ0LCBiaW5hcnkpCiBERUZfSU5U RVJOQUxfU0lHTkVEX09QVEFCX0ZOIChNVUxIUywgRUNGX0NPTlNUIHwgRUNGX05PVEhST1cs IGZpcnN0LAogCQkJICAgICAgc211bGhzLCB1bXVsaHMsIGJpbmFyeSkKIERFRl9JTlRFUk5B TF9TSUdORURfT1BUQUJfRk4gKE1VTEhSUywgRUNGX0NPTlNUIHwgRUNGX05PVEhST1csIGZp cnN0LApkaWZmIC0tZ2l0IGEvZ2NjL3RyZWUtdmVjdC1wYXR0ZXJucy5jIGIvZ2NjL3RyZWUt dmVjdC1wYXR0ZXJucy5jCmluZGV4IGIyZTdmYzJjYzdhLi5hZGE4OWQ3MDYwYiAxMDA2NDQK LS0tIGEvZ2NjL3RyZWUtdmVjdC1wYXR0ZXJucy5jCisrKyBiL2djYy90cmVlLXZlY3QtcGF0 dGVybnMuYwpAQCAtMTg5Niw4ICsxODk2LDE1IEBAIHZlY3RfcmVjb2dfb3Zlcl93aWRlbmlu Z19wYXR0ZXJuICh2ZWNfaW5mbyAqdmluZm8sCiAKICAgIDEpIE11bHRpcGx5IGhpZ2ggd2l0 aCBzY2FsaW5nCiAgICAgIFRZUEUgcmVzID0gKChUWVBFKSBhICogKFRZUEUpIGIpID4+IGM7 CisgICAgIEhlcmUsIGMgaXMgYml0c2l6ZSAoVFlQRSkgLyAyIC0gMS4KKwogICAgMikgLi4u IG9yIGFsc28gd2l0aCByb3VuZGluZwogICAgICBUWVBFIHJlcyA9ICgoKFRZUEUpIGEgKiAo VFlQRSkgYikgPj4gZCArIDEpID4+IDE7CisgICAgIEhlcmUsIGQgaXMgYml0c2l6ZSAoVFlQ RSkgLyAyIC0gMi4KKworICAgMykgTm9ybWFsIG11bHRpcGx5IGhpZ2gKKyAgICAgVFlQRSBy ZXMgPSAoKFRZUEUpIGEgKiAoVFlQRSkgYikgPj4gZTsKKyAgICAgSGVyZSwgZSBpcyBiaXRz aXplIChUWVBFKSAvIDIuCiAKICAgIHdoZXJlIG9ubHkgdGhlIGJvdHRvbSBoYWxmIG9mIHJl cyBpcyB1c2VkLiAgKi8KIApAQCAtMTk0Miw3ICsxOTQ5LDYgQEAgdmVjdF9yZWNvZ19tdWxo c19wYXR0ZXJuICh2ZWNfaW5mbyAqdmluZm8sCiAgIHN0bXRfdmVjX2luZm8gbXVsaF9zdG10 X2luZm87CiAgIHRyZWUgc2NhbGVfdGVybTsKICAgaW50ZXJuYWxfZm4gaWZuOwotICB1bnNp Z25lZCBpbnQgZXhwZWN0X29mZnNldDsKIAogICAvKiBDaGVjayBmb3IgdGhlIHByZXNlbmNl IG9mIHRoZSByb3VuZGluZyB0ZXJtLiAgKi8KICAgaWYgKGdpbXBsZV9hc3NpZ25fcmhzX2Nv ZGUgKHJzaGlmdF9pbnB1dF9zdG10KSA9PSBQTFVTX0VYUFIpCkBAIC0xOTkxLDI1ICsxOTk3 LDM3IEBAIHZlY3RfcmVjb2dfbXVsaHNfcGF0dGVybiAodmVjX2luZm8gKnZpbmZvLAogCiAg ICAgICAvKiBHZXQgdGhlIHNjYWxpbmcgdGVybS4gICovCiAgICAgICBzY2FsZV90ZXJtID0g Z2ltcGxlX2Fzc2lnbl9yaHMyIChwbHVzX2lucHV0X3N0bXQpOworICAgICAgLyogQ2hlY2sg dGhhdCB0aGUgc2NhbGluZyBmYWN0b3IgaXMgY29ycmVjdC4gICovCisgICAgICBpZiAoVFJF RV9DT0RFIChzY2FsZV90ZXJtKSAhPSBJTlRFR0VSX0NTVCkKKwlyZXR1cm4gTlVMTDsKKwor ICAgICAgLyogQ2hlY2sgcGF0dGVybiAyKS4gICovCisgICAgICBpZiAod2k6OnRvX3dpZGVz dCAoc2NhbGVfdGVybSkgKyB0YXJnZXRfcHJlY2lzaW9uICsgMgorCSAgIT0gVFlQRV9QUkVD SVNJT04gKGxoc190eXBlKSkKKwlyZXR1cm4gTlVMTDsKIAotICAgICAgZXhwZWN0X29mZnNl dCA9IHRhcmdldF9wcmVjaXNpb24gKyAyOwogICAgICAgaWZuID0gSUZOX01VTEhSUzsKICAg ICB9CiAgIGVsc2UKICAgICB7CiAgICAgICBtdWxoX3N0bXRfaW5mbyA9IHJzaGlmdF9pbnB1 dF9zdG10X2luZm87CiAgICAgICBzY2FsZV90ZXJtID0gZ2ltcGxlX2Fzc2lnbl9yaHMyIChs YXN0X3N0bXQpOworICAgICAgLyogQ2hlY2sgdGhhdCB0aGUgc2NhbGluZyBmYWN0b3IgaXMg Y29ycmVjdC4gICovCisgICAgICBpZiAoVFJFRV9DT0RFIChzY2FsZV90ZXJtKSAhPSBJTlRF R0VSX0NTVCkKKwlyZXR1cm4gTlVMTDsKIAotICAgICAgZXhwZWN0X29mZnNldCA9IHRhcmdl dF9wcmVjaXNpb24gKyAxOwotICAgICAgaWZuID0gSUZOX01VTEhTOworICAgICAgLyogQ2hl Y2sgZm9yIHBhdHRlcm4gMSkuICAqLworICAgICAgaWYgKHdpOjp0b193aWRlc3QgKHNjYWxl X3Rlcm0pICsgdGFyZ2V0X3ByZWNpc2lvbiArIDEKKwkgID09IFRZUEVfUFJFQ0lTSU9OIChs aHNfdHlwZSkpCisJaWZuID0gSUZOX01VTEhTOworICAgICAgLyogQ2hlY2sgZm9yIHBhdHRl cm4gMykuICAqLworICAgICAgZWxzZSBpZiAod2k6OnRvX3dpZGVzdCAoc2NhbGVfdGVybSkg KyB0YXJnZXRfcHJlY2lzaW9uCisJICAgICAgID09IFRZUEVfUFJFQ0lTSU9OIChsaHNfdHlw ZSkpCisJaWZuID0gSUZOX01VTEg7CisgICAgICBlbHNlCisJcmV0dXJuIE5VTEw7CiAgICAg fQogCi0gIC8qIENoZWNrIHRoYXQgdGhlIHNjYWxpbmcgZmFjdG9yIGlzIGNvcnJlY3QuICAq LwotICBpZiAoVFJFRV9DT0RFIChzY2FsZV90ZXJtKSAhPSBJTlRFR0VSX0NTVAotICAgICAg fHwgd2k6OnRvX3dpZGVzdCAoc2NhbGVfdGVybSkgKyBleHBlY3Rfb2Zmc2V0Ci0JICAgIT0g VFlQRV9QUkVDSVNJT04gKGxoc190eXBlKSkKLSAgICByZXR1cm4gTlVMTDsKLQogICAvKiBD aGVjayB3aGV0aGVyIHRoZSBzY2FsaW5nIGlucHV0IHRlcm0gY2FuIGJlIHNlZW4gYXMgdHdv IHdpZGVuZWQKICAgICAgaW5wdXRzIG11bHRpcGxpZWQgdG9nZXRoZXIuICAqLwogICB2ZWN0 X3VucHJvbW90ZWRfdmFsdWUgdW5wcm9tX211bHRbMl07Ci0tIAoyLjE3LjEKCg== --------------B7DD34C278CA859D49272858--