From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2069.outbound.protection.outlook.com [40.107.22.69]) by sourceware.org (Postfix) with ESMTPS id E65423858D37 for ; Wed, 28 Jun 2023 15:10:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E65423858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ldrYj4IfUxPgawqaAELN+6wIslgpjuwwvtCpwOHFB4M=; b=yaBqVRg3F/buOULfpjFVRTUe/zgeMshPdhraXmthS/cmqLXALSfY27nBbf9ZvDQmiM0zs7nzwS9NVoIeZI3Sw9X2YxeISVmC12l69Mb5Bk+DvzqZME8shCOa3fpKcEGrsPK4cKPqfVCvxT8B70QVB1Hm2j7O901j2wRpXtcC9rs= Received: from AS9PR0301CA0013.eurprd03.prod.outlook.com (2603:10a6:20b:468::13) by PAWPR08MB10257.eurprd08.prod.outlook.com (2603:10a6:102:367::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.24; Wed, 28 Jun 2023 15:10:12 +0000 Received: from AM7EUR03FT025.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:468:cafe::f) by AS9PR0301CA0013.outlook.office365.com (2603:10a6:20b:468::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.34 via Frontend Transport; Wed, 28 Jun 2023 15:10:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT025.mail.protection.outlook.com (100.127.140.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6544.20 via Frontend Transport; Wed, 28 Jun 2023 15:10:11 +0000 Received: ("Tessian outbound d6c4ee3ba1eb:v142"); Wed, 28 Jun 2023 15:10:11 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: dfec94f4ffd5b0a2 X-CR-MTA-TID: 64aa7808 Received: from 9efc6d3d15ec.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id DB52B258-ADEE-4FC9-BB77-16871E3D9506.1; Wed, 28 Jun 2023 15:10:01 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 9efc6d3d15ec.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 28 Jun 2023 15:10:01 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GVvC8MJ/J4EKOT0dHSK3HfXRHx0whj9/TImjWVRXp8fZVn2h80oVXI0MQlvD4OXefiPaXRvZFb8EZ8t0IVfDVhW0z0JwMgaeEC8nJywg28MMNLLk5KBQ8DKoWN5cgiAcThYtr3hl82S4RpPKpaS+lsT1xuSvXuuhsmAIL3FCpq17B+XTdAxOxs+m5Ts66gFMywgApFqmUxBfE7qVevDv1o5RlNkNzNaq7UZqMQJMCEIyRT4g1xV9SajKMXuETL+ZiuwOTHM8YZY4Ze4QrYLFnWo18vMcRAzdPWSuH/bk37FA58tlHxb4IulaQCGWkYpHkPCFtLNCzMPW0+1BHArQIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ldrYj4IfUxPgawqaAELN+6wIslgpjuwwvtCpwOHFB4M=; b=bR8dllHMb6b2EheMu2ThMqv1afTlXRu42u4SEkSVi5yZqyu2PSsVbOr7UGRSJ0mJ3qs6JF7vEGt5UumNs1UISdLteH4ReekgC8HPD1MVmtTFk9V6jXCstmvYq+aKJkXBbA7JADDur9hP8WPBzA9yskDxmzOt8C0dN30ZnSUZmqwUQ8GgqLbC2oSgPIxH802CF/zR6IaAqE0r0zoXVanQm2BKtuv+NF3xeb8NFFjlr0UPEU4twkSVlzYIxPLKR0RxHUwracaIkXbW5gVAtxk0K7qEtIRsb5ymNjrZj5qesXJVncHBBx1HWI3JrZWZtHe7A6bjrjSVfc/tCqHKj4pBUA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=armh.onmicrosoft.com smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ldrYj4IfUxPgawqaAELN+6wIslgpjuwwvtCpwOHFB4M=; b=yaBqVRg3F/buOULfpjFVRTUe/zgeMshPdhraXmthS/cmqLXALSfY27nBbf9ZvDQmiM0zs7nzwS9NVoIeZI3Sw9X2YxeISVmC12l69Mb5Bk+DvzqZME8shCOa3fpKcEGrsPK4cKPqfVCvxT8B70QVB1Hm2j7O901j2wRpXtcC9rs= Received: from DB7PR02CA0024.eurprd02.prod.outlook.com (2603:10a6:10:52::37) by AS2PR08MB10265.eurprd08.prod.outlook.com (2603:10a6:20b:62c::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.26; Wed, 28 Jun 2023 15:09:58 +0000 Received: from DBAEUR03FT038.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:52:cafe::4b) by DB7PR02CA0024.outlook.office365.com (2603:10a6:10:52::37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6544.19 via Frontend Transport; Wed, 28 Jun 2023 15:09:58 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DBAEUR03FT038.mail.protection.outlook.com (100.127.143.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6544.18 via Frontend Transport; Wed, 28 Jun 2023 15:09:58 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 15:09:56 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 15:09:55 +0000 Received: from e119885.cambridge.arm.com (10.2.78.55) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27 via Frontend Transport; Wed, 28 Jun 2023 15:09:55 +0000 From: Oluwatamilore Adebayo To: CC: , , Subject: [PATCH 1/2] Mid engine setup [SU]ABDL Date: Wed, 28 Jun 2023 16:09:48 +0100 Message-ID: <20230628150948.47843-1-oluwatamilore.adebayo@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230628150747.47729-1-oluwatamilore.adebayo@arm.com> References: <20230628150747.47729-1-oluwatamilore.adebayo@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DBAEUR03FT038:EE_|AS2PR08MB10265:EE_|AM7EUR03FT025:EE_|PAWPR08MB10257:EE_ X-MS-Office365-Filtering-Correlation-Id: a0e62f89-c3d6-4cf0-d5c5-08db77e9c511 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 6oML0Q//h6i2igU5EE54c0sJub5Q5RmHWxYGFfnnct2xPFlbjzJjT7xteEGpTAHNzhBU7RmR98I2HA41e3UYXWaJUp9UgDjFwk2RPXuc8mtCDET00bjf4XLn9mfjNGNJdMGn3lVno0Pf2LBFkMbjQmDmNidQlXCLQT79N9eA9B6Bp5QP90z+PGj9rfvrSrVhJh4lVHrP5bxoOM8mZct4hm+DCFDSm9MgYd1VA+6JkScr1lZEuy6EiiUM0SpKY0Uq6XG3gHC0lkcHQ1nNjytGDpPlPNKQMhTVmkxi9TZaGZX4GglNGTOxdEP2ECabea6chK9ON0RkbMDUvlnSlNaFHi10xbiiiXc/9rB/0wPDbXKhLBivJBLVpTSzTG/CB1Y7jXH8GPoZMGTEzT5muY7s83dTFx2fD0M3kSw64R4xnErgaqtaAN91bliJJhtv4aISOJ9APHMT++55LJRe09Zxp/mXClm4K2m775Td2jm9YY6eU2gh2uR34BAuIKS4CHzBVO2WP+y2x6hZA5lQ81SR2rGd0pmXd5VIc8iiAzTBw0OLl2TOXd+EzaaLioAa5PEQ1JgKv+nNTZQ6eO1a7QYABH/VyvPVPIUEJ+z2ktCpMk0HA7MjFrCLBrx7b7rTCtmazSR8ni4aSiAP5JQFWU1BXT1/hESHyLY233r6CudMZZoC/1oejCirllg1vxoT0aDD1te2dvL1frhYkFsLi/oiU0/FYbrZjc7r46attF54ItAjUXCZAcfDp67ZW9+/CLNxZrh+HG3aPQJwrsMSA/0QJg== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:nebula.arm.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(396003)(376002)(346002)(136003)(39860400002)(451199021)(46966006)(40470700004)(36840700001)(1076003)(5660300002)(44832011)(70586007)(316002)(36756003)(478600001)(4326008)(70206006)(6862004)(8676002)(8936002)(86362001)(2906002)(26005)(7049001)(54906003)(36860700001)(6200100001)(40460700003)(37006003)(40480700001)(41300700001)(6666004)(82310400005)(7696005)(45080400002)(186003)(336012)(426003)(82740400003)(47076005)(356005)(30864003)(81166007)(2616005)(83380400001)(36900700001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB10265 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT025.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 39eccba1-0221-4723-9b16-08db77e9bcbb X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5Nhb/yuaenO2GHe/7P5EIWwB8gSO1189GsJ9qz2pCph421WLK9Na6NZOx8eKiqAV5Wq/ohQ0r6uSV8LaWOBIzD15lzXUIgWV4ZKv53h/pMlH8BZVLG3AXVuzoXPlcv4ERaHhyGdrZtJ4WHpAGrOzktccnuPpQBXUYGUFIhBp9wJOxaHMrPam8gZpy11eenCd2mzuTay0CHvS9xcQsP/gnfiGDG9RLesIrJ5aHqXtBVWeHDi6bTWj1xf86n7F7NPMCIH8fq8q+EtwbqbU+E5J75SqgeojCdfSUXyBjk9sytlLrvBPkKQWHq3xCR/kHY2gQDvV6TYeJSx8fOX7dgOSJXDY+OI4Pe3BPYcsnLd/7zvRAzzk8XUK9imwYJrKT4QYDJCFVMEqzvtAP8VaA3JyVsoPpAp5Ig0/buK/RGpPWrIXzSXMdYHeeH0fcy9w3d4RmbuoqOZY+8Xhckvvx3Gnia1+fFH+5YyjfiZhY3HyHIRzZUEPoR82skRdlB6RfhbcLbLL8AEZ1+3k6VUptmdO/iL61sjwMdiSDl2ICUfIpuw8ofSj3gflxtjV77uVTLnr7ZozYL3YVFpVA6m1sy4ExtMRmcoBw9lIVJbzLgTYKyYMLBZxtpP6iUMEQp3ViLLxnP4+TDXtRQZTf/aUaeCE/Z55E0RlVqx2qrfpBRz0PslUaGnNG5Oc9qzDNVoYowE3kV7n2ijGiOy6WnzM2YDGRjh4j7jDL/bGtSCFTpcNI2wuphzeZICHu2UIzfTZ9rki X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(396003)(39860400002)(136003)(376002)(346002)(451199021)(46966006)(40470700004)(36840700001)(44832011)(5660300002)(4326008)(70206006)(478600001)(36756003)(316002)(70586007)(8936002)(8676002)(6862004)(2906002)(7049001)(40460700003)(37006003)(36860700001)(54906003)(6200100001)(41300700001)(45080400002)(30864003)(82310400005)(186003)(7696005)(40480700001)(336012)(86362001)(26005)(47076005)(6666004)(426003)(1076003)(81166007)(2616005)(82740400003)(83380400001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Jun 2023 15:10:11.9797 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a0e62f89-c3d6-4cf0-d5c5-08db77e9c511 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT025.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR08MB10257 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: oluade01 This updates vect_recog_abd_pattern to recognize the widening variant of absolute difference (ABDL, ABDL2). gcc/ChangeLog: * internal-fn.cc (widening_fn_p, decomposes_to_hilo_fn_p): Add IFN_VEC_WIDEN_ABD to the switch statement. * internal-fn.def (VEC_WIDEN_ABD): New internal hilo optab. * optabs.def (vec_widen_sabd_optab, vec_widen_sabd_hi_optab, vec_widen_sabd_lo_optab, vec_widen_sabd_odd_even, vec_widen_sabd_even_optab, vec_widen_uabd_optab, vec_widen_uabd_hi_optab, vec_widen_uabd_lo_optab, vec_widen_uabd_odd_even, vec_widen_uabd_even_optab): New optabs. * tree-vect-patterns.cc (vect_recog_abd_pattern): Update to to build a VEC_WIDEN_ABD call if the input precision is smaller than the precision of the output. (vect_recog_widen_abd_pattern): Should an ABD expression be found preceeding an extension, replace the two with a VEC_WIDEN_ABD. --- gcc/doc/md.texi | 11 ++ gcc/internal-fn.def | 5 + gcc/optabs.def | 10 ++ gcc/tree-vect-patterns.cc | 205 +++++++++++++++++++++++++++++--------- 4 files changed, 183 insertions(+), 48 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index e11b10d2fca11016232921bc85e47975f700e6c6..2ae6182b925d0cf8950dc830d083cf93baf2eaa1 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5617,6 +5617,17 @@ signed/unsigned elements of size S@. Subtract the high/low elements of 2 from 1 and widen the resulting elements. Put the N/2 results of size 2*S in the output vector (operand 0). +@cindex @code{vec_widen_sabdl_hi_@var{m}} instruction pattern +@cindex @code{vec_widen_sabdl_lo_@var{m}} instruction pattern +@cindex @code{vec_widen_uabdl_hi_@var{m}} instruction pattern +@cindex @code{vec_widen_uabdl_lo_@var{m}} instruction pattern +@item @samp{vec_widen_uabdl_hi_@var{m}}, @samp{vec_widen_uabdl_lo_@var{m}} +@itemx @samp{vec_widen_sabdl_hi_@var{m}}, @samp{vec_widen_sabdl_lo_@var{m}} +Signed/Unsigned widening absolute difference long. Operands 1 and 2 are +vectors with N signed/unsigned elements of size S@. Find the absolute +difference between 1 and 2 and widen the resulting elements. Put the N/2 +results of size 2*S in the output vector (operand 0). + @cindex @code{vec_addsub@var{m}3} instruction pattern @item @samp{vec_addsub@var{m}3} Alternating subtract, add with even lanes doing subtract and odd diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 116965f4830cec8f60642ff011a86b6562e2c509..d67274d68b49943a88c531e903fd03b42343ab97 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -352,6 +352,11 @@ DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_MINUS, first, vec_widen_ssub, vec_widen_usub, binary) +DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_ABD, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_sabd, vec_widen_uabd, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 35b835a6ac56d72417dac8ddfd77a8a7e2475e65..68dfa1550f791a2fe833012157601ecfa68f1e09 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -418,6 +418,11 @@ OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a") OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a") OPTAB_D (vec_widen_sadd_odd_optab, "vec_widen_sadd_odd_$a") OPTAB_D (vec_widen_sadd_even_optab, "vec_widen_sadd_even_$a") +OPTAB_D (vec_widen_sabd_optab, "vec_widen_sabd_$a") +OPTAB_D (vec_widen_sabd_hi_optab, "vec_widen_sabd_hi_$a") +OPTAB_D (vec_widen_sabd_lo_optab, "vec_widen_sabd_lo_$a") +OPTAB_D (vec_widen_sabd_odd_optab, "vec_widen_sabd_odd_$a") +OPTAB_D (vec_widen_sabd_even_optab, "vec_widen_sabd_even_$a") OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a") OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a") OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a") @@ -436,6 +441,11 @@ OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a") OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a") OPTAB_D (vec_widen_uadd_odd_optab, "vec_widen_uadd_odd_$a") OPTAB_D (vec_widen_uadd_even_optab, "vec_widen_uadd_even_$a") +OPTAB_D (vec_widen_uabd_optab, "vec_widen_uabd_$a") +OPTAB_D (vec_widen_uabd_hi_optab, "vec_widen_uabd_hi_$a") +OPTAB_D (vec_widen_uabd_lo_optab, "vec_widen_uabd_lo_$a") +OPTAB_D (vec_widen_uabd_odd_optab, "vec_widen_uabd_odd_$a") +OPTAB_D (vec_widen_uabd_even_optab, "vec_widen_uabd_even_$a") OPTAB_D (vec_addsub_optab, "vec_addsub$a3") OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4") OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4") diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index e2392113bff4065c909aefc760b4c48978b73a5a..281d7bc2e9945ee415be051f5ec1cce19251fbbf 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1404,15 +1404,28 @@ vect_recog_sad_pattern (vec_info *vinfo, gcall *abd_stmt = dyn_cast (abs_stmt_vinfo->stmt); if (!abd_stmt || !gimple_call_internal_p (abd_stmt) - || gimple_call_internal_fn (abd_stmt) != IFN_ABD) + || gimple_call_num_args (abd_stmt) != 2) return NULL; tree abd_oprnd0 = gimple_call_arg (abd_stmt, 0); tree abd_oprnd1 = gimple_call_arg (abd_stmt, 1); - if (!vect_look_through_possible_promotion (vinfo, abd_oprnd0, &unprom[0]) - || !vect_look_through_possible_promotion (vinfo, abd_oprnd1, - &unprom[1])) + if (gimple_call_internal_fn (abd_stmt) == IFN_ABD) + { + if (!vect_look_through_possible_promotion (vinfo, abd_oprnd0, + &unprom[0]) + || !vect_look_through_possible_promotion (vinfo, abd_oprnd1, + &unprom[1])) + return NULL; + } + else if (gimple_call_internal_fn (abd_stmt) == IFN_VEC_WIDEN_ABD) + { + unprom[0].op = abd_oprnd0; + unprom[0].type = TREE_TYPE (abd_oprnd0); + unprom[1].op = abd_oprnd1; + unprom[1].type = TREE_TYPE (abd_oprnd1); + } + else return NULL; half_type = unprom[0].type; @@ -1442,16 +1455,19 @@ vect_recog_sad_pattern (vec_info *vinfo, /* Function vect_recog_abd_pattern - Try to find the following ABsolute Difference (ABD) pattern: + Try to find the following ABsolute Difference (ABD) or + widening ABD (WIDEN_ABD) pattern: - VTYPE x, y, out; - type diff; - loop i in range: - S1 diff = x[i] - y[i] - S2 out[i] = ABS_EXPR ; + TYPE1 x; + TYPE2 y; + TYPE3 x_cast = (TYPE3) x; // widening or no-op + TYPE3 y_cast = (TYPE3) y; // widening or no-op + TYPE3 diff = x_cast - y_cast; + TYPE4 diff_cast = (TYPE4) diff; // widening or no-op + TYPE5 abs = ABS(U)_EXPR ; - where 'type' is a integer and 'VTYPE' is a vector of integers - the same size as 'type' + WIDEN_ABD exists to optimize the case where WTYPE is at least + twice as wide as VTYPE. Input: @@ -1459,30 +1475,18 @@ vect_recog_sad_pattern (vec_info *vinfo, Output: - * TYPE_out: The type of the output of this pattern + * TYPE_OUT: The type of the output of this pattern * Return value: A new stmt that will be used to replace the sequence of - stmts that constitute the pattern; either SABD or UABD: - SABD_EXPR - UABD_EXPR + stmts that constitute the pattern; either SABD, UABD, SABDL or UABDL: + IFN_ABD + IFN_WIDEN_ABD */ static gimple * vect_recog_abd_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, tree *type_out) { - /* Look for the following patterns - X = x[i] - Y = y[i] - DIFF = X - Y - DAD = ABS_EXPR - out[i] = DAD - - In which - - X, Y, DIFF, DAD all have the same type - - x, y, out are all vectors of the same type - */ - gassign *last_stmt = dyn_cast (STMT_VINFO_STMT (stmt_vinfo)); if (!last_stmt) return NULL; @@ -1496,54 +1500,83 @@ vect_recog_abd_pattern (vec_info *vinfo, unprom, &diff_stmt)) return NULL; - tree abd_type = out_type, vectype; - tree abd_oprnds[2]; - bool extend = false; + tree abd_in_type, abd_out_type; + if (half_type) { - vectype = get_vectype_for_scalar_type (vinfo, half_type); - abd_type = half_type; - extend = TYPE_PRECISION (abd_type) < TYPE_PRECISION (out_type); + abd_in_type = half_type; + abd_out_type = abd_in_type; } else { unprom[0].op = gimple_assign_rhs1 (diff_stmt); unprom[1].op = gimple_assign_rhs2 (diff_stmt); - tree signed_out = signed_type_for (out_type); - vectype = get_vectype_for_scalar_type (vinfo, signed_out); + abd_in_type = signed_type_for (out_type); + abd_out_type = abd_in_type; } - vect_pattern_detected ("vect_recog_abd_pattern", last_stmt); + tree vectype_in = get_vectype_for_scalar_type (vinfo, abd_in_type); + if (!vectype_in) + return NULL; - if (!vectype - || !direct_internal_fn_supported_p (IFN_ABD, vectype, + internal_fn ifn = IFN_ABD; + tree vectype_out = vectype_in; + + if (TYPE_PRECISION (out_type) >= TYPE_PRECISION (abd_in_type) * 2 + && TYPE_PRECISION (abd_out_type) != stmt_vinfo->min_output_precision) + { + tree mid_type + = build_nonstandard_integer_type (TYPE_PRECISION (abd_in_type) * 2, + TYPE_UNSIGNED (abd_in_type)); + tree mid_vectype = get_vectype_for_scalar_type (vinfo, mid_type); + + code_helper dummy_code; + int dummy_int; + auto_vec dummy_vec; + if (mid_vectype + && supportable_widening_operation (vinfo, IFN_VEC_WIDEN_ABD, + stmt_vinfo, mid_vectype, + vectype_in, + &dummy_code, &dummy_code, + &dummy_int, &dummy_vec)) + { + ifn = IFN_VEC_WIDEN_ABD; + abd_out_type = mid_type; + vectype_out = mid_vectype; + } + } + + if (ifn == IFN_ABD + && !direct_internal_fn_supported_p (ifn, vectype_in, OPTIMIZE_FOR_SPEED)) return NULL; + vect_pattern_detected ("vect_recog_abd_pattern", last_stmt); + + tree abd_oprnds[2]; vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds, - TREE_TYPE (vectype), unprom, vectype); + abd_in_type, unprom, vectype_in); *type_out = get_vectype_for_scalar_type (vinfo, out_type); - tree abd_result = vect_recog_temp_ssa_var (abd_type, NULL); - gcall *abd_stmt = gimple_build_call_internal (IFN_ABD, 2, + tree abd_result = vect_recog_temp_ssa_var (abd_out_type, NULL); + gcall *abd_stmt = gimple_build_call_internal (ifn, 2, abd_oprnds[0], abd_oprnds[1]); gimple_call_set_lhs (abd_stmt, abd_result); gimple_set_location (abd_stmt, gimple_location (last_stmt)); - if (!extend) - return abd_stmt; - gimple *stmt = abd_stmt; - if (!TYPE_UNSIGNED (abd_type)) + if (TYPE_PRECISION (abd_in_type) == TYPE_PRECISION (abd_out_type) + && TYPE_PRECISION (abd_out_type) < TYPE_PRECISION (out_type) + && !TYPE_UNSIGNED (abd_out_type)) { - tree unsign = unsigned_type_for (abd_type); + tree unsign = unsigned_type_for (abd_out_type); tree unsign_vectype = get_vectype_for_scalar_type (vinfo, unsign); stmt = vect_convert_output (vinfo, stmt_vinfo, unsign, stmt, unsign_vectype); } - return vect_convert_output (vinfo, stmt_vinfo, out_type, stmt, vectype); + return vect_convert_output (vinfo, stmt_vinfo, out_type, stmt, vectype_out); } /* Recognize an operation that performs ORIG_CODE on widened inputs, @@ -1703,6 +1736,81 @@ vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, &subtype); } +/* Try to detect abd on widened inputs, converting IFN_ABD + to IFN_VEC_WIDEN_ABD. */ +static gimple * +vect_recog_widen_abd_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo, + tree *type_out) +{ + gassign *last_stmt = dyn_cast (STMT_VINFO_STMT (stmt_vinfo)); + if (!last_stmt || !gimple_assign_cast_p (last_stmt)) + return NULL; + + tree last_rhs = gimple_assign_rhs1 (last_stmt); + + tree in_type = TREE_TYPE (last_rhs); + tree out_type = TREE_TYPE (gimple_assign_lhs (last_stmt)); + if (TYPE_PRECISION (in_type) * 2 != TYPE_PRECISION (out_type)) + return NULL; + + stmt_vec_info abs_vinfo = vect_get_internal_def (vinfo, last_rhs); + if (!abs_vinfo) + return NULL; + + stmt_vec_info abd_pattern_vinfo = STMT_VINFO_RELATED_STMT (abs_vinfo); + if (!abd_pattern_vinfo) + return NULL; + + gimple *pattern_stmt = STMT_VINFO_STMT (abd_pattern_vinfo); + if (gimple_assign_cast_p (pattern_stmt)) + { + tree op = gimple_assign_rhs1 (pattern_stmt); + vect_unpromoted_value unprom; + op = vect_look_through_possible_promotion (vinfo, op, &unprom); + + if (!op) + return NULL; + + abd_pattern_vinfo = vect_get_internal_def (vinfo, op); + if (!abd_pattern_vinfo) + return NULL; + + pattern_stmt = STMT_VINFO_STMT (abd_pattern_vinfo); + } + + gcall *abd_stmt = dyn_cast (pattern_stmt); + if (!abd_stmt || gimple_call_internal_fn (abd_stmt) != IFN_ABD) + return NULL; + + tree abd_oprnd0 = gimple_call_arg (abd_stmt, 0); + tree abd_oprnd1 = gimple_call_arg (abd_stmt, 1); + if (TYPE_PRECISION (TREE_TYPE (abd_oprnd0)) != TYPE_PRECISION (in_type)) + return NULL; + + tree vectype_in = get_vectype_for_scalar_type (vinfo, in_type); + tree vectype_out = get_vectype_for_scalar_type (vinfo, out_type); + + code_helper dummy_code; + int dummy_int; + auto_vec dummy_vec; + if (!supportable_widening_operation (vinfo, IFN_VEC_WIDEN_ABD, stmt_vinfo, + vectype_out, vectype_in, + &dummy_code, &dummy_code, + &dummy_int, &dummy_vec)) + return NULL; + + vect_pattern_detected ("vect_recog_widen_abd_pattern", last_stmt); + + *type_out = vectype_out; + + tree widen_abd_result = vect_recog_temp_ssa_var (out_type, NULL); + gcall *widen_abd_stmt = gimple_build_call_internal (IFN_VEC_WIDEN_ABD, 2, + abd_oprnd0, abd_oprnd1); + gimple_call_set_lhs (widen_abd_stmt, widen_abd_result); + gimple_set_location (widen_abd_stmt, gimple_location (last_stmt)); + return widen_abd_stmt; +} + /* Function vect_recog_ctz_ffs_pattern Try to find the following pattern: @@ -6670,6 +6778,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + { vect_recog_widen_abd_pattern, "widen_abd" }, /* These must come after the double widening ones. */ }; -- 2.25.1