From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2078.outbound.protection.outlook.com [40.107.7.78]) by sourceware.org (Postfix) with ESMTPS id 7F3DA3858C53 for ; Wed, 14 Jun 2023 15:27:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7F3DA3858C53 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5BhaMozUXyh+Dc0xLvnIFfux1mxk6Czw/cvywPh4Dqk=; b=a9OEcNZ33JFoSjypx+EduN3Dd9mAt12XQln/uZdieqYNHxxjL5mFMN+2lI6UdelYfoMjNOYIBFJpM/GttZHN3S0RKTDgdl0u5BJBRaaKHHJPhQDtsZZukvMsxTLAZJ80J379ZXEM3xXEZzaQriWlSSgWpi0jUW/a7Jnubn93yCw= Received: from DU2PR04CA0199.eurprd04.prod.outlook.com (2603:10a6:10:28d::24) by GV2PR08MB10355.eurprd08.prod.outlook.com (2603:10a6:150:de::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.46; Wed, 14 Jun 2023 15:27:22 +0000 Received: from DBAEUR03FT029.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:28d::4) by DU2PR04CA0199.outlook.office365.com (2603:10a6:10:28d::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.36 via Frontend Transport; Wed, 14 Jun 2023 15:27:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT029.mail.protection.outlook.com (100.127.142.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.23 via Frontend Transport; Wed, 14 Jun 2023 15:27:21 +0000 Received: ("Tessian outbound 3570909035da:v136"); Wed, 14 Jun 2023 15:27:21 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: efe69d2d65dc15c9 X-CR-MTA-TID: 64aa7808 Received: from f2c8d1bef86d.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 4BC08182-7951-4378-819F-1902515D8F58.1; Wed, 14 Jun 2023 15:27:10 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f2c8d1bef86d.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 14 Jun 2023 15:27:10 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jPeM9XzdJClaVTxXIHyCVhNbDMVM2p/v/XVN5ShgV8I03TbMGke0ZLkJIsxQlkP4eD4UnVbv7AKxz6Hj704wH4KAqfj5ib2gDIPkk9gY1rUmmweA8rNWMzZLJ8YW4CAvOUB2ogRCXpREHou/WKLu16KYYlH5AczM4SLMboZah8ri9UGnu+ZsJ9qLP+LxXo0J9Z5a0AqVMrRWoa5QFGl5n/0jTHBZhnJLgz0zgVu5ocr0nTNSoT+PS54gXVDIgXJLstcdkD5z6PupSKW/e45KTuG7wFaC2TnrqU8ihfgXevaLIpEa1WQdr2T/25ks0WFUJxP/KDfxVGDjmkXP20riyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5BhaMozUXyh+Dc0xLvnIFfux1mxk6Czw/cvywPh4Dqk=; b=gy2EkYfw/VB37f9cuqrl9WDlVDPqazGqJZiWbqT5EolnEtzW8pa9ATuz+YLYe7P+u6bTr1eUc3DOkG1ZmHAHdpOwx3cxVYeHkwXs4Kq4GIfI1Pg7ILggZXKdOhUE/uhnl9a3mggO07+CdrHvNkvmZaX4/Eg3IIr/CYW5tPCGiFdiPgGp6Ujc+SjYIvylRBWwUBSBnLSN+RCLNAtZdxYyrYz9sIAi0Ywn9qNQJ+lqlu/JHBpMLdLgKiERGS+ytCqqwb+oWqKW1SdxdM0B6vC7Yu/JyGv/GmkEgYku4Bue8u7qnZ/Ms7zRDPR67bnQw1pT9QgW2zkgjk2XMUTyt7mC2A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5BhaMozUXyh+Dc0xLvnIFfux1mxk6Czw/cvywPh4Dqk=; b=a9OEcNZ33JFoSjypx+EduN3Dd9mAt12XQln/uZdieqYNHxxjL5mFMN+2lI6UdelYfoMjNOYIBFJpM/GttZHN3S0RKTDgdl0u5BJBRaaKHHJPhQDtsZZukvMsxTLAZJ80J379ZXEM3xXEZzaQriWlSSgWpi0jUW/a7Jnubn93yCw= Received: from AM4PR0302CA0013.eurprd03.prod.outlook.com (2603:10a6:205:2::26) by DB4PR08MB8176.eurprd08.prod.outlook.com (2603:10a6:10:380::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.36; Wed, 14 Jun 2023 15:27:06 +0000 Received: from AM7EUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:205:2:cafe::34) by AM4PR0302CA0013.outlook.office365.com (2603:10a6:205:2::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37 via Frontend Transport; Wed, 14 Jun 2023 15:27:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM7EUR03FT023.mail.protection.outlook.com (100.127.140.73) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6500.25 via Frontend Transport; Wed, 14 Jun 2023 15:27:06 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Wed, 14 Jun 2023 15:27:04 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Wed, 14 Jun 2023 15:27:03 +0000 Received: from e119885.cambridge.arm.com (10.2.78.52) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23 via Frontend Transport; Wed, 14 Jun 2023 15:27:03 +0000 From: Oluwatamilore Adebayo To: CC: , , Subject: [PATCH 1/2] Missed opportunity to use [SU]ABD Date: Wed, 14 Jun 2023 16:26:56 +0100 Message-ID: <20230614152656.51278-1-oluwatamilore.adebayo@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM7EUR03FT023:EE_|DB4PR08MB8176:EE_|DBAEUR03FT029:EE_|GV2PR08MB10355:EE_ X-MS-Office365-Filtering-Correlation-Id: acff8bd2-4a38-4e4a-4d25-08db6cebd8a5 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: b0gDgYTAbhKiL4cWnaoRiSQRl++5Zj6PRSPMs/+rpQEbKLpatnc1IYbmjBgMx/9E1edrSnJAwqin+78WoXsqnP9mL3VRF8zVCSvR5XXgPBtyN+hRTS62BNahCY8fciHkK289zuERdukeYerHos2xTK2PfpbrZpqTQmP9W2YCT7/5ZTD7Kj5fhg105+yQZbanv/T+AUeAHZ4KlI/3LdAIpsf5cwxcKTQ/q/LvKm8c8K6mzhnJ2VajICd47pImwVjxPPtTU7OdSCg/dIKLPafVlmWuYmWhlmMeAosew7oea+y2w605UWEsnH4bJ8hhnEkZdge33DxWP6OeBeJEEAakQbp1Nxcl9CNIySevwQkmCzImHjSiizRMPJHSX23CGSDzu2+ime3LB9RpJwua9MQZB1Kqk8rJYb50AiefMVsqDbBXi+TV0BNaTqmNX70ZQNTMA04+C5IOA0OjXI54lu5t4eOaoWcDcUttsHALJAF0WMWhfSkrgmP4ygrW0D8f3feN2zsCtNdv4QvLVY1V1MM9WvRmpJUwOXgFxCwjWKVryzAGmxRZOgtl4MpaYX+vvPPYsjlvqkVhXpxEXdYC0NHmHElGRRHufRd+jBQ3ShXhGMRp4jMYdeINApMJeI3MlpXCR2DKyK1JQBany5o5Yb7/2cQZBd7thX2HURrEAul4IWYZZF9dT1b718OARynA8U5LGZtZjqmnjxcbk+aIBDV3QWKyCikQ2nzMz2vjzNz7C23gygzHHflIy5B3/72NvNKV X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:nebula.arm.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(346002)(396003)(39860400002)(376002)(136003)(451199021)(46966006)(36840700001)(336012)(7696005)(316002)(2616005)(41300700001)(83380400001)(36860700001)(82310400005)(44832011)(47076005)(86362001)(186003)(1076003)(26005)(2906002)(30864003)(356005)(82740400003)(40480700001)(81166007)(426003)(36756003)(5660300002)(8936002)(8676002)(6862004)(70586007)(70206006)(54906003)(37006003)(45080400002)(478600001)(6666004)(4326008)(6636002)(36900700001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB4PR08MB8176 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT029.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 19c61493-302b-4fec-9ed9-08db6cebd001 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cBgF2jTP5C9GGieRrz5olyYdKQHpLnfTRK4PvfO6w+/kgErNROb7OEVrJMAEdOs0cviKTh5nffOzOQ7Pdn2cNHzTHM4iDjoIBenMwTHCKHTUot/MHYR7Ik8t/pMNnMlzSu4tbMLp33KBaSJ9awi9gC8oYXAo2Mee6qYsSYHlUl7Dd5cqucew/bW9R6uEZ9x7nDgH7nk4M2qhabZh6fvPLChE5Y1FpqU3RkxOPkJkxdWVhTwKsg0IokyaNT6V2A6fKCMrqXLLm2JZ3i/1128te1vR/mtJvK945+Jl3i3SUBd4XAiu79/ySxIlbIqntCfg3r/VvBHfD/xb2VGStXRwYivL+82ZDeQzBBLrSLSiavSmHlMg8Yn73AGZbsA9GnDzYK2SYAoTU88rFTOlpcinHamBMjJgc7nLE4tBXsau5EhaWwXA+uBhUTDk4SELDcsgxAP1fYubH5bU+jzBkXCa/h9pknmmdr7GbBnYC8exum72mYedZxfnG+1LJrmPQyxPtNmebypYgWy0jzt3aGwXsoRYTz/odMNN6Ed4VYg1Rl2/eXDxfjmfeQXcn/oM0iawOgTnACsjsXFZMOGz/f1+ro4B0HdFqm0UygU3zrOKuVKqXWyo2K777QFiY//FOOiVakb9sR0P08jggIVyyLu5jUhmELOwizBkws9Hp9sEtiapf2t9QXxkJ/3S1B5wFR4UmHAs8G0tQJOvP26X9wDSP6QvWQVxLzG2HGb9CMYsWBTXcMCx3l2lFwb1/Go3HIoq X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(396003)(346002)(376002)(136003)(451199021)(40470700004)(46966006)(36840700001)(82310400005)(5660300002)(36860700001)(83380400001)(40480700001)(336012)(47076005)(426003)(40460700003)(6636002)(81166007)(41300700001)(4326008)(82740400003)(316002)(36756003)(26005)(6666004)(107886003)(70586007)(70206006)(478600001)(1076003)(6862004)(2906002)(44832011)(7696005)(86362001)(8676002)(8936002)(45080400002)(37006003)(30864003)(2616005)(186003)(54906003);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Jun 2023 15:27:21.0727 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: acff8bd2-4a38-4e4a-4d25-08db6cebd8a5 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT029.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB10355 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: oluade01 This adds a recognition pattern for the non-widening absolute difference (ABD). gcc/ChangeLog: * doc/md.texi (sabd, uabd): Document them. * internal-fn.def (ABD): Use new optab. * optabs.def (sabd_optab, uabd_optab): New optabs, * tree-vect-patterns.cc (vect_recog_absolute_difference): Recognize the following idiom abs (a - b). (vect_recog_sad_pattern): Refactor to use vect_recog_absolute_difference. (vect_recog_abd_pattern): Use patterns found by vect_recog_absolute_difference to build a new ABD internal call. --- gcc/doc/md.texi | 10 ++ gcc/internal-fn.def | 3 + gcc/optabs.def | 2 + gcc/tree-vect-patterns.cc | 233 +++++++++++++++++++++++++++++++++----- 4 files changed, 217 insertions(+), 31 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 6a435eb44610960513e9739ac9ac1e8a27182c10..e11b10d2fca11016232921bc85e47975f700e6c6 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5787,6 +5787,16 @@ Other shift and rotate instructions, analogous to the Vector shift and rotate instructions that take vectors as operand 2 instead of a scalar type. +@cindex @code{uabd@var{m}} instruction pattern +@cindex @code{sabd@var{m}} instruction pattern +@item @samp{uabd@var{m}}, @samp{sabd@var{m}} +Signed and unsigned absolute difference instructions. These +instructions find the difference between operands 1 and 2 +then return the absolute value. A C code equivalent would be: +@smallexample +op0 = op1 > op2 ? op1 - op2 : op2 - op1; +@end smallexample + @cindex @code{avg@var{m}3_floor} instruction pattern @cindex @code{uavg@var{m}3_floor} instruction pattern @item @samp{avg@var{m}3_floor} diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 3ac9d82aace322bd8ef108596e5583daa18c76e3..116965f4830cec8f60642ff011a86b6562e2c509 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -191,6 +191,9 @@ DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary) DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary) DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary) +DEF_INTERNAL_SIGNED_OPTAB_FN (ABD, ECF_CONST | ECF_NOTHROW, first, + sabd, uabd, binary) + DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, first, savg_floor, uavg_floor, binary) DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first, diff --git a/gcc/optabs.def b/gcc/optabs.def index 6c064ff4993620067d38742a0bfe0a3efb511069..35b835a6ac56d72417dac8ddfd77a8a7e2475e65 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -359,6 +359,8 @@ OPTAB_D (mask_fold_left_plus_optab, "mask_fold_left_plus_$a") OPTAB_D (extract_last_optab, "extract_last_$a") OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a") +OPTAB_D (uabd_optab, "uabd$a3") +OPTAB_D (sabd_optab, "sabd$a3") OPTAB_D (savg_floor_optab, "avg$a3_floor") OPTAB_D (uavg_floor_optab, "uavg$a3_floor") OPTAB_D (savg_ceil_optab, "avg$a3_ceil") diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index dc102c919352a0328cf86eabceb3a38c41a7e4fd..e2392113bff4065c909aefc760b4c48978b73a5a 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -782,6 +782,83 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info stmt2_info, tree new_rhs, } } +/* Look for the following pattern + X = x[i] + Y = y[i] + DIFF = X - Y + DAD = ABS_EXPR + + ABS_STMT should point to a statement of code ABS_EXPR or ABSU_EXPR. + HALF_TYPE and UNPROM will be set should the statement be found to + be a widened operation. + DIFF_STMT will be set to the MINUS_EXPR + statement that precedes the ABS_STMT unless vect_widened_op_tree + succeeds. + */ +static bool +vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt, + tree *half_type, + vect_unpromoted_value unprom[2], + gassign **diff_stmt) +{ + if (!abs_stmt) + return false; + + /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi + inside the loop (in case we are analyzing an outer-loop). */ + enum tree_code code = gimple_assign_rhs_code (abs_stmt); + if (code != ABS_EXPR && code != ABSU_EXPR) + return false; + + tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); + tree abs_type = TREE_TYPE (abs_oprnd); + if (!abs_oprnd) + return false; + if (!ANY_INTEGRAL_TYPE_P (abs_type) + || TYPE_OVERFLOW_WRAPS (abs_type) + || TYPE_UNSIGNED (abs_type)) + return false; + + /* Peel off conversions from the ABS input. This can involve sign + changes (e.g. from an unsigned subtraction to a signed ABS input) + or signed promotion, but it can't include unsigned promotion. + (Note that ABS of an unsigned promotion should have been folded + away before now anyway.) */ + vect_unpromoted_value unprom_diff; + abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd, + &unprom_diff); + if (!abs_oprnd) + return false; + if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type) + && TYPE_UNSIGNED (unprom_diff.type)) + return false; + + /* We then detect if the operand of abs_expr is defined by a minus_expr. */ + stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); + if (!diff_stmt_vinfo) + return false; + + /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi + inside the loop (in case we are analyzing an outer-loop). */ + if (vect_widened_op_tree (vinfo, diff_stmt_vinfo, + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, 2, unprom, half_type)) + return true; + + /* Failed to find a widen operation so we check for a regular MINUS_EXPR. */ + gassign *diff = dyn_cast (STMT_VINFO_STMT (diff_stmt_vinfo)); + if (diff_stmt && diff + && gimple_assign_rhs_code (diff) == MINUS_EXPR + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd))) + { + *diff_stmt = diff; + *half_type = NULL_TREE; + return true; + } + + return false; +} + /* Convert UNPROM to TYPE and return the result, adding new statements to STMT_INFO's pattern definition statements if no better way is available. VECTYPE is the vector form of TYPE. @@ -1320,41 +1397,28 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ gassign *abs_stmt = dyn_cast (abs_stmt_vinfo->stmt); - if (!abs_stmt - || (gimple_assign_rhs_code (abs_stmt) != ABS_EXPR - && gimple_assign_rhs_code (abs_stmt) != ABSU_EXPR)) - return NULL; + vect_unpromoted_value unprom[2]; - tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); - tree abs_type = TREE_TYPE (abs_oprnd); - if (TYPE_UNSIGNED (abs_type)) - return NULL; + if (!abs_stmt) + { + gcall *abd_stmt = dyn_cast (abs_stmt_vinfo->stmt); + if (!abd_stmt + || !gimple_call_internal_p (abd_stmt) + || gimple_call_internal_fn (abd_stmt) != IFN_ABD) + return NULL; - /* Peel off conversions from the ABS input. This can involve sign - changes (e.g. from an unsigned subtraction to a signed ABS input) - or signed promotion, but it can't include unsigned promotion. - (Note that ABS of an unsigned promotion should have been folded - away before now anyway.) */ - vect_unpromoted_value unprom_diff; - abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd, - &unprom_diff); - if (!abs_oprnd) - return NULL; - if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type) - && TYPE_UNSIGNED (unprom_diff.type)) - return NULL; + tree abd_oprnd0 = gimple_call_arg (abd_stmt, 0); + tree abd_oprnd1 = gimple_call_arg (abd_stmt, 1); - /* We then detect if the operand of abs_expr is defined by a minus_expr. */ - stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); - if (!diff_stmt_vinfo) - return NULL; + if (!vect_look_through_possible_promotion (vinfo, abd_oprnd0, &unprom[0]) + || !vect_look_through_possible_promotion (vinfo, abd_oprnd1, + &unprom[1])) + return NULL; - /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi - inside the loop (in case we are analyzing an outer-loop). */ - vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, - IFN_VEC_WIDEN_MINUS, - false, 2, unprom, &half_type)) + half_type = unprom[0].type; + } + else if (!vect_recog_absolute_difference (vinfo, abs_stmt, &half_type, + unprom, NULL)) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -1376,6 +1440,112 @@ vect_recog_sad_pattern (vec_info *vinfo, return pattern_stmt; } +/* Function vect_recog_abd_pattern + + Try to find the following ABsolute Difference (ABD) pattern: + + VTYPE x, y, out; + type diff; + loop i in range: + S1 diff = x[i] - y[i] + S2 out[i] = ABS_EXPR ; + + where 'type' is a integer and 'VTYPE' is a vector of integers + the same size as 'type' + + Input: + + * STMT_VINFO: The stmt from which the pattern search begins + + Output: + + * TYPE_out: The type of the output of this pattern + + * Return value: A new stmt that will be used to replace the sequence of + stmts that constitute the pattern; either SABD or UABD: + SABD_EXPR + UABD_EXPR + */ + +static gimple * +vect_recog_abd_pattern (vec_info *vinfo, + stmt_vec_info stmt_vinfo, tree *type_out) +{ + /* Look for the following patterns + X = x[i] + Y = y[i] + DIFF = X - Y + DAD = ABS_EXPR + out[i] = DAD + + In which + - X, Y, DIFF, DAD all have the same type + - x, y, out are all vectors of the same type + */ + + gassign *last_stmt = dyn_cast (STMT_VINFO_STMT (stmt_vinfo)); + if (!last_stmt) + return NULL; + + tree out_type = TREE_TYPE (gimple_assign_lhs (last_stmt)); + + vect_unpromoted_value unprom[2]; + gassign *diff_stmt; + tree half_type; + if (!vect_recog_absolute_difference (vinfo, last_stmt, &half_type, + unprom, &diff_stmt)) + return NULL; + + tree abd_type = out_type, vectype; + tree abd_oprnds[2]; + bool extend = false; + if (half_type) + { + vectype = get_vectype_for_scalar_type (vinfo, half_type); + abd_type = half_type; + extend = TYPE_PRECISION (abd_type) < TYPE_PRECISION (out_type); + } + else + { + unprom[0].op = gimple_assign_rhs1 (diff_stmt); + unprom[1].op = gimple_assign_rhs2 (diff_stmt); + tree signed_out = signed_type_for (out_type); + vectype = get_vectype_for_scalar_type (vinfo, signed_out); + } + + vect_pattern_detected ("vect_recog_abd_pattern", last_stmt); + + if (!vectype + || !direct_internal_fn_supported_p (IFN_ABD, vectype, + OPTIMIZE_FOR_SPEED)) + return NULL; + + vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds, + TREE_TYPE (vectype), unprom, vectype); + + *type_out = get_vectype_for_scalar_type (vinfo, out_type); + + tree abd_result = vect_recog_temp_ssa_var (abd_type, NULL); + gcall *abd_stmt = gimple_build_call_internal (IFN_ABD, 2, + abd_oprnds[0], abd_oprnds[1]); + gimple_call_set_lhs (abd_stmt, abd_result); + gimple_set_location (abd_stmt, gimple_location (last_stmt)); + + if (!extend) + return abd_stmt; + + gimple *stmt = abd_stmt; + if (!TYPE_UNSIGNED (abd_type)) + { + tree unsign = unsigned_type_for (abd_type); + tree unsign_vectype = get_vectype_for_scalar_type (vinfo, unsign); + stmt = vect_convert_output (vinfo, stmt_vinfo, unsign, stmt, + unsign_vectype); + } + + return vect_convert_output (vinfo, stmt_vinfo, out_type, stmt, vectype); +} + /* Recognize an operation that performs ORIG_CODE on widened inputs, so that it can be treated as though it had the form: @@ -6471,6 +6641,7 @@ struct vect_recog_func static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_bitfield_ref_pattern, "bitfield_ref" }, { vect_recog_bit_insert_pattern, "bit_insert" }, + { vect_recog_abd_pattern, "abd" }, { vect_recog_over_widening_pattern, "over_widening" }, /* Must come after over_widening, which narrows the shift as much as possible beforehand. */ -- 2.25.1