From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2083.outbound.protection.outlook.com [40.107.7.83]) by sourceware.org (Postfix) with ESMTPS id 3D2443858402 for ; Wed, 28 Jun 2023 15:08:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3D2443858402 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+xpvDu0TpP9Xh1jAS8C8d1OOgcJCeruHKdjzbjb6kQw=; b=z7ksgSaVlZth1CI3r2A03O3zMV7GeNj56i6NVeKJWCg6+kLeIaa117lQNh57jwE3ln93Tp4Pwa5EPK+LSjoeBwyIs20gNF47LWRkDn/Rwc8qQ/XTB32N5/+zUUNGQIzrvtodgRMcQ7TI4YuOow+59fkZ1n34xMdx6FZsQmCd25c= Received: from AS8PR04CA0101.eurprd04.prod.outlook.com (2603:10a6:20b:31e::16) by PAVPR08MB8797.eurprd08.prod.outlook.com (2603:10a6:102:32e::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.23; Wed, 28 Jun 2023 15:08:07 +0000 Received: from AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:31e:cafe::b4) by AS8PR04CA0101.outlook.office365.com (2603:10a6:20b:31e::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.34 via Frontend Transport; Wed, 28 Jun 2023 15:08:07 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT056.mail.protection.outlook.com (100.127.140.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6544.16 via Frontend Transport; Wed, 28 Jun 2023 15:08:07 +0000 Received: ("Tessian outbound b11b8bb4dfe8:v142"); Wed, 28 Jun 2023 15:08:07 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 839a56f736f06a49 X-CR-MTA-TID: 64aa7808 Received: from 615d65c6bdba.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id BF26B8BB-0006-44FC-85A9-B4C3540EBD94.1; Wed, 28 Jun 2023 15:07:56 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 615d65c6bdba.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 28 Jun 2023 15:07:56 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QclhTMuGcqLckbFKcExUBS6G8hMnlPilS4GQuURNDLDWDrRY65xeQq51Y8kV8M+lOTc2V9RvSJa++F0KEUXkMw7XRoxLWkIE3E4iEuwFIrP7E7iyieopVLxelBI/kyGWRcrAXuSdg5UoUahNi6sSruOzJ5MgX2a5r9psE1M2gxw2MA2n9bOnDcqa3FJ0mB+j+m9Rf1HDzfPQ+/G/F+WLZ24E1E3OoP9lupEI4zQl1VhCkqN8zHDTvnVNMSHv+QX4Vv7d2ViSPYG4eFri3r0kgM2ptNsb6jtQAh84DCZm6+RdKDi99ctJOuVuBnz7UYff4pvmhSy/xaXZRauIvICc5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+xpvDu0TpP9Xh1jAS8C8d1OOgcJCeruHKdjzbjb6kQw=; b=DCsNInWWv/99O8r1N04v+Q4LWlPuUlRy/4Z6/4CmIEFzEXgs//dGoSgfBY5doln6DaHuWEBgqgmqWgi6L4P+zitiEnQDYEuEAs0r6G8MaObgfLC03dEw+GlaWc2iBoeQIl9IGWw6NKOcZ9Tln/zOcgv95ijKDauwgKSxp6xQkkyyot70Vi6VmE5cE2pcpZd5DZQO0UD0+fiG4IEUPR/N30OfNn/a0ITXzG8VaKFDk6fQpH9P1dAIA/ShlwMtQTXQTgJued3aC6UrFYDXWrEq3f5QfZIbRweX0e9pAx3Tij+UdmavZLac1ZFcmx+6pPO/oPvrbESmDgnQQiBF5gQw/g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+xpvDu0TpP9Xh1jAS8C8d1OOgcJCeruHKdjzbjb6kQw=; b=z7ksgSaVlZth1CI3r2A03O3zMV7GeNj56i6NVeKJWCg6+kLeIaa117lQNh57jwE3ln93Tp4Pwa5EPK+LSjoeBwyIs20gNF47LWRkDn/Rwc8qQ/XTB32N5/+zUUNGQIzrvtodgRMcQ7TI4YuOow+59fkZ1n34xMdx6FZsQmCd25c= Received: from AM0PR10CA0010.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:208:17c::20) by AS8PR08MB9744.eurprd08.prod.outlook.com (2603:10a6:20b:614::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.26; Wed, 28 Jun 2023 15:07:54 +0000 Received: from AM7EUR03FT016.eop-EUR03.prod.protection.outlook.com (2603:10a6:208:17c:cafe::7c) by AM0PR10CA0010.outlook.office365.com (2603:10a6:208:17c::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6544.19 via Frontend Transport; Wed, 28 Jun 2023 15:07:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM7EUR03FT016.mail.protection.outlook.com (100.127.140.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6544.20 via Frontend Transport; Wed, 28 Jun 2023 15:07:54 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 15:07:53 +0000 Received: from e119885.cambridge.arm.com (10.2.78.55) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27 via Frontend Transport; Wed, 28 Jun 2023 15:07:52 +0000 From: Oluwatamilore Adebayo To: CC: , , Subject: Re: [PATCH 1/2] Mid engine setup [SU]ABDL Date: Wed, 28 Jun 2023 16:07:47 +0100 Message-ID: <20230628150747.47729-1-oluwatamilore.adebayo@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf8" Content-Transfer-Encoding: 8bit X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM7EUR03FT016:EE_|AS8PR08MB9744:EE_|AM7EUR03FT056:EE_|PAVPR08MB8797:EE_ X-MS-Office365-Filtering-Correlation-Id: 883d512c-17fa-4900-6170-08db77e97add x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 35eAOBg8AMMlsJOBoHgDBcrU4Rl/eqzJYoTX+o3s50B8grycWKX/EnzT+VcBBapKNv7khxeuXm3/tmq0MSSc9HePLntvRhrNBPz/CMOL08jo1D2sMrE+ek3NVs8ThozGIVjPQu/7MCUx2Q6zaEdcGABnL5xkycjX/vo8Fv2601X97V63mWSvql62Prqh3MqzXs0Aen29Uwt4IW267EsopTRZVoJjtzUGZf3tMV65ajD+WxNnhXhrdSxm35aadwZhADhrfZ7spXYWQHBttz5TDwqw2IOsgAmQ5uty87VlbGBxETcIo0odhOndoTowXX4TQ65NRAgmh4LjxfLzo2EkO4Vu7aVVtWxWdOZIDjDy9kseGZzHthCICnzmtF4jMiGOG9Ul9L5n14nDZ7L3Nc1YrXgfsHdSyz9fb+3SggcPWRC2k9cD8D+6E5XFi2NlejAfeXfxC4px1zjHZ0YOvEfmkZiamuZ1+2ussaFxny+gZo2K81YFZzMZYYcPODZXBxX4Zr858koOmxVQO1nI60I3y1seBRXiCCotWqm7wfo68r23CZZj2Ksat9CrCA63hzEWZf6RhdGvDi7ECzxDxRtfmWm7KCZT+EYNKltFi7YBrUwUFtDhAivW4YW1esHWBjmBoTkZNLPPYpxvjMYkUJiaKvp+81UviXLBsRltrUxC86d0OOptcT3daLMBQ9M0euf+gnb4wvCL9lEGA+niiQlbaGzBFWvR72+MUfaoGjGq86aSiJsFxUsS7j0FakvlF5dQUdcthY8ils62wThhwx53ZA== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:nebula.arm.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(396003)(376002)(346002)(136003)(39860400002)(451199021)(46966006)(36840700001)(40470700004)(8676002)(83380400001)(8936002)(336012)(70586007)(316002)(41300700001)(70206006)(7696005)(4326008)(186003)(1076003)(26005)(6636002)(2616005)(54906003)(37006003)(6666004)(40460700003)(82310400005)(2906002)(6862004)(5660300002)(40480700001)(44832011)(478600001)(82740400003)(356005)(81166007)(86362001)(36756003)(47076005)(36860700001)(426003)(36900700001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9744 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: d8449fc6-c271-4de1-445d-08db77e972f2 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wKgnpHPIglUk21QbkmWf/7d4gWMRKCTtAqGRFlK/J+tpaYVMok5d0WfmViVcVCYoX5bWmY46K2L9DUHC9qv8WvQc43BFSP9MCwZo/inc2pveKrj57iDo3UpmJQcAQJKBjuRU4mV0afWfQyYFnBt4cDs5LiNz1CkaHyTdJqfXAI/ilS/BMag15zhTyTtGMwpjkAH33MrTtxDLYi8JRH5D/LrxTaloSE0BTAjrbyXkFgodxYv490+4cfGfndesIannAFy0JfEz4oqPDP6akI3rUPKtCe+951oz6Au7sbUUFYMzWM/mHBy0tYR2/GuQEtBqsnNLdPAY+28j2lWfAd7L9OCpPEyEwpvuBMyere5OPLTcapWtU5SETBYAiDzIGv5n5DquHf8fbMNus3SPYqp7XPV9MkQpfzSmu8z0QPjtGGimyqnZ1jT08pLKZIOK+rOxdAui13bgi46Fo6414TmviankaUS2UHCFORnguCWme8xyIk2wwPCiaV7WTk3dBCmsuVpwFSxTVlngLgAYYkgEzLxWagjxj+OCf272eaged2D4iHsFFk05JF3YrAsvK3B8HylOl6H4SsnwRfpbGtW134PIrqGvYO5hG+pAgDjU/1FOmYzPu9aPQ6S+YtyUMHiXJ59yeDHRWMbOaqQQ2fcdpcJZBXcuileZMezWkXSajl2vv0RYttTvtwQuoPeZx5kiNNvxPkteNIbC7l7YGOoroNI04ci8ccgFquANFHyliL/cxFyKPDFhJsVdJFpLcTov X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(136003)(39860400002)(396003)(346002)(376002)(451199021)(46966006)(36840700001)(40470700004)(82310400005)(36860700001)(26005)(36756003)(6862004)(82740400003)(44832011)(70206006)(86362001)(41300700001)(40460700003)(40480700001)(6636002)(4326008)(81166007)(8936002)(70586007)(316002)(8676002)(107886003)(47076005)(5660300002)(186003)(83380400001)(1076003)(54906003)(7696005)(6666004)(2906002)(426003)(336012)(2616005)(478600001)(37006003);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Jun 2023 15:08:07.4846 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 883d512c-17fa-4900-6170-08db77e97add X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB8797 X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > The new optabs need to be documented in doc/md.texi. Done. > “Long” is a bit of an architecture-specific term. Maybe just: > > Try to find the following ABsolute Difference (ABD) or > widening ABD (WIDEN_ABD) pattern: Change made. > >> - VTYPE x, y, out; > >> + VTYPE x, y; > >> + WTYPE out; > >> type diff; > >> loop i in range: > >> S1 diff = x[i] - y[i] > >> S2 out[i] = ABS_EXPR ; > >> > >> - where 'type' is a integer and 'VTYPE' is a vector of integers > >> - the same size as 'type' > >> + where 'VTYPE' and 'WTYPE' are vectors of integers. > >> + 'WTYPE' may be wider than 'VTYPE'. > >> + 'type' is as wide as 'WTYPE'. > > > > I don't think the existing comment is right about the types. What we're > > matching is scalar code, so VTYPE and (now) WTYPE are integers rather > > than vectors of integers. > > Gah, sorry, I realise now that the point was that VTYPE and WTYPE > are sequences rather than scalars. But patterns are used for SLP > as well as loops, and the inputs and outputs might not be memory > objects. So: > > > I think it would be clearer to write: > > > > S1 diff = (type) x[i] - (type) y[i] > > S2 out[i] = ABS_EXPR <(WTYPE) diff>; > > > > since the promotions happen on the operands. > > > > It'd be good to keep the part about 'type' being an integer. > > > > Rather than: > > > > 'WTYPE' may be wider than 'VTYPE'. > > 'type' is as wide as 'WTYPE'. > > > > maybe: > > > > 'type' is no narrower than 'VTYPE' (but may be wider) > > 'WTYPE' is no narrower than 'type' (but may be wider) > > ...how about: > > TYPE1 x; > TYPE2 y; > TYPE3 x_cast = (TYPE3) x; // widening or no-op > TYPE3 y_cast = (TYPE3) y; // widening or no-op > TYPE3 diff = x_cast - y_cast; > TYPE4 diff_cast = (TYPE4) diff; // widening or no-op > TYPE5 abs = ABS(U)_EXPR ; > > (based on the comment above vect_recog_widen_op_pattern). Done. > WTYPE can't be narrower than VTYPE though. I think with the changes > suggested above, the text before this block describes the conditions > in enough detail, and so we can just say: > > WIDEN_ABD exists to optimize the case where WTYPE is at least twice as > wide as VTYPE. Change made. > SABD_EXPR/UABD_EXPR should be IFN_ABD > SABDL_EXPR/UABDL_EXPR should be IFN_WIDEN_ABD Change made. > Maybe it would be easier to remove this comment, since I think the > comment above the function says enough. Done. > Rather than have the "extend" variable, how about: > > > > > - vect_pattern_detected ("vect_recog_abd_pattern", last_stmt); > > + tree vectype_in = get_vectype_for_scalar_type (vinfo, abd_in_type); > > + tree vectype_out = get_vectype_for_scalar_type (vinfo, abd_out_type); > > + if (!vectype_in || !vectype_out) > > + return NULL; > > > > - if (!vectype > > - || !direct_internal_fn_supported_p (IFN_ABD, vectype, > > + if (ifn == IFN_VEC_WIDEN_ABD) > > + { > > + code_helper dummy_code; > > + int dummy_int; > > + auto_vec dummy_vec; > > + if (!supportable_widening_operation (vinfo, ifn, stmt_vinfo, > > + vectype_out, vectype_in, > > + &dummy_code, &dummy_code, > > + &dummy_int, &dummy_vec)) > > + { > > + /* There are architectures that have the ABD instructions > > + but not the ABDL instructions. If we just return NULL here > > + we will miss an occasion where we should have used ABD. > > + So we change back to ABD and try again. */ > > + ifn = IFN_ABD; > > + abd_out_type = abd_in_type; > > + extend = true; > > + } > > + } > > making this: > > if (TYPE_PRECISION (out_type) >= TYPE_PRECISION (abd_in_type) * 2) > { > tree mid_type > = build_nonstandard_integer_type (TYPE_PRECISION (abd_in_type) * 2, > TYPE_UNSIGNED (abd_in_type)); > tree mid_vectype = get_vectype_for_scalar_type (vinfo, mid_vectype); > code_helper dummy_code; > int dummy_int; > auto_vec dummy_vec; > if (mid_vectype > && supportable_widening_operation (vinfo, IFN_WIDEN_ABD, stmt_vinfo, > mid_vectype, vectype_in, > &dummy_code, &dummy_code, > &dummy_int, &dummy_vec)) > { > ifn = IFN_WIDEN_ABD; > abd_out_type = mid_type; > vectype_out = mid_vectype; > } > } > > The idea is to keep the assumption that we're using IFN_ABD > until we've proven conclusively otherwise. > > I think the later: > > if (!extend) > return abd_stmt; > > should then be deleted, since we should still use vect_convert_output > if abd_out_type has a different sign from out_type. > > ..... > > And this condition would then be: > > if (TYPE_PRECISION (abd_out_type) == TYPE_PRECISION (abd_in_type) > && TYPE_PRECISION (abd_out_type) < TYPE_PRECISION (out_type) > && !TYPE_UNSIGNED (abd_out_type)) > > since: > > (a) we don't need to force the type to unsigned if the final > vect_convert_output is just a sign change > > (b) we don't need to force the type to unsigned if abd_out_type > is large enough to hold the resu Done. > I think it'd be more natural to test this on TREE_TYPE (last_rhs) > rather than TREE_TYPE (abd_oprnd0), and do it after the assignment > to last_rhs above. Done. > > + tree abdl_result = vect_recog_temp_ssa_var (out_type, NULL); > > + gcall *abdl_stmt = gimple_build_call_internal (IFN_VEC_WIDEN_ABD, 2, > > + abd_oprnd0, abd_oprnd1); > > + gimple_call_set_lhs (abdl_stmt, abdl_result); > > + gimple_set_location (abdl_stmt, gimple_location (last_stmt)); > > + return abdl_stmt; > > I think “widen_abd” would be better than “abdl” here Done.