From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2043.outbound.protection.outlook.com [40.107.20.43]) by sourceware.org (Postfix) with ESMTPS id 5F36E398B805 for ; Mon, 28 Sep 2020 17:24:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5F36E398B805 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fN3qXbCc6mcwTlozlt5FR5Zb1IvIga9YeiRA3M73Kpc=; b=WQWKxhFfJjIAK0FmM/xDOkvTkVuXUkfSCg99zA2xPQaLRAgue4JorcyBtVTrlbJnHwixOEgPRvkNGwd6zHigexs/BsQVZhdGVevli/hZLmlb+Bf2nLPyMmJBz/yUZVp7E4kjCv1JUN59hCgBLUlYJkKKZOFLLVWHT8o4ATKBmSs= Received: from DB6PR0501CA0020.eurprd05.prod.outlook.com (2603:10a6:4:8f::30) by AM0PR08MB5170.eurprd08.prod.outlook.com (2603:10a6:208:15c::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22; Mon, 28 Sep 2020 17:24:13 +0000 Received: from DB5EUR03FT004.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:8f:cafe::a1) by DB6PR0501CA0020.outlook.office365.com (2603:10a6:4:8f::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Mon, 28 Sep 2020 17:24:13 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT004.mail.protection.outlook.com (10.152.20.128) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Mon, 28 Sep 2020 17:24:13 +0000 Received: ("Tessian outbound 34b830c8a0ef:v64"); Mon, 28 Sep 2020 17:24:13 +0000 X-CR-MTA-TID: 64aa7808 Received: from e2607bd05ad7.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 8F9FF0C5-31F8-41BC-B6CB-7E071992408A.1; Mon, 28 Sep 2020 17:24:08 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id e2607bd05ad7.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 28 Sep 2020 17:24:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BBDXuINdEQd/RTq77rps6zdRJTbSzd+NhIVMwZq6ipe1GOTfbQFYsDgi7sMYjggT2WB+gco/J7ZRPiAIUlBadTBg5feePfxD9aKCtf9E81VKtqdhpTdovdkVCACCtf/24DcZupOutU2x0YDB0TIqEq6PVfyifvFCajS0LsF/23bUVImGJjrd+yLM19uP6ybNjwdN8O86mgc4kU1g/giwLRq0OH0zdAtpU+jzDa5+mSyRi9qX3ZYwFgdHCySsFlLipkG3MOdPWeCapOgN7nzNcmdCD12XHWlXyWDKpJ4Dbe2z8tnDJtF/2AWlso/F6ulkSsLW6pjDM5G3vRjY+bAv4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fN3qXbCc6mcwTlozlt5FR5Zb1IvIga9YeiRA3M73Kpc=; b=EJ/GJWxkPHF7+42VmKFVU0u0dWslon4fk12966ye3kr8wHU8JvqMjV/ntKRnDX6C2Ss7fGBffA7foJAEaHrZFlbb+C303p/lTYDbW+vnW/hF2SU4jf+xtlQ8TzVzxwYg/cYdP/e0F5XMTuUiMyAZasultE06/iz2iOkq7LHX7PaJAzZMof5878duCIxnMuCVnkIfaI8tp14B+tGmUZzvJQ8JMgRAdtQ5hAhDYLy6d2YzlNRYXyAtmgLYC5DZrhhD1CH393vrxpUGlaHabqu9tN12pRDwkk2csAhgjX+AxmjybmwjEcmiUKx4eJBR55zeThI/qHIx5cr7/nYhKz6ulQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fN3qXbCc6mcwTlozlt5FR5Zb1IvIga9YeiRA3M73Kpc=; b=WQWKxhFfJjIAK0FmM/xDOkvTkVuXUkfSCg99zA2xPQaLRAgue4JorcyBtVTrlbJnHwixOEgPRvkNGwd6zHigexs/BsQVZhdGVevli/hZLmlb+Bf2nLPyMmJBz/yUZVp7E4kjCv1JUN59hCgBLUlYJkKKZOFLLVWHT8o4ATKBmSs= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VI1PR08MB2703.eurprd08.prod.outlook.com (2603:10a6:802:25::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Mon, 28 Sep 2020 17:24:03 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.029; Mon, 28 Sep 2020 17:24:03 +0000 From: Tamar Christina To: Tamar Christina , Richard Biener CC: nd , "gcc-patches@gcc.gnu.org" , "ook@ucw.cz" Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding. Thread-Topic: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding. Thread-Index: AQHWk0gPfg3binthXU6SMGDs22R/u6l+AX0AgAAQ82CAAD71gA== Date: Mon, 28 Sep 2020 17:24:03 +0000 Message-ID: References: <20200925142753.GA13692@arm.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: C569A19141223749A0888D31A3DC2644.0 x-checkrecipientchecked: true Authentication-Results-Original: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com; x-originating-ip: [82.24.248.186] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 84e82627-307a-4f4c-09ea-08d863d351e6 x-ms-traffictypediagnostic: VI1PR08MB2703:|AM0PR08MB5170: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 8YJ0lnmQr1k7M68N0wSPKEl3+LUNHN91KAKuAIUoQvftBhtQvUUKXp+ROwqVjzMGup2Tvx6qMNNGAucNfviHKRY0tyZtKmw5qaSH4I3BQ1xu1e1l+1glYD7LoNsV+1/ppdktRA8pbEO53B530KqvEX0YGWo6V8r5pOu47MlbpOfdL/k1vOJ/vqOx/OM002Ph1C7SwUVhrEOGg7B4dHUihwXlzHjg0iRVAIdu/ZA+owYQn+Hrr82wB9I3UvPyKipsBQHpJlspLX1S/I54QN0Ef3MX+x5RZvuEfHjqHk+Exm8AK2+b8vJjaH6B9pEg7e8vwzk5FAOxSc6uL5I2vYqVT8igWY6v9lGtnSTlM4dtTVdtw0+RdR0Yjxn+z74ZHFl4 X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(376002)(346002)(396003)(136003)(366004)(39850400004)(86362001)(83380400001)(9686003)(33656002)(2940100002)(52536014)(30864003)(55016002)(186003)(66946007)(64756008)(66446008)(54906003)(66556008)(66476007)(8676002)(71200400001)(76116006)(316002)(53546011)(6506007)(26005)(110136005)(4326008)(2906002)(478600001)(7696005)(5660300002)(8936002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: yt015EiMCO95rzKP9oL+JtBV7HbFnQUOFyH7R8qiSf5wKbbfVf6ckQB10Roh5mtz0lBa6hN9j53Cx05Wq7LNNfbJJzMbpOlKK+GlGSjje7rhphj/hDvPVLcCqvO86y9Qjblrn5lKhbk41DcqabzRyalK/7TiFb6B7eiI75WwgoZ9f0DDjPbAL8ZZa1q92IL/cWuSpGQMNSPuqAgCthSfxwgxsVcv5DWxgn/IUmsp6TLE2fY1r5EhgEVPUDpTk1T524G9ygGW6Fi+LvobuawO3QFUKJptAt4VK3aFhdHBW0m5CViWd5VCU0S2u40HPRyKk8PlrtpfhEVgYQDqhTtKMkRrtKTgHBevSx+GsgS9u6A7ThqzU+ebAgzYjMrQBw+rxtPL+LpskKQRDDKv3keyxZES6s+uUJ4cgIgPp1VL8+RIPSB8xqvH1o+6teu13+PLhUxhLiQgVZjfAW4hg2oIJ201mKmlZ64nX4dQaSjyrwWkuMa18y2Zjj2G2wehehbVuiFS/WDjrKMXRUaDA1oF1vgz7ec40JDK8BMBnp1ywfGXYUkdkn11DdoDhkC/+JvXcVKPSbXkTJ8d3G8RdByubnmYCRr1Vtp+OuS9PYq0Bd7nrOtxU5653H4+uIy1iGJUmB4Yrt71V+V69wLCvj8txA== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB2703 Original-Authentication-Results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT004.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 21d9535f-d3cb-4616-78e2-08d863d34bf0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /fzhMLbgk1vI4rx/2H6Ibi52IrL8DEq8VRPWS7HZquTBbLUaUPWRmHhiceweNKICdSLsXqNBx9goB6RrijMWzh0/ijVToqyRGgRd4E3f9xfgxHgTQF4tE5Jga+txdQLxHeLbZNXgdnNSsib5VV92jBlvds0OH4O0TGoKpI+3ycrYWSEnvVWmV6O3pfpz8Ypz/cowyv+jEIYhDMEp5AjOV6aWyzpf42hpDCMtg6CGq52Eeg+CQJtUnl/x3jbNjkEXrpSaONnJtf1UMvjq7AVWxH+zoVxR4Ay2vx1NCf+l5FdffGa778jrCJ71RmcLg5d2gyKzrynxm8l0z2Z6BQJYnU76Ex4kqgopbatxUpkI+PLcBEt+HCaC7spUg1wRJk26UoF6Ij+kyV/EMENQ91uQQcUHWrCoosyVdFNHcXftzJFT2hypVk92wltaW6Ab6tto X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39850400004)(396003)(346002)(136003)(376002)(46966005)(2940100002)(6506007)(356005)(53546011)(55016002)(81166007)(26005)(110136005)(82740400003)(33656002)(54906003)(316002)(186003)(8936002)(47076004)(83380400001)(70586007)(7696005)(70206006)(5660300002)(82310400003)(8676002)(478600001)(336012)(9686003)(2906002)(52536014)(30864003)(4326008)(86362001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Sep 2020 17:24:13.6064 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 84e82627-307a-4f4c-09ea-08d863d351e6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT004.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5170 X-Spam-Status: No, score=0.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, INDUSTRIAL_BODY, INDUSTRIAL_SUBJECT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Sep 2020 17:24:18 -0000 > -----Original Message----- > From: Gcc-patches On Behalf Of Tamar > Christina > Sent: Monday, September 28, 2020 3:56 PM > To: Richard Biener > Cc: nd ; gcc-patches@gcc.gnu.org; ook@ucw.cz > Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching > scaffolding. >=20 > Hi Richi, >=20 > Thanks for the review! >=20 > Just some answers to your questions: >=20 > > -----Original Message----- > > From: rguenther@c653.arch.suse.de On > > Behalf Of Richard Biener > > Sent: Monday, September 28, 2020 1:37 PM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; ook@ucw.cz > > Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching > > scaffolding. > > > > On Fri, 25 Sep 2020, Tamar Christina wrote: > > > > > Hi All, > > > > > > This patch adds the basic infrastructure for doing pattern matching > > > on SLP > > trees. > > > This is done immediately after the SLP tree creation because it can > > > change the shape of the tree in radical ways and so we would like to > > > do it before any analysis is performed on the tree. > > > > > > A new file tree-vect-slp-patterns.c is added which contains all the > > > code for pattern matching on SLP trees. > > > > > > This cover letter is short because the changes are heavily commented. > > > > > > All pattern matchers need to implement the abstract type > > VectPatternMatch. > > > The VectSimplePatternMatch abstract class provides some default > > > functionality for pattern matchers that need to rebuild nodes. > > > > > > The pattern matcher requires if replacing a statement in a node, > > > that ALL statements be replaced. > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > > > Ok for master? > > > > + gcall *build () > > + { > > + stmt_vec_info stmt_info; > > + > > > > please define functions out-of-line (apart from the 1-liners) > > > > + /* We have to explicitly mark the old statement as unused > > + because > > during > > + statement analysis the original and new pattern statement may > > require > > + different level of unrolling. As an example add/sub when > > vectorized > > + without a pattern requires 4 copies, whereas with a > > + COMPLEX_ADD > > pattern > > + this only requires 2 copies and the two statement will be > > + treated > > as > > + hand unrolled. That means that the analysis won't happen as > > it'll find > > + a mismatch. So we don't analyze the old statement and if we > > + end > > up > > + needing it, e.g. SLP fails then we have to quickly re-analyze = it. > > */ > > + STMT_VINFO_RELEVANT (stmt_info) =3D vect_unused_in_scope; > > + STMT_VINFO_SLP_VECT_ONLY (call_stmt_info) =3D true; > > + STMT_VINFO_RELATED_STMT (call_stmt_info) =3D stmt_info; > > > > so this means all uses have to be inside the pattern as otherwise > > there may be even non-SLP uses. vect_mark_pattern_stmts supports > > detecting patterns of patterns, I suppose the two-phase analysis for > > SLP patterns does not support this right now? > > > > + SLP_TREE_CODE (this->m_node) =3D gimple_expr_code (call_stmt);; > > > > double ;, just make it CALL_EXPR literally (or leave it ERROR_MARK) > > > > You seem to do in-place changing of the SLP node you match off? >=20 > Yes since this would allow me to change the root node as well, though > thinking about it I can probably do it by passing it as a reference which= then > would allow me to re-use vect_create_new_slp_node which is probably > preferable. >=20 > > > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo, > > &tree_size, bst_map); > > if (node !=3D NULL) > > { > > + /* Temporarily allow add_stmt calls again. */ > > + vinfo->stmt_vec_info_ro =3D false; > > + > > + /* See if any patterns can be found in the constructed SLP tree > > + before we do any analysis on it. */ > > + vect_match_slp_patterns (node, vinfo, group_size, &max_nunits, > > + matches, &npermutes, &tree_size, > > + bst_map); > > + > > + /* After this no more add_stmt calls are allowed. */ > > + vinfo->stmt_vec_info_ro =3D true; > > + > > > > I think this is a bit early to match patterns - I'd defer it to the > > point where all entries into the same SLP subgraph are analyzed, thus > > somewhere at the end of vect_analyze_slp loop over all instances and > > match patterns? That way phases are more clearly separated. >=20 > That would probably work, my only worry is that the SLP analysis itself m= ay > fail and bail out at >=20 > /* If the loads and stores can be handled with load/store-lane > instructions do not generate this SLP instance. */ > if (is_a (vinfo) > && loads_permuted > && dr && vect_store_lanes_supported (vectype, group_size, > false)) >=20 > Which in the initial tree may be true, but in the patterned tree may not = be. > In the previous revision of the patch you had suggested I return a boolea= n > which can be used to cancel such checks. Would that be the preferred > approach? >=20 > > > > Note that fiddling with vinfo->stmt_vec_info_ro is a bit ugly, maybe > > add a - > > >add_pattern_stmt (gimple *pattern_stmt, stmt_vec_info > > orig_stmt) variant that also sets STMT_VINFO_RELATED_STMT but doesn't > > check !stmt_vec_info_ro. That could be used from tree-vect-patterns.c > > as well and we could set stmt_vec_info_ro earlier. > > > > + VectPattern *pattern =3D patt_fn (node, vinfo); uint8_t n =3D > > + pattern->get_arity (); > > + > > + if (group_size % n !=3D 0) > > + { > > + delete pattern; > > > > seems to require VectPattern allocation even upon failure, I suggest > > to return NULL then to avoid excessive allocations. > > > > + if (!pattern->matches (stmt_infos, i)) > > + { > > + /* We can only do replacements for entire groups, we must > > replace all > > + statements in a node as the argument list/children may > > + not > > have > > + equal height then. Operations that don't rewrite the > > arguments > > + may be safe to do, so perhaps paramatrise it. */ > > + > > + found_p =3D false; > > > > I find it a bit ugly to iterate over "unrolls" in the machinery rather > > than the individual pattern matcher which might have an easier and in > > particular cheaper job here. Since you require > > all lanes to match the same pattern anyway. Not sure if your > > later patches support say, mixing complex add with different rotate in > > the same SLP node. >=20 > It does, as the constraint only applies to one pattern matcher class hand= ling > the entire node. >=20 > An example of such case is >=20 > node 0x531a1f0 (max_nunits=3D2, refcnt=3D2) > stmt 0 *_9 =3D _10; > stmt 1 *_15 =3D _16; > stmt 2 *_25 =3D _26; > stmt 3 *_31 =3D _32; > children 0x531a980 > node 0x531a980 (max_nunits=3D2, refcnt=3D2) > stmt 0 slp_patt_112 =3D .COMPLEX_ADD_ROT90 (_4, _14); stmt 1 slp_patt_1= 11 > =3D .COMPLEX_ADD_ROT90 (_12, _8); stmt 2 slp_patt_110 > =3D .COMPLEX_ADD_ROT270 (_20, _30); stmt 3 slp_patt_109 > =3D .COMPLEX_ADD_ROT270 (_28, _24); lane permutation { 0[0] 1[1] 1[2] 0[= 3] } > children 0x5310680 0x530e040 node 0x5310680 (max_nunits=3D2, refcnt=3D4) > stmt 0 _4 =3D *_3; stmt 1 _12 =3D *_11; stmt 2 _20 =3D *_19; stmt 3 _2= 8 =3D *_27; > load permutation { 0 1 2 3 } node 0x530e040 (max_nunits=3D2, refcnt=3D2) = stmt 0 > _14 =3D *_13; stmt 1 _8 =3D *_7; stmt 2 _30 =3D *_29; stmt 3 _24 =3D *= _23; load > permutation { 0 1 2 3 } >=20 > though looking at the resulting assembly the code is incorrect, >=20 > .L6: > ldr q1, [x1, x3] > ldr q0, [x0, x3] > fcadd v0.2d, v0.2d, v1.2d, #270 > str q0, [x2, x3] > ldr q1, [x5, x3] > ldr q0, [x6, x3] > fcadd v0.2d, v0.2d, v1.2d, #270 > str q0, [x4, x3] > add x3, x3, 32 > cmp x3, 1600 > bne .L6 > ret >=20 > Which I assume is because SLP_TREE_REPRESENTATIVE is pointing to the > rotate 270? >=20 > > Note the ultimate idea in the end is that a SLP node can, of course, > > be split into two [but at this point the vector type / unroll factor > > is not final so general splitting at vector boundary is not desired yet= ]. > > The split can be undone for consumers by inserting a VEC_PERM node > > (which should semantically be a concat + select) > > > > + tree type =3D gimple_expr_type (STMT_VINFO_STMT (stmt_info)); > > + tree vectype =3D get_vectype_for_scalar_type (vinfo, type, node)= ; > > > > use > > > > tree vectype =3D SLP_TREE_VECTYPE (node); > > > > generally avoid looking at scalar stmts, iff then look at > > SLP_TREE_REPRESENTATIVE - all lanes have uniform operations applied to > > (but the scalar stmts may not appear to do so! the scalar stmts > > merely stand for their 'def'). > > > > + /* Perform recursive matching, it's important to do this after > > + matching > > things > > + in the current node as the matches here may re-order the nodes > > + below > > it. > > + As such the pattern that needs to be subsequently match may chang= e. > > */ > > + > > + if (SLP_TREE_CHILDREN (node).exists ()) { > > + slp_tree child; > > + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) > > + found_rec_p |=3D vect_match_slp_patterns_2 (child, vinfo, group_= size, > > + patt_fn, max_nunits, > > matches, > > + npermutes, tree_size, > > bst_map); > > + } > > > > > > you definitely need a visited set - you are walking a graph and nodes > > can appear along multiple paths! > > > > + vect_mark_slp_stmts_relevant (node); > > > > that walks the whole subgraph but if you need to do anything you at > > most want to touch the node itself, no? > > > > To make patterns-of-patterns viable you need to do all parts of the > > walk in post-order. What breaks if you do ->matches/->validate in > > post-order? I think that would be more future-proof. >=20 > You lose the ability to match the longest pattern. As an example the comp= lex > add and complex fma patterns overlap. Right now I can try matching the fm= a > first and then add. > But doing it in post order the fma woud never match as the subtree would = be > too small and the add would always match. >=20 Oops, I forgot that this new version tries the same pattern over the entire= tree. So it should work. You would only lose the ability to navigate by SSA name, but it doesn't nee= d that anyway.. I'll make that change and see. Thanks, Tamar > Aside from that it makes it very difficult to rebuild the subtrees as the= SSA > names have changed (since build Is already done in post order), So right = now > I can use e.g. _3, _4 etc, however if the patterns have already been appl= ied I > would need to know what their replacements are since build () would > replace them and you lose the ability to navigate by SSA name. >=20 > Regards, > Tamar >=20 > > > > Otherwise this looks like an OK overall design. > > > > Thanks for working on it! > > > > Richard. > > > > > > > Thanks, > > > Tamar > > > > > > gcc/ChangeLog: > > > > > > * Makefile.in (tree-vect-slp-patterns.o): New. > > > * doc/passes.texi: Update documentation. > > > * tree-vect-slp.c (vect_match_slp_patterns_2, > > vect_match_slp_patterns): > > > New. > > > (vect_analyze_slp_instance): Call pattern matcher. > > > * tree-vectorizer.h (class VectPatternMatch, class VectPattern): New= . > > > * tree-vect-slp-patterns.c: New file. > > > > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > > Nuernberg, Germany; GF: Felix Imend