From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60044.outbound.protection.outlook.com [40.107.6.44]) by sourceware.org (Postfix) with ESMTPS id 2F5CB3947430 for ; Mon, 28 Sep 2020 14:56:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2F5CB3947430 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Dxc5/M3KWp5f8Jmz/yGwlEOKgqZcQMWVTAjnoZdaT8Q=; b=X1OEtUO/zDMVWxWsvX4CKUyION+U9hk/FSt5gVhpCH8IkCDc5n51ICvNSqZJ/j8xnOxuVnSoDpa4Zf2TKdrtqmGE3WwIfpGa+KAr8FIqY46hHA3j2uQ3Xj9oFyideZR1KIYJDRut8oPHqnNilVmJY8aWFfQx3+oEk3yW2yggOaI= Received: from AM5PR1001CA0036.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:206:2::49) by VI1PR0801MB1885.eurprd08.prod.outlook.com (2603:10a6:800:84::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22; Mon, 28 Sep 2020 14:56:27 +0000 Received: from AM5EUR03FT014.eop-EUR03.prod.protection.outlook.com (2603:10a6:206:2:cafe::1a) by AM5PR1001CA0036.outlook.office365.com (2603:10a6:206:2::49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Mon, 28 Sep 2020 14:56:27 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT014.mail.protection.outlook.com (10.152.16.130) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Mon, 28 Sep 2020 14:56:26 +0000 Received: ("Tessian outbound bac899b43a54:v64"); Mon, 28 Sep 2020 14:56:26 +0000 X-CR-MTA-TID: 64aa7808 Received: from 9e90fa835ef2.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 58038486-A551-489C-ABAB-00F0E7205ED0.1; Mon, 28 Sep 2020 14:56:21 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 9e90fa835ef2.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 28 Sep 2020 14:56:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YRFmiScByunlKPmhg4uos0eIzOQbJBxLLIfiIp7GmSixQJilq1vBj/KraUgO2/VIrBR/aCzmHs2d1gjWEqIycVxNS401b1m/wHtCU1rWHTqW2kE9R9GZZC+Owkr5OutQuf5zfK7GL4b+avn6SS+f05gh8y1xW/4QVbaTwmp+8ZFDSjDloFWiFr9wTz0+nqpfD5mXRnyMldxra+5Eqbu49k47jKIKFM8yoj2iffo9GJjoe2sFxE3VxtSB7KzGiYbYzyn2wqK7qIP+hPtd4NsA9fKgUtDN1zn/Br4xB/Ut1LP/ZwMEvTsaKaED+6BQfoFvT0fRo7WLKB/YqP+24s3JBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Dxc5/M3KWp5f8Jmz/yGwlEOKgqZcQMWVTAjnoZdaT8Q=; b=OidHkU12OYoas8i74Th+804P5ZjckhbnqJ0ewqznJ7FYIMtN145tUKaWauidE9eSlAF934ibvykbH+M/bu0+CUlj5zXWU/luCOzd5Cfg7cOj9NDxDmFwVvWa+o11Bw6doV6hCNh9lu+SLyW7fUtDXknYbrkhYdZPdMIRR30bxXuV6aD2P30FwF7imUZI99irivO6FwbvagRC5Wg30t3DI1w48x/VrBNu2VjNuAJIJE/9enya9Q8zUfB/+CGLyeiH+dKrtULPCr/ecS0m6S+6fbQKHa33AzMmPlmyc+AkTYORwvBI2ntIMBJ+k+hTjSTtvpbFGT+KbL4jMRujLwFSyA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Dxc5/M3KWp5f8Jmz/yGwlEOKgqZcQMWVTAjnoZdaT8Q=; b=X1OEtUO/zDMVWxWsvX4CKUyION+U9hk/FSt5gVhpCH8IkCDc5n51ICvNSqZJ/j8xnOxuVnSoDpa4Zf2TKdrtqmGE3WwIfpGa+KAr8FIqY46hHA3j2uQ3Xj9oFyideZR1KIYJDRut8oPHqnNilVmJY8aWFfQx3+oEk3yW2yggOaI= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VI1PR08MB3775.eurprd08.prod.outlook.com (2603:10a6:803:bc::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.28; Mon, 28 Sep 2020 14:56:20 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.029; Mon, 28 Sep 2020 14:56:20 +0000 From: Tamar Christina To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , nd , "ook@ucw.cz" Subject: RE: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding. Thread-Topic: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding. Thread-Index: AQHWk0gPfg3binthXU6SMGDs22R/u6l+AX0AgAAQ82A= Date: Mon, 28 Sep 2020 14:56:19 +0000 Message-ID: References: <20200925142753.GA13692@arm.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 135C4DE300024A4A8766B01A120477C2.0 x-checkrecipientchecked: true Authentication-Results-Original: suse.de; dkim=none (message not signed) header.d=none;suse.de; dmarc=none action=none header.from=arm.com; x-originating-ip: [82.24.248.186] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 15e83921-3509-4b52-80ad-08d863bead02 x-ms-traffictypediagnostic: VI1PR08MB3775:|VI1PR0801MB1885: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 4ztrotfKqRERG1eeleoFTmFi+dninzSEOlZKF2GJEBjO7dnb4MLuLbicDnh6+EaqIHgZJVw5P68YfU7R0MSfhYChxHbFtqk4E3Fk11HqFSRTnWWo5AXWYHvVCy9mslO4cbzW0G8ajYLISsLzZUzid+/wSz7Sp23HBk85yVR2B/mFzUahRjHV8yLRPzodw2yEc2eyXiGlgED9T9iE8U2BKHWjks0Rvs0yZRuTHrK5RQxrLX0xBEX9hKH5GSltDzc46wJ2s/dan/xVuzbwKe5bbvGxspwro9t6maYE/hY/GL0i35zGFZv0D4gnCqemBBOGnm3YYTEl9ISxAuPiHpdTg21MpyKOaqkILf4P+VoNEFe3ejN4zwPcXnmcin35Yudg X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(376002)(39860400002)(136003)(396003)(346002)(366004)(186003)(83380400001)(2906002)(55016002)(71200400001)(6506007)(7696005)(53546011)(478600001)(54906003)(316002)(8676002)(86362001)(76116006)(8936002)(33656002)(4326008)(26005)(66446008)(64756008)(66556008)(66476007)(9686003)(66946007)(52536014)(5660300002)(6916009); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: 5dMIvYUxmF5WwVMfJ9hWWRbUbYYtw7fiE+d/C9oX3Tz6445vtQZ0BBTq0ITWcc66XqBPcwDKZe3jzX6kf7YCOu/fpA9YKTvHfipHBCEErmEXZ5YeV0bwYdd8sHHrzKvD3JEymnrEbduS0NB0yfLMGVGEK1/niNth+8aO2+Akg5gNTf7ek33LHE36EU+aKo4v6tj1Cf05ZSADw5LbD9Tcf1U4J+XoSTUKaV3tNhrZn3eglHBPyXgfJMzcZ7zh/lRZ1NNG+YhtGjOCziWsHYGePy0JcAhxoxPwq+lOnwo/zjadkfz01egGxQvUyRiYsFqAiw83SsEhGyqXRIRU4n+qv5pXuR8WARCFGQqXc3ZEm9EyDjnt+uFTT8w3qZ2XdGoNoi5FfSZ5VrzBHAVU8amTPh6djdReqW5reqa/oa6i8pkhNIAWNuXreC0W/jaukynCCUuJGfOFrABZWuksh7NGXqHTr0m/56gASzcmueTaxDg88Llc5k/uGRzR1Lcc3lfIYuh55Thzg5Gb16l11itg3sEPQERhGhqfvJQR2r/CaYgOeuxmkiMCjF5dZCfiB8jKFjAHM5M1dpAjEGg6a/JODa6EciSWo/rV8fL+Zbbginl7ZdqSE1xenne8ogtbbmySwNHWjQCLBB/rtDk2WKATjA== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB3775 Original-Authentication-Results: suse.de; dkim=none (message not signed) header.d=none;suse.de; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT014.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: ff00d2b9-3533-4cd5-faa8-08d863bea8da X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /mpt8OjOFDFYFUtsU4+bPQSMVFsK9igpynjZ2p/6hGh1p1OLLSiJEFtuGwMGACfjw7RfW1cZjHSY9d8y2D+YO/DgYlefr8UluEQtq+iO0PwhHIslHGR7lkOrFZVicfDHeDHsySwf2G5aGJioT87rhvelXbej12KlR8RiLU6pq+4nzqwjie65sQ2C41hxgMgjhuXoLLsQuL3lUiAcyMEfulZ6PjekiEOaAj+nhhPbEWDEmS5OFhO/NZ+kEpPdzbpyNtShlfi8cL9us9AOdLDpba2sLjaov08FwqlTf7THNdNfcaNJPNL47lJp1Tu6FuKYEcKn7lo+PkuqK4zDyE/qm6R/20mEm+z45GP+PaiggybDeJYdMLLHRThTFla8D0ZVcH7STmylUoaTQBPRN6B3JONHVVGA7rgfgL5LOSflLONsfmCMnnV9rYEaBgyP1/Y2 X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(136003)(376002)(346002)(39860400002)(396003)(46966005)(9686003)(70586007)(47076004)(316002)(82740400003)(53546011)(6506007)(2906002)(4326008)(36906005)(82310400003)(7696005)(6862004)(81166007)(55016002)(8936002)(8676002)(70206006)(336012)(83380400001)(26005)(52536014)(33656002)(86362001)(5660300002)(54906003)(356005)(478600001)(186003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Sep 2020 14:56:26.9744 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 15e83921-3509-4b52-80ad-08d863bead02 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT014.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1885 X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, INDUSTRIAL_BODY, INDUSTRIAL_SUBJECT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Sep 2020 14:56:32 -0000 Hi Richi, Thanks for the review!=20 Just some answers to your questions: > -----Original Message----- > From: rguenther@c653.arch.suse.de On > Behalf Of Richard Biener > Sent: Monday, September 28, 2020 1:37 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; ook@ucw.cz > Subject: Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching > scaffolding. >=20 > On Fri, 25 Sep 2020, Tamar Christina wrote: >=20 > > Hi All, > > > > This patch adds the basic infrastructure for doing pattern matching on = SLP > trees. > > This is done immediately after the SLP tree creation because it can > > change the shape of the tree in radical ways and so we would like to > > do it before any analysis is performed on the tree. > > > > A new file tree-vect-slp-patterns.c is added which contains all the > > code for pattern matching on SLP trees. > > > > This cover letter is short because the changes are heavily commented. > > > > All pattern matchers need to implement the abstract type > VectPatternMatch. > > The VectSimplePatternMatch abstract class provides some default > > functionality for pattern matchers that need to rebuild nodes. > > > > The pattern matcher requires if replacing a statement in a node, that > > ALL statements be replaced. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? >=20 > + gcall *build () > + { > + stmt_vec_info stmt_info; > + >=20 > please define functions out-of-line (apart from the 1-liners) >=20 > + /* We have to explicitly mark the old statement as unused because > during > + statement analysis the original and new pattern statement may > require > + different level of unrolling. As an example add/sub when > vectorized > + without a pattern requires 4 copies, whereas with a COMPLEX_ADD > pattern > + this only requires 2 copies and the two statement will be > + treated > as > + hand unrolled. That means that the analysis won't happen as > it'll find > + a mismatch. So we don't analyze the old statement and if we > + end > up > + needing it, e.g. SLP fails then we have to quickly re-analyze it= . > */ > + STMT_VINFO_RELEVANT (stmt_info) =3D vect_unused_in_scope; > + STMT_VINFO_SLP_VECT_ONLY (call_stmt_info) =3D true; > + STMT_VINFO_RELATED_STMT (call_stmt_info) =3D stmt_info; >=20 > so this means all uses have to be inside the pattern as otherwise there m= ay > be even non-SLP uses. vect_mark_pattern_stmts supports detecting > patterns of patterns, I suppose the two-phase analysis for SLP patterns d= oes > not support this right now? >=20 > + SLP_TREE_CODE (this->m_node) =3D gimple_expr_code (call_stmt);; >=20 > double ;, just make it CALL_EXPR literally (or leave it ERROR_MARK) >=20 > You seem to do in-place changing of the SLP node you match off? Yes since this would allow me to change the root node as well, though thinking about it I can probably do it by passing it as a reference which then would allow me to re-use vect_create_new_slp_node which is probably preferable.=20 >=20 > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo, > &tree_size, bst_map); > if (node !=3D NULL) > { > + /* Temporarily allow add_stmt calls again. */ > + vinfo->stmt_vec_info_ro =3D false; > + > + /* See if any patterns can be found in the constructed SLP tree > + before we do any analysis on it. */ > + vect_match_slp_patterns (node, vinfo, group_size, &max_nunits, > + matches, &npermutes, &tree_size, > + bst_map); > + > + /* After this no more add_stmt calls are allowed. */ > + vinfo->stmt_vec_info_ro =3D true; > + >=20 > I think this is a bit early to match patterns - I'd defer it to the point= where all > entries into the same SLP subgraph are analyzed, thus somewhere at the > end of vect_analyze_slp loop over all instances and match patterns? That > way phases are more clearly separated. That would probably work, my only worry is that the SLP analysis itself may= fail and bail out at=20 /* If the loads and stores can be handled with load/store-lane instructions do not generate this SLP instance. */ if (is_a (vinfo) && loads_permuted && dr && vect_store_lanes_supported (vectype, group_size, false)) Which in the initial tree may be true, but in the patterned tree may not be= . In the previous revision of the patch you had suggested I return a boolean which can be use= d to cancel such checks. Would that be the preferred approach? >=20 > Note that fiddling with vinfo->stmt_vec_info_ro is a bit ugly, maybe add = a - > >add_pattern_stmt (gimple *pattern_stmt, stmt_vec_info > orig_stmt) variant that also sets STMT_VINFO_RELATED_STMT but doesn't > check !stmt_vec_info_ro. That could be used from tree-vect-patterns.c as > well and we could set stmt_vec_info_ro earlier. >=20 > + VectPattern *pattern =3D patt_fn (node, vinfo); uint8_t n =3D > + pattern->get_arity (); > + > + if (group_size % n !=3D 0) > + { > + delete pattern; >=20 > seems to require VectPattern allocation even upon failure, I suggest to > return NULL then to avoid excessive allocations. >=20 > + if (!pattern->matches (stmt_infos, i)) > + { > + /* We can only do replacements for entire groups, we must > replace all > + statements in a node as the argument list/children may not > have > + equal height then. Operations that don't rewrite the > arguments > + may be safe to do, so perhaps paramatrise it. */ > + > + found_p =3D false; >=20 > I find it a bit ugly to iterate over "unrolls" in the machinery rather th= an the > individual pattern matcher which might have an easier and in particular > cheaper job here. Since you require > all lanes to match the same pattern anyway. Not sure if your > later patches support say, mixing complex add with different rotate in th= e > same SLP node. It does, as the constraint only applies to one pattern matcher class handli= ng the entire node. An example of such case is node 0x531a1f0 (max_nunits=3D2, refcnt=3D2) stmt 0 *_9 =3D _10; stmt 1 *_15 =3D _16; stmt 2 *_25 =3D _26; stmt 3 *_31 =3D _32; children 0x531a980 node 0x531a980 (max_nunits=3D2, refcnt=3D2) stmt 0 slp_patt_112 =3D .COMPLEX_ADD_ROT90 (_4, _14); stmt 1 slp_patt_111 =3D .COMPLEX_ADD_ROT90 (_12, _8); stmt 2 slp_patt_110 =3D .COMPLEX_ADD_ROT270 (_20, _30); stmt 3 slp_patt_109 =3D .COMPLEX_ADD_ROT270 (_28, _24); lane permutation { 0[0] 1[1] 1[2] 0[3] } children 0x5310680 0x530e040 node 0x5310680 (max_nunits=3D2, refcnt=3D4) stmt 0 _4 =3D *_3; stmt 1 _12 =3D *_11; stmt 2 _20 =3D *_19; stmt 3 _28 =3D *_27; load permutation { 0 1 2 3 } node 0x530e040 (max_nunits=3D2, refcnt=3D2) stmt 0 _14 =3D *_13; stmt 1 _8 =3D *_7; stmt 2 _30 =3D *_29; stmt 3 _24 =3D *_23; load permutation { 0 1 2 3 } though looking at the resulting assembly the code is incorrect, .L6: ldr q1, [x1, x3] ldr q0, [x0, x3] fcadd v0.2d, v0.2d, v1.2d, #270 str q0, [x2, x3] ldr q1, [x5, x3] ldr q0, [x6, x3] fcadd v0.2d, v0.2d, v1.2d, #270 str q0, [x4, x3] add x3, x3, 32 cmp x3, 1600 bne .L6 ret Which I assume is because SLP_TREE_REPRESENTATIVE is pointing to the rotate= 270? > Note the ultimate idea in the end is that a SLP node can, of > course, be split into two [but at this point the vector type / unroll fac= tor is not > final so general splitting at vector boundary is not desired yet]. > The split can be undone for consumers by inserting a VEC_PERM node (which > should semantically be a concat + select) >=20 > + tree type =3D gimple_expr_type (STMT_VINFO_STMT (stmt_info)); > + tree vectype =3D get_vectype_for_scalar_type (vinfo, type, node); >=20 > use >=20 > tree vectype =3D SLP_TREE_VECTYPE (node); >=20 > generally avoid looking at scalar stmts, iff then look at > SLP_TREE_REPRESENTATIVE - all lanes have uniform operations applied to > (but the scalar stmts may not appear to do so! the scalar stmts merely s= tand > for their 'def'). >=20 > + /* Perform recursive matching, it's important to do this after > + matching > things > + in the current node as the matches here may re-order the nodes > + below > it. > + As such the pattern that needs to be subsequently match may change. > */ > + > + if (SLP_TREE_CHILDREN (node).exists ()) { > + slp_tree child; > + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) > + found_rec_p |=3D vect_match_slp_patterns_2 (child, vinfo, group_si= ze, > + patt_fn, max_nunits, > matches, > + npermutes, tree_size, > bst_map); > + } >=20 >=20 > you definitely need a visited set - you are walking a graph and nodes can > appear along multiple paths! >=20 > + vect_mark_slp_stmts_relevant (node); >=20 > that walks the whole subgraph but if you need to do anything you at most > want to touch the node itself, no? >=20 > To make patterns-of-patterns viable you need to do all parts of the walk = in > post-order. What breaks if you do ->matches/->validate in post-order? I > think that would be more future-proof. You lose the ability to match the longest pattern. As an example the comple= x add and complex fma patterns overlap. Right now I can try matching the fma first an= d then add. But doing it in post order the fma woud never match as the subtree would be= too small and the add would always match. Aside from that it makes it very difficult to rebuild the subtrees as the S= SA names have changed (since build Is already done in post order), So right now I can use e.g. _3, _4 etc, however if the patterns have alread= y been applied I would need to know what their replacements are since build () would replace them = and you lose the ability to navigate by SSA name. Regards, Tamar >=20 > Otherwise this looks like an OK overall design. >=20 > Thanks for working on it! >=20 > Richard. >=20 >=20 > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * Makefile.in (tree-vect-slp-patterns.o): New. > > * doc/passes.texi: Update documentation. > > * tree-vect-slp.c (vect_match_slp_patterns_2, > vect_match_slp_patterns): > > New. > > (vect_analyze_slp_instance): Call pattern matcher. > > * tree-vectorizer.h (class VectPatternMatch, class VectPattern): New. > > * tree-vect-slp-patterns.c: New file. > > > > >=20 > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > Nuernberg, Germany; GF: Felix Imend