From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on2070.outbound.protection.outlook.com [40.107.13.70]) by sourceware.org (Postfix) with ESMTPS id 5C8C43858CDA for ; Wed, 11 Oct 2023 10:45:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C8C43858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Wgwlgp6stT8dcokkdCsMmd3mLPIUsZqt+pzf/9wmTeY=; b=xglT4Lsd2Eoul5P6l+9eofWOWMBrcn11G/zDVDsiJnicy9H4SUcdk77vcWVR795t2xj1DPtEycoQHgBeJKrXOZm39Cn3k3bpKqYwYR8jSb3PaN/shjYoxpDsKRCavBdoRonv0mEDjaAAUvTu7MWpfsFRmurQ/trJA9fwYMHnzS8= Received: from AS9PR04CA0111.eurprd04.prod.outlook.com (2603:10a6:20b:531::6) by PA4PR08MB5901.eurprd08.prod.outlook.com (2603:10a6:102:ed::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.43; Wed, 11 Oct 2023 10:45:15 +0000 Received: from AM7EUR03FT040.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:531:cafe::fa) by AS9PR04CA0111.outlook.office365.com (2603:10a6:20b:531::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.27 via Frontend Transport; Wed, 11 Oct 2023 10:45:14 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT040.mail.protection.outlook.com (100.127.140.128) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.23 via Frontend Transport; Wed, 11 Oct 2023 10:45:14 +0000 Received: ("Tessian outbound 0ae75d4034ba:v211"); Wed, 11 Oct 2023 10:45:14 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: c5ad5073ae6343ef X-CR-MTA-TID: 64aa7808 Received: from b22c38b57855.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 52890EEE-508C-4655-AD3D-559D795F0ED6.1; Wed, 11 Oct 2023 10:45:08 +0000 Received: from EUR02-AM0-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id b22c38b57855.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 11 Oct 2023 10:45:08 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vbw/FyPt860wtuQD8hQ9r0FmmWBIOVUl7+ZuElUOI3VzyQPr24d6rrtVl/PlLi3Ygs0Zs1sGzMlnRE3bc+3roOzyKkfltb1BtvtOyvX9mqmVxL5UUjgJAC8EChfPb16cx5QCCtqvR29Ckb0QKvjrrgJ+vgXuAzVPt7thbwGsVCnIQfacZ5BR11oKN+oDcyWXCUAa019QE2gijLl3JJkz/JvbR7RcplZwrtTR90n7OWjLcEeUvnClbIWmWXcozsAPGln5vjX0FFUnWyDdDWW3tvsxeWkt5kxcGXtDRbFHpK5CAL0WAOoM0ZWUUQKpHJW6lo1VjYEBXZA/fqYaiJkGQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Wgwlgp6stT8dcokkdCsMmd3mLPIUsZqt+pzf/9wmTeY=; b=ImWFlrirnN0J16lucAy8++UxtbhsCq0kpflGNeD+4Lzj2dXOBq6pjAqc0f/xy/rlLCEJ1a/6HJqwDX5eCo6DNcwy4778om1jY1nkUSmsPx97xeRiNAMMeFs730GXLl2J2FrjT1VMwnGPZHwQa1+O1dgBUNobzw3l9dNW3JEXodezd9ZkD5Xx4yo2Hiwj5ZXSxRttoPyZPfEHGIwdAL1kX0ZL6QyoA/OyEmEhM/HdXo0Qy7p1/RAXwZ0lDdRIeX95+Bfol4e6oMZGvAQp5BZwBRFrRhpeBuReSusIz4EXk1WTEHjEjeM2xHf4n4IBf6pF0zuKZjA8eAhxE9ysw1JXcQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Wgwlgp6stT8dcokkdCsMmd3mLPIUsZqt+pzf/9wmTeY=; b=xglT4Lsd2Eoul5P6l+9eofWOWMBrcn11G/zDVDsiJnicy9H4SUcdk77vcWVR795t2xj1DPtEycoQHgBeJKrXOZm39Cn3k3bpKqYwYR8jSb3PaN/shjYoxpDsKRCavBdoRonv0mEDjaAAUvTu7MWpfsFRmurQ/trJA9fwYMHnzS8= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PAVPR08MB9770.eurprd08.prod.outlook.com (2603:10a6:102:31e::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6863.37; Wed, 11 Oct 2023 10:45:06 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bba1:2711:6992:468d]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bba1:2711:6992:468d%4]) with mapi id 15.20.6863.032; Wed, 11 Oct 2023 10:45:06 +0000 From: Tamar Christina To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables Thread-Topic: [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables Thread-Index: AQHZ9QPXL3Gh7PjWnU2eTFozmUnTSrBBgNYAgALyEuA= Date: Wed, 11 Oct 2023 10:45:05 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|PAVPR08MB9770:EE_|AM7EUR03FT040:EE_|PA4PR08MB5901:EE_ X-MS-Office365-Filtering-Correlation-Id: 2d380b00-309c-4f55-f9ce-08dbca4726fb x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Fm1lbN56nROj2t3AIP+FKKD417Dtrg9+WtS1XKNi5Zu6+Jc1aAT9NZKZOUFy7bEgb+sUZwPuEKqoA8FH69xeDrP9OoP+J1JEyC5Yuddu52GYZlpbUm3kHtAsx3nnp5oDzSzRQpQZ/ozhUW7L7EYmddLRebvCLdxMNhH3dBNvtBhgFFCvoOTpRo0VaOmKa8rE+uAyyQzpgb2fDDNM+y64w1nP6RwxbS/jEOWpPqMobVOzz8oITL9noesH0uLl//A+RfR3VpPwmjfJmfq+zD4rTucEP6bLiNN/6C7obmLKh2ux4mJDT4qzPkCsLFM6zC62BpyugZlWhvhIQlm/nZ1BhiIdOykrJ041rarDF6V1ojGADDx+KqpkaIxsF3EXbGxwTrX2bI4470qHxW/jovgpYUeVhAlEeHiZF9J0llc3E9qN/fiBnjo9RJqAa58Yts8olxufm5z0zbV+uzGYd2lJTYoGz1LZ+25XWhjwLiHgSujNOwZ+kaGyKNGCoI1tse1Q9Z6vt2Hiy6c5963hgMeAvnZ+6cLgV7P9Bct2gZfbdTh6o7lzHUuTV6+KmGATR3SnTFBd23+Nju3lRpCC/Lmbn/B8N9QDjJOPUA0xxjCthb2efXO25ppD2LLpcIYshsyE X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(136003)(39860400002)(376002)(396003)(366004)(346002)(230922051799003)(451199024)(186009)(1800799009)(64100799003)(55016003)(6506007)(7696005)(9686003)(30864003)(71200400001)(66556008)(2906002)(8676002)(66446008)(64756008)(5660300002)(66476007)(76116006)(54906003)(6916009)(52536014)(4326008)(66946007)(8936002)(83380400001)(26005)(38070700005)(316002)(38100700002)(122000001)(86362001)(33656002)(478600001)(41300700001)(579004);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB9770 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT040.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 8153c40e-73e7-4afb-aeff-08dbca4721af X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: rQz83T0BElmlzqV27zE3JoHZeosYNjRpcmCvosJnvxFTje/NJ0QNVA7doqv4HvAVKJOF7W63G+8D9LT5he5SOklMExWy15DUpkrpOCjdtnhqeYfB15x+jHPJHdrkov0q5iLSgiB0UHuRwJSMZOPtHjWR59Fy6rL9yLx5OwjuOMQC6F6ajpTFtzr/TTb1wDBofLDubFIUmT40o+VMxNkGUcWQ2gVnu6GcTQ6iVO9KABZPpQgWwLMzOqbAqW5QwxBHYEulgC6ChvBlKmanvMy3Oo/iu0sbtE/jCNqY7MFXXUnf3gaJZs86QM8JJ9R3nJt4WimchjrpyWrw9pHo+TR2z3p87EFtgS2IFCw5ZzJhKIRAh/BnFO324sKpI3CqHYnGmW3mGu/7tTCUsf5+2PhBQj2QKlW/Zn/3w9mplJliNX95WWj1grzD31nv/K5UKYSviJMJ9dzE2u0aztkxY6kUV5b/mLlqyWw0sPqv36Lr9Ci24cjq8GVkJJhsQfufVFro9RivIh6vmQ9xwrNokgEduAz7KndFw6Jd1voZKa6WxD4meo0jobH7NHaJ2Eye73KxkQdxctFyAh7U+SEZcgW9c9Yje3rFwDZ+BNgJHwiIlWCrmtT0NjGqdROMdrUnb/bOXIRuUJUQk/oEvmrGD5Np+VIXp+IBmUQNOM7G+SiN/4mZ6PeUO5t2qSkd97mwWRt9lYR7oXq0TArtFbzdtj2Bdg7ei4lsOA6bfwuU531aCdwjmH76I6sLARr3pRXXst5C X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(376002)(346002)(39860400002)(136003)(396003)(230922051799003)(451199024)(64100799003)(1800799009)(82310400011)(186009)(46966006)(36840700001)(40470700004)(40480700001)(86362001)(55016003)(82740400003)(9686003)(356005)(107886003)(7696005)(6506007)(47076005)(478600001)(36860700001)(83380400001)(336012)(26005)(81166007)(2906002)(70206006)(54906003)(316002)(8936002)(8676002)(41300700001)(4326008)(70586007)(30864003)(52536014)(5660300002)(33656002)(6862004)(40460700003);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Oct 2023 10:45:14.7812 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2d380b00-309c-4f55-f9ce-08dbca4726fb X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT040.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB5901 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > > @@ -2664,7 +2679,7 @@ slpeel_update_phi_nodes_for_loops > (loop_vec_info loop_vinfo, > > for correct vectorization of live stmts. */ > > if (loop =3D=3D first) > > { > > - basic_block orig_exit =3D single_exit (second)->dest; > > + basic_block orig_exit =3D second_loop_e->dest; > > for (gsi_orig =3D gsi_start_phis (orig_exit); > > !gsi_end_p (gsi_orig); gsi_next (&gsi_orig)) > > { > > @@ -2673,13 +2688,14 @@ slpeel_update_phi_nodes_for_loops > (loop_vec_info loop_vinfo, > > if (TREE_CODE (orig_arg) !=3D SSA_NAME || virtual_operand_p > (orig_arg)) > > continue; > > > > + const_edge exit_e =3D LOOP_VINFO_IV_EXIT (loop_vinfo); > > /* Already created in the above loop. */ > > - if (find_guard_arg (first, second, orig_phi)) > > + if (find_guard_arg (first, second, exit_e, orig_phi)) > > continue; > > > > tree new_res =3D copy_ssa_name (orig_arg); > > gphi *lcphi =3D create_phi_node (new_res, between_bb); > > - add_phi_arg (lcphi, orig_arg, single_exit (first), > UNKNOWN_LOCATION); > > + add_phi_arg (lcphi, orig_arg, first_loop_e, UNKNOWN_LOCATION); > > } > > } > > } > > @@ -2847,7 +2863,8 @@ slpeel_update_phi_nodes_for_guard2 (class loop > *loop, class loop *epilog, > > if (!merge_arg) > > merge_arg =3D old_arg; > > > > - tree guard_arg =3D find_guard_arg (loop, epilog, update_phi); > > + tree guard_arg > > + =3D find_guard_arg (loop, epilog, single_exit (loop), update_phi); >=20 > missed adjustment? you are introducing a single_exit call here ... >=20 It's a very temporary one that gets removed in patch 3/3 when I start passing the rest of the edges down explicitly. It allowed me to split the patches a bit more. > > /* If the var is live after loop but not a reduction, we simply > > use the old arg. */ > > if (!guard_arg) > > @@ -3201,27 +3218,37 @@ vect_do_peeling (loop_vec_info loop_vinfo, > tree niters, tree nitersm1, > > } > > > > if (vect_epilogues) > > - /* Make sure to set the epilogue's epilogue scalar loop, such that= we can > > - use the original scalar loop as remaining epilogue if necessary= . */ > > - LOOP_VINFO_SCALAR_LOOP (epilogue_vinfo) > > - =3D LOOP_VINFO_SCALAR_LOOP (loop_vinfo); > > + { > > + /* Make sure to set the epilogue's epilogue scalar loop, such th= at we can > > + use the original scalar loop as remaining epilogue if necessary. */ > > + LOOP_VINFO_SCALAR_LOOP (epilogue_vinfo) > > + =3D LOOP_VINFO_SCALAR_LOOP (loop_vinfo); > > + LOOP_VINFO_SCALAR_IV_EXIT (epilogue_vinfo) > > + =3D LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo); > > + } > > > > if (prolog_peeling) > > { > > e =3D loop_preheader_edge (loop); > > - gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, e)); > > + edge exit_e =3D LOOP_VINFO_IV_EXIT (loop_vinfo); > > + gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, exit_e, > > + e)); > > > > /* Peel prolog and put it on preheader edge of loop. */ > > - prolog =3D slpeel_tree_duplicate_loop_to_edge_cfg (loop, scalar_= loop, e); > > + edge scalar_e =3D LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo); > > + edge prolog_e =3D NULL; > > + prolog =3D slpeel_tree_duplicate_loop_to_edge_cfg (loop, exit_e, > > + scalar_loop, scalar_e, > > + e, &prolog_e); > > gcc_assert (prolog); > > prolog->force_vectorize =3D false; > > - slpeel_update_phi_nodes_for_loops (loop_vinfo, prolog, loop, tru= e); > > + slpeel_update_phi_nodes_for_loops (loop_vinfo, prolog, prolog_e,= loop, > > + exit_e, true); > > first_loop =3D prolog; > > reset_original_copy_tables (); > > > > /* Update the number of iterations for prolog loop. */ > > tree step_prolog =3D build_one_cst (TREE_TYPE (niters_prolog)); > > - vect_set_loop_condition (prolog, NULL, niters_prolog, > > + vect_set_loop_condition (prolog, prolog_e, loop_vinfo, > > + niters_prolog, > > step_prolog, NULL_TREE, false); > > > > /* Skip the prolog loop. */ > > @@ -3275,8 +3302,8 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree > > niters, tree nitersm1, > > > > if (epilog_peeling) > > { > > - e =3D single_exit (loop); > > - gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, e)); > > + e =3D LOOP_VINFO_IV_EXIT (loop_vinfo); > > + gcc_checking_assert (slpeel_can_duplicate_loop_p (loop, e, e)); > > > > /* Peel epilog and put it on exit edge of loop. If we are vecto= rizing > > said epilog then we should use a copy of the main loop as a > > starting @@ -3285,12 +3312,18 @@ vect_do_peeling (loop_vec_info > loop_vinfo, tree niters, tree nitersm1, > > If we are not vectorizing the epilog then we should use the scalar l= oop > > as the transformations mentioned above make less or no sense when > not > > vectorizing. */ > > + edge scalar_e =3D LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo); > > epilog =3D vect_epilogues ? get_loop_copy (loop) : scalar_loop; > > - epilog =3D slpeel_tree_duplicate_loop_to_edge_cfg (loop, epilog,= e); > > + edge epilog_e =3D vect_epilogues ? e : scalar_e; > > + edge new_epilog_e =3D NULL; > > + epilog =3D slpeel_tree_duplicate_loop_to_edge_cfg (loop, e, epil= og, > > + epilog_e, e, > > + &new_epilog_e); > > + LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo) =3D new_epilog_e; > > gcc_assert (epilog); > > - > > epilog->force_vectorize =3D false; > > - slpeel_update_phi_nodes_for_loops (loop_vinfo, loop, epilog, fal= se); > > + slpeel_update_phi_nodes_for_loops (loop_vinfo, loop, e, epilog, > > + new_epilog_e, false); > > bb_before_epilog =3D loop_preheader_edge (epilog)->src; > > > > /* Scalar version loop may be preferred. In this case, add > > guard @@ -3374,16 +3407,16 @@ vect_do_peeling (loop_vec_info > loop_vinfo, tree niters, tree nitersm1, > > { > > guard_cond =3D fold_build2 (EQ_EXPR, boolean_type_node, > > niters, niters_vector_mult_vf); > > - guard_bb =3D single_exit (loop)->dest; > > - guard_to =3D split_edge (single_exit (epilog)); > > + guard_bb =3D LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; > > + edge epilog_e =3D LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo); > > + guard_to =3D split_edge (epilog_e); > > guard_e =3D slpeel_add_loop_guard (guard_bb, guard_cond, guard_to, > > skip_vector ? anchor : guard_bb, > > prob_epilog.invert (), > > irred_flag); > > if (vect_epilogues) > > epilogue_vinfo->skip_this_loop_edge =3D guard_e; > > - slpeel_update_phi_nodes_for_guard2 (loop, epilog, guard_e, > > - single_exit (epilog)); > > + slpeel_update_phi_nodes_for_guard2 (loop, epilog, guard_e, > > +epilog_e); > > /* Only need to handle basic block before epilog loop if it's not > > the guard_bb, which is the case when skip_vector is true. */ > > if (guard_bb !=3D bb_before_epilog) > > @@ -3416,6 +3449,8 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree > niters, tree nitersm1, > > { > > epilog->aux =3D epilogue_vinfo; > > LOOP_VINFO_LOOP (epilogue_vinfo) =3D epilog; > > + LOOP_VINFO_IV_EXIT (epilogue_vinfo) > > + =3D LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo); > > > > loop_constraint_clear (epilog, LOOP_C_INFINITE); > > > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index > > > 23c6e8259e7b133cd7acc6bcf0bad26423e9993a..6e60d84143626a8e1d80 > 1bb580f4 > > dcebc73c7ba7 100644 > > --- a/gcc/tree-vect-loop.cc > > +++ b/gcc/tree-vect-loop.cc > > @@ -855,10 +855,9 @@ vect_fixup_scalar_cycles_with_patterns > > (loop_vec_info loop_vinfo) > > > > > > static gcond * > > -vect_get_loop_niters (class loop *loop, tree *assumptions, > > +vect_get_loop_niters (class loop *loop, edge exit, tree *assumptions, > > tree *number_of_iterations, tree > *number_of_iterationsm1) { > > - edge exit =3D single_exit (loop); > > class tree_niter_desc niter_desc; > > tree niter_assumptions, niter, may_be_zero; > > gcond *cond =3D get_loop_exit_condition (loop); @@ -927,6 +926,20 @@ > > vect_get_loop_niters (class loop *loop, tree *assumptions, > > return cond; > > } > > > > +/* Determine the main loop exit for the vectorizer. */ > > + > > +edge >=20 > can't this be 'static'? No since it's used by set_uid_loop_bbs which is setting the loop out of get= _loop. If I understand correctly the expected loop from this is the ifcvt loop? If= that's the case I may be able to match it up through the ->aux again but since set_uid= _loop_bbs isn't called often I figure I can just re-analyze. Regards, Tamar >=20 > > +vec_init_loop_exit_info (class loop *loop) { > > + /* Before we begin we must first determine which exit is the main on= e and > > + which are auxilary exits. */ > > + auto_vec exits =3D get_loop_exit_edges (loop); > > + if (exits.length () =3D=3D 1) > > + return exits[0]; > > + else > > + return NULL; > > +} > > + > > /* Function bb_in_loop_p > > > > Used as predicate for dfs order traversal of the loop bbs. */ @@ > > -987,7 +1000,10 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, > vec_info_shared *shared) > > has_mask_store (false), > > scalar_loop_scaling (profile_probability::uninitialized ()), > > scalar_loop (NULL), > > - orig_loop_info (NULL) > > + orig_loop_info (NULL), > > + vec_loop_iv (NULL), > > + vec_epilogue_loop_iv (NULL), > > + scalar_loop_iv (NULL) > > { > > /* CHECKME: We want to visit all BBs before their successors (except= for > > latch blocks, for which this assertion wouldn't hold). In the > > simple @@ -1646,6 +1662,18 @@ vect_analyze_loop_form (class loop > > *loop, vect_loop_form_info *info) { > > DUMP_VECT_SCOPE ("vect_analyze_loop_form"); > > > > + edge exit_e =3D vec_init_loop_exit_info (loop); > > + if (!exit_e) > > + return opt_result::failure_at (vect_location, > > + "not vectorized:" > > + " could not determine main exit from" > > + " loop with multiple exits.\n"); > > + info->loop_exit =3D exit_e; > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_NOTE, vect_location, > > + "using as main loop exit: %d -> %d [AUX: %p]\n", > > + exit_e->src->index, exit_e->dest->index, exit_e->aux); > > + > > /* Different restrictions apply when we are considering an inner-mos= t loop, > > vs. an outer (nested) loop. > > (FORNOW. May want to relax some of these restrictions in the > > future). */ @@ -1767,7 +1795,7 @@ vect_analyze_loop_form (class loop > *loop, vect_loop_form_info *info) > > " abnormal loop exit edge.\n"); > > > > info->loop_cond > > - =3D vect_get_loop_niters (loop, &info->assumptions, > > + =3D vect_get_loop_niters (loop, e, &info->assumptions, > > &info->number_of_iterations, > > &info->number_of_iterationsm1); > > if (!info->loop_cond) > > @@ -1821,6 +1849,9 @@ vect_create_loop_vinfo (class loop *loop, > > vec_info_shared *shared, > > > > stmt_vec_info loop_cond_info =3D loop_vinfo->lookup_stmt (info- > >loop_cond); > > STMT_VINFO_TYPE (loop_cond_info) =3D loop_exit_ctrl_vec_info_type; > > + > > + LOOP_VINFO_IV_EXIT (loop_vinfo) =3D info->loop_exit; > > + > > if (info->inner_loop_cond) > > { > > stmt_vec_info inner_loop_cond_info @@ -3063,9 +3094,9 @@ > > start_over: > > if (dump_enabled_p ()) > > dump_printf_loc (MSG_NOTE, vect_location, "epilog loop require= d\n"); > > if (!vect_can_advance_ivs_p (loop_vinfo) > > - || !slpeel_can_duplicate_loop_p (LOOP_VINFO_LOOP (loop_vinfo), > > - single_exit (LOOP_VINFO_LOOP > > - (loop_vinfo)))) > > + || !slpeel_can_duplicate_loop_p (loop, > > + LOOP_VINFO_IV_EXIT (loop_vinfo), > > + LOOP_VINFO_IV_EXIT (loop_vinfo))) > > { > > ok =3D opt_result::failure_at (vect_location, > > "not vectorized: can't create required " > > @@ -6002,7 +6033,7 @@ vect_create_epilog_for_reduction (loop_vec_info > loop_vinfo, > > Store them in NEW_PHIS. */ > > if (double_reduc) > > loop =3D outer_loop; > > - exit_bb =3D single_exit (loop)->dest; > > + exit_bb =3D LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; > > exit_gsi =3D gsi_after_labels (exit_bb); > > reduc_inputs.create (slp_node ? vec_num : ncopies); > > for (unsigned i =3D 0; i < vec_num; i++) @@ -6018,7 +6049,7 @@ > > vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, > > phi =3D create_phi_node (new_def, exit_bb); > > if (j) > > def =3D gimple_get_lhs (STMT_VINFO_VEC_STMTS (rdef_info)[j]); > > - SET_PHI_ARG_DEF (phi, single_exit (loop)->dest_idx, def); > > + SET_PHI_ARG_DEF (phi, LOOP_VINFO_IV_EXIT (loop_vinfo)- > >dest_idx, > > +def); > > new_def =3D gimple_convert (&stmts, vectype, new_def); > > reduc_inputs.quick_push (new_def); > > } > > @@ -10416,12 +10447,12 @@ vectorizable_live_operation (vec_info > *vinfo, stmt_vec_info stmt_info, > > lhs' =3D new_tree; */ > > > > class loop *loop =3D LOOP_VINFO_LOOP (loop_vinfo); > > - basic_block exit_bb =3D single_exit (loop)->dest; > > + basic_block exit_bb =3D LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; > > gcc_assert (single_pred_p (exit_bb)); > > > > tree vec_lhs_phi =3D copy_ssa_name (vec_lhs); > > gimple *phi =3D create_phi_node (vec_lhs_phi, exit_bb); > > - SET_PHI_ARG_DEF (phi, single_exit (loop)->dest_idx, vec_lhs); > > + SET_PHI_ARG_DEF (phi, LOOP_VINFO_IV_EXIT > > + (loop_vinfo)->dest_idx, vec_lhs); > > > > gimple_seq stmts =3D NULL; > > tree new_tree; > > @@ -10965,7 +10996,7 @@ vect_get_loop_len (loop_vec_info loop_vinfo, > gimple_stmt_iterator *gsi, > > profile. */ > > > > static void > > -scale_profile_for_vect_loop (class loop *loop, unsigned vf, bool > > flat) > > +scale_profile_for_vect_loop (class loop *loop, edge exit_e, unsigned > > +vf, bool flat) > > { > > /* For flat profiles do not scale down proportionally by VF and only > > cap by known iteration count bounds. */ @@ -10980,7 +11011,6 @@ > > scale_profile_for_vect_loop (class loop *loop, unsigned vf, bool flat) > > return; > > } > > /* Loop body executes VF fewer times and exit increases VF times. > > */ > > - edge exit_e =3D single_exit (loop); > > profile_count entry_count =3D loop_preheader_edge (loop)->count (); > > > > /* If we have unreliable loop profile avoid dropping entry @@ > > -11350,7 +11380,7 @@ vect_transform_loop (loop_vec_info loop_vinfo, > > gimple *loop_vectorized_call) > > > > /* Make sure there exists a single-predecessor exit bb. Do this bef= ore > > versioning. */ > > - edge e =3D single_exit (loop); > > + edge e =3D LOOP_VINFO_IV_EXIT (loop_vinfo); > > if (! single_pred_p (e->dest)) > > { > > split_loop_exit_edge (e, true); @@ -11376,7 +11406,7 @@ > > vect_transform_loop (loop_vec_info loop_vinfo, gimple > *loop_vectorized_call) > > loop closed PHI nodes on the exit. */ > > if (LOOP_VINFO_SCALAR_LOOP (loop_vinfo)) > > { > > - e =3D single_exit (LOOP_VINFO_SCALAR_LOOP (loop_vinfo)); > > + e =3D LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo); > > if (! single_pred_p (e->dest)) > > { > > split_loop_exit_edge (e, true); > > @@ -11625,8 +11655,9 @@ vect_transform_loop (loop_vec_info > loop_vinfo, gimple *loop_vectorized_call) > > a zero NITERS becomes a nonzero NITERS_VECTOR. */ > > if (integer_onep (step_vector)) > > niters_no_overflow =3D true; > > - vect_set_loop_condition (loop, loop_vinfo, niters_vector, step_vecto= r, > > - niters_vector_mult_vf, !niters_no_overflow); > > + vect_set_loop_condition (loop, LOOP_VINFO_IV_EXIT (loop_vinfo), > loop_vinfo, > > + niters_vector, step_vector, niters_vector_mult_vf, > > + !niters_no_overflow); > > > > unsigned int assumed_vf =3D vect_vf_for_cost (loop_vinfo); > > > > @@ -11699,7 +11730,8 @@ vect_transform_loop (loop_vec_info > loop_vinfo, gimple *loop_vectorized_call) > > assumed_vf) - 1 > > : wi::udiv_floor (loop->nb_iterations_estimate + bias_for_assumed, > > assumed_vf) - 1); > > - scale_profile_for_vect_loop (loop, assumed_vf, flat); > > + scale_profile_for_vect_loop (loop, LOOP_VINFO_IV_EXIT (loop_vinfo), > > + assumed_vf, flat); > > > > if (dump_enabled_p ()) > > { > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index > > > f1d0cd79961abb095bc79d3b59a81930f0337e59..afa7a8e30891c782a0e5e > 3740ecc > > 4377f5a31e54 100644 > > --- a/gcc/tree-vectorizer.h > > +++ b/gcc/tree-vectorizer.h > > @@ -919,10 +919,24 @@ public: > > analysis. */ > > vec<_loop_vec_info *> epilogue_vinfos; > > > > + /* The controlling loop IV for the current loop when vectorizing. T= his IV > > + controls the natural exits of the loop. */ edge vec_loop_iv; > > + > > + /* The controlling loop IV for the epilogue loop when vectorizing. = This IV > > + controls the natural exits of the loop. */ edge > > + vec_epilogue_loop_iv; > > + > > + /* The controlling loop IV for the scalar loop being vectorized. Th= is IV > > + controls the natural exits of the loop. */ edge > > + scalar_loop_iv; >=20 > all of the above sound as if they were IVs, the access macros have _EXIT = at the > end, can you make the above as well? >=20 > Otherwise looks good to me. >=20 > Feel free to push approved patches of the series, no need to wait until > everything is approved. >=20 > Thanks, > Richard. >=20 > > } *loop_vec_info; > > > > /* Access Functions. */ > > #define LOOP_VINFO_LOOP(L) (L)->loop > > +#define LOOP_VINFO_IV_EXIT(L) (L)->vec_loop_iv > > +#define LOOP_VINFO_EPILOGUE_IV_EXIT(L) (L)->vec_epilogue_loop_iv > > +#define LOOP_VINFO_SCALAR_IV_EXIT(L) (L)->scalar_loop_iv > > #define LOOP_VINFO_BBS(L) (L)->bbs > > #define LOOP_VINFO_NITERSM1(L) (L)->num_itersm1 > > #define LOOP_VINFO_NITERS(L) (L)->num_iters > > @@ -2155,11 +2169,13 @@ class auto_purge_vect_location > > > > /* Simple loop peeling and versioning utilities for vectorizer's purpo= ses - > > in tree-vect-loop-manip.cc. */ > > -extern void vect_set_loop_condition (class loop *, loop_vec_info, > > +extern void vect_set_loop_condition (class loop *, edge, > > +loop_vec_info, > > tree, tree, tree, bool); > > -extern bool slpeel_can_duplicate_loop_p (const class loop *, > > const_edge); -class loop *slpeel_tree_duplicate_loop_to_edge_cfg (class > loop *, > > - class loop *, edge); > > +extern bool slpeel_can_duplicate_loop_p (const class loop *, const_edg= e, > > + const_edge); > > +class loop *slpeel_tree_duplicate_loop_to_edge_cfg (class loop *, edge= , > > + class loop *, edge, > > + edge, edge *); > > class loop *vect_loop_versioning (loop_vec_info, gimple *); extern > > class loop *vect_do_peeling (loop_vec_info, tree, tree, > > tree *, tree *, tree *, int, bool, bool, @@ - > 2169,6 +2185,7 > > @@ extern void vect_prepare_for_masked_peels (loop_vec_info); extern > > dump_user_location_t find_loop_location (class loop *); extern bool > > vect_can_advance_ivs_p (loop_vec_info); extern void > > vect_update_inits_of_drs (loop_vec_info, tree, tree_code); > > +extern edge vec_init_loop_exit_info (class loop *); > > > > /* In tree-vect-stmts.cc. */ > > extern tree get_related_vectype_for_scalar_type (machine_mode, tree, > > @@ -2358,6 +2375,7 @@ struct vect_loop_form_info > > tree assumptions; > > gcond *loop_cond; > > gcond *inner_loop_cond; > > + edge loop_exit; > > }; > > extern opt_result vect_analyze_loop_form (class loop *, > > vect_loop_form_info *); extern loop_vec_info vect_create_loop_vinfo > > (class loop *, vec_info_shared *, diff --git a/gcc/tree-vectorizer.cc > > b/gcc/tree-vectorizer.cc index > > > a048e9d89178a37455bd7b83ab0f2a238a4ce69e..d97e2b54c25ac6037893 > 5392aa7b > > 73476efed74b 100644 > > --- a/gcc/tree-vectorizer.cc > > +++ b/gcc/tree-vectorizer.cc > > @@ -943,6 +943,8 @@ set_uid_loop_bbs (loop_vec_info loop_vinfo, > gimple *loop_vectorized_call, > > class loop *scalar_loop =3D get_loop (fun, tree_to_shwi (arg)); > > > > LOOP_VINFO_SCALAR_LOOP (loop_vinfo) =3D scalar_loop; > > + LOOP_VINFO_SCALAR_IV_EXIT (loop_vinfo) > > + =3D vec_init_loop_exit_info (scalar_loop); > > gcc_checking_assert (vect_loop_vectorized_call (scalar_loop) > > =3D=3D loop_vectorized_call); > > /* If we are going to vectorize outer loop, prevent vectorization > > > > > > > > > > >=20 > -- > Richard Biener > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg)