From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2075.outbound.protection.outlook.com [40.107.22.75]) by sourceware.org (Postfix) with ESMTPS id 9B09E3858D32 for ; Thu, 16 Nov 2023 11:08:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9B09E3858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9B09E3858D32 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.22.75 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700132929; cv=pass; b=v0B7yHqN/Z2Or96rt/LwbT/24rRhPx9ITuMnDRz8j+TwQKNffbytiznfa60U7IMfZzUVe38wX+2UOCVhxO8ph8iDwebmVrT1sC7SsXSjB78ekVVS90DWCJwrajVV5Xj/34+t0uddmN6I+268f+Ati6+dpCmgB6+3OXieNRObryg= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700132929; c=relaxed/simple; bh=0L4Hct1d92oNEdQJ+ut8x8AF4BuIZKkFpVswyEcejFY=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=kdJ5bsMcZVSFjvAPIgWK13BV0vhnCU9UcXMD7ngJZ6dR/TA0sajy9Pw5kdVRQQJZ5ZJZdLdtqqYE4xT4NzMvMfSAiIvjdDfzIfHV0Pi33e622NFHKZtPq0oYG9qB9x6Sgh+22fO1EHegenEFJ4ZKB1sHxrTVHySPA20yfJxXAl8= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=RJpfAI7UotGOenukIh8sXZXVgdJURymw4UkbyvTMt25zkqC3Pxw+jASRsGcg3Md2dKWeOzLbJUV6DFN60eKMoy2CXmfjnx3uX498k0vKPkDNdgoEAiZUa7rpRCkXHq9zwBE9c+ElGZKVkAxb/LjGdonvstps6TU4D/xayAmI+QWPXtXXBd2WEbyeemD/KJjT54sN4da1+Q87/ada2wgRIImOPtnoTgjok1LCXrxdRaNeO9u0kwQXHXKatmzM8gYM2QDDCCBLtP7o+ObmsklbQsQ7LMMaw5HedjII1MyvvKgAXxs2N8lLGE0MwIni4wVxHE7WYbk5oZZkMQN+plhJPA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7eLWP8RiI6l/5rZKS3IziEi0fIrHPYLOMXXgSkw+l1U=; b=FuhmR2RAW1gEC+LHmqETqYhW0K00t+5tqY2Y06IHbbqYh1LkOk/sivfejrOCJu0svtF5vYqMJGW7/RzGrQgC+l8nTxeRcNRkGIzGqScpd1OoIGOH5zukC4LZ9DLDoUqPOx3k08ywuRVN6AEX+qoPDpALA5mnYaUUMLkR+U7f3cHMpjW/aG6YswFr1Z9tihyD4QvTXzQFyRMke1P1O4nLyFmTXQVn3ivq5zYyp4eAUU9Cxe8sdSzc+o2acZn/QgZB2FkXwclF5upe1C62UoG1c1WQBsAiK0gQsg90ryHM/o2oAI4LY+rRH2W8nvZddXUHfQknyjDX4B6A5NljUYt6Xw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7eLWP8RiI6l/5rZKS3IziEi0fIrHPYLOMXXgSkw+l1U=; b=9nC7askBhCWRCyU2MqYVH/JGFRNeGZquvdoXyaLwSWij9nDF7Ycua7a+qY75V7v9WjdQBO6dDy7Bl+nmXwQJIucuKieO8SB/2pNGaQFRKpNyoZL2fF0h/kMviQ1av2iD3KBHqWp3AxhncALorbue+E2t+f5EY84RRS1Y1i09cNs= Received: from AS8PR05CA0009.eurprd05.prod.outlook.com (2603:10a6:20b:311::14) by VI0PR08MB10798.eurprd08.prod.outlook.com (2603:10a6:800:211::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20; Thu, 16 Nov 2023 11:08:37 +0000 Received: from AMS0EPF000001A3.eurprd05.prod.outlook.com (2603:10a6:20b:311:cafe::5a) by AS8PR05CA0009.outlook.office365.com (2603:10a6:20b:311::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21 via Frontend Transport; Thu, 16 Nov 2023 11:08:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF000001A3.mail.protection.outlook.com (10.167.16.228) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.19 via Frontend Transport; Thu, 16 Nov 2023 11:08:37 +0000 Received: ("Tessian outbound 26ee1d40577c:v228"); Thu, 16 Nov 2023 11:08:37 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 01c5cec80ee23c31 X-CR-MTA-TID: 64aa7808 Received: from f98e6449b2c4.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 5B950CF4-09C8-49A4-8CF6-0D7A2974D047.1; Thu, 16 Nov 2023 11:08:30 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f98e6449b2c4.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 11:08:30 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=erOe1HI/MW47RBch3DfyZe2CLWQePpe0JKBZinhfr4KQcl0WV7uLziAjTYWrQoLuHoDmDSMcjap10Zqq/oeY0Mv//lNL8wHd95DBzQFBCUG2rHl9Nm+el1VkTUCPGXHwW9dzVlH2f4CluiwEGutmrSmizVkKs+YQobzJvwfkPiO41nyubNwqSRn9Aqb4/4kq2DK0qk5utxwn66+tjsnNhgn6XZBprRH7DnQbmIZlu5t1OUcJygSWGqwJgnCXW/wXbvh0oHzF5dx3Zi9FMqd2WHEgfKIebgtfqfoUoKJxQfOPtikPz2unv/ereQeyzAQR7q3aURdL5dOm339rbfTTdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7eLWP8RiI6l/5rZKS3IziEi0fIrHPYLOMXXgSkw+l1U=; b=Lr+DbPN92OMBymu250T2Tnxc5uYkPykVGWyc8y1y2x0bF9iAloIVvdyR+G+H+qiG1JQ1XvOBQjcZWnB+3P2DDt672UQ4i8rlqsYICTW348dRmsWmU4DCrvsnGnUYodYMKxSAulCzx5aMXV0U89MPHT4O+nnL1fDMBuJvwXDP+P9IMVq6/vBjJrXIbtU2eE0G30AncZUuKeXChZeeZDJeW5r6h2RicR8t5tp25WW5aSAv22pFJKE5Jka/iy5NDf53fdJa8ygv0xvwl8SLae3uEp5m9DPfkFdeJ8vUXl9bdmaSXpwIGZdCSJqaOTGduQ+gbfRycHk5qVLl9gTO1FHjDA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7eLWP8RiI6l/5rZKS3IziEi0fIrHPYLOMXXgSkw+l1U=; b=9nC7askBhCWRCyU2MqYVH/JGFRNeGZquvdoXyaLwSWij9nDF7Ycua7a+qY75V7v9WjdQBO6dDy7Bl+nmXwQJIucuKieO8SB/2pNGaQFRKpNyoZL2fF0h/kMviQ1av2iD3KBHqWp3AxhncALorbue+E2t+f5EY84RRS1Y1i09cNs= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS2PR08MB9918.eurprd08.prod.outlook.com (2603:10a6:20b:544::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 11:08:27 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::9679:2ab0:99c6:54a3]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::9679:2ab0:99c6:54a3%6]) with mapi id 15.20.7002.021; Thu, 16 Nov 2023 11:08:27 +0000 From: Tamar Christina To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH 7/21]middle-end: update IV update code to support early breaks and arbitrary exits Thread-Topic: [PATCH 7/21]middle-end: update IV update code to support early breaks and arbitrary exits Thread-Index: AQHaEIRUWLSyh++pSEG+iB4lA+c+BLB6jLQwgADZ7QCAAAD4UIAABQiAgAALGkCAAVnUgIAAAkUA Date: Thu, 16 Nov 2023 11:08:27 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|AS2PR08MB9918:EE_|AMS0EPF000001A3:EE_|VI0PR08MB10798:EE_ X-MS-Office365-Filtering-Correlation-Id: 5a8396c7-091e-4aca-c5d1-08dbe69461f4 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: L7mG+NBPCQ29T0pxMGiDgNozDBkpU2ZAIcPOZxyEHMHaQyVUdVunqhDlrWcsYPda/FIdZ5lAscVfY1G/pl737GFB4ZNsReirD+LzQyGGBoQHMI8Nbn1qtM6pdKQQT+NtUt9J/OgdX1r9xHrH9wfhxJ2oxxObZMQ2a1sUwK7GP92rS0ZgzEMGusvn9viqfJdnJYKu1hynYlCqfesRf1+/0BMoVE8sKomMpC4WqTkgRzGgfPbOMFm1L0P4XnoM8TeV3Jum+08uQ69ynRud67w/Tc88J0kbAcxSnLAGr6zaMgHlHVOjmBYO157BPZ1aFx1HEShdXuxvJmOXnbJDKiYvQ8zFIetlQvJVOyr6bokdMs49/pO0f5bUfeR2oqD3v76pFZ4yIQl77cqw2krSZcBEiyi8c2kdI6YoIsIPc2jf9xL5oy7xH+y4j3pq1jjHTQh4Do5BRDtZREwb/jygO/TOvul1lczBuPejVxpjZSwSwfbSlxtjtd612Yefs8fJ1G96Ec4sV5RbMhzPmZIawIxehxoh9oYnrrC7kM4gQ+qlwW7FezJrPFiAmgKCNR3y2TivJBQ+oT55HOXkSO+DfezYmOCu16Qbf0IGd6gvRpEH+yo= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(39860400002)(396003)(136003)(346002)(366004)(376002)(230922051799003)(64100799003)(1800799009)(451199024)(186009)(30864003)(15650500001)(5660300002)(55016003)(2906002)(52536014)(41300700001)(8936002)(4326008)(8676002)(66899024)(6916009)(316002)(66476007)(54906003)(66446008)(64756008)(76116006)(66946007)(66556008)(86362001)(26005)(478600001)(38070700009)(966005)(53546011)(9686003)(7696005)(71200400001)(6506007)(122000001)(33656002)(83380400001)(38100700002)(579004);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB9918 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF000001A3.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: f00ec575-b702-4434-f38c-08dbe6945bb9 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: tYFzViUUpY60r3Exh6y7GU5gE8xAI8Nkr0CFbVcUnlv+l8UfmnsvdadjVMr4qXNflv7eIGeqlv/EdeNyfvv6uJiepZvEN8R7cOhjP37PUENEtP3R0OJ2r+3qCf2t5OaWm73SgG2L55//0qPmaUzbjZBW060gmkynfPhomuYIsbBEbq1nM/HIstH7abE4Cyghlru+VCfUHck639VYYjkAEeO/z4XDFP/79oZDFwV3Jbpj35Nk6jtF7BImvj9HWPjiuxk18ADYdmKUEqTcBtkKeEFAOL9eU3k4DoLrNlFpuyxILJUtsQOQWSo+xG4kzSfntY7t1538SBQgFUXJF/05S9XywgMF0/YGEepvu2yE34gszBtH4G0Qzc2Pg9Cvxap0xCqEXEWFa30Mt8hkqGdZXEyaGt+7Ecq41b4DhuSZoib6c3Mk2KYQZSG2iq9jArwm3tpqaAqNJD7HAs1UTQ+2cwsqLsdYrDJNuBppzInBd8ka1Hou/qq5l4D+rVpjD7+sY7GXvzyKa+Z6IH4R3uh+VyE8mpiNydfD7cA5gziR5s6N0M5Q9F/eMcXy2JeVd2XwdHBP/sxWtzBftUacBqKMje6+yjnLJTcKH5hZVPBG8hyYEo3M52cjEE1+5ZDFFUsjouU32YFToES+vBBGzAd/958bpNtmhoAFjSSi2UN5BItBv/CHHEvEWS44tPXHWv6HZo9GtekK0YaQSuV6moxap94fM19L31zpNqNHCF5bZ6c= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(39860400002)(396003)(346002)(136003)(376002)(230922051799003)(451199024)(64100799003)(82310400011)(1800799009)(186009)(36840700001)(46966006)(40470700004)(26005)(40460700003)(336012)(107886003)(53546011)(966005)(9686003)(83380400001)(36860700001)(5660300002)(47076005)(8676002)(52536014)(8936002)(4326008)(478600001)(41300700001)(15650500001)(2906002)(30864003)(6862004)(316002)(6506007)(7696005)(54906003)(70206006)(70586007)(33656002)(82740400003)(86362001)(356005)(81166007)(40480700001)(55016003)(66899024);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 11:08:37.5255 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5a8396c7-091e-4aca-c5d1-08dbe69461f4 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001A3.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI0PR08MB10798 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Richard Biener > Sent: Thursday, November 16, 2023 10:40 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; jlaw@ventanamicro.com > Subject: RE: [PATCH 7/21]middle-end: update IV update code to support ear= ly > breaks and arbitrary exits >=20 > On Wed, 15 Nov 2023, Tamar Christina wrote: >=20 > > > -----Original Message----- > > > From: Richard Biener > > > Sent: Wednesday, November 15, 2023 1:23 PM > > > To: Tamar Christina > > > Cc: gcc-patches@gcc.gnu.org; nd ; > jlaw@ventanamicro.com > > > Subject: RE: [PATCH 7/21]middle-end: update IV update code to > > > support early breaks and arbitrary exits > > > > > > On Wed, 15 Nov 2023, Tamar Christina wrote: > > > > > > > > -----Original Message----- > > > > > From: Richard Biener > > > > > Sent: Wednesday, November 15, 2023 1:01 PM > > > > > To: Tamar Christina > > > > > Cc: gcc-patches@gcc.gnu.org; nd ; > > > jlaw@ventanamicro.com > > > > > Subject: RE: [PATCH 7/21]middle-end: update IV update code to > > > > > support early breaks and arbitrary exits > > > > > > > > > > On Wed, 15 Nov 2023, Tamar Christina wrote: > > > > > > > > > > > Patch updated to latest trunk: > > > > > > > > > > > > Hi All, > > > > > > > > > > > > This changes the PHI node updates to support early breaks. > > > > > > It has to support both the case where the loop's exit matches > > > > > > the normal loop exit and one where the early exit is "inverted"= , i.e. > > > > > > it's an early > > > > > exit edge. > > > > > > > > > > > > In the latter case we must always restart the loop for VF itera= tions. > > > > > > For an early exit the reason is obvious, but there are cases > > > > > > where the "normal" exit is located before the early one. This > > > > > > exit then does a check on ivtmp resulting in us leaving the > > > > > > loop since it thinks we're > > > done. > > > > > > > > > > > > In these case we may still have side-effects to perform so we > > > > > > also go to the scalar loop. > > > > > > > > > > > > For the "normal" exit niters has already been adjusted for > > > > > > peeling, for the early exits we must find out how many > > > > > > iterations we actually did. So we have to recalculate the new = position > for each exit. > > > > > > > > > > > > Thanks, > > > > > > Tamar > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > * tree-vect-loop-manip.cc (vect_set_loop_condition_normal): > > > > > > Hide > > > > > unused. > > > > > > (vect_update_ivs_after_vectorizer): Support early break. > > > > > > (vect_do_peeling): Use it. > > > > > > > > > > > > --- inline copy of patch --- > > > > > > > > > > > > diff --git a/gcc/tree-vect-loop-manip.cc > > > > > > b/gcc/tree-vect-loop-manip.cc index > > > > > > > > > > > > > > > d3fa8699271c4d7f404d648a38a95beabeabc99a..e1d210ab4617c894dab3 > > > > > d2654cf1 > > > > > > c842baac58f5 100644 > > > > > > --- a/gcc/tree-vect-loop-manip.cc > > > > > > +++ b/gcc/tree-vect-loop-manip.cc > > > > > > @@ -1200,7 +1200,7 @@ > > > > > > vect_set_loop_condition_partial_vectors_avx512 > > > > > (class loop *loop, > > > > > > loop handles exactly VF scalars per iteration. */ > > > > > > > > > > > > static gcond * > > > > > > -vect_set_loop_condition_normal (loop_vec_info loop_vinfo, > > > > > > edge exit_edge, > > > > > > +vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo > > > > > > +*/, edge exit_edge, > > > > > > class loop *loop, tree niters, tree step, > > > > > > tree final_iv, bool niters_maybe_zero, > > > > > > gimple_stmt_iterator loop_cond_gsi) > @@ - > > > > > 1412,7 +1412,7 @@ > > > > > > vect_set_loop_condition (class loop *loop, edge loop_e, > > > > > > loop_vec_info > > > > > loop_vinfo > > > > > > When this happens we need to flip the understanding of > > > > > > main and > > > other > > > > > > exits by peeling and IV updates. */ > > > > > > > > > > > > -bool inline > > > > > > +bool > > > > > > vect_is_loop_exit_latch_pred (edge loop_exit, class loop *loop= ) { > > > > > > return single_pred (loop->latch) =3D=3D loop_exit->src; @@ > > > > > > -2142,6 > > > > > > +2142,7 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo) > > > > > > Input: > > > > > > - LOOP - a loop that is going to be vectorized. The last = few > iterations > > > > > > of LOOP were peeled. > > > > > > + - VF - The chosen vectorization factor for LOOP. > > > > > > - NITERS - the number of iterations that LOOP executes (b= efore it is > > > > > > vectorized). i.e, the number of times the ivs = should be > bumped. > > > > > > - UPDATE_E - a successor edge of LOOP->exit that is on > > > > > > the > > > > > > (only) path > > > > > > > > > > the comment on this is now a bit misleading, can you try to > > > > > update it and/or move the comment bits to the docs on EARLY_EXIT? > > > > > > > > > > > @@ -2152,6 +2153,9 @@ vect_can_advance_ivs_p (loop_vec_info > > > > > loop_vinfo) > > > > > > The phi args associated with the edge UPDATE= _E in the bb > > > > > > UPDATE_E->dest are updated accordingly. > > > > > > > > > > > > + - restart_loop - Indicates whether the scalar loop needs > > > > > > + to restart the > > > > > > > > > > params are ALL_CAPS > > > > > > > > > > > + iteration count where the vector loop began. > > > > > > + > > > > > > Assumption 1: Like the rest of the vectorizer, this funct= ion assumes > > > > > > a single loop exit that has a single predecessor. > > > > > > > > > > > > @@ -2169,18 +2173,22 @@ vect_can_advance_ivs_p (loop_vec_info > > > > > loop_vinfo) > > > > > > */ > > > > > > > > > > > > static void > > > > > > -vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, > > > > > > - tree niters, edge update_e) > > > > > > +vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, > > > > > > +poly_uint64 vf, > > > > > > > > > > LOOP_VINFO_VECT_FACTOR? > > > > > > > > > > > + tree niters, edge update_e, bool > > > > > restart_loop) > > > > > > > > > > I think 'bool early_exit' is better here? I wonder if we have an= "early" > > > > > exit after the main exit we are probably sure there are no > > > > > side-effects to re- execute and could avoid this restarting? > > > > > > > > Side effects yes, but the actual check may not have been performed = yet. > > > > If you remember > > > > > https://gist.github.com/Mistuke/66f14fe5c1be32b91ce149bd9b8bb35f > > > > There in the clz loop through the "main" exit you still have to > > > > see if that iteration did not contain the entry. This is because > > > > the loop counter is incremented before you iterate. > > > > > > > > > > > > > > > { > > > > > > gphi_iterator gsi, gsi1; > > > > > > class loop *loop =3D LOOP_VINFO_LOOP (loop_vinfo); > > > > > > basic_block update_bb =3D update_e->dest; > > > > > > - > > > > > > - basic_block exit_bb =3D LOOP_VINFO_IV_EXIT > > > > > > (loop_vinfo)->dest; > > > > > > - > > > > > > - /* Make sure there exists a single-predecessor exit bb: */ > > > > > > - gcc_assert (single_pred_p (exit_bb)); > > > > > > - gcc_assert (single_succ_edge (exit_bb) =3D=3D update_e); > > > > > > + bool inversed_iv > > > > > > + =3D !vect_is_loop_exit_latch_pred (LOOP_VINFO_IV_EXIT > (loop_vinfo), > > > > > > + LOOP_VINFO_LOOP > (loop_vinfo)); > > > > > > + bool needs_interm_block =3D LOOP_VINFO_EARLY_BREAKS > (loop_vinfo) > > > > > > + && flow_bb_inside_loop_p (loop, > update_e->src); > > > > > > + edge loop_e =3D LOOP_VINFO_IV_EXIT (loop_vinfo); > > > > > > + gcond *cond =3D get_loop_exit_condition (loop_e); > > > > > > + basic_block exit_bb =3D loop_e->dest; > > > > > > + basic_block iv_block =3D NULL; > > > > > > + gimple_stmt_iterator last_gsi =3D gsi_last_bb (exit_bb); > > > > > > > > > > > > for (gsi =3D gsi_start_phis (loop->header), gsi1 =3D > > > > > > gsi_start_phis > > > (update_bb); > > > > > > !gsi_end_p (gsi) && !gsi_end_p (gsi1); @@ -2190,7 > > > > > > +2198,6 @@ vect_update_ivs_after_vectorizer (loop_vec_info > loop_vinfo, > > > > > > tree step_expr, off; > > > > > > tree type; > > > > > > tree var, ni, ni_name; > > > > > > - gimple_stmt_iterator last_gsi; > > > > > > > > > > > > gphi *phi =3D gsi.phi (); > > > > > > gphi *phi1 =3D gsi1.phi (); @@ -2222,11 +2229,52 @@ > > > > > > vect_update_ivs_after_vectorizer > > > > > (loop_vec_info loop_vinfo, > > > > > > enum vect_induction_op_type induction_type > > > > > > =3D STMT_VINFO_LOOP_PHI_EVOLUTION_TYPE (phi_info); > > > > > > > > > > > > - if (induction_type =3D=3D vect_step_op_add) > > > > > > + tree iv_var =3D PHI_ARG_DEF_FROM_EDGE (phi, > > > > > > + loop_latch_edge > > > (loop)); > > > > > > + /* create_iv always places it on the LHS. Alternatively= we can set a > > > > > > + property during create_iv to identify it. */ > > > > > > + bool ivtemp =3D gimple_cond_lhs (cond) =3D=3D iv_var; > > > > > > + if (restart_loop && ivtemp) > > > > > > { > > > > > > + type =3D TREE_TYPE (gimple_phi_result (phi)); > > > > > > + ni =3D build_int_cst (type, vf); > > > > > > + if (inversed_iv) > > > > > > + ni =3D fold_build2 (MINUS_EXPR, type, ni, > > > > > > + fold_convert (type, step_expr)); > > > > > > + } > > > > > > + else if (induction_type =3D=3D vect_step_op_add) > > > > > > + { > > > > > > + > > > > > > tree stype =3D TREE_TYPE (step_expr); > > > > > > - off =3D fold_build2 (MULT_EXPR, stype, > > > > > > - fold_convert (stype, niters), step_expr); > > > > > > + > > > > > > + /* Early exits always use last iter value not niters. */ > > > > > > + if (restart_loop) > > > > > > + { > > > > > > + /* Live statements in the non-main exit shouldn't be > adjusted. We > > > > > > + normally didn't have this problem with a single exit as > live > > > > > > + values would be in the exit block. However when > dealing with > > > > > > + multiple exits all exits are redirected to the merge > block > > > > > > + and we restart the iteration. */ > > > > > > > > > > Hmm, I fail to see how this works - we're either using the value > > > > > to continue the induction or not, independent of STMT_VINFO_LIVE_= P. > > > > > > > > That becomes clear in the patch to update live reductions. > > > > Essentially any live Reductions inside an alternative exit will > > > > reduce to the first element rather than the last and use that as > > > > the seed for the > > > scalar loop. > > > > > > Hum. Reductions are vectorized as N separate reductions. I don't > > > think you can simply change the reduction between the lanes to "skip" > > > part of the vector iteration. But you can use the value of the > > > vector from before the vector iteration - the loop header PHI > > > result, and fully reduce that to get at the proper value. > > > > That's what It's supposed to be doing though. The reason live > > operations are skipped here is that if we don't we'll re-adjust the IV > > even though the value will already be correct after vectorization. > > > > Remember that this code only gets so far for IV PHI nodes. > > > > The loop phi header result itself can be live, i.e. see testcases > > vect-early-break_70.c to vect-early-break_75.c > > > > you have i_15 =3D PHI > > > > we use i_15 in the early exit. This should not be adjusted because > > when it's vectorized the value at 0[lane 0] is already correct. This > > is why for any PHI inside the early exits it uses the value 0[0] instea= d of > N[lane_max]. > > > > Perhaps I'm missing something here? >=20 > OK, so I refreshed my mind of what vect_update_ivs_after_vectorizer does. >=20 > I still do not understand the (complexity of the) patch. Basically the f= unction > computes the new value of the IV "from scratch" based on the number of > scalar iterations of the vector loop, the 'niter' > argument. I would have expected that for the early exits we either pass = in a > different 'niter' or alternatively a 'niter_adjustment'. But for an early exit there's no static value for adjusted niter, since you= don't know which iteration you exited from. Unlike the normal exit when you know if yo= u get there you've done all possible iterations. So you must compute the scalar iteration count on the exit itself. >=20 > It seems your change handles different kinds of inductions differently. > Specifically >=20 > bool ivtemp =3D gimple_cond_lhs (cond) =3D=3D iv_var; > if (restart_loop && ivtemp) > { > type =3D TREE_TYPE (gimple_phi_result (phi)); > ni =3D build_int_cst (type, vf); > if (inversed_iv) > ni =3D fold_build2 (MINUS_EXPR, type, ni, > fold_convert (type, step_expr)); > } >=20 > it looks like for the exit test IV we use either 'VF' or 'VF - step' > as the new value. That seems to be very odd special casing for unknown > reasons. And while you adjust vec_step_op_add, you don't adjust > vect_peel_nonlinear_iv_init (maybe not supported - better assert here). The VF case is for a normal "non-inverted" loop, where if you take an early= exit you know that you have to do at most VF iterations. The VF - step is to ac= count for the inverted loop control flow where you exit after adjusting the IV al= ready by + step. Peeling doesn't matter here, since you know you were able to do a vector it= eration so it's safe to do VF iterations. So having peeled doesn't affect the rema= ining iters count. >=20 > Also the vec_step_op_add case will keep the original scalar IV live even = when it > is a vectorized induction. The code recomputing the value from scratch a= voids > this. >=20 > /* For non-main exit create an intermediat edge to get any updated = iv > calculations. */ > if (needs_interm_block > && !iv_block > && (!gimple_seq_empty_p (stmts) || !gimple_seq_empty_p > (new_stmts))) > { > iv_block =3D split_edge (update_e); > update_e =3D single_succ_edge (update_e->dest); > last_gsi =3D gsi_last_bb (iv_block); > } >=20 > this is also odd, can we adjust the API instead? I suppose this is becau= se your > computation uses the original loop IV, if you based the computation off t= he > initial value only this might not be necessary? No, on the main exit the code updates the value in the loop header and puts= the Calculation in the merge block. This works because it only needs to consum= e PHI nodes in the merge block and things like niters are adjusted in the guard b= lock. For an early exit, we don't have a guard block, only the merge block. We ha= ve to update the PHI nodes in that block, but can't do so since you can't produc= e a value and consume it in a PHI node in the same BB. So we need to create the bloc= k to put the values in for use in the merge block. Because there's no "guard" block= for early exits. The API can be adjusted by always creating the empty block either during pe= eling. That would prevent us from having to do anything special here. Would that = work better? Or I can do it in the loop that iterates over the exits to before = the call to vect_update_ivs_after_vectorizer, which I think might be more consistent= . >=20 > That said, I wonder why we cannot simply pass in an adjusted niter which > would be niters_vector_mult_vf - vf and be done with that? >=20 We can ofcourse not have this and recompute it from niters itself, however = this does affect the epilog code layout. Particularly knowing the static number if it= erations left causes it to usually unroll the loop and share some of the computations. i= .e. the scalar code is often more efficient. The computation would be niters_vector_mult_vf - iters_done * vf, since the= value put Here is the remaining iteration count. It's static for early exits. But can do whatever you prefer here. Let me know what you prefer for the a= bove. Thanks, Tamar > Thanks, > Richard. >=20 >=20 > > Regards, > > Tamar > > > > > > > It has to do this since you have to perform the side effects for > > > > the non-matching elements still. > > > > > > > > Regards, > > > > Tamar > > > > > > > > > > > > > > > + if (STMT_VINFO_LIVE_P (phi_info)) > > > > > > + continue; > > > > > > + > > > > > > + /* For early break the final loop IV is: > > > > > > + init + (final - init) * vf which takes into account peeling > > > > > > + values and non-single steps. The main exit can use > niters > > > > > > + since if you exit from the main exit you've done all > vector > > > > > > + iterations. For an early exit we don't know when we > exit > > > > > > +so > > > > > we > > > > > > + must re-calculate this on the exit. */ > > > > > > + tree start_expr =3D gimple_phi_result (phi); > > > > > > + off =3D fold_build2 (MINUS_EXPR, stype, > > > > > > + fold_convert (stype, start_expr), > > > > > > + fold_convert (stype, init_expr)); > > > > > > + /* Now adjust for VF to get the final iteration value. = */ > > > > > > + off =3D fold_build2 (MULT_EXPR, stype, off, > > > > > > + build_int_cst (stype, vf)); > > > > > > + } > > > > > > + else > > > > > > + off =3D fold_build2 (MULT_EXPR, stype, > > > > > > + fold_convert (stype, niters), step_expr); > > > > > > + > > > > > > if (POINTER_TYPE_P (type)) > > > > > > ni =3D fold_build_pointer_plus (init_expr, off); > > > > > > else > > > > > > @@ -2238,6 +2286,8 @@ vect_update_ivs_after_vectorizer > > > > > > (loop_vec_info > > > > > loop_vinfo, > > > > > > /* Don't bother call vect_peel_nonlinear_iv_init. */ > > > > > > else if (induction_type =3D=3D vect_step_op_neg) > > > > > > ni =3D init_expr; > > > > > > + else if (restart_loop) > > > > > > + continue; > > > > > > > > > > This looks all a bit complicated - why wouldn't we simply always > > > > > use the PHI result when 'restart_loop'? Isn't that the correct > > > > > old start value in > > > all cases? > > > > > > > > > > > else > > > > > > ni =3D vect_peel_nonlinear_iv_init (&stmts, init_expr, > > > > > > niters, step_expr, > > > > > > @@ -2245,9 +2295,20 @@ vect_update_ivs_after_vectorizer > > > > > (loop_vec_info > > > > > > loop_vinfo, > > > > > > > > > > > > var =3D create_tmp_var (type, "tmp"); > > > > > > > > > > > > - last_gsi =3D gsi_last_bb (exit_bb); > > > > > > gimple_seq new_stmts =3D NULL; > > > > > > ni_name =3D force_gimple_operand (ni, &new_stmts, false, > > > > > > var); > > > > > > + > > > > > > + /* For non-main exit create an intermediat edge to get a= ny > updated iv > > > > > > + calculations. */ > > > > > > + if (needs_interm_block > > > > > > + && !iv_block > > > > > > + && (!gimple_seq_empty_p (stmts) || !gimple_seq_empty_p > > > > > (new_stmts))) > > > > > > + { > > > > > > + iv_block =3D split_edge (update_e); > > > > > > + update_e =3D single_succ_edge (update_e->dest); > > > > > > + last_gsi =3D gsi_last_bb (iv_block); > > > > > > + } > > > > > > + > > > > > > /* Exit_bb shouldn't be empty. */ > > > > > > if (!gsi_end_p (last_gsi)) > > > > > > { > > > > > > @@ -3342,8 +3403,26 @@ vect_do_peeling (loop_vec_info > > > > > > loop_vinfo, tree > > > > > niters, tree nitersm1, > > > > > > niters_vector_mult_vf steps. */ > > > > > > gcc_checking_assert (vect_can_advance_ivs_p (loop_vinfo)= ); > > > > > > update_e =3D skip_vector ? e : loop_preheader_edge (epil= og); > > > > > > - vect_update_ivs_after_vectorizer (loop_vinfo, > niters_vector_mult_vf, > > > > > > - update_e); > > > > > > + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) > > > > > > + update_e =3D single_succ_edge (e->dest); > > > > > > + bool inversed_iv > > > > > > + =3D !vect_is_loop_exit_latch_pred (LOOP_VINFO_IV_EXIT > (loop_vinfo), > > > > > > + LOOP_VINFO_LOOP > (loop_vinfo)); > > > > > > > > > > You are computing this here and in vect_update_ivs_after_vectoriz= er? > > > > > > > > > > > + > > > > > > + /* Update the main exit first. */ > > > > > > + vect_update_ivs_after_vectorizer (loop_vinfo, vf, > > > niters_vector_mult_vf, > > > > > > + update_e, inversed_iv); > > > > > > + > > > > > > + /* And then update the early exits. */ > > > > > > + for (auto exit : get_loop_exit_edges (loop)) > > > > > > + { > > > > > > + if (exit =3D=3D LOOP_VINFO_IV_EXIT (loop_vinfo)) > > > > > > + continue; > > > > > > + > > > > > > + vect_update_ivs_after_vectorizer (loop_vinfo, vf, > > > > > > + niters_vector_mult_vf, > > > > > > + exit, true); > > > > > > > > > > ... why does the same not work here? Wouldn't the proper > > > > > condition be !dominated_by_p (CDI_DOMINATORS, exit->src, > > > > > LOOP_VINFO_IV_EXIT > > > > > (loop_vinfo)->src) or similar? That is, whether the exit is at > > > > > or after the main IV exit? (consider having two) > > > > > > > > > > > + } > > > > > > > > > > > > if (skip_epilog) > > > > > > { > > > > > > > > > > > > > > > > -- > > > Richard Biener > > > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 > > > Nuernberg, Germany; > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > Nuernberg) > > >=20 > -- > Richard Biener > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg)