From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2059.outbound.protection.outlook.com [40.107.8.59]) by sourceware.org (Postfix) with ESMTPS id CD5B0387544A for ; Tue, 30 Jan 2024 12:57:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CD5B0387544A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CD5B0387544A Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.8.59 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1706619462; cv=pass; b=xKUNmzH0PHn6bZG2EvX64iGvhL175KaJSVzOjb/jBlto57CDnzj2WiTZH4DQDz8bDGFqiGjArR/X1OZ1y6u6CoMj+6VY9q4WRGjyFjFQRSPKZxyOPcZv9/mWuRSLendS6rhPUU/sft0G99lAdIYWqey3KxQHqse5qSZeVw9c8iI= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1706619462; c=relaxed/simple; bh=7E94Ni9GWFmAOL1Y0AMRwdNcy4kjpA3O10qzSg/9+CM=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=aIJaG44MIBI7uvFK6QPpGVgWLw2PK9VXA6GjAGEBeJGGGkddIMa0bRVc9cbWSubBZYz5cOEkUh6qJxfv+8aDG6V6uUSMgR3tDtVHjZ7KNf4gjH3RKZxyCpX6FDpCZJH7EL5Td0Ryb9sdECzoaLo4pefq4B/HPDFWO3NVZxxN/Xk= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=nqOAe3JPJgM5FO1vX1irS5RFPsatnNGq5nKm0zs3khIVwzcfPKMle2mluE8ZE5MQEEqYjAEgj8dsOO/QqpMEe9ihol3R3WqdCdpHIcLMbKnsXketQbGoSVrPNKHG3Coo1Hh6rMfNwwZNvBduHvlC9tGH39KBpNRgYlZSWcOlYdPmqLkhBmH48067zyn9qqbl692koVDejSuiLvBNtGwEqeEHHBxHKo3cfSOkLlxj5zHOQQnGQ8042exnFLKtlOZO7wVI0J//Op8vzjdwtHQYGqD0S0zHKZiKMPM54vNQWcykHuj/rApXAvpFFddyoWyJ1DOtfugZN8DHEnrQsjHWuw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Rrjm1yzqcqBKVVf+XGAkcGSW37Sx3MC8g/DQ63d0s/U=; b=RjESyiZvkNSFRajNv2c4zDTJ2UWSjc2SK2oCiNpyBxRY82ofcWpT4vkP1iBXIc+1DgF+TqQkWiFp0S/z5EFh8CAmF3ryQIJ4IaDWSm1pTvwnbrJZ5Ew/bKl/ic1J0Xy26DaINGkrZ30BhsFBox4U8k7dqNyp91CjQkb017gTrLdWo9nubNieuWMFN9dG2OoRd/tCM6rsKiNWqxw782bH9+2fCb2rn0ixWrM9aq+RYdcBThcI0bIpYhcdaGzAy4HHIG1c1uafv4yzyNpu1biEkQU+bZIvJKJIZPdAwbq7ajhBLeoolt1oZF+zDGtf1Zop7CO6ddnJFd3BKHJm/rMgmQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Rrjm1yzqcqBKVVf+XGAkcGSW37Sx3MC8g/DQ63d0s/U=; b=unouKBGU40IMm5UydMhfUOJYkvVyXhz2cok+uwi3t41tYzFo/7gLDaCUgM17yGlHqaOfacrKglsPLD4cN793uucagHH5nL+y32/o5rB4QnigyIF5EX5eJiko3UQ0UU3ydCHX46xvWOXq7F1jAIuRZh6Ks/tXLXbBz7vfvYlzjVk= Received: from AM4PR07CA0034.eurprd07.prod.outlook.com (2603:10a6:205:1::47) by AM8PR08MB5586.eurprd08.prod.outlook.com (2603:10a6:20b:1d6::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7228.34; Tue, 30 Jan 2024 12:57:34 +0000 Received: from AM3PEPF0000A791.eurprd04.prod.outlook.com (2603:10a6:205:1:cafe::cb) by AM4PR07CA0034.outlook.office365.com (2603:10a6:205:1::47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7249.22 via Frontend Transport; Tue, 30 Jan 2024 12:57:34 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM3PEPF0000A791.mail.protection.outlook.com (10.167.16.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7249.19 via Frontend Transport; Tue, 30 Jan 2024 12:57:34 +0000 Received: ("Tessian outbound c4f080b252bb:v228"); Tue, 30 Jan 2024 12:57:34 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 7a50df5ebc8a300d X-CR-MTA-TID: 64aa7808 Received: from d8022a574f94.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 4FEF896F-66C7-43BB-9A61-08AA819C2B36.1; Tue, 30 Jan 2024 12:57:28 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id d8022a574f94.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 30 Jan 2024 12:57:28 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Kb56pvY5F1Y7Ei+niXqUtqZzoLET/iAApSbsPqV95FdQHhtZO8pEvpkiM8ws702W9b/ypuIcPPEoblfPgMgvtgRVKwRlSNUbj4ue8KzC8Q1uApsM99gBFHTMWc6u+r3cKnofqzCekMoMJVzmrC62/9a7Z+FMYVuW699bijMIHBENGefGd2cc0jVZONlWR/jC9rFOFO1otDsGJjaw+aKq9Jai82+YrWd8P2ZtmpK38zmTqWl3hmLEep4z/CMZbkTpvPpUvJpkywwOnDi2VSwzZ5Kcv6A2NgzfVvhmDFO2YEC6AoZ5LSIGP+y3CEypYlJeXkeqR1c2CYEabG/+/sfIdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Rrjm1yzqcqBKVVf+XGAkcGSW37Sx3MC8g/DQ63d0s/U=; b=SYysOdZMprhaBBp/lSV6BvF+38AoCuNaVp1hWtt1vDolaBf9p6M3djSVNgoEDPZFk2Xk63dTwspfUrcGP12NSXhACA4vAkA8YSNRVfrTVt+HH0fX9OaJa+h2DrA6nAyW+e3nb7tyNJLsBkmS3rXuFtcikQ5fz/XG3K2Rjve030NOKI61md8bRxELDgy+54ZQi9WuNeCrB3lYSXXWL6sEdY+TJvhwyZbwF+KyHwF3Ev3C4+ayxqIkUbhtCuRu/GdMjsjfdaZO5l8vJIo+gGbUdwZzXDQvEXkxp/3u4L1/MTl57xPPqi7v66xwpvvVMdGTjkWKKAJhoo8M3Rq2K8AH1A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Rrjm1yzqcqBKVVf+XGAkcGSW37Sx3MC8g/DQ63d0s/U=; b=unouKBGU40IMm5UydMhfUOJYkvVyXhz2cok+uwi3t41tYzFo/7gLDaCUgM17yGlHqaOfacrKglsPLD4cN793uucagHH5nL+y32/o5rB4QnigyIF5EX5eJiko3UQ0UU3ydCHX46xvWOXq7F1jAIuRZh6Ks/tXLXbBz7vfvYlzjVk= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PA4PR08MB5888.eurprd08.prod.outlook.com (2603:10a6:102:e8::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7228.34; Tue, 30 Jan 2024 12:57:25 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::810c:8495:3f0a:ef8]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::810c:8495:3f0a:ef8%7]) with mapi id 15.20.7228.029; Tue, 30 Jan 2024 12:57:25 +0000 From: Tamar Christina To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH]middle-end: check memory accesses in the destination block [PR113588]. Thread-Topic: [PATCH]middle-end: check memory accesses in the destination block [PR113588]. Thread-Index: AQHaUsR+LDTP8h6MVE2LDVbgiBjvO7DyHjIAgAAwqQA= Date: Tue, 30 Jan 2024 12:57:25 +0000 Message-ID: References: <1096n2p8-s430-5p01-orn1-ro09s5162o8n@fhfr.qr> In-Reply-To: <1096n2p8-s430-5p01-orn1-ro09s5162o8n@fhfr.qr> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|PA4PR08MB5888:EE_|AM3PEPF0000A791:EE_|AM8PR08MB5586:EE_ X-MS-Office365-Filtering-Correlation-Id: 085afa7f-49de-41b1-aedd-08dc21930763 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: YMfOy4/JEUtm0RT71I0fNvqcAB97zl7nsY68inqkjv0EXTwZvTVdA1Vi1mD2CxecAVsiM++FgYY5Ud8go/WhOZuxNfgReCtohgQx/kxtPB757iWlv6AtMUczy122Ohfd3KejQP6TAQ5+ryNXwULfOJCCTsWeotW3gl5szCLN7mr72/xxkthAdQ48zf1fJ7imVkYUFWnJ7p5mvAPJ02Sd0G4iwcBPSvCLpZym0y/YwYq1gu+ALt+UZrwuL5M/rpXNPdD87WSRS5CsceYHr2RHW43GdkmLOUynnLU5SKFcK6OqcFfY6G0WTMaDnoUoII/OopEnnO/YmzobvSTiMb6GTSHLMiVqJXZudC64go5iYn6CdstKlEdX8vZbxEnenUuLfYMh3qOsRimXfWduGMlo0+YNfMP6EUL69ZUr4XMaOHOWZkjXsc+Ba7GtvD7wTKklxj0APwsCT/hfVXS6kdRIpKD5Msf68YapD3LMcfIu9jpZiPJGszR4GXE3KkL+j4xtBEONMujcJhIpzJ6dOTpbYve3wcbavbubveCBe+KoKxfUFjLR6bWBK8EJS2GIpPVBbEqlMgYtmIuv6Oyn2dd4pLWola0YtTb+Y93LhoRDFu59BfkQnXlUP+5GsaJ8fnC7/kng1xyxpR4YAa/1opclJw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(136003)(366004)(376002)(396003)(39860400002)(346002)(230922051799003)(64100799003)(1800799012)(186009)(451199024)(41300700001)(84970400001)(26005)(55016003)(9686003)(38070700009)(53546011)(6506007)(7696005)(71200400001)(83380400001)(478600001)(38100700002)(122000001)(86362001)(4326008)(30864003)(5660300002)(76116006)(6916009)(66446008)(66476007)(54906003)(64756008)(66946007)(66556008)(316002)(8676002)(8936002)(2906002)(33656002)(52536014)(579004);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB5888 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM3PEPF0000A791.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: bec642bb-4a9e-47b1-9764-08dc219301ad X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iGjP+XQVvyx2rwSBFDyElU/ntwwfZv6pkZuMb8xeEuidg1yl7eStmJ2saoOtx0Ej9y+jNHq9HqwFMuUCsrDcN9rombkAKABJzQCbmbR1VjAnDJ/NYyOJpjVKiHLw/54iHTkCOW8I+hu2oaPltU2JF4U9PDViewMRa95SOEwtDOLXETutnVMpislqu9rSbdRWD1ODuoQCYd3If05qwsqIDgoSv6x6HCvomzqwN5r1FrHf1ndsqOWaSEVzUPJyx2EbuvwjS9bWGx6Xb9sAW78g8TnHMIMFFsbFNnAUe8fG+aHHwFbT0cx8M4WwRpEsS39VqYhCgZMLyWgYgdzaeNYhhfMAnyBcAsARtN7YXfQ8Ig1pcmum5rPIE35w0qZeAhl5K+x3VbKcHIRI7RigURicpE3cO6V4xPRvZVo7awl6++af3AjmZ01IqsXRdOFKE8+Gq6sZ28nycAZLiBQakwDXaplp+aLo73wXFr9AOF8Xl2/0rk1XF2ep70GYy5rXAZiMR4Jpgh6Jt4xeZwURhG4z8ii9SyQg6pp90MiUAxTDPNGICfuKAVIVkITFnVzyhACE6vj+BElgvEE8uCunvdHcvF2V46mTl+vvIlqQ33ZkumYBM/5Zm9dZOIZ4gaF1uYT17b6itUDWit+JXa/thWXmWp1fPbYGdp+rEq4Gajv62m5hIFn3joh8iZnnDOC4VGKY95wDQzUF0FRVYUAswGfWlkrlKNtRACNtJL6Dx1+T+lRUI6u8HGgWG4aVt1+G6euMFn1Al/RKjL/smSSWmdRAPw== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(396003)(39860400002)(346002)(376002)(136003)(230922051799003)(82310400011)(186009)(451199024)(64100799003)(1800799012)(46966006)(36840700001)(40470700004)(316002)(6862004)(4326008)(52536014)(8676002)(8936002)(33656002)(30864003)(2906002)(5660300002)(86362001)(70586007)(70206006)(54906003)(83380400001)(36860700001)(47076005)(82740400003)(81166007)(356005)(6506007)(53546011)(9686003)(478600001)(7696005)(26005)(107886003)(336012)(41300700001)(40460700003)(40480700001)(84970400001)(55016003);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jan 2024 12:57:34.7025 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 085afa7f-49de-41b1-aedd-08dc21930763 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A791.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR08MB5586 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Richard Biener > Sent: Tuesday, January 30, 2024 9:51 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; jlaw@ventanamicro.com > Subject: Re: [PATCH]middle-end: check memory accesses in the destination = block > [PR113588]. >=20 > On Mon, 29 Jan 2024, Tamar Christina wrote: >=20 > > Hi All, > > > > When analyzing loads for early break it was always the intention that > > for the exit where things get moved to we only check the loads that can > > be reached from the condition. >=20 > Looking at the code I'm a bit confused that we always move to > single_pred (loop->latch) - IIRC that was different at some point? >=20 > Shouldn't we move stores after the last early exit condition instead? Yes it was changed during another PR fix. The rationale at that time didn'= t take into account the peeled case. It used to be that we would "search" for the the exit to = place it in. At that time the rational was, well it doesn't make sense. It has to go in = the block that is the last to be executed. With the non-peeled case it's always the one before t= he latch. Or put differently, I think the destination should be the main IV block. I= am not quite sure I'm following why you want to put the peeled cases inside the latch block. Ah, is it because the latch block is always going to only be executed when = you make a full iteration? That makes sense, but then I think we should also analyze the stores in all= blocks (which your change maybe already does, let me check) since we'll also lifting past the final b= lock we need to update the vuses there too. If the above is correct then I think I understand what you're saying and wi= ll update the patch and do some Checks. Thanks, Tamar >=20 > In particular for the peeled case single_pred (loop->latch) is the > block with the actual early exit condition? So for that case we'd > need to move to the latch itself instead? For non-peeled we move > to the block with the IV condition which looks OK. >=20 > > However the main loop checks all loads and we skip the destination BB. > > As such we never actually check the loads reachable from the COND in th= e > > last BB unless this BB was also the exit chosen by the vectorizer. > > > > This leads us to incorrectly vectorize the loop in the PR and in doing = so access > > out of bounds. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? >=20 > The patch ends up with a worklist and another confusing comment >=20 > + /* For the destination BB we need to only analyze loads reachable from > the early > + break statement itself. */ >=20 > But I think it's a downstream issue from the issue above. That said, > even for the non-peeled case we need to check ref_within_array_bound, > no? >=20 > So what about re-doing that initial loop like the following instead > (and also fix dest_bb, but I'd like clarification here). Basically > walk all blocks, do the ref_within_array_bound first and only > after we've seen 'dest_bb' do the checks required for moving > stores for all upstream BBs. >=20 > And dest_bb should be >=20 > /* Move side-effects to the in-loop destination of the last early > exit. */ > if (LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo)) > dest_bb =3D loop->latch; > else > dest_bb =3D single_pred (loop->latch); >=20 >=20 > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc > index f592aeb8028..d6c8910dd6c 100644 > --- a/gcc/tree-vect-data-refs.cc > +++ b/gcc/tree-vect-data-refs.cc > @@ -668,7 +668,6 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > auto_vec bases; > basic_block dest_bb =3D NULL; >=20 > - hash_set visited; > class loop *loop =3D LOOP_VINFO_LOOP (loop_vinfo); > class loop *loop_nest =3D loop_outer (loop); >=20 > @@ -681,15 +680,11 @@ vect_analyze_early_break_dependences > (loop_vec_info loop_vinfo) > side-effects to is always the latch connected exit. When we suppor= t > general control flow we can do better but for now this is fine. */ > dest_bb =3D single_pred (loop->latch); > - basic_block bb =3D dest_bb; > + basic_block bb =3D loop->latch; > + bool check_deps =3D false; >=20 > do > { > - /* If the destination block is also the header then we have nothin= g to do. */ > - if (!single_pred_p (bb)) > - continue; > - > - bb =3D single_pred (bb); > gimple_stmt_iterator gsi =3D gsi_last_bb (bb); >=20 > /* Now analyze all the remaining statements and try to determine w= hich > @@ -707,6 +702,25 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > if (!dr_ref) > continue; >=20 > + /* Check if vector accesses to the object will be within bounds. > + must be a constant or assume loop will be versioned or niters > + bounded by VF so accesses are within range. */ > + if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > + "early breaks not supported: vectorization " > + "would %s beyond size of obj.", > + DR_IS_READ (dr_ref) ? "read" : "write"); > + return opt_result::failure_at (stmt, > + "can't safely apply code motion to " > + "dependencies of %G to vectorize " > + "the early exit.\n", stmt); > + } > + > + if (!check_deps) > + continue; > + > /* We currently only support statically allocated objects due to > not having first-faulting loads support or peeling for > alignment support. Compute the size of the referenced object > @@ -739,22 +753,6 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > "the early exit.\n", stmt); > } >=20 > - /* Check if vector accesses to the object will be within bounds. > - must be a constant or assume loop will be versioned or niters > - bounded by VF so accesses are within range. */ > - if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > - { > - if (dump_enabled_p ()) > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > - "early breaks not supported: vectorization " > - "would %s beyond size of obj.", > - DR_IS_READ (dr_ref) ? "read" : "write"); > - return opt_result::failure_at (stmt, > - "can't safely apply code motion to " > - "dependencies of %G to vectorize " > - "the early exit.\n", stmt); > - } > - > if (DR_IS_READ (dr_ref)) > bases.safe_push (dr_ref); > else if (DR_IS_WRITE (dr_ref)) > @@ -814,8 +812,16 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > "marked statement for vUSE update: %G", stmt); > } > } > + if (!single_pred_p (bb)) > + { > + gcc_assert (bb =3D=3D loop->header); > + break; > + } > + if (bb =3D=3D dest_bb) > + check_deps =3D true; > + bb =3D single_pred (bb); > } > - while (bb !=3D loop->header); > + while (1); >=20 > /* We don't allow outer -> inner loop transitions which should have be= en > trapped already during loop form analysis. */ >=20 > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > PR tree-optimization/113588 > > * tree-vect-data-refs.cc (vect_analyze_early_break_dependences_1): New= . > > (vect_analyze_data_ref_dependence): Use it. > > (vect_analyze_early_break_dependences): Update comments. > > > > gcc/testsuite/ChangeLog: > > > > PR tree-optimization/113588 > > * gcc.dg/vect/vect-early-break_108-pr113588.c: New test. > > * gcc.dg/vect/vect-early-break_109-pr113588.c: New test. > > > > --- inline copy of patch -- > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..e488619c9aac41fafbcf479 > 818392a6bb7c6924f > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > > @@ -0,0 +1,15 @@ > > +/* { dg-do compile } */ > > +/* { dg-add-options vect_early_break } */ > > +/* { dg-require-effective-target vect_early_break } */ > > +/* { dg-require-effective-target vect_int } */ > > + > > +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */ > > + > > +int foo (const char *s, unsigned long n) > > +{ > > + unsigned long len =3D 0; > > + while (*s++ && n--) > > + ++len; > > + return len; > > +} > > + > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..488c19d3ede809631d1a7 > ede0e7f7bcdc7a1ae43 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > > @@ -0,0 +1,44 @@ > > +/* { dg-add-options vect_early_break } */ > > +/* { dg-require-effective-target vect_early_break } */ > > +/* { dg-require-effective-target vect_int } */ > > +/* { dg-require-effective-target mmap } */ > > + > > +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */ > > + > > +#include > > +#include > > + > > +#include "tree-vect.h" > > + > > +__attribute__((noipa)) > > +int foo (const char *s, unsigned long n) > > +{ > > + unsigned long len =3D 0; > > + while (*s++ && n--) > > + ++len; > > + return len; > > +} > > + > > +int main() > > +{ > > + > > + check_vect (); > > + > > + long pgsz =3D sysconf (_SC_PAGESIZE); > > + void *p =3D mmap (NULL, pgsz * 3, PROT_READ|PROT_WRITE, > > + MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); > > + if (p =3D=3D MAP_FAILED) > > + return 0; > > + mprotect (p, pgsz, PROT_NONE); > > + mprotect (p+2*pgsz, pgsz, PROT_NONE); > > + char *p1 =3D p + pgsz; > > + p1[0] =3D 1; > > + p1[1] =3D 0; > > + foo (p1, 1000); > > + p1 =3D p + 2*pgsz - 2; > > + p1[0] =3D 1; > > + p1[1] =3D 0; > > + foo (p1, 1000); > > + return 0; > > +} > > + > > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc > > index > f592aeb8028afd4fd70e2175104efab2a2c0d82e..52cef242a7ce5d0e525bff639fa > 1dc2f0a6f30b9 100644 > > --- a/gcc/tree-vect-data-refs.cc > > +++ b/gcc/tree-vect-data-refs.cc > > @@ -619,10 +619,69 @@ vect_analyze_data_ref_dependence (struct > data_dependence_relation *ddr, > > return opt_result::success (); > > } > > > > -/* Funcion vect_analyze_early_break_dependences. > > +/* Function vect_analyze_early_break_dependences_1 > > > > - Examime all the data references in the loop and make sure that if w= e have > > - mulitple exits that we are able to safely move stores such that the= y become > > + Helper function of vect_analyze_early_break_dependences which perfo= rms > safety > > + analysis for load operations in an early break. */ > > + > > +static opt_result > > +vect_analyze_early_break_dependences_1 (data_reference *dr_ref, gimple > *stmt) > > +{ > > + /* We currently only support statically allocated objects due to > > + not having first-faulting loads support or peeling for > > + alignment support. Compute the size of the referenced object > > + (it could be dynamically allocated). */ > > + tree obj =3D DR_BASE_ADDRESS (dr_ref); > > + if (!obj || TREE_CODE (obj) !=3D ADDR_EXPR) > > + { > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "early breaks only supported on statically" > > + " allocated objects.\n"); > > + return opt_result::failure_at (stmt, > > + "can't safely apply code motion to " > > + "dependencies of %G to vectorize " > > + "the early exit.\n", stmt); > > + } > > + > > + tree refop =3D TREE_OPERAND (obj, 0); > > + tree refbase =3D get_base_address (refop); > > + if (!refbase || !DECL_P (refbase) || !DECL_SIZE (refbase) > > + || TREE_CODE (DECL_SIZE (refbase)) !=3D INTEGER_CST) > > + { > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "early breaks only supported on" > > + " statically allocated objects.\n"); > > + return opt_result::failure_at (stmt, > > + "can't safely apply code motion to " > > + "dependencies of %G to vectorize " > > + "the early exit.\n", stmt); > > + } > > + > > + /* Check if vector accesses to the object will be within bounds. > > + must be a constant or assume loop will be versioned or niters > > + bounded by VF so accesses are within range. */ > > + if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > > + { > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "early breaks not supported: vectorization " > > + "would %s beyond size of obj.", > > + DR_IS_READ (dr_ref) ? "read" : "write"); > > + return opt_result::failure_at (stmt, > > + "can't safely apply code motion to " > > + "dependencies of %G to vectorize " > > + "the early exit.\n", stmt); > > + } > > + > > + return opt_result::success (); > > +} > > + > > +/* Function vect_analyze_early_break_dependences. > > + > > + Examine all the data references in the loop and make sure that if w= e have > > + multiple exits that we are able to safely move stores such that the= y become > > safe for vectorization. The function also calculates the place whe= re to move > > the instructions to and computes what the new vUSE chain should be. > > > > @@ -639,7 +698,7 @@ vect_analyze_data_ref_dependence (struct > data_dependence_relation *ddr, > > - Multiple loads are allowed as long as they don't alias. > > > > NOTE: > > - This implemementation is very conservative. Any overlappig loads/= stores > > + This implementation is very conservative. Any overlapping loads/s= tores > > that take place before the early break statement gets rejected as= ide from > > WAR dependencies. > > > > @@ -668,7 +727,6 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > > auto_vec bases; > > basic_block dest_bb =3D NULL; > > > > - hash_set visited; > > class loop *loop =3D LOOP_VINFO_LOOP (loop_vinfo); > > class loop *loop_nest =3D loop_outer (loop); > > > > @@ -683,6 +741,7 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > > dest_bb =3D single_pred (loop->latch); > > basic_block bb =3D dest_bb; > > > > + /* First analyse all blocks leading to dest_bb excluding dest_bb its= elf. */ > > do > > { > > /* If the destination block is also the header then we have noth= ing to do. */ > > @@ -707,53 +766,11 @@ vect_analyze_early_break_dependences > (loop_vec_info loop_vinfo) > > if (!dr_ref) > > continue; > > > > - /* We currently only support statically allocated objects due to > > - not having first-faulting loads support or peeling for > > - alignment support. Compute the size of the referenced object > > - (it could be dynamically allocated). */ > > - tree obj =3D DR_BASE_ADDRESS (dr_ref); > > - if (!obj || TREE_CODE (obj) !=3D ADDR_EXPR) > > - { > > - if (dump_enabled_p ()) > > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > - "early breaks only supported on statically" > > - " allocated objects.\n"); > > - return opt_result::failure_at (stmt, > > - "can't safely apply code motion to " > > - "dependencies of %G to vectorize " > > - "the early exit.\n", stmt); > > - } > > - > > - tree refop =3D TREE_OPERAND (obj, 0); > > - tree refbase =3D get_base_address (refop); > > - if (!refbase || !DECL_P (refbase) || !DECL_SIZE (refbase) > > - || TREE_CODE (DECL_SIZE (refbase)) !=3D INTEGER_CST) > > - { > > - if (dump_enabled_p ()) > > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > - "early breaks only supported on" > > - " statically allocated objects.\n"); > > - return opt_result::failure_at (stmt, > > - "can't safely apply code motion to " > > - "dependencies of %G to vectorize " > > - "the early exit.\n", stmt); > > - } > > - > > - /* Check if vector accesses to the object will be within bounds. > > - must be a constant or assume loop will be versioned or niters > > - bounded by VF so accesses are within range. */ > > - if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > > - { > > - if (dump_enabled_p ()) > > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > - "early breaks not supported: vectorization " > > - "would %s beyond size of obj.", > > - DR_IS_READ (dr_ref) ? "read" : "write"); > > - return opt_result::failure_at (stmt, > > - "can't safely apply code motion to " > > - "dependencies of %G to vectorize " > > - "the early exit.\n", stmt); > > - } > > + /* Check if the operation is one we can safely do. */ > > + opt_result res > > + =3D vect_analyze_early_break_dependences_1 (dr_ref, stmt); > > + if (!res) > > + return res; > > > > if (DR_IS_READ (dr_ref)) > > bases.safe_push (dr_ref); > > @@ -817,6 +834,51 @@ vect_analyze_early_break_dependences > (loop_vec_info loop_vinfo) > > } > > while (bb !=3D loop->header); > > > > + /* For the destination BB we need to only analyze loads reachable fr= om the > early > > + break statement itself. */ > > + auto_vec workset; > > + hash_set visited; > > + gimple *last_stmt =3D gsi_stmt (gsi_last_bb (dest_bb)); > > + gcond *last_cond =3D dyn_cast (last_stmt); > > + /* If the cast fails we have a different control flow statement in t= he latch. Most > > + commonly this is a switch. */ > > + if (!last_cond) > > + return opt_result::failure_at (last_stmt, > > + "can't safely apply code motion to dependencies" > > + " to vectorize the early exit, unknown control fow" > > + " in stmt %G", last_stmt); > > + workset.safe_push (gimple_cond_lhs (last_cond)); > > + workset.safe_push (gimple_cond_rhs (last_cond)); > > + > > + imm_use_iterator imm_iter; > > + use_operand_p use_p; > > + tree lhs; > > + do > > + { > > + tree op =3D workset.pop (); > > + if (visited.add (op)) > > + continue; > > + stmt_vec_info stmt_vinfo =3D loop_vinfo->lookup_def (op); > > + > > + /* Not defined in loop, don't care. */ > > + if (!stmt_vinfo) > > + continue; > > + gimple *stmt =3D STMT_VINFO_STMT (stmt_vinfo); > > + auto dr_ref =3D STMT_VINFO_DATA_REF (stmt_vinfo); > > + if (dr_ref) > > + { > > + opt_result res > > + =3D vect_analyze_early_break_dependences_1 (dr_ref, stmt); > > + if (!res) > > + return res; > > + } > > + else > > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, op) > > + if ((lhs =3D gimple_get_lhs (USE_STMT (use_p)))) > > + workset.safe_push (lhs); > > + } > > + while (!workset.is_empty ()); > > + > > /* We don't allow outer -> inner loop transitions which should have = been > > trapped already during loop form analysis. */ > > gcc_assert (dest_bb->loop_father =3D=3D loop); > > > > > > > > > > >=20 > -- > Richard Biener > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg= )