From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2051.outbound.protection.outlook.com [40.107.8.51]) by sourceware.org (Postfix) with ESMTPS id E076C3858D20 for ; Mon, 22 Jan 2024 22:35:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E076C3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E076C3858D20 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.8.51 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1705962904; cv=pass; b=XoQesP3fM4RhmudDx+qxkSLaVjUfVjrNdtrCTn1qxQMgy6Ty9o6RX20dJkIPTK1qrXZp5BRT3mgdECOcL7nU9IVx8w90Qudf86jRLFXN3osJ2LZNeeCE0/n7b31AZrk2cDjSwSqXSkeVntZ+D+CDYqwyxyE62Ct6OWUmQ15WvPU= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1705962904; c=relaxed/simple; bh=VL+QcyKh7Phs4nQRBpEMAA3ueeKxtHab3UsjYVdxeq4=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=SVBBjSBGFAsGUcSkmWJ4Py4hvEz7XThj0EfLHbuoMRaXS/XGibd1LvFuoEiNeJieRzvXchMNMv0JKaHWglzcEPGYrbEAJDONm6R7LIUQxAJlWQmACJvpBcmqinT5Nnay1If0nLAjntMVnzDlGKRnvlc0mkJOHh6GB/H1pJA7KUk= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=S/1bSJAdujARJ93p+V+Tz7KSAxHdGcm+yx84ru2sbEFGwMBBUKQUc2PyzoypvmNSjXoUtRMzSK5avTWnUz4pSZ5sSHGxgJKxo1jfg/OEAZb9/iaBfk4AT24qmMzW4INnB4zAETIvqhMIlfm+7B427FVfsf/7RKsm3YW2jiCz16Fy5wf/1x+WNNtR3HmeH8t6at/Tld076mFfFdpvHqdy0kaLKWmwpAM+9PJI2nrw2ym077Fh06FHn20j5c3yzep2+yaaZBMUOlLh0cki1EWJZbPsH3USIJSR58MKsxae8OZ9G8bwvWX890AIC4+k9BN5VNW8iW4BpkU4a4/s+fcyCA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=93sgRq9cq7AhE9hB1wOsBq21Fi3YduCboKboibNb+40=; b=J2qTMNvghy0lxIoKzlhqY7t7RfwqL/QMQaq1WPp/I/gghLBj9vWlbyX5N4zuNfazFvMUyuTkRLAZo1j90Kl9iKFn0iIWDXOtmJu1jE/rwORG2kZSFNxY5S9MByESWJZ3f6Pd6UaV8kMmtLROLrD8vX2L4hHn0+OD+afbXtB29G8eCx4SXIfNXewHWQewxZDoRzZS2GztwjpZT3f1GvOCWIr5GB6IqB460BVTw9y5fjS2I+G7PYqfqkFxzmOjidY1S8Hu2t3Nv2BqmrgyIWYLJAL+jGFO2CXOAwmXLiRqi23ItEaLFtVjpcrObltmyjpExmhqf75fdaQmdIr/+wJyTQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=93sgRq9cq7AhE9hB1wOsBq21Fi3YduCboKboibNb+40=; b=PrrG5C8yMy7HMSGkmQMkjsOKqe1MlZ3oeLc5HuouIDn9K6qPyc8itOCWgx6io6ETLuljMnhyxTgRIzj9XZBW7HcQmG7q+i9N+Te2eE305PbJ04zrR5/ipblVw7qAeONwSnwOIuwTeArx9XFi+fWepg+6pHtXNQxkfoXcYwseup4= Received: from DUZPR01CA0072.eurprd01.prod.exchangelabs.com (2603:10a6:10:3c2::10) by PA4PR08MB5935.eurprd08.prod.outlook.com (2603:10a6:102:e4::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7202.36; Mon, 22 Jan 2024 22:34:54 +0000 Received: from DB5PEPF00014B90.eurprd02.prod.outlook.com (2603:10a6:10:3c2:cafe::e3) by DUZPR01CA0072.outlook.office365.com (2603:10a6:10:3c2::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7202.32 via Frontend Transport; Mon, 22 Jan 2024 22:34:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B90.mail.protection.outlook.com (10.167.8.228) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7202.16 via Frontend Transport; Mon, 22 Jan 2024 22:34:54 +0000 Received: ("Tessian outbound 1076c872ecc6:v228"); Mon, 22 Jan 2024 22:34:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 854582e72e231b4e X-CR-MTA-TID: 64aa7808 Received: from 94563493d78b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 97934E02-B7E9-4D9B-99AB-C564A82DC029.1; Mon, 22 Jan 2024 22:34:43 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 94563493d78b.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 22 Jan 2024 22:34:43 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AKoOwL6p2brOyBOv7L9kTH4zmSkPS0TiMNqWgJzn3YzaJHMkZD5l23Nl2RP9yEEfv/sURMyumD2Y9wwWF1N2LSbcq4WT7YCpn3hRprxIoZg49Y/kPGSEZG2XfMaGl5qcmtxTf5GfnhT1f/DK/SJm83rRgoO8x0MPiw/i32+yKDe7Aig4Ah6eU3OeGmm8rHMhVVebeFYi5tkfMNBctrCPgHSzVg9PExcf5HihzSf8pSfO6mFKR0YGr5XKJ0AWE7JthThxtv9FtHd2iYAQmxyvSQuavvHXPDaDQwgkX48kXFfIQX/DibGKvlyJYOrtkmeVq6Kj2ZH+BqPg/yK6sSzHdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=93sgRq9cq7AhE9hB1wOsBq21Fi3YduCboKboibNb+40=; b=SY6i5gD0nPTqqxkRy/N8CiV8FZtrgyC8UFo8LNVj/mfnCN09v7tzyzRVcLz7Ho4PmvfeAVaCH8dRDJNxrojK0RwvDLQHZVtgklzH1FNpcHB21W0nLYEet+8JFgIU++uI+uRxoMFerAgiruDb6bXN19oRYUG55GhvLX2GYUDFR1qBd5/WqD9v1A4R+8wQFoophUIJnjtfqRS/QDaQJOCnkdGsVLdt9F1C9/lexLSt9wP5ioBZumLuIvMOQORyaHvg717vCd98lSq8mUtTdbLTllAT/5CNGpFJRXmHseQFzyrUw89DRa9EQj8QalCa30kjQE4WLa5a6RVzsHqTs+nXYA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=93sgRq9cq7AhE9hB1wOsBq21Fi3YduCboKboibNb+40=; b=PrrG5C8yMy7HMSGkmQMkjsOKqe1MlZ3oeLc5HuouIDn9K6qPyc8itOCWgx6io6ETLuljMnhyxTgRIzj9XZBW7HcQmG7q+i9N+Te2eE305PbJ04zrR5/ipblVw7qAeONwSnwOIuwTeArx9XFi+fWepg+6pHtXNQxkfoXcYwseup4= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by PR3PR08MB5739.eurprd08.prod.outlook.com (2603:10a6:102:8e::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7202.30; Mon, 22 Jan 2024 22:34:41 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::48ca:fbcb:84bf:ed17]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::48ca:fbcb:84bf:ed17%4]) with mapi id 15.20.7202.035; Mon, 22 Jan 2024 22:34:40 +0000 Date: Mon, 22 Jan 2024 22:34:36 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org, Kyrylo Tkachov , Richard Earnshaw , richard.sandiford@arm.com Subject: Re: [PATCH 3/3] aarch64: Fix up debug uses in ldp/stp pass [PR113089] Message-ID: References: Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P265CA0141.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2c4::11) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|PR3PR08MB5739:EE_|DB5PEPF00014B90:EE_|PA4PR08MB5935:EE_ X-MS-Office365-Filtering-Correlation-Id: 54e8522d-2af7-423b-6b00-08dc1b9a5afd x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: CxdkFeyrMupTIk4m0cM8DJwwLGNd6fV8loD1JgQUry9U64JZjm0FV69vclHglaO9fadtOSVRoygtFku0DgMYhcd1oNMHCyleJOXOuO2QlA4WnSHS3JmIIRgUfWX76d00rnSOfckUitTsmONHHVdvwAsPFhkdHKX8Jj8xzzQ/R74y+pWjdsDJBT+ChoLn6dNbWjn2rTW/xW/yog/ute8+o+EnLUFhZgo/khd671jE4aiOoObonwJvqOZnUvvClsovyol3nWpIBLp1Co6TATI5NtFslY0rve9fL1psvCdfec1mEFYFEy1Ej+MSSrZnSWsdeCQiS2yH2Y27oL29RqcMcBSTpGBfWi1qv/ssBfLHWrLoAR4L2ZNCSanEaF0TRzQB0+GBTv2DI+1qi3qPQeeftJbfSry1ZUInGXtzvxkWpWAALD62y9QkK+sfYHUHR4RKvnZySLPQchlEaLMgc3CgE/TLRdEDyrd81mV520+U4V6jQJCLS8+hSCJ+se/hIWVgPFYMEqZPdMn8uZhg0rK598taZe7OtDx0g3Qp7Y+oeRE= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAWPR08MB8958.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(346002)(39860400002)(136003)(376002)(396003)(366004)(230922051799003)(451199024)(186009)(1800799012)(64100799003)(5660300002)(8936002)(44832011)(2906002)(8676002)(66556008)(966005)(316002)(6486002)(66946007)(6636002)(110136005)(30864003)(66476007)(86362001)(36756003)(6512007)(38100700002)(53546011)(6506007)(478600001)(2616005)(6666004)(83380400001)(26005)(41300700001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR08MB5739 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B90.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 24cda1b6-2456-4fdb-f6de-08dc1b9a52be X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 9wx8xwnMgPVG5fI90iFmMSspQWC6JAPUdayNXv3VURXhs7/ILLaGSgcayo3QrQSllJPXJzWXg1ZTsnbhCfdx+rJOmDntAD0KyRjIkEVTQ0yJPMLaSW9X+j3o0jNefyvd1lTSUrh+1fvpRMIG7sMcqCbMgMtu3r2Gq9RoP16xi5n1Jo+hQ19Fomfg56oRsj22kLo7JLj52j3xxV62oOO92JhPrxN6Zv6QdxAZWkJ5K/T+iTIFVhpXM3lid+PqtDSbazJXFUrfT5eJSlbbrU5rME5V7OSQQWDX9FbKv9TPKsfd8TmIMhDSlnZ1gTW+FMCBFCUnRpdVdVD2D+gkS72dOG7nVEnbYtg8RumuWsvfgimiVsbZilUZHiKoPGmq1RgS3g0wPRnAuHzaClHvYQwGTLGBc9GzkMoGyhjkKktPQIQIw/80AikyX7QmLBRC/ox7BJDuspB5rHrFO+GZmts28Zq3CVAiJw4VLYn66MkZn5gMh4XDt4lWkyYqSC6/ndpwvYF5dejGq4id3LpYRm/KRR+IDoYTEz2/2dVyEkRL1CzmzYR52cGgUqO/nipA7HNpZ+K1QUE3jvCD26pgBBEa+VAbiqb39l98/j9BS2m7Sih3WLADRI7uyuS9DArhkU2EEUBFMvHzKYSm0uhpkY2G9pAzTz2rF+4iBRK7pDzWu9hJxtZ61p+MmM1LrnvWyt5F1S3zuqG4boObd92NSai84W4Ey0xb45jJKQmJLxKnOXs= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(136003)(376002)(39860400002)(396003)(346002)(230922051799003)(1800799012)(186009)(451199024)(64100799003)(82310400011)(36840700001)(46966006)(40470700004)(83380400001)(36860700001)(47076005)(336012)(2616005)(26005)(356005)(82740400003)(81166007)(41300700001)(70586007)(70206006)(316002)(6636002)(5660300002)(44832011)(2906002)(30864003)(8676002)(110136005)(8936002)(6512007)(6506007)(6666004)(53546011)(478600001)(6486002)(966005)(36756003)(40480700001)(40460700003)(86362001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Jan 2024 22:34:54.5238 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 54e8522d-2af7-423b-6b00-08dc1b9a5afd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B90.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB5935 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 22/01/2024 17:09, Richard Sandiford wrote: > Sorry for the earlier review comment about debug insns. I hadn't > looked far enough into the queue to see this patch. > > Alex Coplan writes: > > As the PR shows, we were missing code to update debug uses in the > > load/store pair fusion pass. This patch fixes that. > > > > Note that this patch depends on the following patch to create new uses > > in RTL-SSA, submitted as part of the fixes for PR113070: > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642919.html > > > > The patch tries to give a complete treatment of the debug uses that will > > be affected by the changes we make, and in particular makes an effort to > > preserve debug info where possible, e.g. when re-ordering an update of > > a base register by a constant over a debug use of that register. When > > re-ordering loads over a debug use of a transfer register, we reset the > > debug insn. Likewise when re-ordering stores over debug uses of mem. > > > > While doing this I noticed that try_promote_writeback used a strange > > choice of move_range for the pair insn, in that it chose the previous > > nondebug insn instead of the insn itself. Since the insn is being > > changed, these move ranges are equivalent (at least in terms of nondebug > > insn placement as far as RTL-SSA is concerned), but I think it is more > > natural to choose the pair insn itself. This is needed to avoid > > incorrectly updating some debug uses. > > > > Notes on testing: > > - The series was bootstrapped/regtested on top of the fixes for > > PR113070 and PR113356. It seemed to make more sense to test with > > correct use/def info, and as mentioned above, this patch depends on > > one of the PR113070 patches. > > - I also ran the testsuite with -g -funroll-loops -mearly-ldp-fusion > > -mlate-ldp-fusion to try and flush out more issues, and worked > > through some examples where writeback updates were triggered to > > make sure it was doing the right thing. > > - The patches also survived an LTO+PGO bootstrap with > > --enable-languages=all (with the passes enabled). > > > > Bootstrapped/regtested as a series on aarch64-linux-gnu (with/without > > the pass enabled). OK for trunk? > > > > Thanks, > > Alex > > > > gcc/ChangeLog: > > > > PR target/113089 > > * config/aarch64/aarch64-ldp-fusion.cc (reset_debug_use): New. > > (fixup_debug_use): New. > > (fixup_debug_uses_trailing_add): New. > > (fixup_debug_uses): New. Use it ... > > (ldp_bb_info::fuse_pair): ... here. > > (try_promote_writeback): Call fixup_debug_uses_trailing_add to > > fix up debug uses of the base register that are affected by > > folding in the trailing add insn. > > > > gcc/testsuite/ChangeLog: > > > > PR target/113089 > > * gcc.c-torture/compile/pr113089.c: New test. > > --- > > gcc/config/aarch64/aarch64-ldp-fusion.cc | 332 +++++++++++++++++- > > .../gcc.c-torture/compile/pr113089.c | 26 ++ > > 2 files changed, 351 insertions(+), 7 deletions(-) > > create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr113089.c > > > > diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc > > index 4d7fd72c6b1..fd0278e7acf 100644 > > --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc > > +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc > > @@ -1342,6 +1342,309 @@ ldp_bb_info::track_tombstone (int uid) > > gcc_unreachable (); // Bit should have changed. > > } > > > > +// Reset the debug insn containing USE (the debug insn has been > > +// optimized away). > > +static void > > +reset_debug_use (use_info *use) > > +{ > > + auto use_insn = use->insn (); > > + auto use_rtl = use_insn->rtl (); > > + insn_change change (use_insn); > > + change.new_uses = {}; > > + INSN_VAR_LOCATION_LOC (use_rtl) = gen_rtx_UNKNOWN_VAR_LOC (); > > + crtl->ssa->change_insn (change); > > +} > > + > > +// USE is a debug use that needs updating because DEF (a def of the same > > +// register) is being re-ordered over it. If BASE is non-null, then DEF > > +// is an update of the register BASE by a constant, given by WB_OFFSET, > > +// and we can preserve debug info by accounting for the change in side > > +// effects. > > +static void > > +fixup_debug_use (obstack_watermark &attempt, > > + use_info *use, > > + def_info *def, > > + rtx base, > > + poly_int64 wb_offset) > > +{ > > + auto use_insn = use->insn (); > > + if (base) > > + { > > + auto use_rtl = use_insn->rtl (); > > + insn_change change (use_insn); > > + > > + gcc_checking_assert (REG_P (base) && use->regno () == REGNO (base)); > > + change.new_uses = check_remove_regno_access (attempt, > > + change.new_uses, > > + use->regno ()); > > + > > + // The effect of the writeback is to add WB_OFFSET to BASE. If > > + // we're re-ordering DEF below USE, then we update USE by adding > > + // WB_OFFSET to it. Otherwise, if we're re-ordering DEF above > > + // USE, we update USE by undoing the effect of the writeback > > + // (subtracting WB_OFFSET). > > + use_info *new_use; > > + if (*def->insn () > *use_insn) > > + { > > + // We now need USE_INSN to consume DEF. Create a new use of DEF. > > + // > > + // N.B. this means until we call change_insns for the main change > > + // group we will temporarily have a debug use consuming a def that > > + // comes after it, but RTL-SSA doesn't currently support updating > > + // debug insns as part of the main change group (together with > > + // nondebug changes), so we will have to live with this update > > + // leaving the IR being temporarily inconsistent. It seems to > > + // work out OK once the main change group is applied. > > + wb_offset *= -1; > > + new_use = crtl->ssa->create_use (attempt, > > + use_insn, > > + as_a (def)); > > + } > > + else > > + new_use = find_access (def->insn ()->uses (), use->regno ()); > > + > > + change.new_uses = insert_access (attempt, new_use, change.new_uses); > > + > > + if (dump_file) > > + { > > + const char *dir = (*def->insn () < *use_insn) ? "down" : "up"; > > + pretty_printer pp; > > + pp_string (&pp, "["); > > + pp_access (&pp, use, 0); > > + pp_string (&pp, "]"); > > + pp_string (&pp, " due to wb def "); > > + pp_string (&pp, "["); > > + pp_access (&pp, def, 0); > > + pp_string (&pp, "]"); > > + fprintf (dump_file, > > + " i%d: fix up debug use %s re-ordered %s, " > > + "sub r%u -> r%u + ", > > + use_insn->uid (), pp_formatted_text (&pp), > > + dir, REGNO (base), REGNO (base)); > > + print_dec (wb_offset, dump_file); > > + fprintf (dump_file, "\n"); > > + } > > + > > + insn_propagation prop (use_rtl, base, > > + plus_constant (GET_MODE (base), base, wb_offset)); > > + if (prop.apply_to_pattern (&INSN_VAR_LOCATION_LOC (use_rtl))) > > + crtl->ssa->change_insn (change); > > + else > > + { > > + if (dump_file) > > + fprintf (dump_file, " i%d: RTL substitution failed (%s)" > > + ", resetting debug insn", use_insn->uid (), > > + prop.failure_reason); > > + reset_debug_use (use); > > + } > > + } > > + else > > + { > > + if (dump_file) > > + { > > + pretty_printer pp; > > + pp_string (&pp, "["); > > + pp_access (&pp, use, 0); > > + pp_string (&pp, "] due to re-ordered load def ["); > > + pp_access (&pp, def, 0); > > + pp_string (&pp, "]"); > > + fprintf (dump_file, " i%d: resetting debug use %s\n", > > + use_insn->uid (), pp_formatted_text (&pp)); > > + } > > + reset_debug_use (use); > > + } > > +} > > + > > +// Update debug uses when folding in a trailing add insn to form a > > +// writeback pair. > > +// > > +// ATTEMPT is used to allocate RTL-SSA temporaries for the changes, > > +// the final pair is placed immediately after PAIR_DST, TRAILING_ADD > > +// is a trailing add insn which is being folded into the pair to make it > > +// use writeback addressing, and WRITEBACK_EFFECT is the pattern for > > +// TRAILING_ADD. > > +static void > > +fixup_debug_uses_trailing_add (obstack_watermark &attempt, > > + insn_info *pair_dst, > > + insn_info *trailing_add, > > + rtx writeback_effect) > > +{ > > + rtx base = SET_DEST (writeback_effect); > > + > > + poly_int64 wb_offset; > > + rtx base2 = strip_offset (SET_SRC (writeback_effect), &wb_offset); > > + gcc_checking_assert (rtx_equal_p (base, base2)); > > + > > + auto defs = trailing_add->defs (); > > + gcc_checking_assert (defs.size () == 1); > > + def_info *def = defs[0]; > > + > > + if (auto set = safe_dyn_cast (def->prev_def ())) > > + for (auto use : set->debug_insn_uses ()) > > + if (*use->insn () > *pair_dst) > > + // DEF is getting re-ordered above USE, fix up USE accordingly. > > + fixup_debug_use (attempt, use, def, base, wb_offset); > > +} > > + > > +// Called from fuse_pair, fixes up any debug uses that will be affected > > +// by the changes. > > +// > > +// ATTEMPT is the obstack watermark used to allocate RTL-SSA temporaries for > > +// the changes, INSNS gives the candidate insns: at this point the use/def > > +// information should still be as on entry to fuse_pair, but the patterns may > > +// have changed, hence we pass ORIG_RTL which contains the original patterns > > +// for the candidate insns. > > +// > > +// The final pair will be placed immediately after PAIR_DST, LOAD_P is true if > > +// it is a load pair, bit I of WRITEBACK is set if INSNS[I] originally had > > +// writeback, and WRITEBACK_EFFECT is an rtx describing the overall update to > > +// the base register in the final pair (if any). BASE_REGNO gives the register > > +// number of the base register used in the final pair. > > +static void > > +fixup_debug_uses (obstack_watermark &attempt, > > + insn_info *insns[2], > > + rtx orig_rtl[2], > > + insn_info *pair_dst, > > + insn_info *trailing_add, > > + bool load_p, > > + int writeback, > > + rtx writeback_effect, > > + unsigned base_regno) > > +{ > > + // USE is a debug use that needs updating because DEF (a def of the > > + // resource) is being re-ordered over it. If WRITEBACK_PAT is non-NULL, > > + // then it gives the original RTL pattern for DEF's insn, and DEF is a > > + // writeback update of the base register. > > + // > > + // This simply unpacks WRITEBACK_PAT if needed and calls fixup_debug_use. > > + auto update_debug_use = [&](use_info *use, def_info *def, > > + rtx writeback_pat) > > + { > > + poly_int64 offset = 0; > > + rtx base = NULL_RTX; > > + if (writeback_pat) > > + { > > + rtx mem = XEXP (writeback_pat, load_p); > > + gcc_checking_assert (GET_RTX_CLASS (GET_CODE (XEXP (mem, 0))) > > + == RTX_AUTOINC); > > + > > + base = ldp_strip_offset (mem, &offset); > > + gcc_checking_assert (REG_P (base) && REGNO (base) == base_regno); > > + } > > + fixup_debug_use (attempt, use, def, base, offset); > > + }; > > + > > + // Reset any debug uses of mem over which we re-ordered a store. > > + // > > + // It would be nice to try and preserve debug info here, but it seems that > > + // would require doing alias analysis to see if the store aliases with the > > + // debug use, which seems a little extravagant just to preserve debug info. > > + if (!load_p) > > + { > > + auto def = memory_access (insns[0]->defs ()); > > + auto last_def = memory_access (insns[1]->defs ()); > > + for (; def != last_def; def = def->next_def ()) > > + if (auto set = dyn_cast (def)) > > Similarly to the review of the other patch, I think we should use > an unconditional as_a here. Sounds good, I'll make that change. > > > + for (auto use : set->debug_insn_uses ()) > > + { > > + if (dump_file) > > + fprintf (dump_file, " i%d: resetting debug use of mem\n", > > + use->insn ()->uid ()); > > + reset_debug_use (use); > > + } > > + } > > + > > + // Now let's take care of register uses, starting with debug uses > > + // attached to defs from our first insn. > > + for (auto def : insns[0]->defs ()) > > + { > > + auto set = dyn_cast (def); > > + if (!set || set->is_mem () || !set->first_debug_insn_use ()) > > + continue; > > + > > + def_info *defs[2] = { > > + def, > > + find_access (insns[1]->defs (), def->regno ()) > > + }; > > + > > + rtx writeback_pats[2] = {}; > > + if (def->regno () == base_regno) > > + for (int i = 0; i < 2; i++) > > + if (defs[i] && (writeback & (1 << i))) > > Are both checks needed? If so, what does it mean for the writeback > bit to be set but defs[i] to be null? Could we assert the defs[i] > part instead? Yeah, we could assert the defs[i] bit, (writeback & (1 << i)) should imply defs[i] is non-null. I'll make that change, thanks. > > > + writeback_pats[i] = orig_rtl[i]; > > + > > + // Now that we've characterized the defs involved, go through the > > + // debug uses and determine how to update them (if needed). > > + for (auto use : set->debug_insn_uses ()) > > + { > > + if (*pair_dst < *use->insn () && defs[1]) > > + // We're re-ordering defs[1] above a previous use of the > > + // same resource. > > + update_debug_use (use, defs[1], writeback_pats[1]); > > + else if (*pair_dst >= *use->insn ()) > > + // We're re-ordering defs[0] below its use. > > + update_debug_use (use, defs[0], writeback_pats[0]); > > + } > > + } > > + > > + // Now let's look at registers which are def'd by the second insn > > + // but not by the first insn, there may still be debug uses of a > > + // previous def which can be affected by moving the second insn up. > > + for (auto def : insns[1]->defs ()) > > + { > > + // This should be M log N where N is the number of defs in > > + // insns[0] and M is the number of defs in insns[1]. > > + if (def->is_mem () || find_access (insns[0]->defs (), def->regno ())) > > + continue; > > + > > + auto prev_set = safe_dyn_cast (def->prev_def ()); > > + if (!prev_set) > > + continue; > > + > > + rtx writeback_pat = NULL_RTX; > > + if (def->regno () == base_regno && (writeback & 2)) > > + writeback_pat = orig_rtl[1]; > > + > > + // We have a def in insns[1] which isn't def'd by the first insn. > > + // Look to the previous def and see if it has any debug uses. > > + for (auto use : prev_set->debug_insn_uses ()) > > + if (*pair_dst < *use->insn ()) > > + // We're ordering DEF above a previous use of the same register. > > + update_debug_use (use, def, writeback_pat); > > This might be more efficient as a reverse walk that breaks as soon as > *pair_dst >= *use->insn (), since the previous def could be arbitrarily > far away and have arbitrarily many leading uses. But it probably doesn't > matter much in practice. Agreed, provided it's easy to get at the last debug use in constant time (if so I guess 1/3 might need updating to add a last_debug_insn_use accessor). I'll look into that tomorrow. > > > + } > > + > > + if ((writeback & 2) && !writeback_effect) > > + { > > + // If the second insn initially had writeback but the final > > + // pair does not, then there may be trailing debug uses of the > > + // second writeback def which need re-parenting: do that. > > + auto def = find_access (insns[1]->defs (), base_regno); > > + gcc_assert (def); > > + for (auto use : as_a (def)->debug_insn_uses ()) > > + { > > + insn_change change (use->insn ()); > > + change.new_uses = check_remove_regno_access (attempt, > > + change.new_uses, > > + base_regno); > > + auto new_use = find_access (insns[0]->uses (), base_regno); > > + > > + // N.B. insns must have already shared a common base due to writeback. > > + gcc_assert (new_use); > > + > > + if (dump_file) > > + fprintf (dump_file, > > + " i%d: cancelling wb, re-parenting trailing debug use\n", > > + use->insn ()->uid ()); > > + > > + change.new_uses = insert_access (attempt, new_use, change.new_uses); > > + crtl->ssa->change_insn (change); > > + } > > + } > > + else if (trailing_add) > > + fixup_debug_uses_trailing_add (attempt, pair_dst, trailing_add, > > + writeback_effect); > > +} > > + > > // Try and actually fuse the pair given by insns I1 and I2. > > // > > // Here we've done enough analysis to know this is safe, we only > > @@ -1378,6 +1681,9 @@ ldp_bb_info::fuse_pair (bool load_p, > > insn_info *first = (*i1 < *i2) ? i1 : i2; > > insn_info *second = (first == i1) ? i2 : i1; > > > > + insn_info *pair_dst = move_range.singleton (); > > + gcc_assert (pair_dst); > > + > > insn_info *insns[2] = { first, second }; > > > > auto_vec changes; > > @@ -1388,6 +1694,13 @@ ldp_bb_info::fuse_pair (bool load_p, > > PATTERN (second->rtl ()) > > }; > > > > + // Make copies of the patterns as we might need to refer to the original RTL > > + // later, for example when updating debug uses (which is after we've updated > > + // one or both of the patterns in the candidate insns). > > + rtx orig_rtl[2]; > > + for (int i = 0; i < 2; i++) > > + orig_rtl[i] = copy_rtx (pats[i]); > > + > > FWIW, an alternative (that avoids RTL allocation) would be to use > temporarily_undo_changes and redo_changes. But this is fine too. Presumably using temporarily_undo_changes would require some bookkeping around the change numbers that might end up making things more complicated? I thought about making the copies conditional on MAY_HAVE_DEBUG_INSNS as that at least saves copying in the common case with no debug insns (and perhaps value-initializing orig_rtl to avoid it being used uninitialized by potential future patches). WDYT about that? > > The patch is OK from my POV with the as_a change above, but I'd be > interested in the answer to the writeback question above. Great, thanks a lot for the reviews! Alex > > Thanks, > Richard > > > use_array input_uses[2] = { first->uses (), second->uses () }; > > def_array input_defs[2] = { first->defs (), second->defs () }; > > > > @@ -1604,9 +1917,7 @@ ldp_bb_info::fuse_pair (bool load_p, > > using Action = stp_change_builder::action; > > insn_info *store_to_change = try_repurpose_store (first, second, > > move_range); > > - insn_info *stp_dest = move_range.singleton (); > > - gcc_assert (stp_dest); > > - stp_change_builder builder (insns, store_to_change, stp_dest); > > + stp_change_builder builder (insns, store_to_change, pair_dst); > > insn_change *change; > > set_info *new_set = nullptr; > > for (; !builder.done (); builder.advance ()) > > @@ -1677,7 +1988,7 @@ ldp_bb_info::fuse_pair (bool load_p, > > fprintf (dump_file, > > " stp: changing i%d to use mem from new stp " > > "(after i%d)\n", > > - action.insn->uid (), stp_dest->uid ()); > > + action.insn->uid (), pair_dst->uid ()); > > change->new_uses = drop_memory_access (change->new_uses); > > gcc_assert (new_set); > > auto new_use = crtl->ssa->create_use (attempt, action.insn, > > @@ -1741,6 +2052,11 @@ ldp_bb_info::fuse_pair (bool load_p, > > > > gcc_assert (crtl->ssa->verify_insn_changes (changes)); > > > > + // Fix up any debug uses that will be affected by the changes. > > + if (MAY_HAVE_DEBUG_INSNS) > > + fixup_debug_uses (attempt, insns, orig_rtl, pair_dst, trailing_add, > > + load_p, writeback, writeback_effect, base_regno); > > + > > confirm_change_group (); > > crtl->ssa->change_insns (changes); > > > > @@ -2807,7 +3123,7 @@ try_promote_writeback (insn_info *insn) > > > > rtx wb_effect = NULL_RTX; > > def_info *add_def; > > - const insn_range_info pair_range (insn->prev_nondebug_insn ()); > > + const insn_range_info pair_range (insn); > > insn_info *insns[2] = { nullptr, insn }; > > insn_info *trailing_add = find_trailing_add (insns, pair_range, 0, &wb_effect, > > &add_def, base_def, offset, > > @@ -2830,14 +3146,16 @@ try_promote_writeback (insn_info *insn) > > pair_change.new_defs); > > gcc_assert (pair_change.new_defs.is_valid ()); > > > > - pair_change.move_range = insn_range_info (insn->prev_nondebug_insn ()); > > - > > auto is_changing = insn_is_changing (changes); > > for (unsigned i = 0; i < ARRAY_SIZE (changes); i++) > > gcc_assert (rtl_ssa::restrict_movement_ignoring (*changes[i], is_changing)); > > > > gcc_assert (rtl_ssa::recog_ignoring (attempt, pair_change, is_changing)); > > gcc_assert (crtl->ssa->verify_insn_changes (changes)); > > + > > + if (MAY_HAVE_DEBUG_INSNS) > > + fixup_debug_uses_trailing_add (attempt, insn, trailing_add, wb_effect); > > + > > confirm_change_group (); > > crtl->ssa->change_insns (changes); > > } > > diff --git a/gcc/testsuite/gcc.c-torture/compile/pr113089.c b/gcc/testsuite/gcc.c-torture/compile/pr113089.c > > new file mode 100644 > > index 00000000000..70c71f23f1c > > --- /dev/null > > +++ b/gcc/testsuite/gcc.c-torture/compile/pr113089.c > > @@ -0,0 +1,26 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-g -funroll-loops" } */ > > + > > +typedef unsigned short uint16; > > + > > +void intrapred_chroma_plane(uint16 ***mb_preds, int* max_imgpel_values, int crx, int cry, int px) { > > + for (int uv = 0; uv < 2; uv++) { > > + uint16 **mb_pred = mb_preds[uv + 1]; > > + uint16 **predU2 = &mb_pred[px - 2]; > > + uint16 *upPred = &mb_pred[px][px]; > > + int max_imgpel_value = max_imgpel_values[uv]; > > + > > + int ih = upPred[crx - 1]; > > + for (int i = 0; i < crx*3; ++i) > > + ih += upPred[crx*3]; > > + > > + int iv = (mb_pred[cry - 1][px+1]); > > + for (int i = 0; i < cry - 1; ++i) { > > + iv += (i + 1) * (*(mb_preds[uv][0]) - *(*predU2--)); > > + } > > + > > + for (int j = 0; j < cry; ++j) > > + for (int i = 0; i < crx; ++i) > > + mb_pred[j][i] = (uint16) (max_imgpel_value * ((i * ih + iv))); > > + } > > +}