From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-DB5-obe.outbound.protection.outlook.com (mail-db5eur02on2084.outbound.protection.outlook.com [40.107.249.84]) by sourceware.org (Postfix) with ESMTPS id 02C8B3858CDB for ; Mon, 12 Feb 2024 17:11:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 02C8B3858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 02C8B3858CDB Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.249.84 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1707757914; cv=pass; b=h4KvXmhkONOYbXBV3dEfWlGJrOFboNcYwnr6PoL1I06xS6SPJmwj9RNmq3/KpyAAf7Yd8JK7UOJKQU7a3tz9Y+kX1lOyjsfKan4LLgYcWLCdBJtY+BUZrr+baejdtp0wajjVJdyVHgLZh6TBxPB80XTAhcfTqoe1/GYaxNL2968= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1707757914; c=relaxed/simple; bh=Nhd0tf2+icCMQdjjS/gGzOaVc2WneuL3GEaeTGHHXLY=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=IfitVobv7MjSatboK+lG1czATv7qFr5q1Miy7N13pFywAjM0ChPoGzfNAT4yNFznznrHYfdsZ/7smXmAKFEWvs6toxmWQG6iY+Kmf77ny6c5VQMkev9pbteZM16LXO/NK6DdIZoiBmoPB/OaermTePQk1EmfNjx0p1EYe+SY1EA= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=BjZrAYJS3cpIluAak2X/97Vaui4AKVm6wmEgoVFRia3QyRdU3YH1GL72EK7ScHRjNV/I7G5j2F82D7cb3uhPhyIlcaX8IE/tdI9VwJQPY2WojgocqD00yJL1yzHzbzJbHJRMbp6kEpJhFrYyxFgQL8OVt33u84Z7DhmhXWQlU1TdBN+tTet3/hLc8SbbwwwGcYy/V42DVO3hrfyDeuPlQ/IvdkgTJKqBWvD4S4/4Vb0HQWIOmoHkaqzpjp7f6CGSJOpvPYSCAEFR57rL1q0e9RCOU8q5dwt7lsCRVKDq2OPbbJ8VeAkfJy4joSZYzxGJ466Pf6Rz94oHY/ZTV8z8BQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IL+2yziY3ucxdk3sUsh+RAoiysEn7njAjVGyDwCHsUc=; b=DshoHgI5bDgMsyjrAtqfAngj7eaTCHNXI+uAT1AfanwrjYdVfKAAcvAfRHWeuECqoE92Eod13T77/CAh+LJWtiPdAChCYjiQG7654oe8dP08S9hNIaYJcn4k99a9PuS0QP7mB1BOo1GbpRVWX4LoMl5F697KMNIrImbbSBAveM0f3Y0r0+6ex4FYtjB+yvWtgUVko9AygOz2Lsjpe17qIM9jYgyrJyzlMJM7Y1zaJtbD2bu5YKa9cucUdc5zbB1mLpLoCXp8MfTVh9fWHtAOpEYJR0CB47txnHtmaYgxQzKZFn9sZ9qJbezg7fbL9j6Pptm3nvu9PCDEquISxu7IbQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IL+2yziY3ucxdk3sUsh+RAoiysEn7njAjVGyDwCHsUc=; b=ke5lMEsKyLBaC9YZG4pHXbe5nVNy/ANsCdZfTiIqif6kqJn9TmMdv4s0WWxv6DsPme6DQzmQWI/+Lfy+rqFNetN5N7UCweBOWMt8iiCn1hBd2VWsMJMgrPAjjCeJqv3FyDK6GF1GCBmqejfz0W7tBkJ40H/mMvlXVgJ9nZ3Ebe8= Received: from DBBPR09CA0035.eurprd09.prod.outlook.com (2603:10a6:10:d4::23) by DBBPR08MB5993.eurprd08.prod.outlook.com (2603:10a6:10:1f4::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7270.35; Mon, 12 Feb 2024 17:11:47 +0000 Received: from DB3PEPF0000885D.eurprd02.prod.outlook.com (2603:10a6:10:d4:cafe::c4) by DBBPR09CA0035.outlook.office365.com (2603:10a6:10:d4::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7270.39 via Frontend Transport; Mon, 12 Feb 2024 17:11:47 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB3PEPF0000885D.mail.protection.outlook.com (10.167.242.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7249.19 via Frontend Transport; Mon, 12 Feb 2024 17:11:47 +0000 Received: ("Tessian outbound 94d82ba85b1d:v228"); Mon, 12 Feb 2024 17:11:47 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 916bbb132348bf6e X-CR-MTA-TID: 64aa7808 Received: from 6296c86f65f2.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 17063109-354C-47DF-A802-959190987E85.1; Mon, 12 Feb 2024 17:11:41 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 6296c86f65f2.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 12 Feb 2024 17:11:41 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=U1mKfAzAPsq9vXXLNvqibGnI/6Sd2VGSIYU7ogj2L4M5mQgD2pmu/q7+CqRCy3pwjUeA7L2g/gUIZ2Aeb8DGpg1XYS8RUfS+Kl2/jF86kOcBETCWjRY0wTZpV4CDxMI7IhpwslWSgU/9qPN3hYXYaTj1VPDPBsRpo+fte2w6crLkpmHTbU/BfdK6I4xTgbQrBGiyLEpG7GkIJaRES4hZsPxGWc8P1tElc9DQ1L1Y+wnVfvPl6v1OKz6wc1NXTSGKQIOsLhGkEylO8Dae4W4eBPNlHub94B+Dh9Rci26PDP8aZhU9DDjIihxIkQjoxLAUfktfRj9cnikLvPA8gzAXvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IL+2yziY3ucxdk3sUsh+RAoiysEn7njAjVGyDwCHsUc=; b=K8p9boIPAoti3Ob1IqUENNmqfMSF2l8By1AhZCviFv0YViiu3Q4Ivzc9wlte0vU/hRjPuS2q0Q3OnYaIFkDdL5QhBgcCSPRqLeDTAILY/4PbdhlnhDMNR9riXRCj8raLmQAjSmsJjtuoHQk7kwgA/K3IDcbQuxzGhlRE+lmNslPXq1l3cBJK2PAjqlNi1O2MmiFDWjkpUcGTnK5WULJoasxWeXO1+88fGCUw0GkXUIG+DKOpPg9/hOh3gC8k671y+xlqJ6DABaBhjUOQwqepGQKTf5vzMHXNQcZ6sfvKgsOpEsbJ0O1126k+sagSNQDNYLx5UjPlcZ/MbwCb3XuKhA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IL+2yziY3ucxdk3sUsh+RAoiysEn7njAjVGyDwCHsUc=; b=ke5lMEsKyLBaC9YZG4pHXbe5nVNy/ANsCdZfTiIqif6kqJn9TmMdv4s0WWxv6DsPme6DQzmQWI/+Lfy+rqFNetN5N7UCweBOWMt8iiCn1hBd2VWsMJMgrPAjjCeJqv3FyDK6GF1GCBmqejfz0W7tBkJ40H/mMvlXVgJ9nZ3Ebe8= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by AS2PR08MB8502.eurprd08.prod.outlook.com (2603:10a6:20b:55d::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7270.37; Mon, 12 Feb 2024 17:11:38 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::48ca:fbcb:84bf:ed17]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::48ca:fbcb:84bf:ed17%4]) with mapi id 15.20.7270.024; Mon, 12 Feb 2024 17:11:38 +0000 Date: Mon, 12 Feb 2024 17:11:33 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov , Richard Earnshaw Subject: [PATCH][GCC 12] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677] Message-ID: Content-Type: multipart/mixed; boundary="KTv8mx9XwlUoZ+dU" Content-Disposition: inline Content-Transfer-Encoding: 8bit X-ClientProxiedBy: LO2P265CA0353.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:d::29) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|AS2PR08MB8502:EE_|DB3PEPF0000885D:EE_|DBBPR08MB5993:EE_ X-MS-Office365-Filtering-Correlation-Id: da458183-5746-4369-1092-08dc2bedb1e6 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 9/x1njV09VP6v88e/W05udj11nL/WrrqAsUcYZHtAYhgCvlBtChvUFwVHFFi+6zACBr+FFptpWO/w2Gf3P2++H0iVh5TPTiDXqQuc6IbbT/7xrE6tPBcF1ofE3+oHmaEJuyvM5rUcGuVpToZzxMiNvDTOJPFPEcM3nBEoO0keTwpi5emHrQEy8eWqtZD/J2PwdGeKvHUG4HNPtwZIesaUS2TmmIkVo+NvON0zQCeUjECAbZccH1VsmWjhNjg5CBWUpon6xW4mgLWuo7kCBXDnLs1yPSIHfhOAmHe5pnP/NifCsNYhH61KOD9Zt5vatLdS3ZyARGiGQyPkc322w5N162dNtxIdxYcjrw5i6KwcjF1tJewe4H3W2lNtxXV7bktw8nmY8hRkhKAoXqyzHOiG2nliHw9s61/U7djJUNFuPFjF4d/b5vkXgTJiR1m9CuoPe8JXj3omKm75265QqT9zG032Ojiuv11N1y6qK6j0ua2PJghumL6TMGr5Aqfjz/NsImR3j9m7UmYeL61nwBPUNJTbcTjd8ilXXX2049g2rUpiWqYni4VQ5+UFNhEI+ZrkMuDZH38/m9wjOE40WgIZA== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAWPR08MB8958.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(396003)(366004)(346002)(136003)(39860400002)(376002)(230922051799003)(451199024)(1800799012)(186009)(64100799003)(235185007)(44832011)(2906002)(84970400001)(5660300002)(41300700001)(2616005)(26005)(38100700002)(86362001)(316002)(33964004)(6512007)(6506007)(66556008)(54906003)(6666004)(66946007)(66476007)(8676002)(4326008)(6916009)(8936002)(44144004)(966005)(6486002)(83380400001)(36756003)(478600001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8502 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB3PEPF0000885D.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 7520e148-4853-451c-d9b3-08dc2bedac94 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: A03rXjeI9m6wzkSMLkruFXR1eoZ3fPpruh/3MaRQLc7Uw8DSmEuZ2Uo22C/9CqbMrS/rDQuAspVy8oYe4P5o4xOtCdpGd97Bb/xFbbgbaPHgc/9y7k/qxT3ALKyRtV8RM/8UAgL2Imw1DK+qSpq7D3Te67h9mw4he5zqebE1DqXFOMMugRILuQwoXdBenN7FczEgb1IEd8C7fkDXfL6u2xPNEWDEv05n784DcI78rLkEiiAO6K6uhwk0S4f8+nf6CPLjrFCCURoNC1NMfQJrvR5ofn7kuiP/AFASEywKOItlSk83crGpiomfvN8P0Q2ZqoIepsyjJxIRMWn4t9Wm+zCkFTyBh+qTz1l+U2FbPGrnzLOXN/ewpnwvqvbnBrqjq7N80u1/JVEi+MfI2kO1+W4UUs4Hw4nkhbj5o9KEDAmEGHh049rukK1Mp4X8d3hHvSEtUXJkA/2EuZ6wvJQ6OzJDpwSpL0Y3LAs1ZIQnSzBFIraUoNjyfAMTp3isdJTSkx9Cluz9e+XhjZVqVaf3ayZyUNe0RbXgYcANh3SDF+Cv8bYFSzjqneqG8UC4jT9i2aes9Y8fMa7fkit/ftmC9SivIogiNgxEFnoHh/prw+zMkFzrzp8M/U3rwCyDM5xv X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(376002)(136003)(396003)(346002)(39860400002)(230922051799003)(64100799003)(451199024)(186009)(1800799012)(82310400011)(46966006)(36840700001)(40470700004)(44832011)(84970400001)(2906002)(966005)(44144004)(83380400001)(26005)(2616005)(336012)(41300700001)(6486002)(6512007)(6506007)(33964004)(478600001)(86362001)(36756003)(82740400003)(316002)(70586007)(8936002)(6916009)(5660300002)(70206006)(54906003)(8676002)(235185007)(4326008)(6666004)(81166007)(356005);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Feb 2024 17:11:47.1499 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: da458183-5746-4369-1092-08dc2bedb1e6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB3PEPF0000885D.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB5993 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --KTv8mx9XwlUoZ+dU Content-Type: text/plain; charset=utf-8; format=fixed Content-Disposition: inline This is a backport of the GCC 13 fix for PR111677 to the GCC 12 branch. The only part of the patch that isn't a straight cherry-pick is due to the TX iterator lacking TDmode for GCC 12, so this version adjusts TX_V16QI accordingly. Bootstrapped/regtested on aarch64-linux-gnu, the only changes in the testsuite I saw were in gcc/testsuite/c-c++-common/hwasan/large-aligned-1.c where the dg-output "READ of size 4 [...]" check appears to be flaky on the GCC 12 branch since libhwasan gained the short granule tag feature, I've requested a backport of the following patch (committed as r13-100-g3771486daa1e904ceae6f3e135b28e58af33849f) which should fix that (independent) issue for GCC 12: https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645278.html OK for the GCC 12 branch? Thanks, Alex -- >8 -- The PR shows us ICEing due to an unrecognizable TFmode save emitted by aarch64_process_components. The problem is that for T{I,F,D}mode we conservatively require mems to be in range for x-register ldp/stp. That is because (at least for TImode) it can be allocated to both GPRs and FPRs, and in the GPR case that is an x-reg ldp/stp, and the FPR case is a q-register load/store. As Richard pointed out in the PR, aarch64_get_separate_components already checks that the offsets are suitable for a single load, so we just need to choose a mode in aarch64_reg_save_mode that gives the full q-register range. In this patch, we choose V16QImode as an alternative 16-byte "bag-of-bits" mode that doesn't have the artificial range restrictions imposed on T{I,F,D}mode. Unlike for GCC 14 we need additional handling in the load/store pair code as various cases are not expecting to see V16QImode (particularly the writeback patterns, but also aarch64_gen_load_pair). gcc/ChangeLog: PR target/111677 * config/aarch64/aarch64.cc (aarch64_reg_save_mode): Use V16QImode for the full 16-byte FPR saves in the vector PCS case. (aarch64_gen_storewb_pair): Handle V16QImode. (aarch64_gen_loadwb_pair): Likewise. (aarch64_gen_load_pair): Likewise. * config/aarch64/aarch64.md (loadwb_pair_): Rename to ... (loadwb_pair_): ... this, extending to V16QImode. (storewb_pair_): Rename to ... (storewb_pair_): ... this, extending to V16QImode. * config/aarch64/iterators.md (TX_V16QI): New. gcc/testsuite/ChangeLog: PR target/111677 * gcc.target/aarch64/torture/pr111677.c: New test. (cherry picked from commit 2bd8264a131ee1215d3bc6181722f9d30f5569c3) --- gcc/config/aarch64/aarch64.cc | 13 ++++++- gcc/config/aarch64/aarch64.md | 35 ++++++++++--------- gcc/config/aarch64/iterators.md | 3 ++ .../gcc.target/aarch64/torture/pr111677.c | 28 +++++++++++++++ 4 files changed, 61 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/torture/pr111677.c --KTv8mx9XwlUoZ+dU Content-Type: text/x-patch; charset=utf-8; name="0001-GCC-12-aarch64-Avoid-out-of-range-shrink-wrapped-sav.patch" Content-Disposition: attachment; filename="0001-GCC-12-aarch64-Avoid-out-of-range-shrink-wrapped-sav.patch" Content-Transfer-Encoding: 8bit diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 3bccd96a23d..2bbba323770 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -4135,7 +4135,7 @@ aarch64_reg_save_mode (unsigned int regno) case ARM_PCS_SIMD: /* The vector PCS saves the low 128 bits (which is the full register on non-SVE targets). */ - return TFmode; + return V16QImode; case ARM_PCS_SVE: /* Use vectors of DImode for registers that need frame @@ -8602,6 +8602,10 @@ aarch64_gen_storewb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, return gen_storewb_pairtf_di (base, base, reg, reg2, GEN_INT (-adjustment), GEN_INT (UNITS_PER_VREG - adjustment)); + case E_V16QImode: + return gen_storewb_pairv16qi_di (base, base, reg, reg2, + GEN_INT (-adjustment), + GEN_INT (UNITS_PER_VREG - adjustment)); default: gcc_unreachable (); } @@ -8647,6 +8651,10 @@ aarch64_gen_loadwb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, case E_TFmode: return gen_loadwb_pairtf_di (base, base, reg, reg2, GEN_INT (adjustment), GEN_INT (UNITS_PER_VREG)); + case E_V16QImode: + return gen_loadwb_pairv16qi_di (base, base, reg, reg2, + GEN_INT (adjustment), + GEN_INT (UNITS_PER_VREG)); default: gcc_unreachable (); } @@ -8730,6 +8738,9 @@ aarch64_gen_load_pair (machine_mode mode, rtx reg1, rtx mem1, rtx reg2, case E_V4SImode: return gen_load_pairv4siv4si (reg1, mem1, reg2, mem2); + case E_V16QImode: + return gen_load_pairv16qiv16qi (reg1, mem1, reg2, mem2); + default: gcc_unreachable (); } diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index fb100bdf6b3..99f185718c9 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1874,17 +1874,18 @@ (define_insn "loadwb_pair_" [(set_attr "type" "neon_load1_2reg")] ) -(define_insn "loadwb_pair_" +(define_insn "loadwb_pair_" [(parallel [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:TX 2 "register_operand" "=w") - (mem:TX (match_dup 1))) - (set (match_operand:TX 3 "register_operand" "=w") - (mem:TX (plus:P (match_dup 1) + (plus:P (match_operand:P 1 "register_operand" "0") + (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) + (set (match_operand:TX_V16QI 2 "register_operand" "=w") + (mem:TX_V16QI (match_dup 1))) + (set (match_operand:TX_V16QI 3 "register_operand" "=w") + (mem:TX_V16QI (plus:P (match_dup 1) (match_operand:P 5 "const_int_operand" "n"))))])] - "TARGET_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" + "TARGET_SIMD + && known_eq (INTVAL (operands[5]), GET_MODE_SIZE (mode))" "ldp\\t%q2, %q3, [%1], %4" [(set_attr "type" "neon_ldp_q")] ) @@ -1923,20 +1924,20 @@ (define_insn "storewb_pair_" [(set_attr "type" "neon_store1_2reg")] ) -(define_insn "storewb_pair_" +(define_insn "storewb_pair_" [(parallel [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:TX (plus:P (match_dup 0) + (plus:P (match_operand:P 1 "register_operand" "0") + (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) + (set (mem:TX_V16QI (plus:P (match_dup 0) (match_dup 4))) - (match_operand:TX 2 "register_operand" "w")) - (set (mem:TX (plus:P (match_dup 0) + (match_operand:TX_V16QI 2 "register_operand" "w")) + (set (mem:TX_V16QI (plus:P (match_dup 0) (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:TX 3 "register_operand" "w"))])] + (match_operand:TX_V16QI 3 "register_operand" "w"))])] "TARGET_SIMD - && INTVAL (operands[5]) - == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" + && known_eq (INTVAL (operands[5]), + INTVAL (operands[4]) + GET_MODE_SIZE (mode))" "stp\\t%q2, %q3, [%0, %4]!" [(set_attr "type" "neon_stp_q")] ) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 26a840d7fe9..d49e37893df 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -303,6 +303,9 @@ (define_mode_iterator VS [V2SI V4SI]) (define_mode_iterator TX [TI TF]) +;; TX plus V16QImode. +(define_mode_iterator TX_V16QI [TI TF V16QI]) + ;; Advanced SIMD opaque structure modes. (define_mode_iterator VSTRUCT [OI CI XI]) diff --git a/gcc/testsuite/gcc.target/aarch64/torture/pr111677.c b/gcc/testsuite/gcc.target/aarch64/torture/pr111677.c new file mode 100644 index 00000000000..6bb640c42c0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/torture/pr111677.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target fopenmp } */ +/* { dg-options "-ffast-math -fstack-protector-strong -fopenmp" } */ +typedef struct { + long size_z; + int width; +} dt_bilateral_t; +typedef float dt_aligned_pixel_t[4]; +#pragma omp declare simd +void dt_bilateral_splat(dt_bilateral_t *b) { + float *buf; + long offsets[8]; + for (; b;) { + int firstrow; + for (int j = firstrow; j; j++) + for (int i; i < b->width; i++) { + dt_aligned_pixel_t contrib; + for (int k = 0; k < 4; k++) + buf[offsets[k]] += contrib[k]; + } + float *dest; + for (int j = (long)b; j; j++) { + float *src = (float *)b->size_z; + for (int i = 0; i < (long)b; i++) + dest[i] += src[i]; + } + } +} --KTv8mx9XwlUoZ+dU--