From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by sourceware.org (Postfix) with ESMTPS id 8D0153865460 for ; Wed, 20 Sep 2023 23:05:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8D0153865460 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=oracle.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=oracle.com Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38KKJe1W007933 for ; Wed, 20 Sep 2023 23:05:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2023-03-30; bh=fS/TG/NPHWtsT2sD8rVcgDwmdYYQl4IVqZt2c96EHF0=; b=rNzevRfvud1PUGjYvHvK4hvLX0/dfY/LDZh4BHZsHeRv4qzrwqcSvgLAFFTcT+YyaWjZ eUDZzP5xjRQB2Wc59xSR6xAHdkObp07QV2EpqSNEDPbLJEufbWM3UAdtDihYDi6KPVyi y9dhJftcOo+jJicUY27FEu1vNtInmpI402Ol+1lX3trc5Aq9V9NhDsfgtM4mubf7Z5EK bOHkzM+QOJ2YpVtyMchFBnPcWm6zdEObYuK2yGpIpox3HkOJIcPh1k9qY5dkNHEWYZmm mv89ER1XHyZ+/6H6mZbN/gf+BB/DaoWmmnvRt9TLQP87a11I0s5a1/l5+KZUmaAxkQHO PQ== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3t539crmtd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 20 Sep 2023 23:05:23 +0000 Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 38KLUHjE012050 for ; Wed, 20 Sep 2023 23:05:21 GMT Received: from nam11-dm6-obe.outbound.protection.outlook.com (mail-dm6nam11lp2169.outbound.protection.outlook.com [104.47.57.169]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3t52t810n3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 20 Sep 2023 23:05:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZOm8u1i4IMUUSvnASsuSsg1jieRpF1NkC4M3AT0COi2jFAAsw7Azf5UrSmy7G4QBVyRdSjmCwHF/TYtZvjqGXAKdkgR+qrrVcpV7GJod/zhj2o2On57BfOgV/liq9BUq7GjAhIXcmbJIr52c3J+yaALQVDT9vf3LAq75BxSd2wPJuUBaGoOs1b5nn868gisoD3NfFSNyzOKB6RcnNxh8WKF2qQYKAhzSWWAPR2w1K4H22w1Wg2PLxJ7DtHq1vPe++QddVawEjXWncs6KCKyz6WZLdC4x1RJlhlO/lJCBw7Fl8qcPSQq0ZB9co+p2SQbDjexZsE6uylIca8axoqYgXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fS/TG/NPHWtsT2sD8rVcgDwmdYYQl4IVqZt2c96EHF0=; b=IPS6eMNZ73NX5tuWBEzO5zL7xDLIZq2Z141Tn00W/81BNrMixamf+1J/C0jLe0gqpu0oqM9e8JApBqAbok0/lmAL5uin18d60Mkte/yBOFnKiA7B1s8KCdziod688mDzWVUpsW2nBvoJwlHNmo0OoL29HhGaLigzLSIoA88RuUWqlwtpJAyiuYnK/lFlECvLuKxCCJUde6/1KaaO3I9HqTIiMPuLqOVoMIpuJaVYoFaq4TfqTpK5hN6a26+nMaraYEwD27eglbGVaXUELlPjGoOqKYMFh+3tAChKV0M+sFghUxesTRxNS/xwLEG+3jkDGDA5P8SDc3cUhSEmVgWVcA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fS/TG/NPHWtsT2sD8rVcgDwmdYYQl4IVqZt2c96EHF0=; b=EtLc/+pLO+OMRTK+ZvKjABmE4j9d+4U+GHmqZVzJ/p1b76oQrJMvRUf4l62jNpc94gu+NxaypOMA/WYx1IgZ9iHo0GKpNrSJPO7PbqGQwDEST0/vQbJGNjLKoLnhkC9oXzb8+ZUIwmp3mmE1qEW+lvtGxBQHd8t84wVZEnRq5rw= Received: from MWHPR1001MB2158.namprd10.prod.outlook.com (2603:10b6:301:2d::17) by CH3PR10MB6786.namprd10.prod.outlook.com (2603:10b6:610:140::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6813.20; Wed, 20 Sep 2023 23:04:55 +0000 Received: from MWHPR1001MB2158.namprd10.prod.outlook.com ([fe80::44f7:ba76:db28:606e]) by MWHPR1001MB2158.namprd10.prod.outlook.com ([fe80::44f7:ba76:db28:606e%3]) with mapi id 15.20.6813.017; Wed, 20 Sep 2023 23:04:55 +0000 From: Indu Bhagat To: binutils@sourceware.org Cc: Indu Bhagat Subject: [PATCH,RFC 8/9] gas: synthesize CFI for hand-written asm Date: Wed, 20 Sep 2023 16:04:00 -0700 Message-ID: <20230920230401.1739139-9-indu.bhagat@oracle.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230920230401.1739139-1-indu.bhagat@oracle.com> References: <20230920230401.1739139-1-indu.bhagat@oracle.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: MW4PR03CA0123.namprd03.prod.outlook.com (2603:10b6:303:8c::8) To MWHPR1001MB2158.namprd10.prod.outlook.com (2603:10b6:301:2d::17) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWHPR1001MB2158:EE_|CH3PR10MB6786:EE_ X-MS-Office365-Filtering-Correlation-Id: 524c896c-f6c8-4567-28a9-08dbba2e0105 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: f25jEqcb/xIk3MP0Egrq3o6k+OqjqBATJXmpwZuI8lUUtFkvJFWTldkZpsyv6jfUpiUmMeBZCsDy0EOlu1jptwkAkHEvBrecSaX8EWhdU1gJ2iOunJ79hQCFwkvHOE1T8Im474CgzMkMXW0sWtIuaMpQv18aZvFt7yM21i2goJbutsRCoW66ZkR9U70p0b01KaErpwLMTPrS+Vzfo4RoLonAbeg0ccrWTAgkFnLPfHrOBo+YRgi8Ibpjj21H4uA5mxxnDkx38ejjm5dMUfJzY/y9PGuvP3mgaxHXxHfeMFRlnAvhQx79IYFKPywfmtqgBJlotJ0DwwYiJEWxD7MURBozwaEAD2uuiWZPYia7gXUsnfKnCMCWODbAgfYs8QrirkKkvjr3dN3B5mICFN8TWQzn31vT/JPlrtivIOXlhdx4QMa0YFMZeKDsTPMql9B6HJlBBMAg60bkMFs9zS9aMcgFTnqrtenZFndJf86vOkvOwqhff27cObUKYNHrEcSUAKM1LAdhFVMC5yHIEsExUeCYAxhvl3fm3Dw5TvHbIVOaRfAOQtQd2FjjDxlMptRFAVdE5wCrLaCe3XLYHQa71CwvuPZZphczwm95GWAn8OU= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MWHPR1001MB2158.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(366004)(396003)(376002)(39860400002)(346002)(136003)(1800799009)(186009)(451199024)(478600001)(6486002)(6666004)(6512007)(6506007)(8676002)(8936002)(44832011)(83380400001)(66946007)(66476007)(66556008)(6916009)(316002)(5660300002)(66899024)(107886003)(2616005)(36756003)(4326008)(38100700002)(86362001)(2906002)(1076003)(41300700001)(30864003)(2004002)(559001)(579004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?L9a0nJ7BKmPWVTbMYVxL67F5ndQt/zowf0CZSXrJjq9/f/fv+n62sjtK6rtQ?= =?us-ascii?Q?8SCIxrsBqAU1rrP2Ub/cYln6PjSbrPRYOPr/WT8+kpzparr7JhbYnpA3GmIO?= =?us-ascii?Q?yV8b87OFxZbW3XQi69FVBfVK8INNd1ugps+iiqgiRb3cCuqmtloMF1NI/If0?= =?us-ascii?Q?UTr0ZfoMSOBkjmso5BTvJsJ2ajyRriwykinmvcLULkAUOUgNo2GEjIMlt+ZW?= =?us-ascii?Q?p0Prmgx6/xaqsYnieHL7csKEpZ/7DFmLfov6f+IxNLIfnyDmP31703W9sv2L?= =?us-ascii?Q?BHa1M+YChtlZzKKhtX9Ar63mvHFqhP4NyXC5h7fRA2XQ+nRBoQy8nYiLxPk7?= =?us-ascii?Q?09fsZWq3XKypQ5lV//Nkw6QniYXBMLGLxsBvhrVRMWOn5oKvQuw1pkd8QFVB?= =?us-ascii?Q?ZgoLYHacXuULV8LeG3glC76qOzhMJFb5Md5YaRsnnNF7PSnwk2PlpHW+fUAC?= =?us-ascii?Q?SDXp0+RM9dqESh+wCKfSRG4Kv7oEBwAfg2RKP5bUttjCeMQFVkhwEnmaH92k?= =?us-ascii?Q?OhktMuucXgMjnQY08bHMD26/RR1C82SyA5lPv1Vo8h3EOVuD3BtJBhNdUK7g?= =?us-ascii?Q?jJ3UU8/YzTPXjN4l0+PoVJaHmgF2qSfbNbaFp9YZFWXMKXG0R/7QgELR1niH?= =?us-ascii?Q?B06nocBdjZVkSpzauOnJrm85JWCsqE+taEh/WZFJzXRiq31iQVOnwI+cGLf+?= =?us-ascii?Q?z2IJqPtLckhn7AQauD0vhirrZPr8QtGc7NEvH6YOjxOf+VghXxrQaV8TRxFq?= =?us-ascii?Q?yAaxOXVIaBe/6gXcOuZazvmP9sLZIQzizZBIuG411nXJSDHhF3y7UZ6b0zst?= =?us-ascii?Q?eRN1hg806cXriTq1IHtNLlaYB1a5lWJGK41Dk/hiO0mKKb1ER8YppSUyqP5r?= =?us-ascii?Q?R3YNdpcoswsejXous9TfSwBBZCiC7qSFwx2dNpHBUpGA0xpSc/XLWZ4ReZYr?= =?us-ascii?Q?nA62tlxU/qXsBO43eNjrgGo7ZJNLDIQeccEakFesl25gRa/90hIUasZihRvK?= =?us-ascii?Q?qf6za8671lXOj7cQ/NWRnvbPhviAoknONHILAFjxm6EGiE1Ev6SsIfTJKXRn?= =?us-ascii?Q?zhcZL/LjZoEO2cTW+IfVIOYhb7vA4K0KEK81GD69Ho/MePdpzkZ8gc3j/rYI?= =?us-ascii?Q?4Tn31hI0fn3C3fler23TGtG23ttrhBd1o6aesPQ50YRGVPaPGct6BI/bOo3L?= =?us-ascii?Q?bH/KEFsI9mpgQ71JE3xxH6zSewAOX5f66T0g29j45T6Qk6ediLIASLVwPghl?= =?us-ascii?Q?RgedvgXKec70WSDR/Kp/0SaRa10pD05oMl4BiwbdJPgTIbgxk15IIyGXIH/R?= =?us-ascii?Q?61MqML3BhI9TaeznJqFGQ+kXSzO2ZqHtOtMHhn4sxQeNx7OljJ8Aq5d8fBEd?= =?us-ascii?Q?mRUaW828XoyAYQWOLkUU8hyc4oP7IuGRtF4TtcmbLH/jv05bXlg1o4gXzcxP?= =?us-ascii?Q?YslBLp24dFWVjTwCLj2Qc4nu8AFU7+XN2FCkf+f5O+slep/YvRYE+Q03eAot?= =?us-ascii?Q?78umG6jslyCefymXjdXAJ/CbuMbBvn2OyYwzqkbElHNVC6nybH6KVICejWj+?= =?us-ascii?Q?6ux3R7lbIcO4t1KoF4GfucGgvtAbWiZZqpzQC6FrXgO7xAOEJd76j1XcdJ5D?= =?us-ascii?Q?7pAYJC4jx7vsBsUZJFZg3DE=3D?= X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: VAs2wRFU0Ufig8iG+JKXMpPfjtnmcoa1wQ068SAkqk0Wd2UKJ3MzkyiT356I3vlKRlE+10O1RjeTRJYpGxvX/ezcliQJQMb+iB8+fZ4tTxgZjzELzneW+Y6gtnnFZQ+zCww/XY7p6q2Lxd+6tfn0/8ke7NdUjXdxKooU27kh25uXr/5n8t20h9JqA511+pIvSIQdj1b51wcoe65vbdtG5djduXsGvP1bFbuIcYVinpQ8bCnpu0t5vXl0ldr9cKpBW31XZVx7TJtb5u1Y1PFztaOu6AivrkByw61zxpjKxlwHCed16iuh1OKDpuMeUG/6xRDKJ6EJ4mQYnCAplqHqf4r23jPc5Z79D3mxKftvZq8jJTaZgtfFfLvYAZbT7NhK3Ph9LMO5cqD/JKN81MDJ/H2xljjEwSWuQ6DCJ5dDHEC85RVs8TvzR+6jLIKgy01aEsBLP//Xajs6WLrVuq4fbC3wJati5grTRNf6cw512KJ1iF00pMWsSi41Dl93QDgjsQKkubxQwmbFLluvxxhVkXhfBpscS3oQmPFXo/JIXFdX4GuQnGvPSTtxUD7oirw9S14TYtA3GIyy2X7PzhMVULNpmFOk2bKeNubOZpAPUAkOKwhDdbdLES4RHEWVFMq7HTS7HJkFcmyQfBsIL4UUcq745nKjw/FKDAehv8R4Y+2v8XH1fvATuxv/iHmHCbVWI+HSW4fK81ioVTQsUnQZK+6crpMy1j6KeCtguP3JjaK4y5ihkF+bFIP0x0PTop/zP2k8i84p4IX292AvlJnxUQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 524c896c-f6c8-4567-28a9-08dbba2e0105 X-MS-Exchange-CrossTenant-AuthSource: MWHPR1001MB2158.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Sep 2023 23:04:55.7116 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: VPE3HoE8lfgx/39cB9Ew3fTk6x3JTWARw45cTQM2+c1JvzSn62fmAgija6XiWCC0+rcvY+jHUdmLIrpg+Gs7KA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR10MB6786 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-20_11,2023-09-20_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 malwarescore=0 adultscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309200193 X-Proofpoint-ORIG-GUID: 9hd-cW_icMLDZsgj_C-sY_Z71YFSxdrr X-Proofpoint-GUID: 9hd-cW_icMLDZsgj_C-sY_Z71YFSxdrr X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This patch adds support in GAS to generate generic GAS instructions (a.k.a., the ginsn) for the x86 backend (AMD64 ABI only at this time). Using this ginsn infrastructure, GAS can also synthesize CFI for hand-written asm for x86_64. A ginsn is a target-independent representation of the machine instructions. One machine instruction may need one or more ginsn. Since the current use-case of ginsn is to synthesize CFI, the x86 target generates ginsns necessary for the following machine instructions only: - All change of flow instructions, including all conditional and unconditional branches, call and return from functions. - All register saves and unsaves to the stack. - All instructions affecting the two registers that could potentially be used as the base register for CFA tracking. For SCFI, the base register for CFA tracking is limited to REG_SP and REG_FP only for now. The representation of ginsn is kept simple: - GAS instruction has GINSN_NUM_SRC_OPNDS (defined to be 2 at this time) number of source operands and one destination operand at this time. - GAS instruction uses DWARF register numbers in its representation and does not track register size. - GAS instructions carry location information (file name and line number). - GAS instructions are ID's with a natural number in order of their addtion to the list. This can be used as a proxy for the static program order of the corresponding machine instructions. Note that, GAS instruction (ginsn) format does not support GINSN_TYPE_PUSH and GINSN_TYPE_POP. Some architectures, like aarch64, do not have push and pop instructions, but rather STP/LDP instructions. Further these instructions have a variety of addressing modes, like pre-indexing and post-indexing etc. Among other things, one of differences in these addressing modes is _when_ the base pointer is updated with the result of the address calculation : before or after the memory operation. To best support such needs, the generic instructions like GINSN_TYPE_LDS, GINSN_TYPE_STS (load and store to stack) together with GINSN_TYPE_ADD, and GINSN_TYPE_SUB may be used. The functionality provided in ginsn.c and scfi.c is compiled in when a target defines TARGET_USE_SCFI and TARGET_USE_GINSN. This can be revisited later when there are other use-cases of creating ginsn's in GAS, apart from the current use-case of synthesizing CFI for hand-written asm. Support is added only for AMD64 ABI at this time. If the user specifies, --scfi --32, GAS issues an error: "Fatal error: Synthesizing CFI is not supported for this ABI" For synthesizing (DWARF) CFI, the SCFI machinery requires the programmer to adhere to some pre-requisites for their asm: - Hand-written asm block must begin with a .type foo, @function - Hand-written asm block must end with a .size foo, .-foo Further, the SCFI machinery employs some heuristics / rules. These heuristics imply certain restrictions on how the hand-written asm is done by the programmer. For example: - The base register for CFA tracking may be either REG_SP or REG_FP. - If the base register for CFA tracking is REG_SP, the precise amount of stack usage (and hence, the value of REG_SP) must be known at all times. - If using dynamic stack allocation, the function must switch to FP-based CFA. This means using instructions like the following (in AMD64) in prologue: pushq %rbp movq %rsp, %rbp and analogous instructions in epilogue. - Save and Restore of callee-saved registers must be symmetrical. However, the SCFI machinery at this time only warns if any such asymmetry is seen. These heuristics / rules are architecture-independent and are meant to employed for all architectures/ABIs using SCFI in the future. gas/ * Makefile.am: Add new files. * Makefile.in: Regenerated. * as.c (defined): Guard with both TARGET_USE_SCFI and TARGET_USE_GINSN. * config/obj-elf.c (obj_elf_size): Invoke ginsn_data_end. (obj_elf_type): Invoke ginsn_data_begin. * config/tc-i386.c (ginsn_new): New functionality to generate ginsns. (x86_scfi_callee_saved_p): New function. (ginsn_dw2_regnum): Likewise. (ginsn_set_where): Likewise. (x86_ginsn_alu): Likewise. (x86_ginsn_move): Likewise. (x86_ginsn_lea): Likewise. (x86_ginsn_jump): Likewise. (x86_ginsn_jump_cond): Likewise. (md_assemble): Invoke ginsn_new. (s_insn): Likewise. (i386_target_format): Add hard error for usage of --scfi with non AMD64 ABIs. * config/tc-i386.h (TARGET_USE_GINSN): New definition. (TARGET_USE_SCFI): Likewise. (SCFI_NUM_REGS): Likewise. (REG_FP): Likewise. (REG_SP): Likewise. (SCFI_INIT_CFA_OFFSET): Likewise. (SCFI_CALLEE_SAVED_REG_P): Likewise. (x86_scfi_callee_saved_p): Likewise. * subsegs.h (struct frch_ginsn_data): New forward declaration. (struct frchain): New member for ginsn data. * symbols.c: Invoke ginsn_frob_label to convey user-defined labels to ginsn infrastructure. * ginsn.c: New file. * ginsn.h: New file. * scfi.c: New file. * scfi.h: New file. --- gas/Makefile.am | 4 + gas/Makefile.in | 19 +- gas/as.c | 4 +- gas/config/obj-elf.c | 8 + gas/config/tc-i386.c | 646 ++++++++++++++++++++++++- gas/config/tc-i386.h | 21 + gas/ginsn.c | 985 ++++++++++++++++++++++++++++++++++++++ gas/ginsn.h | 347 ++++++++++++++ gas/scfi.c | 1090 ++++++++++++++++++++++++++++++++++++++++++ gas/scfi.h | 31 ++ gas/subsegs.h | 2 + gas/symbols.c | 3 + 12 files changed, 3151 insertions(+), 9 deletions(-) create mode 100644 gas/ginsn.c create mode 100644 gas/ginsn.h create mode 100644 gas/scfi.c create mode 100644 gas/scfi.h diff --git a/gas/Makefile.am b/gas/Makefile.am index e174305ca62..b477d74cb53 100644 --- a/gas/Makefile.am +++ b/gas/Makefile.am @@ -82,6 +82,7 @@ GAS_CFILES = \ flonum-mult.c \ frags.c \ gen-sframe.c \ + ginsn.c \ hash.c \ input-file.c \ input-scrub.c \ @@ -94,6 +95,7 @@ GAS_CFILES = \ remap.c \ sb.c \ scfidw2gen.c \ + scfi.c \ sframe-opt.c \ stabs.c \ subsegs.c \ @@ -119,6 +121,7 @@ HFILES = \ flonum.h \ frags.h \ gen-sframe.h \ + ginsn.h \ hash.h \ input-file.h \ itbl-lex.h \ @@ -130,6 +133,7 @@ HFILES = \ read.h \ sb.h \ scfidw2gen.h \ + scfi.h \ subsegs.h \ symbols.h \ tc.h \ diff --git a/gas/Makefile.in b/gas/Makefile.in index 87428bc46b8..99edb365a00 100644 --- a/gas/Makefile.in +++ b/gas/Makefile.in @@ -167,12 +167,13 @@ am__objects_1 = app.$(OBJEXT) as.$(OBJEXT) atof-generic.$(OBJEXT) \ ecoff.$(OBJEXT) ehopt.$(OBJEXT) expr.$(OBJEXT) \ flonum-copy.$(OBJEXT) flonum-konst.$(OBJEXT) \ flonum-mult.$(OBJEXT) frags.$(OBJEXT) gen-sframe.$(OBJEXT) \ - hash.$(OBJEXT) input-file.$(OBJEXT) input-scrub.$(OBJEXT) \ - listing.$(OBJEXT) literal.$(OBJEXT) macro.$(OBJEXT) \ - messages.$(OBJEXT) output-file.$(OBJEXT) read.$(OBJEXT) \ - remap.$(OBJEXT) sb.$(OBJEXT) scfidw2gen.$(OBJEXT) \ - sframe-opt.$(OBJEXT) stabs.$(OBJEXT) subsegs.$(OBJEXT) \ - symbols.$(OBJEXT) write.$(OBJEXT) + ginsn.$(OBJEXT) hash.$(OBJEXT) input-file.$(OBJEXT) \ + input-scrub.$(OBJEXT) listing.$(OBJEXT) literal.$(OBJEXT) \ + macro.$(OBJEXT) messages.$(OBJEXT) output-file.$(OBJEXT) \ + read.$(OBJEXT) remap.$(OBJEXT) sb.$(OBJEXT) \ + scfidw2gen.$(OBJEXT) scfi.$(OBJEXT) sframe-opt.$(OBJEXT) \ + stabs.$(OBJEXT) subsegs.$(OBJEXT) symbols.$(OBJEXT) \ + write.$(OBJEXT) am_as_new_OBJECTS = $(am__objects_1) am__dirstamp = $(am__leading_dot)dirstamp as_new_OBJECTS = $(am_as_new_OBJECTS) @@ -570,6 +571,7 @@ GAS_CFILES = \ flonum-mult.c \ frags.c \ gen-sframe.c \ + ginsn.c \ hash.c \ input-file.c \ input-scrub.c \ @@ -582,6 +584,7 @@ GAS_CFILES = \ remap.c \ sb.c \ scfidw2gen.c \ + scfi.c \ sframe-opt.c \ stabs.c \ subsegs.c \ @@ -606,6 +609,7 @@ HFILES = \ flonum.h \ frags.h \ gen-sframe.h \ + ginsn.h \ hash.h \ input-file.h \ itbl-lex.h \ @@ -617,6 +621,7 @@ HFILES = \ read.h \ sb.h \ scfidw2gen.h \ + scfi.h \ subsegs.h \ symbols.h \ tc.h \ @@ -1325,6 +1330,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/flonum-mult.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/frags.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/gen-sframe.Po@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginsn.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/hash.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/input-file.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/input-scrub.Po@am__quote@ @@ -1339,6 +1345,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/read.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/remap.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sb.Po@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scfi.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scfidw2gen.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sframe-opt.Po@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stabs.Po@am__quote@ diff --git a/gas/as.c b/gas/as.c index 523169e66e3..9d68b2c8d67 100644 --- a/gas/as.c +++ b/gas/as.c @@ -372,7 +372,7 @@ Options:\n\ -R fold data section into text section\n")); fprintf (stream, _("\ --reduce-memory-overheads ignored\n")); -# ifdef TARGET_USE_SCFI +# if defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN) fprintf (stream, _("\ --scfi=[all,none] synthesize DWARF CFI for hand-written asm (not inline)\n\ (default --scfi=all)\n")); @@ -592,7 +592,7 @@ parse_args (int * pargc, char *** pargv) ,{"no-pad-sections", no_argument, NULL, OPTION_NO_PAD_SECTIONS} ,{"no-warn", no_argument, NULL, 'W'} ,{"reduce-memory-overheads", no_argument, NULL, OPTION_REDUCE_MEMORY_OVERHEADS} -#ifdef TARGET_USE_SCFI +# if defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN) ,{"scfi", no_argument, NULL, OPTION_SCFI} #endif ,{"statistics", no_argument, NULL, OPTION_STATISTICS} diff --git a/gas/config/obj-elf.c b/gas/config/obj-elf.c index 681e75f9a48..e09f26b85f9 100644 --- a/gas/config/obj-elf.c +++ b/gas/config/obj-elf.c @@ -24,6 +24,7 @@ #include "subsegs.h" #include "obstack.h" #include "dwarf2dbg.h" +#include "scfi.h" #ifndef ECOFF_DEBUGGING #define ECOFF_DEBUGGING 0 @@ -2298,6 +2299,10 @@ obj_elf_size (int ignore ATTRIBUTE_UNUSED) symbol_get_obj (sym)->size = XNEW (expressionS); *symbol_get_obj (sym)->size = exp; } + + if (S_IS_FUNCTION (sym) && flag_synth_cfi) + ginsn_data_end (symbol_temp_new_now ()); + demand_empty_rest_of_line (); } @@ -2486,6 +2491,9 @@ obj_elf_type (int ignore ATTRIBUTE_UNUSED) elfsym->symbol.flags &= ~mask; } + if (S_IS_FUNCTION (sym) && flag_synth_cfi) + ginsn_data_begin (sym); + demand_empty_rest_of_line (); } diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index cec9a02be52..6f30ffac64c 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -30,6 +30,7 @@ #include "subsegs.h" #include "dwarf2dbg.h" #include "dw2gencfi.h" +#include "scfi.h" #include "gen-sframe.h" #include "sframe.h" #include "elf/x86-64.h" @@ -193,8 +194,11 @@ static unsigned int x86_isa_1_used; static unsigned int x86_feature_2_used; /* Generate x86 used ISA and feature properties. */ static unsigned int x86_used_note = DEFAULT_X86_USED_NOTE; + #endif +static ginsnS *ginsn_new (symbolS *sym, enum ginsn_gen_mode gmode); + static const char *default_arch = DEFAULT_ARCH; /* parse_register() returns this when a register alias cannot be used. */ @@ -5075,6 +5079,627 @@ static INLINE bool may_need_pass2 (const insn_template *t) && t->base_opcode == 0x63); } +bool +x86_scfi_callee_saved_p (uint32_t dw2reg_num) +{ + if (dw2reg_num == 3 /* rbx. */ + || dw2reg_num == REG_FP /* rbp. */ + || dw2reg_num == REG_SP /* rsp. */ + || (dw2reg_num >= 12 && dw2reg_num <= 15) /* r12 - r15. */) + return true; + + return false; +} + +static uint32_t +ginsn_dw2_regnum (const reg_entry *ireg) +{ + /* PS: Note the data type here as int32_t, because of Dw2Inval (-1). */ + int32_t dwarf_reg = Dw2Inval; + const reg_entry *temp; + + if (ireg->dw2_regnum[0] == Dw2Inval && ireg->dw2_regnum[1] == Dw2Inval) + return dwarf_reg; + + dwarf_reg = ireg->dw2_regnum[flag_code >> 1]; + if (dwarf_reg == Dw2Inval) + { + temp = ireg + 16; + dwarf_reg = ginsn_dw2_regnum (temp); + } + + if (dwarf_reg == Dw2Inval) + gas_assert (1); /* Needs to be addressed. */ + + return (uint32_t) dwarf_reg; +} + +static void +ginsn_set_where (ginsnS* ginsn) +{ + const char *file; + unsigned int line; + file = as_where (&line); + ginsn_set_file_line (ginsn, file, line); +} + +static ginsnS * +x86_ginsn_alu (i386_insn insn, symbolS *insn_end_sym) +{ + offsetT src_imm; + uint32_t dw2_regnum; + ginsnS *ginsn = NULL; + + /* FIXME - create ginsn for REG_SP target only ? */ + /* Map for insn.tm.extension_opcode + 000 ADD 100 AND + 001 OR 101 SUB + 010 ADC 110 XOR + 011 SBB 111 CMP */ + + /* add/sub imm, %reg. + and imm, %reg only at this time for SCFI. */ + if (!(insn.tm.extension_opcode == 0 + || insn.tm.extension_opcode == 4 + || insn.tm.extension_opcode == 5)) + return ginsn; + + /* TBD_GINSN_REPRESENTATION_LIMIT: There is no representation for when a + symbol is used as an operand, like so: + addq $simd_cmp_op+8, %rdx + Skip generating any ginsn for this. */ + if (insn.imm_operands == 1 + && insn.op[0].imms->X_op == O_symbol) + return ginsn; + + gas_assert (insn.imm_operands == 1 + && insn.op[0].imms->X_op == O_constant); + src_imm = insn.op[0].imms->X_add_number; + dw2_regnum = ginsn_dw2_regnum (insn.op[1].regs); + /* For ginsn, keep the imm as second src operand. */ + if (insn.tm.extension_opcode == 5) + ginsn = ginsn_new_sub (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, + GINSN_SRC_IMM, src_imm, + GINSN_DST_REG, dw2_regnum); + else if (insn.tm.extension_opcode == 4) + ginsn = ginsn_new_and (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, + GINSN_SRC_IMM, src_imm, + GINSN_DST_REG, dw2_regnum); + else if (insn.tm.extension_opcode == 0) + ginsn = ginsn_new_add (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, + GINSN_SRC_IMM, src_imm, + GINSN_DST_REG, dw2_regnum); + + ginsn_set_where (ginsn); + + return ginsn; +} + +static ginsnS * +x86_ginsn_move (i386_insn insn, symbolS *insn_end_sym) +{ + ginsnS *ginsn; + uint16_t opcode; + uint32_t dst_reg; + uint32_t src_reg; + offsetT dst_disp; + offsetT src_disp; + const reg_entry *dst = NULL; + const reg_entry *src = NULL; + enum ginsn_dst_type dst_type; + enum ginsn_src_type src_type; + + opcode = insn.tm.base_opcode; + src_type = GINSN_SRC_REG; + src_disp = dst_disp = 0; + dst_type = GINSN_DST_REG; + + if (opcode == 0x8b) + { + /* mov disp(%reg), %reg. */ + if (insn.mem_operands && insn.base_reg) + { + src = insn.base_reg; + if (insn.disp_operands == 1) + src_disp = insn.op[0].disps->X_add_number; + src_type = GINSN_SRC_INDIRECT; + } + else + src = insn.op[0].regs; + + dst = insn.op[1].regs; + } + else if (opcode == 0x89 || opcode == 0x88) + { + /* mov %reg, disp(%reg). */ + src = insn.op[0].regs; + if (insn.mem_operands && insn.base_reg) + { + dst = insn.base_reg; + if (insn.disp_operands == 1) + dst_disp = insn.op[1].disps->X_add_number; + dst_type = GINSN_DST_INDIRECT; + } + else + dst = insn.op[1].regs; + } + + src_reg = ginsn_dw2_regnum (src); + dst_reg = ginsn_dw2_regnum (dst); + + ginsn = ginsn_new_mov (insn_end_sym, true, + src_type, src_reg, src_disp, + dst_type, dst_reg, dst_disp); + ginsn_set_where (ginsn); + + return ginsn; +} + +static ginsnS * +x86_ginsn_lea (i386_insn insn, symbolS *insn_end_sym) +{ + offsetT src_disp = 0; + ginsnS *ginsn = NULL; + uint32_t base_reg; + uint32_t index_reg; + offsetT index_scale; + uint32_t dst_reg; + + if (!insn.index_reg && !insn.base_reg) + { + /* lea symbol, %rN. */ + dst_reg = ginsn_dw2_regnum (insn.op[1].regs); + /* FIXME - Skip encoding information about the symbol. + This is TBD_GINSN_INFO_LOSS, but it is fine if the mode is + GINSN_GEN_SCFI. */ + ginsn = ginsn_new_mov (insn_end_sym, false, + GINSN_SRC_IMM, 0xf /* arbitrary const. */, 0, + GINSN_DST_REG, dst_reg, 0); + } + else if (insn.base_reg && !insn.index_reg) + { + /* lea -0x2(%base),%dst. */ + base_reg = ginsn_dw2_regnum (insn.base_reg); + dst_reg = ginsn_dw2_regnum (insn.op[1].regs); + + if (insn.disp_operands) + src_disp = insn.op[0].disps->X_add_number; + + if (src_disp) + /* Generate an ADD ginsn. */ + ginsn = ginsn_new_add (insn_end_sym, true, + GINSN_SRC_REG, base_reg, + GINSN_SRC_IMM, src_disp, + GINSN_DST_REG, dst_reg); + else + /* Generate a MOV ginsn. */ + ginsn = ginsn_new_mov (insn_end_sym, true, + GINSN_SRC_REG, base_reg, 0, + GINSN_DST_REG, dst_reg, 0); + } + else if (!insn.base_reg && insn.index_reg) + { + /* lea (,%index,imm), %dst. */ + /* FIXME - Skip encoding an explicit multiply operation, instead use + GINSN_TYPE_OTHER. This is TBD_GINSN_INFO_LOSS, but it is fine if + the mode is GINSN_GEN_SCFI. */ + index_scale = insn.log2_scale_factor; + index_reg = ginsn_dw2_regnum (insn.index_reg); + dst_reg = ginsn_dw2_regnum (insn.op[1].regs); + ginsn = ginsn_new_other (insn_end_sym, true, + GINSN_SRC_REG, index_reg, + GINSN_SRC_IMM, index_scale, + GINSN_DST_REG, dst_reg); + } + else + { + /* lea disp(%base,%index,imm) %dst. */ + /* FIXME - Skip encoding information about the disp and imm for index + reg. This is TBD_GINSN_INFO_LOSS, but it is fine if the mode is + GINSN_GEN_SCFI. */ + base_reg = ginsn_dw2_regnum (insn.base_reg); + index_reg = ginsn_dw2_regnum (insn.index_reg); + dst_reg = ginsn_dw2_regnum (insn.op[1].regs); + /* Generate an ADD ginsn. */ + ginsn = ginsn_new_add (insn_end_sym, true, + GINSN_SRC_REG, base_reg, + GINSN_SRC_REG, index_reg, + GINSN_DST_REG, dst_reg); + } + + ginsn_set_where (ginsn); + + return ginsn; +} + +static ginsnS * +x86_ginsn_jump (i386_insn insn, symbolS *insn_end_sym) +{ + ginsnS *ginsn = NULL; + symbolS *src_symbol; + + gas_assert (insn.disp_operands == 1); + + if (insn.op[0].disps->X_op == O_symbol) + { + src_symbol = insn.op[0].disps->X_add_symbol; + /* The jump target is expected to be a symbol with 0 addend. + Assert for now to see if this assumption is true. */ + gas_assert (insn.op[0].disps->X_add_number == 0); + ginsn = ginsn_new_jump (insn_end_sym, true, + GINSN_SRC_SYMBOL, 0, src_symbol); + + ginsn_set_where (ginsn); + } + + return ginsn; +} + +static ginsnS * +x86_ginsn_jump_cond (i386_insn insn, symbolS *insn_end_sym) +{ + ginsnS *ginsn = NULL; + symbolS *src_symbol; + + /* TBD_GINSN_GEN_NOT_SCFI: Ignore move to or from xmm reg for mode. */ + if (i.tm.opcode_space == SPACE_0F) + return ginsn; + + gas_assert (insn.disp_operands == 1); + + if (insn.op[0].disps->X_op == O_symbol) + { + src_symbol = insn.op[0].disps->X_add_symbol; + /* The jump target is expected to be a symbol with 0 addend. + Assert for now to see if this assumption is true. */ + gas_assert (insn.op[0].disps->X_add_number == 0); + ginsn = ginsn_new_jump_cond (insn_end_sym, true, + GINSN_SRC_SYMBOL, 0, src_symbol); + ginsn_set_where (ginsn); + } + else + /* Catch them for now so we know what we are dealing with. */ + gas_assert (0); + + return ginsn; +} + +/* Generate one or more GAS instructions for the current machine dependent + instruction. + + Returns the head of linked list of ginsn(s) added, if success; + Returns NULL if failure. */ + +static ginsnS * +ginsn_new (symbolS *insn_end_sym, enum ginsn_gen_mode gmode) +{ + uint16_t opcode; + uint32_t dw2_regnum; + uint32_t src2_dw2_regnum; + ginsnS *ginsn = NULL; + ginsnS *ginsn_next = NULL; + ginsnS *ginsn_last = NULL; + + /* FIXME - Need a way to check whether the decoding is sane. The specific + checks around i.tm.opcode_space were added as issues were seen. Likely + insufficient. */ + + /* Currently supports generation of selected ginsns, sufficient for + the use-case of SCFI only. To remove this condition will require + work on this target-specific process of creation of ginsns. Some + of such places are tagged with TBD_GINSN_GEN_NOT_SCFI to serve as + examples. */ + if (gmode != GINSN_GEN_SCFI) + return ginsn; + + opcode = i.tm.base_opcode; + + switch (opcode) + { + case 0x1: + /* add reg, reg. */ + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + src2_dw2_regnum = ginsn_dw2_regnum (i.op[1].regs); + ginsn = ginsn_new_add (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, + GINSN_SRC_REG, src2_dw2_regnum, + GINSN_DST_REG, src2_dw2_regnum); + ginsn_set_where (ginsn); + break; + case 0x29: + /* sub reg, reg. */ + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + src2_dw2_regnum = ginsn_dw2_regnum (i.op[1].regs); + ginsn = ginsn_new_sub (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, + GINSN_SRC_REG, src2_dw2_regnum, + GINSN_DST_REG, src2_dw2_regnum); + ginsn_set_where (ginsn); + break; + case 0xa0: + case 0xa8: + gas_assert (i.tm.opcode_space == SPACE_0F); + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + /* push fs / push gs. */ + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_REG, dw2_regnum, + GINSN_DST_STACK); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + case 0xa1: + case 0xa9: + /* If opcode_space != SPACE_0F, this is test insn. Skip it + for GINSN_GEN_SCFI. */ + if (i.tm.opcode_space != SPACE_0F) + break; + + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + /* pop fs / pop gs. */ + ginsn = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_STACK, + GINSN_DST_REG, dw2_regnum); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + case 0x50 ... 0x57: + /* push reg. */ + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_REG, dw2_regnum, + GINSN_DST_STACK); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + case 0x58 ... 0x5f: + if (i.tm.opcode_space != SPACE_BASE) + break; + /* pop reg. */ + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_STACK, + GINSN_DST_REG, dw2_regnum); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + case 0x68: + case 0x6a: + /* push imm. */ + /* Skip getting the value of imm from machine instruction + because for ginsn generation this is not important. */ + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_IMM, 0, + GINSN_DST_STACK); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + case 0x71 ... 0x7f: + ginsn = x86_ginsn_jump_cond (i, insn_end_sym); + break; + case 0x81: + case 0x83: + ginsn = x86_ginsn_alu (i, insn_end_sym); + break; + case 0x8b: + /* Move r/m64 to r64. */ + case 0x88: + case 0x89: + /* mov reg, reg/mem. */ + ginsn = x86_ginsn_move (i, insn_end_sym); + break; + case 0x8d: + /* lea disp(%src), %dst */ + ginsn = x86_ginsn_lea (i, insn_end_sym); + break; + case 0x8f: + /* pop to mem. */ + ginsn = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_STACK, + GINSN_DST_MEM, 0); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + break; + case 0x9c: + /* pushf / pushfd / pushfq. + Tracking EFLAGS register by number is not necessary. */ + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_IMM, 0, + GINSN_DST_STACK); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + + break; + case 0xff: + /* push from mem. */ + if (i.tm.extension_opcode == 6) + { + ginsn = ginsn_new_sub (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn); + + ginsn_next = ginsn_new_store (insn_end_sym, false, + GINSN_SRC_MEM, 0, + GINSN_DST_STACK); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + } + else if (i.tm.extension_opcode == 4) + { + /* jmp r/m. E.g., notrack jmp *%rax. */ + if (i.reg_operands) + { + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_jump (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + else if (i.mem_operands && i.index_reg) + { + /* jmp *0x0(,%rax,8). */ + dw2_regnum = ginsn_dw2_regnum (i.index_reg); + ginsn = ginsn_new_jump (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + else + /* Catch them for now so we know what we are dealing with. */ + gas_assert (0); + } + else if (i.tm.extension_opcode == 2) + { + /* 0xFF /2 (call). */ + if (i.reg_operands) + { + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + ginsn = ginsn_new_call (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + else if (i.mem_operands && i.base_reg) + { + dw2_regnum = ginsn_dw2_regnum (i.base_reg); + ginsn = ginsn_new_call (insn_end_sym, true, + GINSN_SRC_REG, dw2_regnum, NULL); + ginsn_set_where (ginsn); + } + else + /* Catch them for now so we know what we are dealing with. */ + gas_assert (0); + } + else + /* Catch them for now so we know what we are dealing with. */ + gas_assert (0); + break; + case 0xc2: + case 0xc3: + /* Near ret. */ + ginsn = ginsn_new_return (insn_end_sym, true); + ginsn_set_where (ginsn); + break; + case 0xc9: + /* The 'leave' instruction copies the contents of the RBP register + into the RSP register to release all stack space allocated to the + procedure. */ + ginsn = ginsn_new_mov (insn_end_sym, false, + GINSN_SRC_REG, REG_FP, 0, + GINSN_DST_REG, REG_SP, 0); + ginsn_set_where (ginsn); + + /* Then it restores the old value of the RBP register from the stack. */ + ginsn_next = ginsn_new_load (insn_end_sym, false, + GINSN_SRC_STACK, + GINSN_DST_REG, REG_FP); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn, ginsn_next)); + ginsn_last = ginsn_new_add (insn_end_sym, false, + GINSN_SRC_REG, REG_SP, + GINSN_SRC_IMM, 8, + GINSN_DST_REG, REG_SP); + ginsn_set_where (ginsn_next); + + gas_assert (!ginsn_link_next (ginsn_next, ginsn_last)); + break; + case 0xe8: + /* PS: SCFI machinery does not care about which func is being + called. OK to skip that info. */ + ginsn = ginsn_new_call (insn_end_sym, true, + GINSN_SRC_SYMBOL, 0, NULL); + ginsn_set_where (ginsn); + break; + case 0xe9: + case 0xeb: + /* Unconditional jmp. */ + ginsn = x86_ginsn_jump (i, insn_end_sym); + ginsn_set_where (ginsn); + break; + /* Fall Through. */ + default: + /* TBD_GINSN_GEN_NOT_SCFI: Keep a warning, for now, to find out about + possibly missed instructions affecting REG_SP or REG_FP. These + checks may not be completely exhaustive as they do not involve + index / base reg. */ + if (i.op[0].regs) + { + dw2_regnum = ginsn_dw2_regnum (i.op[0].regs); + if (dw2_regnum == REG_SP || dw2_regnum == REG_FP) + as_warn_where (last_insn.file, last_insn.line, + _("SCFI: unhandled op 0x%x may cause incorrect CFI"), + i.tm.base_opcode); + } + if (i.op[1].regs) + { + dw2_regnum = ginsn_dw2_regnum (i.op[1].regs); + if (dw2_regnum == REG_SP || dw2_regnum == REG_FP) + as_warn_where (last_insn.file, last_insn.line, + _("SCFI: unhandled op 0x%x may cause incorrect CFI"), + i.tm.base_opcode); + } + /* Keep an eye on other instructions affecting control flow. */ + gas_assert (!i.tm.opcode_modifier.jump); + /* TBD_GINSN_GEN_NOT_SCFI: Skip all other opcodes uninteresting for + GINSN_GEN_SCFI mode. */ + break; + } + + return ginsn; +} + /* This is the guts of the machine-dependent assembler. LINE points to a machine dependent instruction. This function is supposed to emit the frags/bytes it assembles to. */ @@ -5087,6 +5712,7 @@ md_assemble (char *line) const char *end, *pass1_mnem = NULL; enum i386_error pass1_err = 0; const insn_template *t; + ginsnS *ginsn; /* Initialize globals. */ current_templates = NULL; @@ -5609,6 +6235,13 @@ md_assemble (char *line) /* We are ready to output the insn. */ output_insn (); + /* At this time, SCFI is enabled only for AMD64 ABI. */ + if (flag_synth_cfi && x86_elf_abi == X86_64_ABI) + { + ginsn = ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ()); + frch_ginsn_data_append (ginsn); + } + insert_lfence_after (); last_insn.seg = now_seg; @@ -10817,6 +11450,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED) valueT val; bool vex = false, xop = false, evex = false; static const templates tt = { &i.tm, &i.tm + 1 }; + ginsnS *ginsn; init_globals (); @@ -11566,7 +12200,14 @@ s_insn (int dummy ATTRIBUTE_UNUSED) output_insn (); - done: + /* At this time, SCFI is enabled only for AMD64 ABI. */ + if (flag_synth_cfi && x86_elf_abi == X86_64_ABI) + { + ginsn = ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ()); + frch_ginsn_data_append (ginsn); + } + +done: *saved_ilp = saved_char; input_line_pointer = line; @@ -15208,6 +15849,9 @@ i386_target_format (void) else as_fatal (_("unknown architecture")); + if (flag_synth_cfi && x86_elf_abi != X86_64_ABI) + as_fatal (_("Synthesizing CFI is not supported for this ABI")); + if (cpu_flags_all_zero (&cpu_arch_isa_flags)) cpu_arch_isa_flags = cpu_arch[flag_code == CODE_64BIT].enable; if (cpu_flags_all_zero (&cpu_arch_tune_flags)) diff --git a/gas/config/tc-i386.h b/gas/config/tc-i386.h index 80d66c1ce15..4695d1a4940 100644 --- a/gas/config/tc-i386.h +++ b/gas/config/tc-i386.h @@ -359,6 +359,27 @@ extern int i386_elf_section_type (const char *, size_t); extern void i386_solaris_fix_up_eh_frame (segT); #endif +#define TARGET_USE_GINSN 1 +/* Allow GAS to synthesize DWARF CFI for hand-written asm. + PS: TARGET_USE_CFIPOP is a pre-condition. */ +#define TARGET_USE_SCFI 1 +/* Identify the maximum DWARF register number of all the registers being + tracked for SCFI. This is the last DWARF register number of the set + of SP, BP, and all callee-saved registers. For AMD64, this means + R15 (15). Use SCFI_CALLEE_SAVED_REG_P to identify which registers + are callee-saved from this set. */ +#define SCFI_NUM_REGS 15 +/* Identify the DWARF register number of the frame-pointer register. */ +#define REG_FP 6 +/* Identify the DWARF register number of the stack-pointer register. */ +#define REG_SP 7 +/* Some ABIs, like AMD64, use stack for call instruction. + If so, identify the Initial (CFA) offset from RSP at the entry of function. */ +#define SCFI_INIT_CFA_OFFSET 8 + +#define SCFI_CALLEE_SAVED_REG_P(dw2reg) x86_scfi_callee_saved_p (dw2reg) +extern bool x86_scfi_callee_saved_p (uint32_t dw2reg_num); + /* Support for SHF_X86_64_LARGE */ extern bfd_vma x86_64_section_letter (int, const char **); #define md_elf_section_letter(LETTER, PTR_MSG) x86_64_section_letter (LETTER, PTR_MSG) diff --git a/gas/ginsn.c b/gas/ginsn.c new file mode 100644 index 00000000000..4aec5482243 --- /dev/null +++ b/gas/ginsn.c @@ -0,0 +1,985 @@ +/* ginsn.h - GAS instruction representation. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#include "as.h" +#include "subsegs.h" +#include "ginsn.h" +#include "scfi.h" + +#ifdef TARGET_USE_GINSN + +static +ginsnS *ginsn_alloc (void) +{ + ginsnS *ginsn = XCNEW (ginsnS); + return ginsn; +} + +static ginsnS* +ginsn_init (enum ginsn_type type, symbolS *sym, bool real_p) +{ + ginsnS *ginsn = ginsn_alloc (); + ginsn->type = type; + ginsn->sym = sym; + if (real_p) + ginsn->flags |= GINSN_F_INSN_REAL; + return ginsn; +} + +static void +ginsn_set_src (struct ginsn_src *src, enum ginsn_src_type type, uint32_t reg, + int32_t immdisp) +{ + if (!src) + return; + + src->type = type; + /* Even when the use-case is SCFI, the value of reg may be > SCFI_NUM_REGS. + E.g., in AMD64, push fs etc. */ + src->reg = reg; + + if (type == GINSN_SRC_IMM || type == GINSN_SRC_INDIRECT) + src->immdisp = immdisp; +} + +static void +ginsn_set_dst (struct ginsn_dst *dst, enum ginsn_dst_type type, uint32_t reg, + int32_t disp) +{ + if (!dst) + return; + + dst->type = type; + dst->reg = reg; + + if (type == GINSN_DST_INDIRECT) + dst->disp = disp; +} + +# if 0 +static void +free_ginsn (ginsnS *ginsn) +{ + free (ginsn); + ginsn = NULL; +} +#endif + +struct ginsn_src * +ginsn_get_src1 (ginsnS *ginsn) +{ + return &ginsn->src[0]; +} + +struct ginsn_src * +ginsn_get_src2 (ginsnS *ginsn) +{ + return &ginsn->src[1]; +} + +struct ginsn_dst * +ginsn_get_dst (ginsnS *ginsn) +{ + return &ginsn->dst; +} + +uint32_t +ginsn_get_src_reg (struct ginsn_src *src) +{ + return src->reg; +} + +enum ginsn_src_type +ginsn_get_src_type (struct ginsn_src *src) +{ + return src->type; +} + +uint32_t +ginsn_get_src_disp (struct ginsn_src *src) +{ + return src->immdisp; +} + +uint32_t +ginsn_get_src_imm (struct ginsn_src *src) +{ + return src->immdisp; +} + +uint32_t +ginsn_get_dst_reg (struct ginsn_dst *dst) +{ + return dst->reg; +} + +enum ginsn_dst_type +ginsn_get_dst_type (struct ginsn_dst *dst) +{ + return dst->type; +} + +int32_t +ginsn_get_dst_disp (struct ginsn_dst *dst) +{ + return (int32_t) dst->disp; +} + +void +label_ginsn_map_insert (symbolS *label, ginsnS *ginsn) +{ + const char *name = S_GET_NAME (label); + str_hash_insert (frchain_now->frch_ginsn_data->label_ginsn_map, + name, ginsn, 0 /* noreplace. */); +} + +ginsnS * +label_ginsn_map_find (symbolS *label) +{ + const char *name = S_GET_NAME (label); + ginsnS *ginsn + = (ginsnS *) str_hash_find (frchain_now->frch_ginsn_data->label_ginsn_map, + name); + return ginsn; +} + +ginsnS * +ginsn_new_symbol (symbolS *sym, bool func_begin_p) +{ + ginsnS *ginsn = ginsn_alloc (); + ginsn->type = GINSN_TYPE_SYMBOL; + ginsn->sym = sym; + if (func_begin_p) + ginsn->flags |= GINSN_F_FUNC_MARKER; + return ginsn; +} + +ginsnS * +ginsn_new_symbol_func_begin (symbolS *sym) +{ + return ginsn_new_symbol (sym, true); +} + +ginsnS * +ginsn_new_symbol_func_end (symbolS *sym) +{ + return ginsn_new_symbol (sym, false); +} + +ginsnS * +ginsn_new_symbol_user_label (symbolS *sym) +{ + ginsnS *ginsn = ginsn_new_symbol (sym, false); + ginsn->flags |= GINSN_F_USER_LABEL; + return ginsn; +} + +/* PS: In some of the ginsn_new_* APIs below, a 'uint32_t src[1-2]_val' is + used to carry even an 'int32_t disp'. This is done to keep the number + of arguments in the APIs in check, in hope that this is more readable + code. */ + +ginsnS * +ginsn_new_add (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_ADD, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_val); + /* GINSN_SRC_INDIRECT src2_type is not expected. */ + gas_assert (src2_type != GINSN_SRC_INDIRECT); + ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_val); + /* dst info. */ + gas_assert (dst_type != GINSN_DST_INDIRECT); + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_and (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_AND, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_val); + /* GINSN_SRC_INDIRECT src2_type is not expected. */ + gas_assert (src2_type != GINSN_SRC_INDIRECT); + ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_val); + /* dst info. */ + gas_assert (dst_type != GINSN_DST_INDIRECT); + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_call (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_val, + symbolS *src_text_sym) + +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_CALL, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_val, 0); + + if (src_type == GINSN_SRC_SYMBOL) + ginsn->src[0].sym = src_text_sym; + + return ginsn; +} + +ginsnS * +ginsn_new_jump (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_val, + symbolS *src_ginsn_sym) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_JUMP, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_val, 0); + + if (src_type == GINSN_SRC_SYMBOL) + ginsn->src[0].sym = src_ginsn_sym; + + return ginsn; +} + +ginsnS * +ginsn_new_jump_cond (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_val, + symbolS *src_ginsn_sym) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_JUMP_COND, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_val, 0); + + if (src_type == GINSN_SRC_SYMBOL) + ginsn->src[0].sym = src_ginsn_sym; + + return ginsn; +} + +ginsnS * +ginsn_new_mov (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_reg, int32_t src_disp, + enum ginsn_dst_type dst_type, uint32_t dst_reg, int32_t dst_disp) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_MOV, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_reg, src_disp); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp); + + return ginsn; +} + +ginsnS * +ginsn_new_store (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_reg, + enum ginsn_dst_type dst_type) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_STS, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0); + /* dst info. */ + gas_assert (dst_type == GINSN_DST_STACK || dst_type == GINSN_DST_MEM); + ginsn_set_dst (&ginsn->dst, dst_type, 0, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_load (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, + enum ginsn_dst_type dst_type, uint32_t dst_reg) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_LDS, sym, real_p); + /* src info. */ + gas_assert (src_type == GINSN_SRC_STACK || src_type == GINSN_SRC_MEM); + ginsn_set_src (&ginsn->src[0], src_type, 0, 0); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_sub (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_SUB, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_val); + /* GINSN_SRC_INDIRECT src2_type is not expected. */ + gas_assert (src2_type != GINSN_SRC_INDIRECT); + ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_val); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_other (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_OTHER, sym, real_p); + /* src info. */ + ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_val); + /* GINSN_SRC_INDIRECT src2_type is not expected. */ + gas_assert (src2_type != GINSN_SRC_INDIRECT); + ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_val); + /* dst info. */ + ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0); + + return ginsn; +} + +ginsnS * +ginsn_new_return (symbolS *sym, bool real_p) +{ + ginsnS *ginsn = ginsn_init (GINSN_TYPE_RETURN, sym, real_p); + return ginsn; +} + +void +ginsn_set_file_line (ginsnS *ginsn, const char *file, unsigned int line) +{ + if (!ginsn) + return; + + ginsn->file = file; + ginsn->line = line; +} + +int +ginsn_link_next (ginsnS *ginsn, ginsnS *next) +{ + int ret = 0; + + /* Avoid data corruption by limiting the scope of the API. */ + if (!ginsn || ginsn->next) + return 1; + + ginsn->next = next; + + return ret; +} + +bool +ginsn_track_reg_p (uint32_t dw2reg, enum ginsn_gen_mode gmode) +{ + bool track_p = false; + + if (gmode == GINSN_GEN_SCFI && dw2reg <= SCFI_NUM_REGS) + { + /* FIXME - rename this to tc_ ? */ + track_p |= SCFI_CALLEE_SAVED_REG_P(dw2reg); + track_p |= (dw2reg == REG_FP); + track_p |= (dw2reg == REG_SP); + } + + return track_p; +} + +static bool +ginsn_indirect_jump_p (ginsnS *ginsn) +{ + bool ret_p = false; + if (!ginsn) + return ret_p; + + ret_p = (ginsn->type == GINSN_TYPE_JUMP + && ginsn->src[0].type == GINSN_SRC_REG); + return ret_p; +} + +static bool +ginsn_direct_local_jump_p (ginsnS *ginsn) +{ + bool ret_p = false; + if (!ginsn) + return ret_p; + + ret_p |= (ginsn->type == GINSN_TYPE_JUMP + && ginsn->src[0].type == GINSN_SRC_SYMBOL + && S_IS_LOCAL (ginsn->src[0].sym)); + return ret_p; +} + +static void +bb_add_edge (gbbS* from_bb, gbbS *to_bb) +{ + gedgeS *tmpedge = NULL; + gedgeS *gedge; + bool exists = false; + + if (!from_bb || !to_bb) + return; + + /* Create a new edge object. */ + gedge = XCNEW (gedgeS); + gedge->dst_bb = to_bb; + gedge->next = NULL; + gedge->visited = false; + + /* Add it in. */ + if (from_bb->out_gedges == NULL) + { + from_bb->out_gedges = gedge; + from_bb->num_out_gedges++; + } + else + { + /* Get the tail of the list. */ + tmpedge = from_bb->out_gedges; + while (tmpedge) + { + /* Do not add duplicate edges. Duplicated edges will cause unwanted + failures in the forward and backward passes for SCFI. */ + if (tmpedge->dst_bb == to_bb) + { + exists = true; + break; + } + if (tmpedge->next) + tmpedge = tmpedge->next; + else + break; + } + + if (!exists) + { + tmpedge->next = gedge; + from_bb->num_out_gedges++; + } + else + free (gedge); + } +} + +static void +cfg_add_bb (gcfgS *gcfg, gbbS *gbb) +{ + gbbS *last_bb = NULL; + + if (!gcfg->root_bb) + gcfg->root_bb = gbb; + else + { + last_bb = gcfg->root_bb; + while (last_bb->next) + last_bb = last_bb->next; + + last_bb->next = gbb; + } + gcfg->num_gbbs++; + + gbb->id = gcfg->num_gbbs; +} + +static gbbS* +add_bb_at_ginsn (gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb); + +static gbbS* +find_bb (gcfgS *gcfg, ginsnS *ginsn) +{ + gbbS *found_bb = NULL; + gbbS *gbb = NULL; + + if (!ginsn) + return found_bb; + + if (ginsn->visited) + { + cfg_for_each_bb(gcfg, gbb) + { + if (gbb->first_ginsn == ginsn) + { + found_bb = gbb; + break; + } + } + /* Must be found if ginsn is visited. */ + gas_assert (found_bb); + } + + return found_bb; +} + +static gbbS* +find_or_make_bb (gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb) +{ + gbbS *found_bb = NULL; + + found_bb = find_bb (gcfg, ginsn); + if (found_bb) + return found_bb; + + return add_bb_at_ginsn (gcfg, ginsn, prev_bb); +} + +/* Add the basic block starting at GINSN to the given GCFG. + Also adds an edge from the PREV_BB to the newly added basic block. + + This is a recursive function which returns the root of the added + basic blocks. */ + +static gbbS* +add_bb_at_ginsn (gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb) +{ + gbbS *current_bb = NULL; + ginsnS *gins = NULL; + symbolS *taken_label; + + while (ginsn) + { + /* Skip these as they may be right after a GINSN_TYPE_RETURN. + For GINSN_TYPE_RETURN, we have already considered that as + end of bb, and a logical exit from function. */ + if (GINSN_F_FUNC_END_P(ginsn)) + { + ginsn = ginsn->next; + continue; + } + + if (ginsn->visited) + { + /* If the ginsn has been visited earlier, the bb must exist by now + in the cfg. */ + prev_bb = current_bb; + current_bb = find_bb (gcfg, ginsn); + gas_assert (current_bb); + /* Add edge from the prev_bb. */ + if (prev_bb) + bb_add_edge (prev_bb, current_bb); + break; + } + else if (current_bb && GINSN_F_USER_LABEL_P(ginsn)) + { + /* Create new bb starting at this label ginsn. */ + prev_bb = current_bb; + find_or_make_bb (gcfg, ginsn, prev_bb); + break; + } + + if (current_bb == NULL) + { + /* Create a new bb. */ + current_bb = XCNEW (gbbS); + cfg_add_bb (gcfg, current_bb); + /* Add edge for the Not Taken, or Fall-through path. */ + if (prev_bb) + bb_add_edge (prev_bb, current_bb); + } + + if (current_bb->first_ginsn == NULL) + current_bb->first_ginsn = ginsn; + + ginsn->visited = true; + current_bb->num_ginsns++; + current_bb->last_ginsn = ginsn; + + /* Note that BB is _not_ split on ginsn of type GINSN_TYPE_CALL. */ + if (ginsn->type == GINSN_TYPE_JUMP + || ginsn->type == GINSN_TYPE_JUMP_COND + || ginsn->type == GINSN_TYPE_RETURN) + { + /* Indirect Jumps or direct jumps to symbols non-local to the + function must not be seen here. The caller must have already + checked for that. */ + gas_assert (!ginsn_indirect_jump_p (ginsn)); + if (ginsn->type == GINSN_TYPE_JUMP) + gas_assert (ginsn_direct_local_jump_p (ginsn)); + + /* Direct Jumps. May include conditional or unconditional change of + flow. What is important for CFG creation is that the target be + local to function. */ + if (ginsn->type == GINSN_TYPE_JUMP_COND + || ginsn_direct_local_jump_p (ginsn)) + { + gas_assert (ginsn->src[0].type == GINSN_SRC_SYMBOL); + taken_label = ginsn->src[0].sym; + gas_assert (taken_label && S_IS_LOCAL (taken_label)); + + /* Follow the target on the taken path. */ + gins = label_ginsn_map_find (taken_label); + gas_assert (gins); + + /* Preserve the prev_bb to be the dominator bb as we are + going to follow the taken path of the conditional branch + soon. */ + prev_bb = current_bb; + + /* Add the bb for the target of the taken branch. */ + find_or_make_bb (gcfg, gins, prev_bb); + } + else if (ginsn->type == GINSN_TYPE_RETURN) + { + /* We'll come back to the following ginsns after GINSN_TYPE_RETURN + from another path if it is indeed reachable code. */ + break; + } + + /* Current BB has been processed. */ + current_bb = NULL; + } + ginsn = ginsn->next; + } + + return current_bb; +} + +static int +gbbs_compare (const void *v1, const void *v2) +{ + const gbbS *bb1 = *(const gbbS **) v1; + const gbbS *bb2 = *(const gbbS **) v2; + + if (bb1->first_ginsn->id < bb2->first_ginsn->id) + return -1; + else if (bb1->first_ginsn->id > bb2->first_ginsn->id) + return 1; + else if (bb1->first_ginsn->id == bb2->first_ginsn->id) + return 0; + + return 0; +} + +/* Traverse the list of ginsns for the function and warn if some + ginsns are not visited. + + FIXME - this code assumes the caller has already performed a pass over + ginsns such that the reachable ginsns are already marked. Revisit this - we + should ideally make this pass self-sufficient. */ + +static int +ginsn_pass_warn_unreachable_code (symbolS *func, gcfgS *gcfg ATTRIBUTE_UNUSED, + ginsnS *root_ginsn) +{ + ginsnS *ginsn; + bool unreach_p = false; + + if (!gcfg || !func || !root_ginsn) + return 0; + + ginsn = root_ginsn; + + while (ginsn) + { + /* Some ginsns of type GINSN_TYPE_SYMBOL remain unvisited. Some + may even be excluded from the CFG as they are not reachable, given + their function, e.g., user labels after return machine insn. */ + if (!ginsn->visited + && !GINSN_F_FUNC_END_P(ginsn) + && !GINSN_F_USER_LABEL_P(ginsn)) + { + unreach_p = true; + break; + } + ginsn = ginsn->next; + } + + if (unreach_p) + as_warn_where (ginsn->file, ginsn->line, + _("GINSN: found unreachable code in func '%s'"), + S_GET_NAME (func)); + + return unreach_p; +} + +void +gcfg_get_bbs_in_prog_order (gcfgS *gcfg, gbbS **prog_order_bbs) +{ + int i = 0; + gbbS *gbb; + + if (!prog_order_bbs) + return; + + cfg_for_each_bb(gcfg, gbb) + { + gas_assert (i < gcfg->num_gbbs); + prog_order_bbs[i++] = gbb; + } + + qsort (prog_order_bbs, gcfg->num_gbbs, sizeof (gbbS *), gbbs_compare); +} + +/* Build the control flow graph for the ginsns of the function. + + It is important that the target adds an appropriate ginsn: + - GINSN_TYPE_JUMP, + - GINSN_TYPE_JUMP_COND, + - GINSN_TYPE_CALL, + - GINSN_TYPE_RET + at the associated points in the function. The correctness of the CFG + depends on the accuracy of these 'change of flow instructions'. */ + +gcfgS * +build_gcfg (void) +{ + gcfgS *gcfg; + ginsnS *first_ginsn; + + gcfg = XCNEW (gcfgS); + first_ginsn = frchain_now->frch_ginsn_data->gins_rootP; + add_bb_at_ginsn (gcfg, first_ginsn, NULL /* prev_bb. */); + + return gcfg; +} + +gbbS * +get_rootbb_gcfg (gcfgS *gcfg) +{ + gbbS *rootbb = NULL; + + if (!gcfg || !gcfg->num_gbbs) + return NULL; + + rootbb = gcfg->root_bb; + + return rootbb; +} + +void +frch_ginsn_data_init (symbolS *func, symbolS *start_addr, + enum ginsn_gen_mode gmode) +{ + /* FIXME - error out if prev object is not free'd ? */ + frchain_now->frch_ginsn_data = XCNEW (struct frch_ginsn_data); + + frchain_now->frch_ginsn_data->mode = gmode; + /* Annotate with the current function symbol. */ + frchain_now->frch_ginsn_data->func = func; + /* Create a new start address symbol now. */ + frchain_now->frch_ginsn_data->start_addr = start_addr; + /* Assume the set of ginsn are apt for CFG creation, by default. */ + frchain_now->frch_ginsn_data->gcfg_apt_p = true; + + frchain_now->frch_ginsn_data->label_ginsn_map = str_htab_create (); +} + +void +frch_ginsn_data_cleanup (void) +{ + ginsnS *ginsn = NULL; + ginsnS *next_ginsn = NULL; + + ginsn = frchain_now->frch_ginsn_data->gins_rootP; + while (ginsn) + { + next_ginsn = ginsn->next; + free (ginsn); + ginsn = next_ginsn; + } + + if (frchain_now->frch_ginsn_data->label_ginsn_map) + htab_delete (frchain_now->frch_ginsn_data->label_ginsn_map); + + free (frchain_now->frch_ginsn_data); + frchain_now->frch_ginsn_data = NULL; +} + +/* Append GINSN to the list of ginsns for the current function being + assembled. */ + +int +frch_ginsn_data_append (ginsnS *ginsn) +{ + ginsnS *last = NULL; + ginsnS *temp = NULL; + uint64_t id = 0; + + if (!ginsn) + return 1; + + if (frchain_now->frch_ginsn_data->gins_lastP) + id = frchain_now->frch_ginsn_data->gins_lastP->id; + + /* Do the necessary preprocessing on the set of input GINSNs: + - Update each ginsn with its ID. + While you iterate, also keep gcfg_apt_p updated by checking whether any + ginsn is inappropriate for GCFG creation. */ + temp = ginsn; + while (temp) + { + temp->id = ++id; + + if (ginsn_indirect_jump_p (temp) + || (ginsn->type == GINSN_TYPE_JUMP + && !ginsn_direct_local_jump_p (temp))) + frchain_now->frch_ginsn_data->gcfg_apt_p = false; + + /* The input GINSN may be a linked list of multiple ginsns chained + together. Find the last ginsn in the input chain of ginsns. */ + last = temp; + + temp = temp->next; + } + + /* Link in the ginsn to the tail. */ + if (!frchain_now->frch_ginsn_data->gins_rootP) + frchain_now->frch_ginsn_data->gins_rootP = ginsn; + else + ginsn_link_next (frchain_now->frch_ginsn_data->gins_lastP, ginsn); + + frchain_now->frch_ginsn_data->gins_lastP = last; + + return 0; +} + +enum ginsn_gen_mode frch_ginsn_gen_mode (void) +{ + enum ginsn_gen_mode gmode = GINSN_GEN_NONE; + + if (frchain_now->frch_ginsn_data) + gmode = frchain_now->frch_ginsn_data->mode; + + return gmode; +} + +int +ginsn_data_begin (symbolS *func) +{ + ginsnS *ginsn; + + /* The previous block of asm must have been processed by now. */ + if (frchain_now->frch_ginsn_data) + as_bad (_("GINSN process for prev func not done")); + + /* FIXME - hard code the mode to GINSN_GEN_SCFI. + This can be changed later when other passes on ginsns are formalised. */ + frch_ginsn_data_init (func, symbol_temp_new_now (), GINSN_GEN_SCFI); + + /* Create and insert ginsn with function begin marker. */ + ginsn = ginsn_new_symbol_func_begin (func); + frch_ginsn_data_append (ginsn); + + return 0; +} + +int +ginsn_data_end (symbolS *label) +{ + ginsnS *ginsn; + gbbS *root_bb; + gcfgS *gcfg; + symbolS *func; + + int ret = 0; + + /* Insert Function end marker. */ + ginsn = ginsn_new_symbol_func_end (label); + frch_ginsn_data_append (ginsn); + + func = frchain_now->frch_ginsn_data->func; + + /* Build the cfg of ginsn(s) of the function. */ + if (!frchain_now->frch_ginsn_data->gcfg_apt_p) + { + as_warn (_("Untraceable control flow for func '%s'"), S_GET_NAME (func)); + goto end; + } + + gcfg = build_gcfg (); + + root_bb = get_rootbb_gcfg (gcfg); + if (!root_bb) + { + as_bad (_("Bad cfg of ginsn of func '%s'"), S_GET_NAME (func)); + goto end; + } + + /* Synthesize DWARF CFI and emit it. */ + ret = scfi_synthesize_dw2cfi (func, gcfg, root_bb); + if (ret) + goto end; + scfi_emit_dw2cfi (func); + + /* Other passes, e.g. warn for unreachable code can be enabled too. */ + ginsn = frchain_now->frch_ginsn_data->gins_rootP; + ginsn_pass_warn_unreachable_code (func, gcfg, ginsn); + +end: + frch_ginsn_data_cleanup (); + return ret; +} + +/* Add GINSN_TYPE_SYMBOL type ginsn for user-defined labels. These may be + branch targets, and hence are necessary for control flow graph. */ + +void +ginsn_frob_label (symbolS *label) +{ + ginsnS *label_ginsn; + symbolS *gsym; + const char *file; + unsigned int line; + + if (frchain_now->frch_ginsn_data) + { + /* PS: Note how we use the last ginsn's sym for this GINSN_TYPE_SYMBOL + ginsn (i.e., skip keeping the actual LABEL symbol as ginsn->sym). + We try to avoid keeping GAS symbols in ginsn(s) to avoid inadvertent + updates or cleanups. */ + gsym = frchain_now->frch_ginsn_data->gins_lastP->sym; + label_ginsn = ginsn_new_symbol_user_label (gsym); + /* Keep the location updated. */ + file = as_where (&line); + ginsn_set_file_line (label_ginsn, file, line); + + frch_ginsn_data_append (label_ginsn); + + label_ginsn_map_insert (label, label_ginsn); + } +} + +#else + +int +ginsn_data_begin (symbolS *func ATTRIBUTE_UNUSED) +{ + as_bad (_("ginsn unsupported for target")); + return 1; +} + +int +ginsn_data_end (symbolS *label ATTRIBUTE_UNUSED) +{ + as_bad (_("ginsn unsupported for target")); + return 1; +} + +void +ginsn_frob_label (symbolS *sym ATTRIBUTE_UNUSED) +{ + return; +} + +#endif /* TARGET_USE_GINSN. */ diff --git a/gas/ginsn.h b/gas/ginsn.h new file mode 100644 index 00000000000..242aeeb3607 --- /dev/null +++ b/gas/ginsn.h @@ -0,0 +1,347 @@ +/* ginsn.h - GAS instruction representation. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#ifndef GINSN_H +#define GINSN_H + +#include "as.h" + +/* Maximum number of source operands of a ginsn. */ +#define GINSN_NUM_SRC_OPNDS 2 + +enum ginsn_gen_mode +{ + GINSN_GEN_NONE, + /* Generate ginsns for program validation passes. */ + GINSN_GEN_FVAL, + /* Generate ginsns for synthesizing DWARF CFI. */ + GINSN_GEN_SCFI, +}; + +enum ginsn_type +{ + GINSN_TYPE_SYMBOL = 0, + GINSN_TYPE_ADD, + GINSN_TYPE_AND, + GINSN_TYPE_CALL, + GINSN_TYPE_JUMP, + GINSN_TYPE_JUMP_COND, + GINSN_TYPE_MOV, + GINSN_TYPE_LDS, /* Load from stack. */ + GINSN_TYPE_STS, /* Store to stack. */ + GINSN_TYPE_RETURN, + GINSN_TYPE_SUB, + GINSN_TYPE_OTHER, +}; + +enum ginsn_src_type +{ + GINSN_SRC_UNKNOWN, + GINSN_SRC_REG, + GINSN_SRC_IMM, + GINSN_SRC_INDIRECT, + GINSN_SRC_STACK, + GINSN_SRC_SYMBOL, + GINSN_SRC_MEM, +}; + +/* GAS instruction source operand representation. */ + +struct ginsn_src +{ + enum ginsn_src_type type; + /* DWARF register number. */ + uint32_t reg; + /* 32-bit immediate or disp for indirect memory access. */ + int32_t immdisp; + /* Src symbol. May be needed for some control flow instructions. */ + symbolS *sym; +}; + +enum ginsn_dst_type +{ + GINSN_DST_UNKNOWN, + GINSN_DST_REG, + GINSN_DST_INDIRECT, + GINSN_DST_STACK, + GINSN_DST_MEM +}; + +/* GAS instruction destination operand representation. */ + +struct ginsn_dst +{ + enum ginsn_dst_type type; + /* DWARF register number. */ + uint32_t reg; + /* 32-bit disp for indirect memory access. */ + int32_t disp; +}; + +/* Various flags for additional information per GAS instruction. */ + +/* Function begin or end symbol. */ +#define GINSN_F_FUNC_MARKER 0x1 +/* Identify real or implicit GAS insn. + Some targets employ CISC-like instructions. Multiple ginsn's may be used + for a single machine instruction in some ISAs. For some optimizations, + there is need to identify whether a ginsn, e.g., GINSN_TYPE_ADD or + GINSN_TYPE_SUB is a result of an user-specified instruction or not. */ +#define GINSN_F_INSN_REAL 0x2 +/* Identify if the GAS insn of type GINSN_TYPE_SYMBOL is due to a user-defined + label. Each user-defined labels in a function will cause addition of a new + ginsn. This simplifies control flow graph creation. + See htab_t label_ginsn_map usage. */ +#define GINSN_F_USER_LABEL 0x4 +/* Max bit position for flags (uint32_t). */ +#define GINSN_F_MAX 0x20 + +#define GINSN_F_FUNC_BEGIN_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->type == GINSN_TYPE_SYMBOL) \ + && (ginsn->flags & GINSN_F_FUNC_MARKER)) + +/* PS: For ginsn associated with a user-defined symbol location, + GINSN_F_FUNC_MARKER is unset, but GINSN_F_USER_LABEL is set. */ +#define GINSN_F_FUNC_END_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->type == GINSN_TYPE_SYMBOL) \ + && !(ginsn->flags & GINSN_F_FUNC_MARKER) \ + && !(ginsn->flags & GINSN_F_USER_LABEL)) + +#define GINSN_F_INSN_REAL_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->flags & GINSN_F_INSN_REAL)) + +#define GINSN_F_USER_LABEL_P(ginsn) \ + ((ginsn != NULL) \ + && (ginsn->flags & GINSN_F_USER_LABEL)) + +typedef struct ginsn ginsnS; +typedef struct scfi_op scfi_opS; +typedef struct scfi_state scfi_stateS; + +/* GAS generic instruction. + + Generic instructions are used by GAS to abstract out the binary machine + instructions. In other words, ginsn is a target/ABI independent internal + representation for GAS. Note that, depending on the target, there may be + more than one ginsn per binary machine instruction. + + ginsns can be used by GAS to perform validations, or even generate + additional information like, sythesizing DWARF CFI for hand-written asm. + + FIXME - what back references should we keep - frag ? frchainS ? + */ + +struct ginsn +{ + enum ginsn_type type; + /* GAS instructions are simple instructions with GINSN_NUM_SRC_OPNDS number + of source operands and one destination operand at this time. */ + struct ginsn_src src[GINSN_NUM_SRC_OPNDS]; + struct ginsn_dst dst; + /* Additional information per instruction. */ + uint32_t flags; + /* Symbol. For ginsn of type other than GINSN_TYPE_SYMBOL, this identifies + the end of the corresponding machine instruction in the .text segment. + These symbols are created anew by the targets and are not used elsewhere + in GAS. These can be safely cleaned up when a ginsn is free'd. */ + symbolS *sym; + /* Identifier (linearly increasing natural number) for each ginsn. Used as + a proxy for program order of ginsns. */ + uint64_t id; + /* Location information for user-interfacing messaging. Only ginsns with + GINSN_F_FUNC_BEGIN_P and GINSN_F_FUNC_END_P may present themselves with no + file or line information. */ + const char *file; + unsigned int line; + + /* Information needed for synthesizing CFI. */ + scfi_opS **scfi_ops; + uint32_t num_scfi_ops; + + /* Flag to keep track of visited instructions for CFG creation. */ + bool visited; + + ginsnS *next; /* A linked list. */ +}; + +struct ginsn_src *ginsn_get_src1 (ginsnS *ginsn); +struct ginsn_src *ginsn_get_src2 (ginsnS *ginsn); +struct ginsn_dst *ginsn_get_dst (ginsnS *ginsn); + +uint32_t ginsn_get_src_reg (struct ginsn_src *src); +enum ginsn_src_type ginsn_get_src_type (struct ginsn_src *src); +uint32_t ginsn_get_src_disp (struct ginsn_src *src); +uint32_t ginsn_get_src_imm (struct ginsn_src *src); + +uint32_t ginsn_get_dst_reg (struct ginsn_dst *dst); +enum ginsn_dst_type ginsn_get_dst_type (struct ginsn_dst *dst); +int32_t ginsn_get_dst_disp (struct ginsn_dst *dst); + +/* Data object for book-keeping information related to GAS generic + instructions. */ +struct frch_ginsn_data +{ + /* Mode for GINSN creation. */ + enum ginsn_gen_mode mode; + /* Head of the list of ginsns. */ + ginsnS *gins_rootP; + /* Tail of the list of ginsns. */ + ginsnS *gins_lastP; + /* Function symbol. */ + symbolS *func; + /* Start address of the function. */ + symbolS *start_addr; + /* User-defined label to ginsn mapping. */ + htab_t label_ginsn_map; + /* Is the list of ginsn apt for creating CFG. */ + bool gcfg_apt_p; +}; + +int ginsn_data_begin (symbolS *func); +int ginsn_data_end (symbolS *label); +void ginsn_frob_label (symbolS *sym); + +void frch_ginsn_data_init (symbolS *func, symbolS *start_addr, + enum ginsn_gen_mode gmode); +void frch_ginsn_data_cleanup (void); +int frch_ginsn_data_append (ginsnS *ginsn); +enum ginsn_gen_mode frch_ginsn_gen_mode (void); + +void label_ginsn_map_insert (symbolS *label, ginsnS *ginsn); +ginsnS *label_ginsn_map_find (symbolS *label); + +ginsnS *ginsn_new_symbol_func_begin (symbolS *sym); +ginsnS *ginsn_new_symbol_func_end (symbolS *sym); +ginsnS *ginsn_new_symbol_user_label (symbolS *sym); + +ginsnS *ginsn_new_symbol (symbolS *sym, bool real_p); +ginsnS *ginsn_new_add (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg); +ginsnS *ginsn_new_and (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg); +ginsnS *ginsn_new_call (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_val, + symbolS *src_text_sym); +ginsnS *ginsn_new_jump (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_val, + symbolS *src_ginsn_sym); +ginsnS *ginsn_new_jump_cond (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_val, + symbolS *src_ginsn_sym); +ginsnS *ginsn_new_mov (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_reg, int32_t src_disp, + enum ginsn_dst_type dst_type, uint32_t dst_reg, int32_t dst_disp); +ginsnS *ginsn_new_store (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, uint32_t src_reg, + enum ginsn_dst_type dst_type); +ginsnS *ginsn_new_load (symbolS *sym, bool real_p, + enum ginsn_src_type src_type, + enum ginsn_dst_type dst_type, uint32_t dst_reg); +ginsnS *ginsn_new_sub (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg); +ginsnS *ginsn_new_other (symbolS *sym, bool real_p, + enum ginsn_src_type src1_type, uint32_t src1_val, + enum ginsn_src_type src2_type, uint32_t src2_val, + enum ginsn_dst_type dst_type, uint32_t dst_reg); +ginsnS *ginsn_new_return (symbolS *sym, bool real_p); + +void ginsn_set_file_line (ginsnS *ginsn, const char *file, unsigned int line); + +bool ginsn_track_reg_p (uint32_t dw2reg, enum ginsn_gen_mode); + +int ginsn_link_next (ginsnS *ginsn, ginsnS *next); + +typedef struct gbb gbbS; +typedef struct gedge gedgeS; + +/* GBB - Basic block of generic GAS instructions. */ + +struct gbb +{ + ginsnS *first_ginsn; + ginsnS *last_ginsn; + int64_t num_ginsns; + + /* Identifier (linearly increasing natural number) for each gbb. Added for + debugging purpose only. */ + int64_t id; + + bool visited; + + int32_t num_out_gedges; + gedgeS *out_gedges; + + /* FIXME - keep a separate map or add like this. */ + /* SCFI state at the entry of basic block. */ + scfi_stateS *entry_state; + /* SCFI state at the exit of basic block. */ + scfi_stateS *exit_state; + /* A linked list. In order of addition. */ + gbbS *next; +}; + +struct gedge +{ + gbbS *dst_bb; + /* A linked list. In order of addition. */ + gedgeS *next; + bool visited; +}; + +/* Control flow graph of generic GAS instructions. */ + +struct gcfg +{ + int64_t num_gbbs; + gbbS *root_bb; +}; + +typedef struct gcfg gcfgS; + +#define bb_for_each_insn(bb, ginsn) \ + for (ginsn = bb->first_ginsn; ginsn; \ + ginsn = (ginsn != bb->last_ginsn) ? ginsn->next : NULL) + +#define bb_for_each_edge(bb, edge) \ + for (edge = bb->out_gedges; edge; edge = edge->next) + +#define cfg_for_each_bb(cfg, bb) \ + for (bb = cfg->root_bb; bb; bb = bb->next) + +#define bb_get_first_ginsn(bb) \ + (bb->first_ginsn) + +#define bb_get_last_ginsn(bb) \ + (bb->last_ginsn) + +gcfgS *build_gcfg (void); +gbbS *get_rootbb_gcfg (gcfgS *gcfg); +void gcfg_get_bbs_in_prog_order (gcfgS *gcfg, gbbS **prog_order_bbs); + +#endif /* GINSN_H. */ diff --git a/gas/scfi.c b/gas/scfi.c new file mode 100644 index 00000000000..585083aadf9 --- /dev/null +++ b/gas/scfi.c @@ -0,0 +1,1090 @@ +/* scfi.c - Support for synthesizing DWARF CFI for hand-written asm. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#include "as.h" +#include "scfi.h" +#include "subsegs.h" +#include "scfidw2gen.h" + +# if defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN) + +/* Beyond the target defined number of registers to be tracked (SCFI_NUM_REGS), + keep the next register ID, in sequence, for REG_CFA. */ +#define REG_CFA (SCFI_NUM_REGS+1) +/* Define the total number of registers being tracked. Used as index into an + array of cfi_reglocS. */ +#define MAX_NUM_SCFI_REGS (REG_CFA+1) + +enum cfi_reglocstate +{ + CFI_UNDEFINED, + CFI_IN_REG, + CFI_ON_STACK +}; + +/* Location at which CFI register is saved. + + A CFI register (callee-saved registers, RA/LR) are always an offset from + the CFA. REG_CFA itself, however, may have REG_SP or REG_FP as base + register. Hence, keep the base reg ID and offset per tracked register. */ + +struct cfi_regloc +{ + /* Base reg ID (DWARF register number). */ + uint32_t base; + /* Location as offset from the CFA. */ + int32_t offset; + /* Current state of the CFI register. */ + enum cfi_reglocstate state; +}; + +typedef struct cfi_regloc cfi_reglocS; + +/* SCFI operation. + + An SCFI operation represents a single atomic change to the SCFI state. + This can also be understood as an abstraction for what eventually gets + emitted as a DWARF CFI operation. */ + +struct scfi_op +{ + /* An SCFI op updates the state of either the CFA or other tracked + (callee-saved, REG_SP etc) registers. 'reg' is in the DWARF register + number space and must be strictly less than MAX_NUM_SCFI_REGS. */ + uint32_t reg; + /* Location of the reg. */ + cfi_reglocS loc; + /* DWARF CFI opcode. */ + uint32_t dw2cfi_op; + /* A linked list. */ + struct scfi_op *next; +}; + +/* SCFI State - accumulated unwind information at a PC. + + SCFI state is the accumulated unwind information encompassing: + - REG_SP, REG_FP, + - RA, and + - all callee-saved registers. + + Note that SCFI_NUM_REGS is target/ABI dependent and is provided by the + backends. The backend must also identify the REG_SP, and REG_FP + registers. */ + +struct scfi_state +{ + cfi_reglocS regs[MAX_NUM_SCFI_REGS]; + cfi_reglocS scratch[MAX_NUM_SCFI_REGS]; + /* Current stack size. */ + int32_t stack_size; + /* Is the stack size known? + Stack size may become untraceable depending on the specific stack + manipulation machine instruction, e.g., rsp = rsp op reg. */ + bool traceable_p; +}; + +/* Initialize a new SCFI op. */ + +static scfi_opS * +init_scfi_op (void) +{ + scfi_opS *op = XCNEW (scfi_opS); + + return op; +} + +/* Compare two SCFI states. */ + +static int +cmp_scfi_state (scfi_stateS *state1, scfi_stateS *state2) +{ + int ret; + + if (!state1 || !state2) + ret = 1; + + /* Skip comparing the scratch[] value of registers. The user visible + unwind information is derived from the regs[] from the SCFI state. */ + ret = memcmp (state1->regs, state2->regs, + sizeof (cfi_reglocS) * MAX_NUM_SCFI_REGS); + ret |= state1->stack_size != state2->stack_size; + ret |= state1->traceable_p != state2->traceable_p; + + return ret; +} + +#if 0 +static void +scfi_state_update_reg (scfi_stateS *state, uint32_t dst, uint32_t base, + int32_t offset) +{ + if (dst >= MAX_NUM_SCFI_REGS) + return; + + state->regs[dst].base = base; + state->regs[dst].offset = offset; +} +#endif + +/* Update the SCFI state of REG as available on execution stack at OFFSET + from REG_CFA (BASE). + + Note that BASE must be REG_CFA, because any other base (REG_SP, REG_FP) + is by definition transitory in the function. */ + +static void +scfi_state_save_reg (scfi_stateS *state, uint32_t reg, uint32_t base, + int32_t offset) +{ + if (reg >= MAX_NUM_SCFI_REGS) + return; + + gas_assert (base == REG_CFA); + + state->regs[reg].base = base; + state->regs[reg].offset = offset; + state->regs[reg].state = CFI_ON_STACK; +} + +static void +scfi_state_restore_reg (scfi_stateS *state, uint32_t reg) +{ + if (reg >= MAX_NUM_SCFI_REGS) + return; + + /* Sanity check. See Rule 4. */ + gas_assert (state->regs[reg].state == CFI_ON_STACK); + gas_assert (state->regs[reg].base == REG_CFA); + + state->regs[reg].base = reg; + state->regs[reg].offset = 0; + /* PS: the register may still be on stack much after the restore, but the + SCFI state keeps the state as 'in register'. */ + state->regs[reg].state = CFI_IN_REG; +} + +/* Identify if the given GAS instruction GINSN saves a register + (of interest) on stack. */ + +static bool +ginsn_scfi_save_reg_p (ginsnS *ginsn, scfi_stateS *state) +{ + bool save_reg_p = false; + struct ginsn_src *src; + struct ginsn_dst *dst; + + src = ginsn_get_src1 (ginsn); + dst = ginsn_get_dst (ginsn); + + if (!ginsn_track_reg_p (ginsn_get_src_reg (src), GINSN_GEN_SCFI)) + return save_reg_p; + + /* A register save insn may be an indirect mov. */ + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn_get_dst_reg (dst) == REG_SP + || (ginsn_get_dst_reg (dst) == REG_FP + && state->regs[REG_CFA].base == REG_FP))) + save_reg_p = true; + /* or an explicit store to stack. */ + else if (ginsn->type == GINSN_TYPE_STS) + save_reg_p = true; + + return save_reg_p; +} + +/* Identify if the given GAS instruction GINSN restores a register + (of interest) on stack. */ + +static bool +ginsn_scfi_reg_restore_p (ginsnS *ginsn, scfi_stateS *state) +{ + bool reg_restore_p = false; + struct ginsn_dst *dst; + struct ginsn_src *src1; + + dst = ginsn_get_dst (ginsn); + src1 = ginsn_get_src1 (ginsn); + + if (!ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI)) + return reg_restore_p; + + /* A register restore insn may be an indirect mov. */ + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || (ginsn_get_src_reg (src1) == REG_FP + && state->regs[REG_CFA].base == REG_FP))) + reg_restore_p = true; + /* or an explicit load from stack. */ + else if (ginsn->type == GINSN_TYPE_LDS) + reg_restore_p = true; + + return reg_restore_p; +} + +/* Append the SCFI operation OP to the list of SCFI operations in the + given GINSN. */ + +static int +ginsn_append_scfi_op (ginsnS *ginsn, scfi_opS *op) +{ + scfi_opS *sop; + + if (!ginsn || !op) + return 1; + + if (!ginsn->scfi_ops) + { + ginsn->scfi_ops = XCNEW (scfi_opS *); + *ginsn->scfi_ops = op; + } + else + { + /* Add to tail. Most ginsns have a single SCFI operation, + so this traversal for every insertion is acceptable for now. */ + sop = *ginsn->scfi_ops; + while (sop->next) + sop = sop->next; + + sop->next = op; + } + ginsn->num_scfi_ops++; + + return 0; +} + +static void +scfi_op_add_def_cfa_reg (scfi_stateS *state, ginsnS *ginsn, uint32_t reg) +{ + scfi_opS *op = NULL; + + state->regs[REG_CFA].base = reg; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa_register; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfa_offset_inc (scfi_stateS *state, ginsnS *ginsn, int32_t num) +{ + scfi_opS *op = NULL; + + state->regs[REG_CFA].offset -= num; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa_offset; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfa_offset_dec (scfi_stateS *state, ginsnS *ginsn, int32_t num) +{ + scfi_opS *op = NULL; + + state->regs[REG_CFA].offset += num; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa_offset; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_def_cfa (scfi_stateS *state, ginsnS *ginsn, uint32_t reg, + int32_t num) +{ + scfi_opS *op = NULL; + + /* On most architectures, CFA is already somewhere on stack. */ + gas_assert (num > 0); + + state->regs[REG_CFA].base = reg; + state->regs[REG_CFA].offset = num; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_def_cfa; + op->reg = REG_CFA; + op->loc = state->regs[REG_CFA]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfi_offset (scfi_stateS *state, ginsnS *ginsn, uint32_t reg) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_offset; + op->reg = reg; + op->loc = state->regs[reg]; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfa_restore (ginsnS *ginsn, uint32_t reg) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_restore; + op->reg = reg; + op->loc.base = -1; /* FIXME invalidate. */ + op->loc.offset = 0; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfi_remember_state (ginsnS *ginsn) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_remember_state; + + ginsn_append_scfi_op (ginsn, op); +} + +static void +scfi_op_add_cfi_restore_state (ginsnS *ginsn) +{ + scfi_opS *op = NULL; + + op = init_scfi_op (); + + op->dw2cfi_op = DW_CFA_restore_state; + + /* FIXME - add to the beginning of the scfi_ops. */ + ginsn_append_scfi_op (ginsn, op); +} + +static int +verify_heuristic_traceable_reg_bp (ginsnS *ginsn, scfi_stateS *state) +{ + /* The function uses this variable to issue error to user right away. */ + int reg_bp_scratch_p = 0; + struct ginsn_dst *dst; + struct ginsn_src *src1; + struct ginsn_src *src2; + + src1 = ginsn_get_src1 (ginsn); + src2 = ginsn_get_src2 (ginsn); + dst = ginsn_get_dst (ginsn); + + /* Stack manipulation can be done in a variety of ways. A program may + allocate in statically in epilogue or may need to do dynamic stack + allocation. + + The SCFI machinery in GAS is based on some heuristics: + + - Rule 3 If the base register for CFA tracking is REG_FP, the program + must not clobber REG_FP, unless it is for switch to REG_SP based CFA + tracking (via say, a pop %rbp in X86). Currently the code does not + guard the programmer from violations of this rule. */ + + /* Check add/sub insn with imm usage when CFA base register is REG_FP. */ + if (state->regs[REG_CFA].base == REG_FP && ginsn_get_dst_reg (dst) == REG_FP) + { + if ((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB) + && ginsn_get_src_reg (src1) == REG_FP + && ginsn_get_src_type (src2) == GINSN_SRC_IMM) + reg_bp_scratch_p = 0; + /* REG_FP restore is allowed. */ + else if (ginsn->type == GINSN_TYPE_LDS) + reg_bp_scratch_p = 0; + /* mov's to memory with REG_FP base. */ + else if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT) + reg_bp_scratch_p = 0; + /* All other ginsns with REG_FP as destination make REG_FP not + traceable. */ + else + reg_bp_scratch_p = 1; + } + + if (reg_bp_scratch_p) + as_bad_where (ginsn->file, ginsn->line, + _("SCFI: usage of REG_FP as scratch not supported")); + + return reg_bp_scratch_p; +} + +static int +verify_heuristic_traceable_stack_manipulation (ginsnS *ginsn, + scfi_stateS *state) +{ + /* The function uses this variable to issue error to user right away. */ + int not_traceable = 0; + struct ginsn_dst *dst; + struct ginsn_src *src1; + struct ginsn_src *src2; + + src1 = ginsn_get_src1 (ginsn); + src2 = ginsn_get_src2 (ginsn); + dst = ginsn_get_dst (ginsn); + + /* Stack manipulation can be done in a variety of ways. A program may + allocate in statically in epilogue or may need to do dynamic stack + allocation. + + The SCFI machinery in GAS is based on some heuristics: + + - Rule 1 The base register for CFA tracking may be either REG_SP or + REG_FP. + + - Rule 2 If the base register for CFA tracking is REG_SP, the precise + amount of stack usage (and hence, the value of rsp) must be known at + all times. */ + + /* Check add/sub/and insn usage when CFA base register is REG_SP. + Any stack size manipulation, including stack realignment is not allowed + if CFA base register is REG_SP. */ + if (ginsn_get_dst_reg (dst) == REG_SP + && (((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB) + && ginsn_get_src_type (src2) != GINSN_SRC_IMM) + || ginsn->type == GINSN_TYPE_AND)) + { + /* See Rule 2. For SP-based CFA, this (src2 not being imm) makes CFA + tracking not possible. Propagate now to caller. */ + if (state->regs[REG_CFA].base == REG_SP) + not_traceable = 1; + else if (state->traceable_p) + { + /* An extension of Rule 2. + For FP-based CFA, this may be a problem *if* certain specific + changes to the SCFI state are seen beyond this point. E.g., + register save / restore from stack. */ + gas_assert (state->regs[REG_CFA].base == REG_FP); + /* Simply make a note in the SCFI state object for now and + continue. Indicate an error when register save / restore + for callee-saved registers is seen. */ + not_traceable = 0; + state->traceable_p = false; + } + } + else if (ginsn_scfi_save_reg_p (ginsn, state) && !state->traceable_p) + { + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn_get_dst_reg (dst) == REG_SP + || (ginsn_get_dst_reg (dst) == REG_FP + && state->regs[REG_CFA].base != REG_FP))) + not_traceable = 1; + } + else if (ginsn_scfi_reg_restore_p (ginsn, state) && !state->traceable_p) + { + if (ginsn->type == GINSN_TYPE_MOV + && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || (ginsn_get_src_reg (src1) == REG_FP + && state->regs[REG_CFA].base != REG_FP))) + not_traceable = 1; + } + + if (not_traceable) + as_bad_where (ginsn->file, ginsn->line, + _("SCFI: unsupported stack manipulation pattern")); + + return not_traceable; +} + +static int +verify_heuristic_symmetrical_restore_reg (scfi_stateS *state, uint32_t reg, + int32_t expected_offset) +{ + int sym_restore; + + /* Rule 4: Save and Restore of callee-saved registers must be symmetrical. + It is expected that value of the saved register is restored correctly. + E.g., + push reg1 + push reg2 + ... + body of func which uses reg1 , reg2 as scratch, + and may be even spills them to stack. + ... + pop reg2 + pop reg1 + It is difficult to verify the Rule 4 in all cases. For the SCFI machinery, + it is difficult to separate prologue-epilogue from the body of the function + + Hence, the SCFI machinery at this time, should only warn on an asymmetrical + restore. */ + + /* The register must have been saved on stack, for sure. */ + gas_assert (state->regs[reg].state == CFI_ON_STACK); + gas_assert (state->regs[reg].base == REG_CFA); + + sym_restore = (expected_offset == state->regs[reg].offset); + + return sym_restore; +} + +/* Perform symbolic execution of the GINSN and update its list of scfi_ops. + scfi_ops are later used to directly generate the DWARF CFI directives. + Also update the SCFI state object STATE for the caller. */ + +static int +gen_scfi_ops (ginsnS *ginsn, scfi_stateS *state) +{ + int ret = 0; + int32_t offset; + int32_t expected_offset; + struct ginsn_src *src1; + struct ginsn_src *src2; + struct ginsn_dst *dst; + + if (!ginsn || !state) + ret = 1; + + /* For the first ginsn (of type GINSN_TYPE_SYMBOL) in the gbb, generate + the SCFI op with DW_CFA_def_cfa. Note that the register and offset are + target-specific. */ + if (GINSN_F_FUNC_BEGIN_P(ginsn)) + { + scfi_op_add_def_cfa (state, ginsn, REG_SP, SCFI_INIT_CFA_OFFSET); + state->stack_size += SCFI_INIT_CFA_OFFSET; + return ret; + } + + src1 = ginsn_get_src1 (ginsn); + src2 = ginsn_get_src2 (ginsn); + dst = ginsn_get_dst (ginsn); + + ret = verify_heuristic_traceable_stack_manipulation (ginsn, state); + if (ret) + return ret; + + ret = verify_heuristic_traceable_reg_bp (ginsn, state); + if (ret) + return ret; + + switch (ginsn->dst.type) + { + case GINSN_DST_REG: + switch (ginsn->type) + { + case GINSN_TYPE_MOV: + if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_src_reg (src1) == REG_SP + && ginsn_get_dst_reg (dst) == REG_FP + && state->regs[REG_CFA].base == REG_SP) + { + /* mov %rsp, %rbp. */ + scfi_op_add_def_cfa_reg (state, ginsn, ginsn_get_dst_reg (dst)); + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_src_reg (src1) == REG_FP + && ginsn_get_dst_reg (dst) == REG_SP + && state->regs[REG_CFA].base == REG_FP) + { + /* mov %rbp, %rsp. */ + state->stack_size = -state->regs[REG_FP].offset; + scfi_op_add_def_cfa_reg (state, ginsn, ginsn_get_dst_reg (dst)); + state->traceable_p = true; + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT + && (ginsn_get_src_reg (src1) == REG_SP + || ginsn_get_src_reg (src1) == REG_FP) + && ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI)) + { + /* mov disp(%rsp), reg. */ + /* mov disp(%rbp), reg. */ + expected_offset = (((ginsn_get_src_reg (src1) == REG_SP) + ? -state->stack_size + : state->regs[REG_FP].offset) + + ginsn_get_src_disp (src1)); + if (verify_heuristic_symmetrical_restore_reg (state, ginsn_get_dst_reg (dst), + expected_offset)) + { + scfi_state_restore_reg (state, ginsn_get_dst_reg (dst)); + scfi_op_add_cfa_restore (ginsn, ginsn_get_dst_reg (dst)); + } + else + as_warn_where (ginsn->file, ginsn->line, + _("SCFI: asymetrical register restore")); + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_dst_type (dst) == GINSN_DST_REG + && ginsn_get_src_reg (src1) == REG_SP) + { + /* mov %rsp, %reg. */ + /* The value of rsp is taken directly from state->stack_size. + IMP: The workflow in gen_scfi_ops must keep it updated. + PS: Not taking the value from state->scratch[REG_SP] is + intentional. */ + state->scratch[ginsn_get_dst_reg (dst)].base = REG_CFA; + state->scratch[ginsn_get_dst_reg (dst)].offset = -state->stack_size; + } + else if (ginsn_get_src_type (src1) == GINSN_SRC_REG + && ginsn_get_dst_type (dst) == GINSN_DST_REG + && ginsn_get_dst_reg (dst) == REG_SP) + { + /* mov %reg, %rsp. */ + /* Keep the value of REG_SP updated. */ + state->stack_size = -state->scratch[ginsn_get_src_reg (src1)].offset; +# if 0 + scfi_state_update_reg (state, ginsn_get_dst_reg (dst), + state->scratch[ginsn_get_src_reg (src1)].base, + state->scratch[ginsn_get_src_reg (src1)].offset); +#endif + + state->traceable_p = true; + } + break; + case GINSN_TYPE_SUB: + if (ginsn_get_src_reg (src1) == REG_SP + && ginsn_get_dst_reg (dst) == REG_SP) + { + /* Stack inc/dec offset, when generated due to stack push and pop is + target-specific. Use the value encoded in the ginsn. */ + state->stack_size += ginsn_get_src_imm (src2); + if (state->regs[REG_CFA].base == REG_SP) + { + /* push reg. */ + scfi_op_add_cfa_offset_dec (state, ginsn, ginsn_get_src_imm (src2)); + } + } + break; + case GINSN_TYPE_ADD: + if (ginsn_get_src_reg (src1) == REG_SP + && ginsn_get_dst_reg (dst) == REG_SP) + { + /* Stack inc/dec offset is target-specific. Use the value + encoded in the ginsn. */ + state->stack_size -= ginsn_get_src_imm (src2); + /* pop %reg affects CFA offset only if CFA is currently + stack-pointer based. */ + if (state->regs[REG_CFA].base == REG_SP) + { + scfi_op_add_cfa_offset_inc (state, ginsn, ginsn_get_src_imm (src2)); + } + } + else if (ginsn_get_src_reg (src1) == REG_FP + && ginsn_get_dst_reg (dst) == REG_SP + && state->regs[REG_CFA].base == REG_FP) + { + /* FIXME - what is this for ? */ + state->stack_size = 0 - (state->regs[REG_FP].offset + ginsn_get_src_imm (src2)); + } + break; + case GINSN_TYPE_LDS: + /* pop %rbp when CFA tracking is frame-pointer based. */ + if (ginsn_get_dst_reg (dst) == REG_FP && state->regs[REG_CFA].base == REG_FP) + { + scfi_op_add_def_cfa_reg (state, ginsn, REG_SP); + } + if (ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI)) + { + expected_offset = -state->stack_size; + if (verify_heuristic_symmetrical_restore_reg (state, ginsn_get_dst_reg (dst), + expected_offset)) + { + scfi_state_restore_reg (state, ginsn_get_dst_reg (dst)); + scfi_op_add_cfa_restore (ginsn, ginsn_get_dst_reg (dst)); + } + else + as_warn_where (ginsn->file, ginsn->line, + _("SCFI: asymetrical register restore")); + } + break; + default: + break; + } + break; + + case GINSN_DST_STACK: + gas_assert (ginsn->type == GINSN_TYPE_STS); + + if (ginsn_track_reg_p (ginsn_get_src_reg (src1), GINSN_GEN_SCFI) + && state->regs[ginsn_get_src_reg (src1)].state != CFI_ON_STACK) + { + /* reg is saved on stack at the current value of REG_SP. */ + offset = 0 - state->stack_size; + scfi_state_save_reg (state, ginsn_get_src_reg (src1), REG_CFA, offset); + /* Track callee-saved registers. */ + scfi_op_add_cfi_offset (state, ginsn, ginsn_get_src_reg (src1)); + } + break; + + case GINSN_DST_INDIRECT: + gas_assert (ginsn->type == GINSN_TYPE_MOV); + /* mov reg, disp(%rbp) */ + /* mov reg, disp(%rsp) */ + if (ginsn_track_reg_p (ginsn_get_src_reg (src1), GINSN_GEN_SCFI) + && state->regs[ginsn_get_src_reg (src1)].state != CFI_ON_STACK) + { + if (ginsn_get_dst_reg (dst) == REG_SP) + { + /* mov reg, disp(%rsp) */ + offset = 0 - state->stack_size + ginsn_get_dst_disp (dst); + scfi_state_save_reg (state, ginsn_get_src_reg (src1), REG_CFA, offset); + scfi_op_add_cfi_offset (state, ginsn, ginsn_get_src_reg (src1)); + } + else if (ginsn_get_dst_reg (dst) == REG_FP) + { + gas_assert (state->regs[REG_CFA].base == REG_FP); + /* mov reg, disp(%rbp) */ + offset = 0 - state->regs[REG_CFA].offset + ginsn_get_dst_disp (dst); + scfi_state_save_reg (state, ginsn_get_src_reg (src1), REG_CFA, offset); + scfi_op_add_cfi_offset (state, ginsn, ginsn_get_src_reg (src1)); + } + } + break; + + default: + /* Skip GINSN_DST_UNKNOWN and GINSN_DST_MEM as they are uninteresting + currently for SCFI. */ + break; + } + + return ret; +} + +/* Recursively perform forward flow of the (unwind information) SCFI state + starting at basic block GBB. + + The forward flow process propagates the SCFI state at exit of a basic block + to the successor basic block. + + Returns error code, if any. */ + +static int +forward_flow_scfi_state (gcfgS *gcfg, gbbS *gbb, scfi_stateS *state) +{ + ginsnS *ginsn; + gbbS *prev_bb; + gedgeS *gedge = NULL; + int ret = 0; + + if (gbb->visited) + { + /* Check that the SCFI state is the same as previous. */ + ret = cmp_scfi_state (state, gbb->entry_state); + if (ret) + as_bad (_("SCFI: Bad CFI propagation perhaps")); + return ret; + } + + gbb->visited = true; + + gbb->entry_state = XCNEW (scfi_stateS); + memcpy (gbb->entry_state, state, sizeof (scfi_stateS)); + + /* Perform symbolic execution of each ginsn in the gbb and update the + scfi_ops list of each ginsn (and also update the STATE object). */ + bb_for_each_insn(gbb, ginsn) + { + ret = gen_scfi_ops (ginsn, state); + if (ret) + goto fail; + } + + gbb->exit_state = XCNEW (scfi_stateS); + memcpy (gbb->exit_state, state, sizeof (scfi_stateS)); + + /* Forward flow the SCFI state. Currently, we process the next basic block + in DFS order. But any forward traversal order should be fine. */ + prev_bb = gbb; + if (gbb->num_out_gedges) + { + bb_for_each_edge(gbb, gedge) + { + gbb = gedge->dst_bb; + if (gbb->visited) + { + ret = cmp_scfi_state (gbb->entry_state, state); + if (ret) + goto fail; + } + + if (!gedge->visited) + { + gedge->visited = true; + + /* Entry SCFI state for the destination bb of the edge is the + same as the exit SCFI state of the source bb of the edge. */ + memcpy (state, prev_bb->exit_state, sizeof (scfi_stateS)); + ret = forward_flow_scfi_state (gcfg, gbb, state); + if (ret) + goto fail; + } + } + } + + return 0; + +fail: + + if (gedge) + gedge->visited = true; + return 1; +} + +static int +backward_flow_scfi_state (symbolS *func ATTRIBUTE_UNUSED, gcfgS *gcfg) +{ + gbbS **prog_order_bbs; + gbbS **restore_bbs; + gbbS *current_bb; + gbbS *prev_bb; + gbbS *dst_bb; + ginsnS *ginsn; + gedgeS *gedge; + + int ret = 0; + int i, j; + + /* Basic blocks in reverse program order. */ + prog_order_bbs = XCNEWVEC (gbbS *, gcfg->num_gbbs); + /* Basic blocks for which CFI remember op needs to be generated. */ + restore_bbs = XCNEWVEC (gbbS *, gcfg->num_gbbs); + + gcfg_get_bbs_in_prog_order (gcfg, prog_order_bbs); + + i = gcfg->num_gbbs - 1; + /* Traverse in reverse program order. */ + while (i > 0) + { + current_bb = prog_order_bbs[i]; + prev_bb = prog_order_bbs[i-1]; + if (cmp_scfi_state (prev_bb->exit_state, current_bb->entry_state)) + { + /* Candidate for .cfi_restore_state found. */ + ginsn = bb_get_first_ginsn (current_bb); + scfi_op_add_cfi_restore_state (ginsn); + /* Memorize current_bb now to find location for its remember state + later. */ + restore_bbs[i] = current_bb; + } + else + { + bb_for_each_edge (current_bb, gedge) + { + dst_bb = gedge->dst_bb; + for (j = 0; j < gcfg->num_gbbs; j++) + if (restore_bbs[j] == dst_bb) + { + ginsn = bb_get_last_ginsn (current_bb); + scfi_op_add_cfi_remember_state (ginsn); + /* Remove the memorised restore_bb from the list. */ + restore_bbs[j] = NULL; + break; + } + } + } + i--; + } + + /* All .cfi_restore_state pseudo-ops must have a corresponding + .cfi_remember_state by now. */ + for (j = 0; j < gcfg->num_gbbs; j++) + if (restore_bbs[j] != NULL) + { + ret = 1; + break; + } + + free (restore_bbs); + free (prog_order_bbs); + + return ret; +} + +/* Synthesize DWARF CFI for a function. */ + +int +scfi_synthesize_dw2cfi (symbolS *func, gcfgS *gcfg, gbbS *root_bb) +{ + int ret; + scfi_stateS *init_state; + + init_state = XCNEW (scfi_stateS); + init_state->traceable_p = true; + + /* Traverse the input GCFG and perform forward flow of information. + Update the scfi_op(s) per ginsn. */ + ret = forward_flow_scfi_state (gcfg, root_bb, init_state); + if (ret) + { + as_warn (_("SCFI: forward pass failed for func '%s'"), S_GET_NAME (func)); + goto end; + } + + ret = backward_flow_scfi_state (func, gcfg); + if (ret) + { + as_warn (_("SCFI: backward pass failed for func '%s'"), S_GET_NAME (func)); + goto end; + } + +end: + free (init_state); + return ret; +} + +static int +handle_scfi_dot_cfi (ginsnS *ginsn) +{ + scfi_opS *op; + + /* Nothing to do. */ + if (!ginsn->scfi_ops) + return 0; + + op = *ginsn->scfi_ops; + if (!op) + goto bad; + + while (op) + { + switch (op->dw2cfi_op) + { + case DW_CFA_def_cfa_register: + scfi_dot_cfi (DW_CFA_def_cfa_register, op->loc.base, 0, 0, + ginsn->sym); + break; + case DW_CFA_def_cfa_offset: + scfi_dot_cfi (DW_CFA_def_cfa_offset, op->loc.base, 0, + op->loc.offset, ginsn->sym); + break; + case DW_CFA_def_cfa: + scfi_dot_cfi (DW_CFA_def_cfa, op->loc.base, 0, op->loc.offset, + ginsn->sym ); + break; + case DW_CFA_offset: + scfi_dot_cfi (DW_CFA_offset, op->reg, 0, op->loc.offset, ginsn->sym); + break; + case DW_CFA_restore: + scfi_dot_cfi (DW_CFA_restore, op->reg, 0, 0, ginsn->sym); + break; + case DW_CFA_remember_state: + scfi_dot_cfi (DW_CFA_remember_state, 0, 0, 0, ginsn->sym); + break; + case DW_CFA_restore_state: + scfi_dot_cfi (DW_CFA_restore_state, 0, 0, 0, ginsn->sym); + break; + default: + goto bad; + break; + } + op = op->next; + } + + return 0; +bad: + as_bad (_("SCFI: Invalid DWARF CFI opcode data")); + return 1; +} + +/* Emit Synthesized DWARF CFI. */ + +int +scfi_emit_dw2cfi (symbolS *func) +{ + struct frch_ginsn_data *frch_gdata; + ginsnS* ginsn = NULL; + + frch_gdata = frchain_now->frch_ginsn_data; + ginsn = frch_gdata->gins_rootP; + + while (ginsn) + { + switch (ginsn->type) + { + case GINSN_TYPE_SYMBOL: + /* .cfi_startproc and .cfi_endproc pseudo-ops. */ + if (GINSN_F_FUNC_BEGIN_P(ginsn)) + { + scfi_dot_cfi_startproc (frch_gdata->start_addr); + break; + } + else if (GINSN_F_FUNC_END_P(ginsn)) + { + scfi_dot_cfi_endproc (ginsn->sym); + break; + } + /* Fall through. */ + case GINSN_TYPE_ADD: + case GINSN_TYPE_AND: + case GINSN_TYPE_CALL: + case GINSN_TYPE_JUMP: + case GINSN_TYPE_JUMP_COND: + case GINSN_TYPE_MOV: + case GINSN_TYPE_LDS: + case GINSN_TYPE_STS: + case GINSN_TYPE_SUB: + case GINSN_TYPE_OTHER: + case GINSN_TYPE_RETURN: + + /* For all other SCFI ops, invoke the handler. */ + if (ginsn->scfi_ops) + handle_scfi_dot_cfi (ginsn); + break; + + default: + /* No other GINSN_TYPE_* expected. */ + as_bad (_("SCFI: bad ginsn for func '%s'"), + S_GET_NAME (func)); + break; + } + ginsn = ginsn->next; + } + return 0; +} + +#else + +int +scfi_emit_dw2cfi (symbolS *func ATTRIBUTE_UNUSED) +{ + as_bad (_("SCFI: unsupported for target")); + return 1; +} + +int +scfi_synthesize_dw2cfi (symbolS *func ATTRIBUTE_UNUSED, + gcfgS *gcfg ATTRIBUTE_UNUSED, + gbbS *root_bb ATTRIBUTE_UNUSED) +{ + as_bad (_("SCFI: unsupported for target")); + return 1; +} + +#endif /* defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN). */ diff --git a/gas/scfi.h b/gas/scfi.h new file mode 100644 index 00000000000..8dcf8412655 --- /dev/null +++ b/gas/scfi.h @@ -0,0 +1,31 @@ +/* scfi.h - Support for synthesizing CFI for asm. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GAS, the GNU Assembler. + + GAS is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GAS is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GAS; see the file COPYING. If not, write to the Free + Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#ifndef SCFI_H +#define SCFI_H + +#include "as.h" +#include "ginsn.h" + +int scfi_emit_dw2cfi (symbolS *func); + +int scfi_synthesize_dw2cfi (symbolS *func, gcfgS *gcfg, gbbS *root_bb); + +#endif /* SCFI_H. */ diff --git a/gas/subsegs.h b/gas/subsegs.h index ace0657bdfb..c90c5622465 100644 --- a/gas/subsegs.h +++ b/gas/subsegs.h @@ -40,6 +40,7 @@ #include "obstack.h" struct frch_cfi_data; +struct frch_ginsn_data; struct frchain /* control building of a frag chain */ { /* FRCH = FRagment CHain control */ @@ -52,6 +53,7 @@ struct frchain /* control building of a frag chain */ struct obstack frch_obstack; /* for objects in this frag chain */ fragS *frch_frag_now; /* frag_now for this subsegment */ struct frch_cfi_data *frch_cfi_data; + struct frch_ginsn_data *frch_ginsn_data; }; typedef struct frchain frchainS; diff --git a/gas/symbols.c b/gas/symbols.c index 45e46ed39b7..b595ffad104 100644 --- a/gas/symbols.c +++ b/gas/symbols.c @@ -25,6 +25,7 @@ #include "obstack.h" /* For "symbols.h" */ #include "subsegs.h" #include "write.h" +#include "scfi.h" #include #ifndef CHAR_BIT @@ -709,6 +710,8 @@ colon (/* Just seen "x:" - rattle symbols & frags. */ #ifdef obj_frob_label obj_frob_label (symbolP); #endif + if (flag_synth_cfi) + ginsn_frob_label (symbolP); return symbolP; } -- 2.41.0