From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2109) id CA4B93858D20; Tue, 22 Nov 2022 12:42:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CA4B93858D20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1669120951; bh=hBbbZVWbryBlcHu9QFMo3Ec3k3c1Qber5sEEdpZmLIs=; h=From:To:Subject:Date:From; b=mp5zHpkmpaSMgj3R94zuiAO32Uu698Lt2ZMMeW9DeWBToFnX8ImPX2CbKYEG7FINc VXUX58kuh+Zct862/PMGTMm6wh1uYXcqF/WufAZZvcDoA5CusfivxoX7lZsYKmRc8b GK7dkFR7ssWo4CTjMfKZRll7PZCeaxP/FOm5iZBQ= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Stam Markianos-Wright To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/vendors/ARM/heads/morello)] Un-obfuscate addresses with cheri bounds for debug info X-Act-Checkin: gcc X-Git-Author: Matthew Malcomson X-Git-Refname: refs/vendors/ARM/heads/morello X-Git-Oldrev: ef254a8c7f201389ff7963bacb84254c61a2053e X-Git-Newrev: be61a38a5eeb79f1e21d57c069555af7c987a1e3 Message-Id: <20221122124231.CA4B93858D20@sourceware.org> Date: Tue, 22 Nov 2022 12:42:31 +0000 (GMT) List-Id: https://gcc.gnu.org/g:be61a38a5eeb79f1e21d57c069555af7c987a1e3 commit be61a38a5eeb79f1e21d57c069555af7c987a1e3 Author: Matthew Malcomson Date: Fri Nov 11 10:55:21 2022 +0000 Un-obfuscate addresses with cheri bounds for debug info After the stack bounds patch in 7bc86e22db7 we have a lot of IL floating around with bounds applied to pointers. This is a great thing, but our code for outputting debug information does not understand the RTL we use to represent this. In order to teach our dwarf2out functions to understand this RTL, it seems best that we should describe this operation with a mid-end code and act to ignore that wrapper in the debug code (since it does not change the location which is accessed). This would be beneficial since we would then have a generic representation for a generic CHERI operation, and we could attempt to get optimizers throughout the compiler to understand this operation and act accordingly. At the time of writing the bounds patch we thought about this approach and decided against implementing it for time constraints. Those time constraints are no looser now. Hence we stick with the current approach of using an UNSPEC for the moment. With that artificial limitation that we will stick with the UNSPEC representation, we need to teach the debug output code to understand this particular UNSPEC. Since UNSPEC codes are target specific this requires a target hook. As it happens, there is already a target hook for extracting a representation that the mid-end can understand from an UNSPEC. This is the TARGET_DELEGITIMIZE_ADDRESS hook. This hook is currently defined to such that the mid-end can understand addresses obfuscated by LEGITIMIZE_ADDRESS. Its definition currently permits use in optimizers as well as debug information, though in practice the majority of uses are for debug information. For this patch we change the semantics of this hook slightly to work on the UNSPEC that represents a bounded pointer. Our changes are: 1) This hook is now to undo obfuscating effects of the force_operand, legitimize_address, or legitimize_reload_address hooks. 2) There is a new distinction between using this hook to get something that is semantically equivalent (and hence can be used as a replacement in an optimisation pass), and to get something that is equivalent enough for debug purposes. Removing the action of applying bounds is only done for debug purposes. We could have introduced a slightly different split, between extracting an equivalent address and extracting something that it is semantically equivalent to access through. This would have meant that the use in avoid_constant_pool_reference would also look through any bounds setting IL, which is fine since the point of that function is to replace a memory access of a constant pool with the constant that it points to. This doesn't matter for us right now since accesses of the constant pool are not bounded and the concept is slightly more difficult to write down in documentation, so we choose the debug/non-debug split above. On top of the reasons above, we fully intend to revisit this when we are revisiting the stack bounds patch. When that happens I believe it's likely that we would simply introduce a new RTL code and remove this distinction as unnecessary. Diff: --- gcc/config/aarch64/aarch64.c | 20 ++++++++++++++++++++ gcc/doc/tm.texi | 10 ++++++++-- gcc/dwarf2out.c | 6 +++--- gcc/fwprop.c | 2 +- gcc/reload1.c | 2 +- gcc/rtl.h | 2 +- gcc/simplify-rtx.c | 4 ++-- gcc/target.def | 12 +++++++++--- gcc/testsuite/c-c++-common/dwarf2/basic-location.c | 8 ++++++++ gcc/var-tracking.c | 6 +++--- 10 files changed, 56 insertions(+), 16 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c5829e2b0f7..bfe5a312134 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -11756,6 +11756,23 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x */, machine_mode mode) return x; } +/* N.b. this is a little bit of a hack. + This hook is introduced to try to counter any obfuscating effects of + legitimize_address. MORELLO TODO understand whether this is all working + fine. */ +static rtx +aarch64_delegitimize_address (rtx x, bool debug_only) +{ + x = delegitimize_mem_from_attrs (x, debug_only); + if (!debug_only || GET_CODE (x) != UNSPEC) + return x; + if (XINT (x, 1) != UNSPEC_CHERI_BOUNDS_SET + && XINT (x, 1) != UNSPEC_CHERI_BOUNDS_SET_EXACT + && XINT (x, 1) != UNSPEC_CHERI_BOUNDS_SET_MAYBE_EXACT) + return x; + return XVECEXP (x, 0, 0); +} + static reg_class_t aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x, reg_class_t rclass, @@ -25788,6 +25805,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_LEGITIMIZE_ADDRESS #define TARGET_LEGITIMIZE_ADDRESS aarch64_legitimize_address +#undef TARGET_DELEGITIMIZE_ADDRESS +#define TARGET_DELEGITIMIZE_ADDRESS aarch64_delegitimize_address + #undef TARGET_SCHED_CAN_SPECULATE_INSN #define TARGET_SCHED_CAN_SPECULATE_INSN aarch64_sched_can_speculate_insn diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 30705cb576d..32f901b281c 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -5970,14 +5970,20 @@ This hook returns true if @var{x} is a legitimate constant for a The default definition returns true. @end deftypefn -@deftypefn {Target Hook} rtx TARGET_DELEGITIMIZE_ADDRESS (rtx @var{x}) +@deftypefn {Target Hook} rtx TARGET_DELEGITIMIZE_ADDRESS (rtx @var{x}, bool @var{debug}) This hook is used to undo the possibly obfuscating effects of the -@code{LEGITIMIZE_ADDRESS} and @code{LEGITIMIZE_RELOAD_ADDRESS} target +@code{FORCE_OPERAND}, @code{LEGITIMIZE_ADDRESS} and @code{LEGITIMIZE_RELOAD_ADDRESS} target macros. Some backend implementations of these macros wrap symbol references inside an @code{UNSPEC} rtx to represent PIC or similar addressing modes. This target hook allows GCC's optimizers to understand the semantics of these opaque @code{UNSPEC}s by converting them back into their original form. + +The second argument @var{DEBUG} is there to indicate whether this +information is for debug purposes or not. Some implementations of this hook +may return an address which is not semantically equivalent (e.g. has +different security properties when accessing) but is equivalent for the user +to know where values are. @end deftypefn @deftypefn {Target Hook} bool TARGET_CONST_NOT_OK_FOR_DEBUG_P (rtx @var{x}) diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 7dd1e781f35..b31124a0419 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -15504,7 +15504,7 @@ mem_loc_descriptor (rtx rtl, machine_mode mode, actually within the array. That's *not* necessarily the same as the zeroth element of the array. */ - rtl = targetm.delegitimize_address (rtl); + rtl = targetm.delegitimize_address (rtl, true); if (mode != GET_MODE (rtl) && GET_MODE (rtl) != VOIDmode) return NULL; @@ -20085,7 +20085,7 @@ rtl_for_decl_location (tree decl) && VAR_P (decl) && TREE_STATIC (decl)))) { - rtl = targetm.delegitimize_address (rtl); + rtl = targetm.delegitimize_address (rtl, true); return rtl; } rtl = NULL_RTX; @@ -20191,7 +20191,7 @@ rtl_for_decl_location (tree decl) rtl = rtl_for_decl_init (DECL_INITIAL (decl), TREE_TYPE (decl)); if (rtl) - rtl = targetm.delegitimize_address (rtl); + rtl = targetm.delegitimize_address (rtl, true); /* If we don't look past the constant pool, we risk emitting a reference to a constant pool entry that isn't referenced from diff --git a/gcc/fwprop.c b/gcc/fwprop.c index 705d2885aae..5718eea2272 100644 --- a/gcc/fwprop.c +++ b/gcc/fwprop.c @@ -619,7 +619,7 @@ propagate_rtx_1 (rtx *px, rtx old_rtx, rtx new_rtx, int flags) if (!can_simplify_addr (op0)) return true; - op0 = new_op0 = targetm.delegitimize_address (op0); + op0 = new_op0 = targetm.delegitimize_address (op0, false); valid_ops &= propagate_rtx_1 (&new_op0, old_rtx, new_rtx, flags | PR_CAN_APPEAR); diff --git a/gcc/reload1.c b/gcc/reload1.c index 688e8aedfd3..e2d3b492138 100644 --- a/gcc/reload1.c +++ b/gcc/reload1.c @@ -1122,7 +1122,7 @@ reload (rtx_insn *first, int global) else if (reg_equiv_invariant (i)) equiv = reg_equiv_invariant (i); else if (reg && MEM_P (reg)) - equiv = targetm.delegitimize_address (reg); + equiv = targetm.delegitimize_address (reg, true); else if (reg && REG_P (reg) && (int)REGNO (reg) != i) equiv = reg; diff --git a/gcc/rtl.h b/gcc/rtl.h index b10fee0f9db..83fd713f175 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -3512,7 +3512,7 @@ extern rtx simplify_rtx (const_rtx); extern rtx gen_pointer_plus (scalar_addr_mode, rtx, rtx); extern rtx gen_pointer_minus (scalar_addr_mode, rtx, rtx); extern rtx avoid_constant_pool_reference (rtx); -extern rtx delegitimize_mem_from_attrs (rtx); +extern rtx delegitimize_mem_from_attrs (rtx, bool debug_only=false); extern bool mode_signbit_p (machine_mode, const_rtx); extern bool val_signbit_p (machine_mode, unsigned HOST_WIDE_INT); extern bool val_signbit_known_set_p (machine_mode, diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index b209a9a419b..0417149a42a 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -261,7 +261,7 @@ avoid_constant_pool_reference (rtx x) addr = XEXP (x, 0); /* Call target hook to avoid the effects of -fpic etc.... */ - addr = targetm.delegitimize_address (addr); + addr = targetm.delegitimize_address (addr, false); /* Split the address into a base and integer offset. */ addr = strip_offset (addr, &offset); @@ -298,7 +298,7 @@ avoid_constant_pool_reference (rtx x) overrider call it. */ rtx -delegitimize_mem_from_attrs (rtx x) +delegitimize_mem_from_attrs (rtx x, bool debug ATTRIBUTE_UNUSED) { /* MEMs without MEM_OFFSETs may have been offset, so we can't just use their base addresses as equivalent. */ diff --git a/gcc/target.def b/gcc/target.def index 8ae729d9a63..2f6e5a7c092 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -2839,13 +2839,19 @@ strategy can generate better code.", DEFHOOK (delegitimize_address, "This hook is used to undo the possibly obfuscating effects of the\n\ -@code{LEGITIMIZE_ADDRESS} and @code{LEGITIMIZE_RELOAD_ADDRESS} target\n\ +@code{FORCE_OPERAND}, @code{LEGITIMIZE_ADDRESS} and @code{LEGITIMIZE_RELOAD_ADDRESS} target\n\ macros. Some backend implementations of these macros wrap symbol\n\ references inside an @code{UNSPEC} rtx to represent PIC or similar\n\ addressing modes. This target hook allows GCC's optimizers to understand\n\ the semantics of these opaque @code{UNSPEC}s by converting them back\n\ -into their original form.", - rtx, (rtx x), +into their original form.\n\ +\n\ +The second argument @var{DEBUG} is there to indicate whether this\n\ +information is for debug purposes or not. Some implementations of this hook\n\ +may return an address which is not semantically equivalent (e.g. has\n\ +different security properties when accessing) but is equivalent for the user\n\ +to know where values are.", + rtx, (rtx x, bool debug), delegitimize_mem_from_attrs) /* Given an RTX, return true if it is not ok to emit it into debug info diff --git a/gcc/testsuite/c-c++-common/dwarf2/basic-location.c b/gcc/testsuite/c-c++-common/dwarf2/basic-location.c new file mode 100644 index 00000000000..765954d929b --- /dev/null +++ b/gcc/testsuite/c-c++-common/dwarf2/basic-location.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-gdwarf-2 -dA -O0" } */ +/* Check that we have a location on the formal parameter "x". + Do this by checking that we have a formal_parameter with the name "x" and + a location of some kind. */ +/* { dg-final { scan-assembler {DW_TAG_formal_parameter\)[\r\n]+[^\r\n]*"x\\0"\t// DW_AT_name[\r\n]+[^\r\n]+DW_AT_decl_file[^\r\n]*[\r\n]+[^\r\n]+DW_AT_decl_line[\r\n]+[^\r\n]+DW_AT_decl_column[\r\n]+[^\r\n]+DW_AT_type[\r\n]+[^\r\n]+DW_AT_location} } } */ +int f(int x) { return x + 1; } + diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index a164f5d1fd3..a55e9285d93 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -1069,7 +1069,7 @@ adjust_mems (rtx loc, const_rtx old_rtx, void *data) mem = loc; if (!amd->store) { - mem = targetm.delegitimize_address (mem); + mem = targetm.delegitimize_address (mem, true); if (mem != loc && !MEM_P (mem)) return simplify_replace_fn_rtx (mem, old_rtx, adjust_mems, data); } @@ -1083,7 +1083,7 @@ adjust_mems (rtx loc, const_rtx old_rtx, void *data) amd->store = store_save; amd->mem_mode = mem_mode_save; if (mem == loc) - addr = targetm.delegitimize_address (addr); + addr = targetm.delegitimize_address (addr, true); if (addr != XEXP (mem, 0)) mem = replace_equiv_address_nv (mem, addr); if (!amd->store) @@ -8656,7 +8656,7 @@ resolve_expansions_pending_recursion (vec *pending) (d).expanding.release (); \ \ if ((l) && MEM_P (l)) \ - (l) = targetm.delegitimize_address (l); \ + (l) = targetm.delegitimize_address (l, true); \ } \ while (0)