From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id 60BF53857C48 for ; Fri, 26 Mar 2021 17:29:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 60BF53857C48 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-404-1ZOJfWpaPJmqjnTt821M0Q-1; Fri, 26 Mar 2021 13:29:18 -0400 X-MC-Unique: 1ZOJfWpaPJmqjnTt821M0Q-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6C034107ACCA; Fri, 26 Mar 2021 17:29:17 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-112-95.ams2.redhat.com [10.36.112.95]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7903260937; Fri, 26 Mar 2021 17:29:14 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 12QHTBKN3282018 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 26 Mar 2021 18:29:12 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 12QHTAxZ3282017; Fri, 26 Mar 2021 18:29:10 +0100 Date: Fri, 26 Mar 2021 18:29:10 +0100 From: Jakub Jelinek To: Jason Merrill Cc: Richard Henderson , gdb@sourceware.org, gcc-patches@gcc.gnu.org, Mark Wielaard Subject: Re: [PATCH] dwarf2cfi: Defer queued register saves some more [PR99334] Message-ID: <20210326172910.GX1179226@tucnak> Reply-To: Jakub Jelinek References: <20210324093132.GE186063@tucnak> <0d60aeed-783d-1128-65e5-fc24c504bf20@redhat.com> MIME-Version: 1.0 In-Reply-To: <0d60aeed-783d-1128-65e5-fc24c504bf20@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Mar 2021 17:29:22 -0000 On Thu, Mar 25, 2021 at 05:43:57PM -0400, Jason Merrill via Gcc-patches wrote: > > Will e.g. GDB be happy about the changes? > > I would expect so, but it would be good to have someone from GDB verify. Ok, have verified it now on the testcase from the PR (stubbed out what it calls, called it from main with all NULL pointers and just in gdb stepped through the prologue and at each insn looked at p $rsp p $rbp up p $rsp p $rbp down stepi With unpatched gcc indeed it works everywhere except when stopping at the start of movq %rsp, %rbp instruction where $rbp in the parent frame was unrelated garbage. And with the patch below, I get the expected values in all cases. > > Thinking some more about this, what can be problematic on the GCC side > > is inline asm, that can and often will contain multiple instructions. > > For that is an easy fix, just test asm_noperands and handle > > clobbers_queued_reg_save before the insn rather than after in that case. > > Sure, but does inline asm go through dwarf2out_frame_debug? Not through dwarf2out_frame_debug, as inline asm is not RTX_FRAME_RELATED_P, but it goes through that scan_trace hunk I was changing like any other insn (except for BARRIERs, instructions not in basic blocks, some NOTEs and DEBUG_INSNs). And clobbers_queued_reg_save (insn) can be true for the inline asm either because it stores some registers or because it clobbers them. And then there are other backend insns that expand to multiple hw instructions, e.g. in the i386 backend typically (but not sure if we guarantee that) marked as get_attr_type (insn) == TYPE_MULTI. So, if we wanted to do what my patch did, we would need to come up with some new (optional) insn attribute, say get_attr_undivisible (insn), e.g. define that attribute on i386 to "type" attribute equal to "multi" by default and perhaps with some exceptions if needed, and use that attribute if HAVE_ATTR_undivisible in dwarf2cfi.c to decide whether it can be emitted after the insn or needs to be emitted before it. > > But there is another problem, instruction patterns that emit multiple > > hw instructions, code can stop in between them. So, do we want some > > instruction attribute that would conservatively tell us which instruction > > patterns are guaranteed to be single machine instructions? > > It seems to me that in that situation you'd want to add the save to *%rsp, > and still not update to *%rbp until after the combined instruction. So, today I've tried instead to deal with it through REG_FRAME_RELATED_EXPR from the backend, but that failed miserably as explained in the PR, dwarf2cfi.c has some rules (Rule 16 to Rule 19) that are specific to the dynamic stack realignment using drap register that only the i386 backend does right now, and by using REG_FRAME_RELATED_EXPR or REG_CFA* notes we can't emulate those rules. The following patch instead does the deferring of the hard frame pointer save rule in dwarf2cfi.c Rule 18 handling and emits it on the (set hfp sp) assignment that must appear shortly after it and adds assertion that it is the case. The difference before/after the patch on the assembly is: --- pr99334.s~ 2021-03-26 15:42:40.881749380 +0100 +++ pr99334.s 2021-03-26 17:38:05.729161910 +0100 @@ -11,8 +11,8 @@ _func_with_dwarf_issue_: andq $-16, %rsp pushq -8(%r10) pushq %rbp - .cfi_escape 0x10,0x6,0x2,0x76,0 movq %rsp, %rbp + .cfi_escape 0x10,0x6,0x2,0x76,0 pushq %r15 pushq %r14 pushq %r13 i.e. does just what we IMHO need, after pushq %rbp %rbp still contains parent's frame value and so the save rule doesn't need to be overridden there, ditto at the start of the next insn before the side-effect took effect, and we override it only after it when %rbp already has the right value. If some other target adds dynamic stack realignment in the future and the offset 0 case wouldn't be true there, the code can be adjusted so that it works on all the drap architectures, I'm pretty sure the code would need other adjustments too. For the rule 18 and for the (set hfp sp) after it we already have asserts for the drap cases that check whether the code looks the way i?86/x86_64 emit it currently. Is this ok for trunk if it passes bootstrap/regtest on x86_64-linux and i686-linux (I've verified no other target has the drap stuff right now, nvptx does nvptx_get_drap_rtx but no other realignment stack (unclear why?), but nvptx is a non-RA target so dwarf2cfi doesn't apply to it)? 2021-03-26 Jakub Jelinek PR debug/99334 * dwarf2out.h (struct dw_fde_node): Add rule18 member. * dwarf2cfi.c (dwarf2out_frame_debug_expr): When handling (set hfp sp) assignment with drap_reg active, queue reg save for hfp with offset 0 and flush queued reg saves. When handling a push with rule18, defer queueing reg save for hfp and just assert the offset is 0. (scan_trace): Assert that fde->rule18 is false. --- gcc/dwarf2out.h.jj 2021-01-04 10:25:38.718235126 +0100 +++ gcc/dwarf2out.h 2021-03-26 17:12:30.599007479 +0100 @@ -108,6 +108,12 @@ struct GTY(()) dw_fde_node { /* True iff dw_fde_second_begin label is in text_section or cold_text_section. */ unsigned second_in_std_section : 1; + /* True if Rule 18 described in dwarf2cfi.c is in action, i.e. for dynamic + stack realignment in between pushing of hard frame pointer to stack + and setting hard frame pointer to stack pointer. The register save for + hard frame pointer register should be emitted only on the latter + instruction. */ + unsigned rule18 : 1; }; --- gcc/dwarf2cfi.c.jj 2021-03-23 19:42:05.000000000 +0100 +++ gcc/dwarf2cfi.c 2021-03-26 17:31:58.133165709 +0100 @@ -1695,9 +1695,19 @@ dwarf2out_frame_debug_expr (rtx expr) if (fde && fde->stack_realign && REGNO (src) == STACK_POINTER_REGNUM) - gcc_assert (REGNO (dest) == HARD_FRAME_POINTER_REGNUM - && fde->drap_reg != INVALID_REGNUM - && cur_cfa->reg != dwf_regno (src)); + { + gcc_assert (REGNO (dest) == HARD_FRAME_POINTER_REGNUM + && fde->drap_reg != INVALID_REGNUM + && cur_cfa->reg != dwf_regno (src) + && fde->rule18); + fde->rule18 = 0; + /* The save of hard frame pointer has been deferred + until this point when Rule 18 applied. Emit it now. */ + queue_reg_save (dest, NULL_RTX, 0); + /* And as the instruction modifies the hard frame pointer, + flush the queue as well. */ + dwarf2out_flush_queued_reg_saves (); + } else queue_reg_save (src, dest, 0); } @@ -1907,6 +1917,7 @@ dwarf2out_frame_debug_expr (rtx expr) { gcc_assert (cur_cfa->reg != dw_frame_pointer_regnum); cur_trace->cfa_store.offset = 0; + fde->rule18 = 1; } if (cur_cfa->reg == dw_stack_pointer_regnum) @@ -2041,7 +2052,17 @@ dwarf2out_frame_debug_expr (rtx expr) span = NULL; if (!span) - queue_reg_save (src, NULL_RTX, offset); + { + if (fde->rule18) + /* Just verify the hard frame pointer save when doing dynamic + realignment uses expected offset. The actual queue_reg_save + needs to be deferred until the instruction that sets + hard frame pointer to stack pointer, see PR99334 for + details. */ + gcc_assert (known_eq (offset, 0)); + else + queue_reg_save (src, NULL_RTX, offset); + } else { /* We have a PARALLEL describing where the contents of SRC live. @@ -2732,6 +2753,7 @@ scan_trace (dw_trace_info *trace, bool e create_trace_edges (control); } + gcc_assert (!cfun->fde || !cfun->fde->rule18); add_cfi_insn = NULL; cur_row = NULL; cur_trace = NULL; Jakub