From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 6A3803857C49 for ; Thu, 17 Sep 2020 06:17:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 6A3803857C49 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=richard.sandiford@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1877130E; Wed, 16 Sep 2020 23:17:52 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0DA653F68F; Wed, 16 Sep 2020 23:17:50 -0700 (PDT) From: Richard Sandiford To: Qing Zhao Mail-Followup-To: Qing Zhao , Segher Boessenkool , Kees Cook , Kees Cook via Gcc-patches , Jakub Jelinek , Uros Bizjak , "Rodriguez Bahena\, Victor" , richard.sandiford@arm.com Cc: Segher Boessenkool , Kees Cook , Kees Cook via Gcc-patches , Jakub Jelinek , Uros Bizjak , "Rodriguez Bahena\, Victor" Subject: Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all] References: <2AF34264-D8FD-49F9-AB06-E1243DA9DB8A@ORACLE.COM> <59551C24-AAA9-4C2E-85EF-E204C35796BD@ORACLE.COM> <20200915194121.GS28786@gate.crashing.org> <5457F886-7DB4-46C0-A7EB-0A12D633FEC1@ORACLE.COM> <20200915230942.GY28786@gate.crashing.org> <20200916103535.GA28786@gate.crashing.org> Date: Thu, 17 Sep 2020 07:17:49 +0100 In-Reply-To: (Qing Zhao's message of "Wed, 16 Sep 2020 15:57:10 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Sep 2020 06:17:53 -0000 Qing Zhao writes: > Segher and Richard,=20 > > Now there are two major concerns from the discussion so far: > > 1. (From Richard): Inserting zero insns should be done after pass_thread= _prologue_and_epilogue since later passes (for example, pass_regrename) mig= ht introduce new used caller-saved registers.=20 > So, we should do this in the beginning of pass_late_compilation (som= e targets wouldn=E2=80=99t cope with doing it later).=20 > > 2. (From Segher): The inserted zero insns should stay together with the r= eturn, no other insns should move in-between zero insns and return insns. O= therwise, a valid gadget could be formed.=20 > > I think that both of the above 2 concerns are important and should be add= ressed for the correct implementation.=20 > > In order to support 1, we cannot implementing it in =E2=80=9Ctargetm.gen= _return()=E2=80=9D and =E2=80=9Ctargetm.gen_simple_return()=E2=80=9D since= =E2=80=9Ctargetm.gen_return()=E2=80=9D and =E2=80=9Ctargetm.gen_simple_ret= urn()=E2=80=9D are called during pass_thread_prologue_and_epilogue, at that= time, the use information still not correct.=20 > > In order to support 2, enhancing EPILOGUE_USES to include the zeroed regi= stgers is NOT enough to prevent all the zero insns from moving around. Right. The purpose of EPILOGUE_USES was instead to stop the moves from being deleted as dead. > More restrictions need to be added to these new zero insns. (I think tha= t marking these new zeroed registers as =E2=80=9Cunspec_volatile=E2=80=9D a= t RTL level is necessary to prevent them from deleting from moving around).= =20 > > > So, based on the above, I propose the following approach that will resolv= e the above 2 concerns: > > 1. Add 2 new target hooks: > A. targetm.pro_epilogue_use (reg) > This hook should return a UNSPEC_VOLATILE rtx to mark a register in us= e to > prevent deleting register setting instructions in prologue and epilogu= e. > > B. targetm.gen_zero_call_used_regs(need_zeroed_hardregs) > This hook will gen a sequence of zeroing insns that zero the registers= that specified in NEED_ZEROED_HARDREGS. > > A default handler of =E2=80=9Cgen_zero_call_used_regs=E2=80=9D could = be defined in middle end, which use mov insns to zero registers, and then u= se =E2=80=9Ctargetm.pro_epilogue_use(reg)=E2=80=9D to mark each zeroed regi= sters.=20 This sounds like you're going back to using: (insn 18 16 19 2 (set (reg:SI 1 dx) (const_int 0 [0])) "t10.c":11:1 -1 (nil)) (insn 19 18 20 2 (unspec_volatile [ (reg:SI 1 dx) ] UNSPECV_PRO_EPILOGUE_USE) "t10.c":11:1 -1 (nil)) This also doesn't prevent the zeroing from being moved around. Like the EPILOGUE_USES approach, it only prevents the clearing from being removed as dead. I still think that the EPILOGUE_USES approach is the better way of doing that. In other words: the use insns themselves are volatile and so can't be moved relative to each other and to other volatile insns. But the uses are fake instructions that don't do anything. The preceding zeroing instructions are just normal instructions that can be moved around freely before their respective uses. I don't think there's a foolproof way of preventing an unknown target machine_reorg pass from moving the instructions around. But since we don't have unknown machine_reorgs (at least not in-tree), I think instead we should be prepared to patch machine_reorgs where necessary to ensure that they do the right thing. If you want to increase the chances that machine_reorgs don't need to be patched, you could either: (a) to make the zeroing instructions themselves volatile or (b) to insert a volatile reference to the register before (rather than after) the zeroing instruction IMO (b) is the way to go, because it avoids the need to define special volatile move patterns for each type of register. (b) would be needed on top of (rather than instead of) the EPILOGUE_USES thing. I don't think we need a new target-specific unspec_volatile code to do (b). We can just use an automatically-generated volatile asm to clobber the registers first. See e.g. how expand_asm_memory_blockage handles memory scheduling barriers. Thanks, Richard