From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 62148 invoked by alias); 15 Nov 2018 09:55:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 62054 invoked by uid 89); 15 Nov 2018 09:54:59 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=it!, timers X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 15 Nov 2018 09:54:57 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-03.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gNEMP-0006XC-US from Andrew_Stubbs@mentor.com ; Thu, 15 Nov 2018 01:54:53 -0800 Received: from [172.30.90.225] (137.202.0.90) by SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Thu, 15 Nov 2018 09:54:48 +0000 Subject: Re: [PATCH 21/25] GCN Back-end (part 2/2). To: Jeff Law , References: <4c633833-1954-4b62-1a96-4f1c2cf541fd@codesourcery.com> <4fe09f84-10d6-d30c-e458-27a4a3220eef@codesourcery.com> From: Andrew Stubbs Message-ID: Date: Thu, 15 Nov 2018 09:55:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2018-11/txt/msg01355.txt.bz2 On 14/11/2018 22:30, Jeff Law wrote: > There's a particular case that has historically been problematical. > > If you have this kind of sequence in the epilogue > > restore register using FP > move fp->sp (deallocates frame) > return > > Under certain circumstances the scheduler can swap the register restore > and move from fp into sp creating something like this: > > move fp->sp (deallocates frame) > restore register using FP (reads from deallocated frame) > return > > That would normally be OK, except if you take an interrupt between the > first two instructions. If interrupt handling is done without switching > stacks, then the interrupt handler may write into the just de-allocated > frame destroying the values that were saved in the prologue. OK, so the barrier needs to be right before the stack pointer moves. I can do that. :-) Presumably the same is true for prologues, except that the barrier needs to be after the stack adjustment. > You may not need to worry about that today on the GCN port, but you > really want to fix it now so that it's never a problem. You *really* > don't want to have to debug this kind of problem in the wild. Been > there, done that, more than once :( I'm not exactly sure how interrupts work on this platform -- we've had no use for them yet -- but without a debugger, and with up to 1024 threads running simultaneously, you can be sure I don't want to debug it! > I would hazard a guess that combine saw the one without the use as > "simpler" and preferred it. I think you've made a bit of a fundamental > problem with the way the EXEC register is being handled. Hopefully you > can get by with some magic UNSPEC wrappers without having to do too much > surgery. Exactly so. An initial experiment with combine re-enabled has not shown any errors, so it's possible the problem has gone away, but I've not been over the full testsuite yet (and you wouldn't expect actual failures anyway). >> In future, I'd like to have the scheduler insert real instructions into >> these slots, but that's very much on the to-do list. > If you you can model this as a latency between the two points where you > need to insert the nops, then the scheduler will fill in what it can. > But it doesn't generally handle non-interlocked processors. So you'll > still want your little pass to fix things up when the scheduler couldn't > find useful work to schedule into those bubbles. Absolutely, the scheduler is about optimization and this md_reorg pass is about correctness. >> I have no idea whether the architecture has those issues or not. > The guideline I would give to determine if you're vulnerable... Do you > have speculation, including the ability to speculate past a memory > operation, branch prediction, memory caches and high resolution timer > (ie, like a cycle timer). If you've got those, then the processor is > likely vulnerable to a spectre V1 style attack. Those are the basic > building blocks. We have cycle timers and caches, but I'll have to ask AMD about the other details. Andrew