From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-490153-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 62148 invoked by alias); 15 Nov 2018 09:55:02 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 62054 invoked by uid 89); 15 Nov 2018 09:54:59 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=it!, timers
X-HELO: relay1.mentorg.com
Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 15 Nov 2018 09:54:57 +0000
Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-03.mgc.mentorg.com)	by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256)	id 1gNEMP-0006XC-US from Andrew_Stubbs@mentor.com ; Thu, 15 Nov 2018 01:54:53 -0800
Received: from [172.30.90.225] (137.202.0.90) by SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Thu, 15 Nov 2018 09:54:48 +0000
Subject: Re: [PATCH 21/25] GCN Back-end (part 2/2).
To: Jeff Law <law@redhat.com>, <gcc-patches@gcc.gnu.org>
References: <cover.1536144068.git.ams@codesourcery.com> <4c633833-1954-4b62-1a96-4f1c2cf541fd@codesourcery.com> <a8084ecb-c638-8e46-8734-3446079e002a@redhat.com> <4fe09f84-10d6-d30c-e458-27a4a3220eef@codesourcery.com> <e59c33d3-68d0-1349-6d79-235477136867@redhat.com>
From: Andrew Stubbs <ams@codesourcery.com>
Message-ID: <eaa5f9af-e8ae-fc33-c30b-0f475d59d8a8@codesourcery.com>
Date: Thu, 15 Nov 2018 09:55:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1
MIME-Version: 1.0
In-Reply-To: <e59c33d3-68d0-1349-6d79-235477136867@redhat.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2018-11/txt/msg01355.txt.bz2

On 14/11/2018 22:30, Jeff Law wrote:
> There's a particular case that has historically been problematical.
> 
> If you have this kind of sequence in the epilogue
> 
> 	restore register using FP
> 	move fp->sp  (deallocates frame)
> 	return
> 
> Under certain circumstances the scheduler can swap the register restore
> and move from fp into sp creating something like this:
> 
> 	move fp->sp (deallocates frame)
> 	restore register using FP (reads from deallocated frame)
> 	return
> 
> That would normally be OK, except if you take an interrupt between the
> first two instructions.  If interrupt handling is done without switching
> stacks, then the interrupt handler may write into the just de-allocated
> frame destroying the values that were saved in the prologue.

OK, so the barrier needs to be right before the stack pointer moves. I 
can do that. :-)

Presumably the same is true for prologues, except that the barrier needs 
to be after the stack adjustment.

> You may not need to worry about that today on the GCN port, but you
> really want to fix it now so that it's never a problem.  You *really*
> don't want to have to debug this kind of problem in the wild.  Been
> there, done that, more than once :(

I'm not exactly sure how interrupts work on this platform -- we've had 
no use for them yet -- but without a debugger, and with up to 1024 
threads running simultaneously, you can be sure I don't want to debug it!

> I would hazard a guess that combine saw the one without the use as
> "simpler" and preferred it.  I think you've made a bit of a fundamental
> problem with the way the EXEC register is being handled.  Hopefully you
> can get by with some magic UNSPEC wrappers without having to do too much
> surgery.

Exactly so. An initial experiment with combine re-enabled has not shown 
any errors, so it's possible the problem has gone away, but I've not 
been over the full testsuite yet (and you wouldn't expect actual 
failures anyway).

>> In future, I'd like to have the scheduler insert real instructions into
>> these slots, but that's very much on the to-do list.
> If you you can model this as a latency between the two points where you
> need to insert the nops, then the scheduler will fill in what it can.
> But it doesn't generally handle non-interlocked processors.   So you'll
> still want your little pass to fix things up when the scheduler couldn't
> find useful work to schedule into those bubbles.

Absolutely, the scheduler is about optimization and this md_reorg pass 
is about correctness.

>> I have no idea whether the architecture has those issues or not.
> The guideline I would give to determine if you're vulnerable...  Do you
> have speculation, including the ability to speculate past a memory
> operation, branch prediction, memory caches and high resolution timer
> (ie, like a cycle timer).  If you've got those, then the processor is
> likely vulnerable to a spectre V1 style attack.  Those are the basic
> building blocks.

We have cycle timers and caches, but I'll have to ask AMD about the 
other details.

Andrew