From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 4D96B3839C4D for ; Tue, 8 Jun 2021 14:08:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4D96B3839C4D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 136B1219C4; Tue, 8 Jun 2021 14:08:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1623161284; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Fb48aluRjfivX09/0ZQ3I4r1Psus9Bty5TOtLYf0WyU=; b=R26gB//hmYDIjmEsXy0gbsWA0gO9xWy9TrsN45voFDKAPj6UVoeITkcT+XPKobNmp1xEuv u6xF7jBAccuHIiytR7pMc87C5mazd2kJdW4eDNlLVq90hj3mfbRTIDQZzPbdFYynPawWt5 HXnvVX1ZXQNUbssx4HOnnPiRzV9f42U= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1623161284; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Fb48aluRjfivX09/0ZQ3I4r1Psus9Bty5TOtLYf0WyU=; b=46FkaoXFEhT0RW6+U552cbVHI+g/USCpq6uvh67u41B5BkJVCHVLHEPOlHWjV8xgiSMnlK VRIGZbqzVxzbOQAg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 0CB15A3B8A; Tue, 8 Jun 2021 14:08:03 +0000 (UTC) Received: by wotan.suse.de (Postfix, from userid 10510) id E190F64E5; Tue, 8 Jun 2021 14:08:03 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by wotan.suse.de (Postfix) with ESMTP id E0CC562A2; Tue, 8 Jun 2021 14:08:03 +0000 (UTC) Date: Tue, 8 Jun 2021 14:08:03 +0000 (UTC) From: Michael Matz To: Jeff Law cc: GCC Patches Subject: Re: Aligning stack offsets for spills In-Reply-To: <98179c8e-bcec-83ed-5b99-6f54791bd7cd@tachyum.com> Message-ID: References: <98179c8e-bcec-83ed-5b99-6f54791bd7cd@tachyum.com> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jun 2021 14:08:06 -0000 Hello, On Mon, 7 Jun 2021, Jeff Law wrote: > > So, as many of you know I left Red Hat a while ago and joined Tachyum.  We're > building a new processor and we've come across an issue where I think we need > upstream discussion. > > I can't divulge many of the details right now, but one of the quirks of our > architecture is that reg+d addressing modes for our vector loads/stores > require the displacement to be aligned.  This is an artifact of how these > instructions are encoded. > > Obviously we can emit a load of the address into a register when the > displacement isn't aligned.  From a correctness point that works perfectly.  > Unfortunately, it's a significant performance hit on some standard benchmarks > (spec) where we have a great number of spills of vector objects into the stack > at unaligned offsets in the hot parts of the code. > > > We've considered 3 possible approaches to solve this problem. > > 1. When the displacement isn't properly aligned, allocate more space in > assign_stack_local so that we can make the offset aligned.  The downside is > this potentially burns a lot of stack space, but in practice the cost was > minimal (16 bytes in a 9k frame)  From a performance standpoint this works > perfectly. > > 2. Abuse the register elimination code to create a second pointer into the > stack.  Spills would start as + offset, then either get eliminated > to sp+offset' when the offset is aligned or gpr+offset'' when the offset > wasn't properly aligned. We started a bit down this path, but with #1 working > so well, we didn't get this approach to proof-of-concept. > > 3. Hack up the post-reload optimizers to fix things up as best as we can.  > This may still be advantageous, but again with #1 working so well, we didn't > explore this in any significant way.  We may still look at this at some point > in other contexts. > > Here's what we're playing with. Obviously we'd need a target hook to > drive this behavior. I was thinking that we'd pass in any slot offset > alignment requirements (from the target hook) to assign_stack_local and > that would bubble down to this point in try_fit_stack_local: Why is the machinery involving STACK_SLOT_ALIGNMENT and spill_slot_alignment() (for spilling) or get_stack_local_alignment() (for backing stack slots) not working for you? If everything is setup correctly the input alignment to try_fit_stack_local ought to be correct already. Ciao, Michael.