From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 85528 invoked by alias); 6 Mar 2018 16:04:49 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 83949 invoked by uid 89); 6 Mar 2018 16:04:48 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=H*f:sk:2f9b658, H*i:sk:2f9b658 X-HELO: mail-lf0-f48.google.com Received: from mail-lf0-f48.google.com (HELO mail-lf0-f48.google.com) (209.85.215.48) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 06 Mar 2018 16:04:47 +0000 Received: by mail-lf0-f48.google.com with SMTP id y19so29277269lfd.4 for ; Tue, 06 Mar 2018 08:04:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=pXLjdhQ5I38WvZeUsMoCUl2k+hKEYQeL6Y7drCcxbT0=; b=lS6RAclrkZvKpVPhWtaMZtpkebhTLK7Z4w0PUI47kjfOteIIQc9hvaaZX4iRPZjCop 5ShpJ1xEdpciSCgPwwsPz40kzGljZTtUhyjdupUVj4uB1pn+H6Ddq1DRstJXfXPv6/Hr zK2YgkSajms1+TytYmvn6gtxnq8toHZPAJ2J1o4Ib4mHv9p/WBkeHnDXEpnRo6roO5SW v7mMJuSrkr6V5B/Uo7+N3Xglku41XMTR3Ehmr1w2iuQPdjq+sTz8A38EVqDQrnTlId1h W+rHbeuTe0sCxeTO4Egjj2LisgrRb7q8/IBQzqOj8tj+YceXiNWr1ZpyJFWjULestMF8 relg== X-Gm-Message-State: AElRT7GDTMjHzMxhxW+RaNKgWT872sG8sVvalDlCG/ihtqSWWCOJZ3W1 eiVbXe2OnLwPVh+JOOFonRBBbsS99X5McZat6I1Dcw== X-Google-Smtp-Source: AG47ELtDL100DxcMDoo/j5KXfod2jgXmFwsiS/KG7rTTaYycO9Jw9WXLPMG3bBtVR9OQKbkHSMgVIWlYzoafs19DyS8= X-Received: by 10.25.222.207 with SMTP id i76mr11734805lfl.133.1520352284698; Tue, 06 Mar 2018 08:04:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.46.104.4 with HTTP; Tue, 6 Mar 2018 08:04:43 -0800 (PST) In-Reply-To: <2f9b6580-4c7d-d29d-157c-24fe6dd8f781@arm.com> References: <2f9b6580-4c7d-d29d-157c-24fe6dd8f781@arm.com> From: Richard Biener Date: Tue, 06 Mar 2018 16:04:00 -0000 Message-ID: Subject: Re: BLKmode parameters are stored in unaligned stack slot when passed via registers. To: Renlin Li Cc: "gcc@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-03/txt/msg00071.txt.bz2 On Tue, Mar 6, 2018 at 4:21 PM, Renlin Li wrote: > Hi all, > > The problem described here probably only affects targets whose ABI allow to > pass structured > arguments of certain size via registers. > > If the mode of the parameter type is BLKmode, in the callee, during RTL > expanding, > a stack slot will be reserved for this parameter, and the incoming value > will be copied into > the stack slot. > > However, the stack slot for the parameter will not be aligned if the > alignment of parameter type > exceeds MAX_SUPPORTED_STACK_ALIGNMENT. > Chances are, unaligned memory access might cause run-time errors. > > For local variable on the stack, the alignment of the data type is honored, > although the document states that it is not guaranteed. > > For example: > > #include > union U { > uint32_t M0; > uint32_t M1; > uint32_t M2; > uint32_t M3; > } __attribute((aligned(16))); > > void tmp (union U *); > void foo (union U P0) > { > union U P1 = P0; > tmp (&P1); > } > > The code-gen from armv7-a is like this: > > foo: > @ args = 0, pretend = 0, frame = 48 > @ frame_needed = 0, uses_anonymous_args = 0 > str lr, [sp, #-4]! > sub sp, sp, #52 > mov ip, sp > stm ip, {r0, r1, r2, r3} --> ip is not 128-bit aligned > add lr, sp, #39 > bic lr, lr, #15 > ldm ip, {r0, r1, r2, r3} > stm lr, {r0, r1, r2, r3} --> lr is 128-bit aligned > mov r0, lr > bl tmp > add sp, sp, #52 > @ sp needed > ldr pc, [sp], #4 > > There are other obvious missed optimizations in the code-generation above. > The stack slot for parameter P0 and local variable P1 could be merged. > So that some of the load/store instructions could be removed. > I think this is a known missed optimization case. > > To summaries, there are two issues here: > 1, (wrong code) unaligned stack slot allocated for parameters during > function expansion. > 2, (missed optimization) stack slot for parameter sometimes is not > necessary. > In certain scenario, the argument register could directly be used. > Currently, this is only possible when the parameter mode is not BLKmode. > > For issue 1, we can do similar things as expand_used_vars. > Dynamically align the stack slot address for parameters whose alignment > exceeds > PREDERRED_STACK_BOUNDARY. Other parameters could be store in gap between the > aligned address and fp when possible. > > For issue 2, I checked the behavior of LLVM, it seems the stack slot > allocation > for parameters are explicitly exposed by the alloca IR instruction at the > very beginning. > Later, there are optimization/transformation passes like mem2reg, reg2mem, > sroa etc. to remove > unnecessary alloca instructions. > > In gcc, the stack allocation for parameters and local variables are done > during expand pass, implicitly. > And RTL passes are not able to remove the unnecessary stack allocation and > load/store operations. > > For example: > > uint32_t bar(union U P0) > { > return P0.M0; > } > > Currently, the code-gen is different on different targets. > There are various backend hooks which make the code-gen sub-optimal. > For example, aarch64 target could directly return with w0 while armv7-a > target generates unnecessary > store and load. > > However, this optimization should be target independent, unrelated target > alignment configuration. > Both issue 1&2 could be resolved if gcc has a similar approach. But I assume > the change is big. > > Is there any suggestions for solving issue 1 and improving issue 2 in a > generic way? > I can create a bugzilla ticket to record the issue. What does the ABI say for passing such over-aligned data types? For solving 1) you could copy the argument as passed by the ABI to a properly aligned stack location in the callee. Generally it sounds like either the ABI doesn't specify anything or the ABI specifies something that violates user expectation. For 2) again, it is the ABI which specifies whether an argument is passed via the stack or via registers. So - what does the ABI say? Richard. > Regards, > Renlin