From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16660 invoked by alias); 19 Dec 2014 15:33:30 -0000 Mailing-List: contact libffi-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libffi-discuss-owner@sourceware.org Received: (qmail 16642 invoked by uid 89); 19 Dec 2014 15:33:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 19 Dec 2014 15:33:23 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id sBJFX3xK003285 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 19 Dec 2014 10:33:03 -0500 Received: from pike.twiddle.home (vpn-48-156.rdu2.redhat.com [10.10.48.156]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id sBJFX1NM029948; Fri, 19 Dec 2014 10:33:01 -0500 Message-ID: <5494452C.6010003@redhat.com> Date: Fri, 19 Dec 2014 15:33:00 -0000 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Ulrich Weigand , Richard Henderson CC: libffi-discuss@sourceware.org, Ulrich.Weigand@de.ibm.com, vogt@linux.vnet.ibm.com, krebbel@linux.vnet.ibm.com Subject: Re: [PATCH 0/4] s390 improvements References: <201412191506.sBJF6pil005079@d03av02.boulder.ibm.com> In-Reply-To: <201412191506.sBJF6pil005079@d03av02.boulder.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2014/txt/msg00274.txt.bz2 On 12/19/2014 09:06 AM, Ulrich Weigand wrote: > Richard Henderson wrote: > >> This is relative to Dominik's patch from the 16th. The complete >> tree can be found at >> >> git://github.com/rth7680/libffi.git s390 >> >> Mostly relevant is patch 3, which converts the s390 port to the >> more modern arrangement where there's no callback into ffi_prep_args. > > This is a bit confusing to me. The assembler routine now does: > > lg %r15,120(%r2) # Set up outgoing stack > > without ever restoring the initial stack pointer before returning > to its caller. This probably works right now since the value loaded > here is determined like that: > > /* Pass the outgoing stack frame in the r15 save slot. */ > frame->gpr_save[8] = (unsigned long)(stack - sizeof(struct call_frame)); > > and since "stack" was allocated via alloca and ffi_call_int does not > require an argument save area when calling any of its subroutines, > the stack pointer value computed here should always in fact be > identical to the value %r15 already has at the above location. > > Using the 160 bytes below "stack" as register save area for use of the > target function called by ffi_call_SYSV is also only safe if those bytes > are in fact the register save area ffi_call_int provides for its caller, > e.g. again if the value is already identical to %r15. (If this were > any other value, we might clobber parts of ffi_call_int's stack frame > that it conceivably might still access.) > > However, if the procedure only works if the "lg" is a nop, why is it > even done? Also, the whole setup seems a bit fragile since changes > to ffi_call_int might cause it to need an argument save area ... The stack frame we install is created with alloca, and so we know for a fact that ffi_call_int must be using a frame pointer to hold its own frame. Since we do not adjust %r15 on the way out of ffi_call_SYSV, we leave the stack frame chain intact. If there were another function for ffi_call_int to call after ffi_call_SYSV (but there's not), the outgoing 160 byte save area would be present. It's true that the load of %r15 is now a nop. It hadn't been at one point in my development; ffi_prep_args had had more than 5 parameters, and so there was extra stack allocated. I suppose if ffi_prep_args were inlined, one could be certain of this (since there will be no function calls) and document it as such. r~