On 01/05/15 19:51, Jeff Law wrote: > On 04/28/2015 03:54 AM, Kyrill Tkachov wrote: >> >> On 27/04/15 21:13, Jeff Law wrote: >>> On 04/21/2015 11:33 AM, Kyrill Tkachov wrote: >>>> On 21/04/15 15:09, Jeff Law wrote: >>>>> On 04/21/2015 02:30 AM, Kyrill Tkachov wrote: >>>>>> From reading config/stormy16/stormy-abi it seems to me that we >>>>>> don't >>>>>> pass arguments partially in stormy16, so this code would never be >>>>>> called >>>>>> there. That leaves pa as the potential problematic target. >>>>>> I don't suppose there's an easy way to test on pa? My checkout of >>>>>> binutils >>>>>> doesn't seem to include a sim target for it. >>>>> No simulator, no machines in the testfarm, the box I had access to via >>>>> parisc-linux.org seems dead and my ancient PA overheats well before a >>>>> bootstrap could complete. I often regret knowing about the backwards >>>>> way many things were done on the PA because it makes me think about >>>>> cases that only matter on dead architectures. >>>> So what should be the action plan here? I can't add an assert on >>>> positive result as a negative result is valid. >>>> >>>> We want to catch the case where this would cause trouble on >>>> pa, or change the patch until we're confident that it's fine >>>> for pa. >>>> >>>> That being said, reading the documentation of STACK_GROWS_UPWARD >>>> and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case >>>> where this would cause trouble on pa. >>>> >>>> Is the problem that in the function: >>>> >>>> +/* Add SIZE to X and check whether it's greater than Y. >>>> + If it is, return the constant amount by which it's greater or >>>> smaller. >>>> + If the two are not statically comparable (for example, X and Y >>>> contain >>>> + different registers) return -1. This is used in expand_push_insn to >>>> + figure out if reading SIZE bytes from location X will end up reading >>>> from >>>> + location Y. */ >>>> +static int >>>> +memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size) >>>> +{ >>>> + rtx tmp = plus_constant (Pmode, x, size); >>>> + rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y); >>>> + >>>> + if (!CONST_INT_P (sub)) >>>> + return -1; >>>> + >>>> + return INTVAL (sub); >>>> +} >>>> >>>> for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x, >>>> so the function should something like the following? >>> So I had to go back and compile some simple examples. >>> >>> References to outgoing arguments will be SP relative. References to the >>> incoming arguments will be ARGP relative. And that brings me to the >>> another issue. Isn't X in this context the incoming argument slot and >>> the destination an outgoing argument slot? >>> >>> If so, the approach of memory_load_overlap simply won't work on a target >>> with calling conventions like the PA. And you might really want to >>> consider punting for these kind of calling conventions >> >> Ok, thanks for the guidance. >> How about this? This patch disables sibcall optimisation when >> encountering a partial argument when ARGS_GROW_DOWNWARD && >> !STACK_GROWS_DOWNWARD. >> Hopefully this shouldn't harm codegen on parisc if, as you say, it's >> rare to have >> partial arguments anyway on PA due to the large number of argument regs. >> >> I tested this on arm and bootstrapped on x86_64. >> I am now going through the process of getting access to a Debian PA >> machine to >> give it a test there (thanks Dave!) >> >> Ok if testing comes clean? >> >> Thanks, >> Kyrill >> >> 2015-04-28 Kyrylo Tkachov >> >> PR target/65358 >> * calls.c (expand_call): Cancel sibcall optimisation when encountering >> partial argument on targets with ARGS_GROW_DOWNWARD and >> !STACK_GROWS_DOWNWARD. >> * expr.c (memory_load_overlap): New function. >> (emit_push_insn): When pushing partial args to the stack would >> clobber the register part load the overlapping part into a pseudo >> and put it into the hard reg after pushing. >> >> 2015-04-28 Honggyu Kim >> >> PR target/65358 >> * gcc.dg/pr65358.c: New test. > The more I think about this, the more I think it's an ugly can of worms and maybe we should just disable sibcalls for partial arguments. I doubt it's a big performance issue in general. We already have quite a bit of code in calls.c to detect cases with partial argument overlap for the explicit purpose of allowing sibcalls when partial arguments occur in the general case. However, that code only detects when a partial argument overlaps with other arguments in a call. In this PR the partial argument overlaps with itself. It would be a shame to disable sibcalls for all partial arguments when there is already infrastructure in place to handle them. > > > In addition to the argument/stack direction stuff, I've been pondering the stack/frame/arg pointer issues. Your approach assumes that the incoming and outgoing areas are always referenced off the same base register. If they aren't, then > the routine returns no overlap. > > But we'd need to consider the case where we have a reference to the arg or frame pointer which later gets rewritten into a stack pointer relative address. > > Is it too late at the point were you do the checks to reject the sibling call? If not, then maybe the overlap routine should return a tri-state. No overlap, overlap, don't know. The last would be used when the two addresses use a > different register. Ok, here is my attempt at that. The overlap functions returns -2 when it cannot staticall compare the two pointers (i.e. when the base registers are different) and the caller then disables sibcalls. The code in calls.c that calls this code will undo any emitted instructions in the meantime if sibcall optimisation fails. This required me to change the type of emit_push_insn to bool and add an extra parameter, so this patch touches a bit more code than the original version. Bootstrapped on x86_64 and tested on arm. The testcase in this PR still performs a sibcall correctly on arm. What do you think of this? Thanks, Kyrill 2015-05-11 Kyrylo Tkachov PR target/65358 * expr.c (memory_load_overlap): New function. (emit_push_insn): When pushing partial args to the stack would clobber the register part load the overlapping part into a pseudo and put it into the hard reg after pushing. Change return type to bool. Add bool argument. * expr.h (emit_push_insn): Change return type to bool. Add bool argument. * calls.c (expand_call): Cancel sibcall optimisation when encountering partial argument on targets with ARGS_GROW_DOWNWARD and !STACK_GROWS_DOWNWARD. (emit_library_call_value_1): Update callsite of emit_push_insn. (store_one_arg): Likewise. 2015-05-11 Honggyu Kim PR target/65358 * gcc.dg/pr65358.c: New test. > > Jeff > >