From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-437201-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 32489 invoked by alias); 28 Sep 2016 16:33:20 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 32470 invoked by uid 89); 28 Sep 2016 16:33:19 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-4.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=thousand, life, visit
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 28 Sep 2016 16:33:18 +0000
Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26])	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 0A4DE65727;	Wed, 28 Sep 2016 16:33:17 +0000 (UTC)
Received: from localhost.localdomain (ovpn-116-142.phx2.redhat.com [10.3.116.142])	by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u8SGXGOt012237;	Wed, 28 Sep 2016 12:33:16 -0400
Subject: Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components
To: Segher Boessenkool <segher@kernel.crashing.org>
References: <cover.1474616087.git.segher@kernel.crashing.org> <cd8a358e576756d1611ca2d7862d4c718ded8bf5.1474616087.git.segher@kernel.crashing.org> <cf28e01b-eb41-acfd-8d4b-7cc724006054@redhat.com> <20160928090407.GH30477@gate.crashing.org>
Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
From: Jeff Law <law@redhat.com>
Message-ID: <2effb66f-132a-be2d-dcd0-038709eea201@redhat.com>
Date: Wed, 28 Sep 2016 16:36:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0
MIME-Version: 1.0
In-Reply-To: <20160928090407.GH30477@gate.crashing.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2016-09/txt/msg02164.txt.bz2

On 09/28/2016 03:04 AM, Segher Boessenkool wrote:
>
>>> +static void
>>> +place_prologue_for_one_component (unsigned int which, basic_block head)
>>> +{
>>> +  /* The block we are currently dealing with.  */
>>> +  basic_block bb = head;
>>> +  /* Is this the first time we visit this block, i.e. have we just gone
>>> +     down the tree.  */
>>> +  bool first_visit = true;
>>> +
>>> +  /* Walk the dominator tree, visit one block per iteration of this loop.
>>> +     Each basic block is visited twice: once before visiting any children
>>> +     of the block, and once after visiting all of them (leaf nodes are
>>> +     visited only once).  As an optimization, we do not visit subtrees
>>> +     that can no longer influence the prologue placement.  */
>>> +  for (;;)
>> Is there some reason you wrote this as a loop rather than recursion?
>> IMHO it makes this function (and spread_components) more difficult to
>> reason about than it needs to be.
>
> It would recurse a few thousand frames deep on not terribly big testcases
> (i.e. I've seen it happen).  This uses a lot of stack space on some ABIs,
> and is very slow as well (this is the slowest function here by far).
> Unlimited recursion is bad.
I'm surprised the recursion was that deep.  Such is life.   Thanks for 
clarifying.  I won't object to the iterative version. :-)

>>> +/* Place code for prologues and epilogues for COMPONENTS where we can put
>>> +   that code at the end of basic blocks.  */
>>> +static void
>>> +emit_common_tails_for_components (sbitmap components)
>> [ Snip. ]
>>> +
>>> +      /* Put the code at the end of the BB, but before any final jump.  */
>>> +      if (!bitmap_empty_p (epi))
>> So what if the final jump uses hard registers in one way or another?   I
>> don't immediately see anything that verifies it is safe to transpose the
>> epilogue and the final jump.
>
> Whoops.  Thanks for catching this.
I missed it the first time though the code too.

>
>> Conceptually want the epilogue to occur on the outgoing edge(s).  But
>> you want to actually transpose the epilogue and the control flow insn so
>> that you can insert the epilogue in one place.
>
> The same problem happens with prologues, too.
Yea.  I guess if a had a jump with an embedded side effect (such as movb 
or addb on the PA), then transposing the control flow insn with the 
prologue would be problematical as well.

>
> A cc0 target can not use separate shrink-wrapping *anyway* if any of the
> components would clobber cc0, so that is no problem here.
True, but I'd be more comfortable if we filtered out cc0 targets explicitly.

>
>>   I think you need to handle the former more cleanly.  The latter I'd
>> be comfortable filtering out in try_shrink_wrapping_separate.
>
> I'm thinking to not do the common tail optimisation if BB_END is a
> JUMP_INSN but not simplejump_p (or a return or a sibling call).  Do
> you see any problem with that?
Seems reasonable.


Jeff