public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Yuri Rumyantsev <ysrumyan@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Igor Zamyatin <izamyatin@gmail.com>
Subject: Re: [PATCH 2/3] Extended if-conversion
Date: Thu, 04 Dec 2014 13:15:00 -0000	[thread overview]
Message-ID: <CAEoMCqTC5XHC2QjLrMPjNGj7f6APpgF6FXMxj+8ZNN_F=Bx6fA@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc3Cj0E37qBc5scafkS5OoecczKqt0zfv_upkGPAM-9Q2g@mail.gmail.com>

Richard,

I did simple change by saving gsi iterator for each bb that has
critical edges by adding additional field to bb_predicate_s:

typedef struct bb_predicate_s {

  /* The condition under which this basic block is executed.  */
  tree predicate;

  /* PREDICATE is gimplified, and the sequence of statements is
     recorded here, in order to avoid the duplication of computations
     that occur in previous conditions.  See PR44483.  */
  gimple_seq predicate_gimplified_stmts;

  /* Insertion point for blocks having incoming critical edges.  */
  gimple_stmt_iterator gsi;
} *bb_predicate_p;

and this iterator is saved in  insert_gimplified_predicates before
insertion code for predicate computation. I checked that this fix
works.

Now I am implementing merging of predicate_extended.. and
predicate_arbitrary.. functions as you proposed.

Best regards.
Yuri.

2014-12-04 15:41 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Tue, Dec 2, 2014 at 4:28 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Thanks Richard for your quick reply!
>>
>> 1. I agree that we can combine predicate_extended_ and
>> predicate_arbitrary_ to one function as you proposed.
>> 2. What is your opinion about using more simple decision about
>> insertion point - if bb has use of phi result insert phi predication
>> before it and at the bb end otherwise. I assume that critical edge
>> splitting is not a good decision.
>
> Why not always insert before the use?  Which would be after labels,
> what we do for two-arg PHIs.  That is, how can it be that you predicate
> a PHI in BB1 and then for an edge predicate on one of its incoming
> edges you get SSA uses with defs that are in BB1 itself?  That
> can only happen for backedges but those you can't remove in any case.
>
> Richard.
>
>>
>> Best regards.
>> Yuri.
>>
>> 2014-12-02 16:28 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Mon, Dec 1, 2014 at 4:53 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Hi Richard,
>>>>
>>>> I resend you patch1 and patch2 with minor changes:
>>>> 1. I renamed flag_force_vectorize to aggressive_if_conv.
>>>> 2. Use static cast for the first argument of gimple_phi_arg_edge.
>>>> I also very sorry that I sent you bad patch.
>>>>
>>>> Now let me answer on your questions related to second patch.
>>>> 1. Why we need both predicate_extended_scalar_phi and
>>>> predicate_arbitrary_scalar_phi?
>>>>
>>>> Let's consider the following simple test-case:
>>>>
>>>>   #pragma omp simd safelen(8)
>>>>   for (i=0; i<512; i++)
>>>>   {
>>>>     float t = a[i];
>>>>     if (t > 0.0f & t < 1.0e+17f)
>>>>       if (c[i] != 0)  /* c is integer array. */
>>>> res += 1;
>>>>   }
>>>>
>>>> we can see the following phi node correspondent to res:
>>>>
>>>> # res_1 = PHI <res_15(3), res_15(4), res_10(5)>
>>>>
>>>> It is clear that we can optimize it to phi node with 2 arguments only
>>>> and only one check can be used for phi predication (for reduction in
>>>> our case), namely predicate of bb_5. In general case we can't do it
>>>> even if we sort all phi argument values since we still have to produce
>>>> a chain of cond expressions to perform phi predication (see comments
>>>> for predicate_arbitrary_scalar_phi).
>>>
>>> How so?  We can always use !(condition) for the "last" value, thus
>>> treat it as an 'else' case.  That even works for
>>>
>>> # res_1 = PHI <res_15(3), res_15(4), res_10(5), res_10(7)>
>>>
>>> where the condition for edges 5 and 7 can be computed as
>>> ! (condition for 3 || condition for 4).
>>>
>>> Of course it is worthwhile to also sort single-occurances first
>>> so your case gets just the condiiton for edge 5 and its inversion
>>> used for edges 3 and 4 combined.
>>>
>>>> 2. Why we need to introduce find_insertion_point?
>>>>  Let's consider another test-case extracted from 175.vpr ( t5.c is
>>>> attached) and we can see that bb_7 and bb_9 containig phi nodes has
>>>> only critical incoming edges and both contain code computing edge
>>>> predicates, e.g.
>>>>
>>>> <bb 7>:
>>>> # xmax_edge_30 = PHI <xmax_edge_36(6), xmax_edge_18(5)>
>>>> _46 = xmax_17 == xmax_37;
>>>> _47 = xmax_17 == xmax_27;
>>>> _48 = _46 & _47;
>>>> _53 = xmax_17 == xmax_37;
>>>> _54 = ~_53;
>>>> _55 = xmax_17 == xmax_27;
>>>> _56 = _54 & _55;
>>>> _57 = _48 | _56;
>>>> xmax_edge_19 = xmax_edge_39 + 1;
>>>> goto <bb 11>;
>>>>
>>>> It is evident that we can not put phi predication at the block
>>>> beginning but need to put it after predicate computations.
>>>> Note also that if there are no critical edges for phi arguments
>>>> insertion point will be "after labels" Note also that phi result can
>>>> have use in this block too, so we can't put predication code to the
>>>> block end.
>>>
>>> So the issue is that predicate insertion for edge predicates does
>>> not happen on the edge but somewhere else (generally impossible
>>> for critical edges unless you split them).
>>>
>>> I think I've told you before that I prefer simple solutions to such issues,
>>> like splitting the edge!  Certainly not involving a function walking
>>> GENERIC expressions.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Let me know if you still have any questions.
>>>>
>>>> Best regards.
>>>> Yuri.
>>>>
>>>>
>>>>
>>>>
>>>> 2014-11-28 15:43 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Wed, Nov 12, 2014 at 2:35 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Here is the second patch related to extended predication.
>>>>>> Few comments which explain a main goal of design.
>>>>>>
>>>>>> 1. I don't want to insert any critical edge splitting since it may
>>>>>> lead to less efficient binaries.
>>>>>> 2. One special case of extended PHI node predication was introduced
>>>>>> when #arguments is more than 2 but only two arguments are different
>>>>>> and one argument has the only occurrence. For such PHI conditional
>>>>>> scalar reduction is applied.
>>>>>> This is correspondent to the following statement:
>>>>>>     if (q1 && q2 && q3) var++
>>>>>>  New function phi_has_two_different_args was introduced to detect such phi.
>>>>>> 3. Original algorithm for PHI predication used assumption that at
>>>>>> least one incoming edge for blocks containing PHI is not critical - it
>>>>>> guarantees that all computations related to predicate of normal edge
>>>>>> are already inserted above this block and
>>>>>> code related to PHI predication can be inserted at the beginning of
>>>>>> block. But this is not true for critical edges for which predicate
>>>>>> computations are  in the block where code for phi predication must be
>>>>>> inserted. So new function find_insertion_point is introduced which is
>>>>>> simply found out the last statement in block defining predicates
>>>>>> correspondent to all incoming edges and insert phi predication code
>>>>>> after it (with some minor exceptions).
>>>>>
>>>>> Unfortunately the patch doesn't apply for me - I get
>>>>>
>>>>> patch: **** malformed patch at line 505: @@ -1720,6 +2075,8 @@
>>>>> predicate_all_scalar_phis (struct loop *loop)
>>>>>
>>>>> a few remarks nevertheless.  I don't see how we need both
>>>>> predicate_extended_scalar_phi and predicate_arbitrary_scalar_phi.
>>>>> Couldn't we simply sort an array of (edge, value) pairs after value
>>>>> and handle equal values specially in predicate_extended_scalar_phi?
>>>>> That would even make PHI <a, a, b, c, c> more optimal.
>>>>>
>>>>> I don't understand the need for find_insertion_point.  All SSA names
>>>>> required for the predicates are defined upward - and the complex CFG
>>>>> is squashed to a single basic-block, thus the defs will dominate the
>>>>> inserted code if you insert after labels just like for the other case.
>>>>> Or what am I missing?  ("flattening" of the basic-blocks of course needs
>>>>> to happen in dominator order - but I guess that happens already?)
>>>>>
>>>>> I'd like the extended PHI handling to be enablable by a flag even
>>>>> for !force-vectorization - I've seen cases with 3 PHI args multiple
>>>>> times that would have been nice to vectorize.  I suggest to
>>>>> add -ftree-loop-if-convert-aggressive for this.  We can do this as
>>>>> followup, but please rename the local flag_force_vectorize flag
>>>>> to something less looking like a flag, like simply 'aggressive'.
>>>>>
>>>>> Otherwise patch 2 looks ok to me.
>>>>>
>>>>> Richard.
>>>>>
>>>>>
>>>>>> ChangeLog:
>>>>>>
>>>>>> 2014-10-24  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>
>>>>>> * tree-if-conv.c (ifcvt_can_use_mask_load_store): Use
>>>>>> FLAG_FORCE_VECTORIZE instead of loop flag.
>>>>>> (if_convertible_bb_p): Allow bb has more than 2 predecessors if
>>>>>> FLAG_FORCE_VECTORIZE is true.
>>>>>> (if_convertible_bb_p): Delete check that bb has at least one
>>>>>> non-critical incoming edge.
>>>>>> (phi_has_two_different_args): New function.
>>>>>> (is_cond_scalar_reduction): Add argument EXTENDED to choose access
>>>>>> to phi arguments. Invoke phi_has_two_different_args to get phi
>>>>>> arguments if EXTENDED is true. Change check that block
>>>>>> containing reduction statement candidate is predecessor
>>>>>> of phi-block since phi may have more than two arguments.
>>>>>> (convert_scalar_cond_reduction): Add argument BEFORE to insert
>>>>>> statement before/after gsi point.
>>>>>> (predicate_scalar_phi): Add argument false (which means non-extended
>>>>>> predication) to call of is_cond_scalar_reduction. Add argument
>>>>>> true (which correspondent to argument BEFORE) to call of
>>>>>> convert_scalar_cond_reduction.
>>>>>> (get_predicate_for_edge): New function.
>>>>>> (predicate_arbitrary_scalar_phi): New function.
>>>>>> (predicate_extended_scalar_phi): New function.
>>>>>> (find_insertion_point): New function.
>>>>>> (predicate_all_scalar_phis): Add two boolean variables EXTENDED and
>>>>>> BEFORE. Initialize EXTENDED to true if BB containing phi has more
>>>>>> than 2 predecessors or both incoming edges are critical. Invoke
>>>>>> find_phi_replacement_condition and predicate_scalar_phi or
>>>>>> find_insertion_point and predicate_extended_scalar_phi depending on
>>>>>> EXTENDED value.
>>>>>> (insert_gimplified_predicates): Add check that non-predicated block
>>>>>> may have statements to insert. Insert predicate of BB just after label
>>>>>> if FLAG_FORCE_VECTORIZE is true.
>>>>>> (tree_if_conversion): Add initialization of FLAG_FORCE_VECTORIZE which
>>>>>> is copy of inner or outer loop field force_vectorize.

  reply	other threads:[~2014-12-04 13:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-12 13:36 Yuri Rumyantsev
2014-11-28 12:46 ` Richard Biener
2014-12-01 15:53   ` Yuri Rumyantsev
2014-12-02 13:28     ` Richard Biener
2014-12-02 15:29       ` Yuri Rumyantsev
2014-12-04 12:41         ` Richard Biener
2014-12-04 13:15           ` Yuri Rumyantsev [this message]
2014-12-04 13:37             ` Richard Biener
2014-12-09 13:11               ` Yuri Rumyantsev
2014-12-09 15:21                 ` Richard Biener
2014-12-10 10:54                   ` Yuri Rumyantsev
2014-12-10 14:31                     ` Richard Biener
2014-12-10 15:22                       ` Yuri Rumyantsev
2014-12-11  8:59                         ` Richard Biener
2014-12-16 15:16                           ` Yuri Rumyantsev
2014-12-17 15:45                             ` Richard Biener
2014-12-18 13:48                               ` Yuri Rumyantsev
2014-12-19 11:46                                 ` Richard Biener
2014-12-22 14:49                                   ` Yuri Rumyantsev
2015-01-09 12:31                                     ` Richard Biener
2015-01-14 13:33                                       ` Yuri Rumyantsev
2015-01-14 15:00                                         ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEoMCqTC5XHC2QjLrMPjNGj7f6APpgF6FXMxj+8ZNN_F=Bx6fA@mail.gmail.com' \
    --to=ysrumyan@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=izamyatin@gmail.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).