From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 3CFA53858402; Fri, 24 Sep 2021 10:45:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3CFA53858402 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18OAhNat011217; Fri, 24 Sep 2021 06:45:30 -0400 Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 3b9d5rr18v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Sep 2021 06:45:30 -0400 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 18OAW08c011078; Fri, 24 Sep 2021 10:45:26 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma06ams.nl.ibm.com with ESMTP id 3b93g6vrxy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Sep 2021 10:45:26 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 18OAeW0P61866396 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 24 Sep 2021 10:40:32 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2B73DA408D; Fri, 24 Sep 2021 10:45:24 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E5F9DA4094; Fri, 24 Sep 2021 10:45:23 +0000 (GMT) Received: from sig-9-145-45-184.uk.ibm.com (unknown [9.145.45.184]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 24 Sep 2021 10:45:23 +0000 (GMT) Message-ID: <76de6663043e7cf1b9ecc935a3d32ea85e7a3aaf.camel@linux.ibm.com> Subject: Re: [PATCH v2 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses From: Ilya Leoshkevich To: Richard Biener Cc: gcc-patches@gcc.gnu.org, Andreas Krebbel Date: Fri, 24 Sep 2021 12:45:23 +0200 In-Reply-To: <6s684943-866-q9sq-s7nr-1r1r9osrr95q@fhfr.qr> References: <20210921225429.326017-1-iii@linux.ibm.com> <20210921225429.326017-3-iii@linux.ibm.com> <6s684943-866-q9sq-s7nr-1r1r9osrr95q@fhfr.qr> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: oakGvYFKjjw_CJ9zygSmmFZBKTLBzbCN X-Proofpoint-ORIG-GUID: oakGvYFKjjw_CJ9zygSmmFZBKTLBzbCN X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.391,FMLib:17.0.607.475 definitions=2021-09-24_03,2021-09-24_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 impostorscore=0 lowpriorityscore=0 mlxlogscore=999 spamscore=0 mlxscore=0 malwarescore=0 adultscore=0 priorityscore=1501 clxscore=1015 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2109240063 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Sep 2021 10:45:33 -0000 On Thu, 2021-09-23 at 13:55 +0200, Richard Biener wrote: > On Wed, 22 Sep 2021, Ilya Leoshkevich wrote: > > > PR tree-optimization/49749 introduced code that shortens dependency > > chains containing loop accumulators by placing them last on operand > > lists of associative operations. > > > > 456.hmmer benchmark on s390 could benefit from this, however, the > > code > > that needs it modifies loop accumulator before using it, and since > > only > > so-called loop-carried phis are are treated as loop accumulators, > > the > > code in the present form doesn't really help.   According to Bill > > Schmidt - the original author - such a conservative approach was > > chosen > > so as to avoid unnecessarily swapping operands, which might cause > > unpredictable effects.  However, giving special treatment to forms > > of > > loop accumulators is acceptable. > > > > The definition of loop-carried phi is: it's a single-use phi, which > > is > > used in the same innermost loop it's defined in, at least one > > argument > > of which is defined in the same innermost loop as the phi itself. > > Given this, it seems natural to treat single uses of such phis as > > phis > > themselves. > > > > gcc/ChangeLog: > > > >         * tree-ssa-reassoc.c (biased_names): New global. > >         (propagate_bias_p): New function. > >         (loop_carried_phi): Remove. > >         (propagate_rank): Propagate bias along single uses. > >         (get_rank): Update biased_names when needed. > > --- > >  gcc/tree-ssa-reassoc.c | 97 ++++++++++++++++++++++++++++---------- > > ---- > >  1 file changed, 64 insertions(+), 33 deletions(-) > > > > diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c > > index 420c14e8cf5..2f7a8882aac 100644 > > --- a/gcc/tree-ssa-reassoc.c > > +++ b/gcc/tree-ssa-reassoc.c > > @@ -211,6 +211,10 @@ static int64_t *bb_rank; > >  /* Operand->rank hashtable.  */ > >  static hash_map *operand_rank; > >   > > +/* SSA_NAMEs that are forms of loop accumulators and whose ranks > > need to be > > +   biased.  */ > > +static auto_bitmap biased_names; > > + > >  /* Vector of SSA_NAMEs on which after reassociate_bb is done with > >     all basic blocks the CFG should be adjusted - basic blocks > >     split right after that SSA_NAME's definition statement and > > before > > @@ -256,6 +260,50 @@ reassoc_remove_stmt (gimple_stmt_iterator > > *gsi) > >     the rank difference between two blocks.  */ > >  #define PHI_LOOP_BIAS (1 << 15) > >   > > +/* Return TRUE iff PHI_LOOP_BIAS should be propagated from one of > > the STMT's > > +   operands to the STMT's left-hand side.  The goal is to preserve > > bias in code > > +   like this: > > + > > +     x_1 = phi(x_0, x_2) > > +     a = x_1 | 1 > > +     b = a ^ 2 > > +     .MEM = b > > +     c = b + d > > +     x_2 = c + e > > + > > +   That is, we need to preserve bias along single-use chains > > originating from > > +   loop-carried phis.  Only GIMPLE_ASSIGNs to SSA_NAMEs are > > considered to be > > +   uses, because only they participate in rank propagation.  */ > > +static bool > > +propagate_bias_p (gimple *stmt) > > +{ > > +  use_operand_p use; > > +  imm_use_iterator use_iter; > > +  gimple *single_use_stmt = NULL; > > + > > +  FOR_EACH_IMM_USE_FAST (use, use_iter, gimple_assign_lhs (stmt)) > > +    { > > +      gimple *current_use_stmt = USE_STMT (use); > > + > > +      if (is_gimple_assign (current_use_stmt) > > +         && TREE_CODE (gimple_assign_lhs (current_use_stmt)) == > > SSA_NAME) > > +       { > > +         if (single_use_stmt != NULL) > > what if single_use_stmt == current_use_stmt?  We might have two > uses on a stmt after all - should that still be biased?  I guess not > and thus the check is correct? Come to think of it, it should be ok to bias it. Things like x = x + x are fine (this particular case can be transformed into something else earlier, but I think the overall point still holds). > > > +           return false; > > +         single_use_stmt = current_use_stmt; > > +       } > > +    } > > + > > +  if (single_use_stmt == NULL) > > +    return false; > > + > > +  if (gimple_bb (stmt)->loop_father > > +      != gimple_bb (single_use_stmt)->loop_father) > > +    return false; > > + > > +  return true; > > +} > > + > >  /* Rank assigned to a phi statement.  If STMT is a loop-carried > > phi of > >     an innermost loop, and the phi has only a single use which is > > inside > >     the loop, then the rank is the block rank of the loop latch > > plus an > > @@ -313,46 +361,23 @@ phi_rank (gimple *stmt) > >    return bb_rank[bb->index]; > >  } > >   > > -/* If EXP is an SSA_NAME defined by a PHI statement that > > represents a > > -   loop-carried dependence of an innermost loop, return TRUE; else > > -   return FALSE.  */ > > -static bool > > -loop_carried_phi (tree exp) > > -{ > > -  gimple *phi_stmt; > > -  int64_t block_rank; > > - > > -  if (TREE_CODE (exp) != SSA_NAME > > -      || SSA_NAME_IS_DEFAULT_DEF (exp)) > > -    return false; > > - > > -  phi_stmt = SSA_NAME_DEF_STMT (exp); > > - > > -  if (gimple_code (SSA_NAME_DEF_STMT (exp)) != GIMPLE_PHI) > > -    return false; > > - > > -  /* Non-loop-carried phis have block rank.  Loop-carried phis > > have > > -     an additional bias added in.  If this phi doesn't have block > > rank, > > -     it's biased and should not be propagated.  */ > > -  block_rank = bb_rank[gimple_bb (phi_stmt)->index]; > > - > > -  if (phi_rank (phi_stmt) != block_rank) > > -    return true; > > - > > -  return false; > > -} > > - > >  /* Return the maximum of RANK and the rank that should be > > propagated > >     from expression OP.  For most operands, this is just the rank > > of OP. > >     For loop-carried phis, the value is zero to avoid undoing the > > bias > >     in favor of the phi.  */ > >  static int64_t > > -propagate_rank (int64_t rank, tree op) > > +propagate_rank (int64_t rank, tree op, gimple *stmt, bool *bias_p) > >  { > >    int64_t op_rank; > >   > > -  if (loop_carried_phi (op)) > > -    return rank; > > +  if (TREE_CODE (op) == SSA_NAME > > +      && bitmap_bit_p (biased_names, SSA_NAME_VERSION (op))) > > +    { > > +      if (propagate_bias_p (stmt)) > > +       *bias_p = true; > > +      else > > +       return rank; > > +    } > >   > >    op_rank = get_rank (op); > >   > > @@ -440,6 +465,8 @@ get_rank (tree e) > >   > >        else > >         { > > +         bool bias_p = false; > > + > >           /* Otherwise, find the maximum rank for the operands.  As > > an > >              exception, remove the bias from loop-carried phis when > > propagating > >              the rank so that dependent operations are not also > > biased.  */ > > @@ -448,9 +475,12 @@ get_rank (tree e) > >              thus have rank 0.  */ > >           rank = 0; > >           FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE) > > -           rank = propagate_rank (rank, op); > > +           rank = propagate_rank (rank, op, stmt, &bias_p); > > It looks like if (propagate_bias_p (stmt)) is loop invariant here > and so when we inline the head this should simplify, avoiding > the new parameters to propagate_rank? I managed to move propagate_bias_p (stmt) out of the loop, but couldn't find any worthy simplification opportunities. For example, if we move the biasing logic out of propagate_rank () into the loop, nothing gets cancelled out and the resulting code looks more cluttered. I'll post the v3 after regtest finishes. > > Otherwise looks good to me. > > Richard.