public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <jeffreyalaw@gmail.com>
To: Mariam Arutunian <mariamarutunian@gmail.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [RFC/RFA] [PATCH 08/12] Add a new pass for naive CRC loops detection
Date: Mon, 27 May 2024 22:20:33 -0600	[thread overview]
Message-ID: <b525ae0c-af72-4399-bd44-f8a381133f3e@gmail.com> (raw)
In-Reply-To: <CAE65F3PWUGsqZOJ=ympR_AC-fWNJFrYWS2Dqsp+GiENPmOY==w@mail.gmail.com>



On 5/24/24 2:42 AM, Mariam Arutunian wrote:
> This patch adds a new compiler pass aimed at identifying naive CRC 
> implementations,
> characterized by the presence of a loop calculating a CRC (polynomial 
> long division).
> Upon detection of a potential CRC, the pass prints an informational message.
> 
> Performs CRC optimization if optimization level is >= 2,
> besides optimizations for size and if fno_gimple_crc_optimization given.
> 
> This pass is added for the detection and optimization of naive CRC 
> implementations,
> improving the efficiency of CRC-related computations.
> 
> This patch includes only initial fast checks for filtering out non-CRCs,
> detected possible CRCs verification and optimization parts will be 
> provided in subsequent patches.
> 
>    gcc/
> 
>      * Makefile.in (OBJS): Add gimple-crc-optimization.o.
>      * common.opt (fgimple-crc-optimization): New option.
>      * doc/invoke.texi (-fgimple-crc-optimization): Add documentation.
>      * gimple-crc-optimization.cc: New file.
>      * gimple.cc (set_phi_stmts_not_visited): New function.
>      (set_gimple_stmts_not_visited): Likewise.
>      (set_bbs_stmts_not_visited): Likewise.
>      * gimple.h (set_gimple_stmts_not_visited): New extern function 
> declaration.
>      (set_phi_stmts_not_visited): New extern function declaration.
>      (set_bbs_stmts_not_visited): New extern function declaration.
>      * opts.cc (default_options_table): Add OPT_fgimple_crc_optimization.
>      (enable_fdo_optimizations): Enable gimple-crc-optimization.
>      * passes.def (pass_crc_optimization): Add new pass.
>      * timevar.def (TV_GIMPLE_CRC_OPTIMIZATION): New timevar.
>      * tree-pass.h (make_pass_crc_optimization): New extern function 
> declaration.
> 
> Signed-off-by: Mariam Arutunian <mariamarutunian@gmail.com 
> <mailto:mariamarutunian@gmail.com>>

> diff --git a/gcc/common.opt b/gcc/common.opt
> index 2c078fdd1f8..53f7ab255dd 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1757,6 +1757,12 @@ Common Var(flag_gcse_after_reload) Optimization
>  Perform global common subexpression elimination after register allocation has
>  finished.
>  
> +fgimple-crc-optimization
> +Common Var(flag_gimple_crc_optimization) Optimization
> +Detect loops calculating CRC and replace with faster implementation.
> +If the target supports carry-less-multiplication instruction, generate CRC using
> +it; otherwise generate table-based CRC.
This probably needs a minor update since your code can also generate a 
CRC instruction on x86 when it detects a CRC loop with the right polynomial.





> +
> +  /* Returns true if there is only two conditional blocks in the loop
> +     (one may be for the CRC bit check and the other for the loop counter).
> +     This may filter out some real CRCs, where more than one condition
> +     is checked for the CRC calculation.  */
> +  static bool loop_contains_two_conditional_bb (basic_block *loop_bbs,
> +						unsigned loop_num_nodes);
It's been a while, so if we're rehashing something we already worked 
through, I apologize.

IIRC we looked at the problem of canonicalizing the loop into a form 
where we didn't necessarily have conditional blocks, instead we had 
branchless sequences for the conditional xor and dealing with the high 
bit in the crc.  My recollection was that the coremark CRC loop would 
always canonicalize, but that in general we still saw multiple CRC 
implementations that did not canonicalize and thus we still needed the 
more complex matching.  Correct?


> +
> +  /* Checks whether found XOR_STMT is for calculating CRC.
> +     The function CRC_FUN calculates CRC only if there is a shift operation
> +     in the crc loop.  */
> +  bool xor_calculates_crc (function *crc_fun, basic_block *loop_bbs,
> +			   const gimple *xor_stmt);
So the second sentence in the comment doesn't really seem to relate to 
this function.  It also seems like we potentially have two xors in a CRC 
loop.  One which xors one bit of the data with one bit of the crc.  The 
other is the conditional xor.  Which does this refer to?  I'm guesing 
the latter since the former can likely hoist out of the loop given a 
sufficiently smart loop optimizer.



> +
> +  /* Checks that the variable used in the condition COND is the assumed CRC
> +     (or depends on the assumed CRC).
> +     Also sets data member m_phi_for_data if it isn't set and exists.  */
> +  bool is_crc_checked (gcond *cond, basic_block *loop_bbs);
> +
> +  /* Returns true if condition COND checks MSB/LSB bit is 1.
> +     Otherwise, return false.  */
> +  static bool cond_true_is_checked_for_bit_one (const gcond *cond);
> +
> +  /* Returns opposite block of the XOR_BB from PRED_BB's dest blocks.  */
> +  static basic_block get_xor_bb_opposite (basic_block pred_bb,
> +					  basic_block xor_bb);
> +
> +  /* Checks whether the pair of xor's shift exists in the opposite
> +     basic block (OPPOSITE_BB).
> +     If there is a shift and xor in the same block,
> +     then in the opposite block must be another shift.  */
> +  bool exists_shift_for_opp_xor_shift (basic_block opposite_bb);
> +
> +  /* Goes down by def-use chain (within the CRC loop) and returns the statement
> +     where variable (dependent on xor-ed variable) is shifted with 1.
> +     Between xor and shift operations only phi statements are allowed.
> +     Otherwise, returns nullptr.  */
> +  gimple *find_shift_after_xor (tree xored_crc);
> +
> +  /* Returns the statement which does shift 1 operation.
> +     If there is no such statement, returns nullptr.
> +     STMTS - are the statements within the loop before xor operation on
> +     which possible CRC variable depends.  */
> +  gimple *find_shift_before_xor (const auto_vec <gimple *> &stmts);
> +
> +  /* Returns true if ASSIGN_STMT does shift with 1.
> +     Otherwise, returns false.  */
> +  bool can_be_crc_shift (gimple *assign_stmt);
> +
> +  /* Returns true if the operation done in ASSIGN_STMT can occur during CRC
> +     calculation.  Otherwise, returns false.  */
> +  bool can_not_be_crc_stmt (gimple *assign_stmt);
> +
> +  /* Returns true if the statement with STMT_CODE may occur in CRC calculation.
> +     Otherwise, returns false.  */
> +  static bool is_acceptable_stmt_code (const tree_code &stmt_code);
> +
> +  /* Prints extracted details of CRC calculation.  */
> +  void dump_crc_information ();
> +
> + public:
> +  unsigned int execute (function *fun);
> +};
> +


> +
> +/* Goes down by def-use chain (within the CRC loop) and returns the statement
> +   where variable (dependent on xor-ed variable) is shifted with 1.
> +   Between xor and shift operations only phi statements are allowed.
> +   Otherwise, returns nullptr.  */
Follow def-use chains of XORED_CRC and return the statement where 
XORED_CRC is shifted by one bit position.  Only PHI statements are 
allowed between XORED_CRC and the shift in the def-use chain.

If no such statement is found, return NULL.

Seems like a better function comment here.


> +
> +gimple *
> +crc_optimization::find_shift_after_xor (tree xored_crc)
> +{
> +  imm_use_iterator imm_iter;
> +  use_operand_p use_p;
> +
> +  if (TREE_CODE (xored_crc) != SSA_NAME)
> +    return nullptr;
If we always expect XORED_CRC to be an SSA_NAME, we might be able to use
gcc_assert TREE_CODE (XORED_CRC) == SSA_NAME);

The difference is the assert means the property should always hold. 
Whereas the IF as written just says that if we don't have an SSA_NAME, 
give up trying to optimize.
 > +> +/* Returns opposite block of the XOR_BB from PRED_BB's dest 
blocks.  */
> +
> +basic_block
> +crc_optimization::get_xor_bb_opposite (basic_block pred_bb, basic_block xor_bb)
> +{
> +  if (EDGE_COUNT (pred_bb->succs) != 2)
> +    return nullptr;
> +
> +  if (EDGE_SUCC (pred_bb, 0)->dest != xor_bb)
> +    return EDGE_SUCC (pred_bb, 0)->dest;
> +  return EDGE_SUCC (pred_bb, 1)->dest;
> +}
Consider using a gcc_assert to ensure that one of the two successors is 
xor_bb.

You might also want to reject edge->flags & EDGE_COMPLEX is nonzero.> +


> +/* Returns true if condition COND checks MSB/LSB bit is 1.
> +   Otherwise, return false.  */
> +
> +bool
> +crc_optimization::cond_true_is_checked_for_bit_one (const gcond *cond)
> +{
> +  if (!cond)
> +    return false;
> +
> +  tree rhs = gimple_cond_rhs (cond);
> +  enum tree_code code = gimple_cond_code (cond);
> +
> +  /* If the condition is something == 1 -> return true.  */
> +  if (integer_onep (rhs) && code == EQ_EXPR)
> +    return true;
Generally we prefer to have the cheapest test first.  So checking the 
code against EQ_EXPR first is marginally better.

> +
> +  /* If the condition is something != 0  or something < 0 -> return true.  */
> +  if (integer_zerop (rhs) && (code == NE_EXPR || code == LT_EXPR))
> +    return true;
We generally discourage having && and || conditions on the same line. 
I'd rewrite as
   if ((code == NE_EXPR || CODE == LT_EXPR)
       && integer_zerop (rhs))

Do you need to check for EQ_EXPR with rhs having the value 0?  What 
about NE_EXPR with the value 1?

 > +> +/* Checks that the condition is for checking CRC.
> +   Returns true if xor is done under the condition of MSB/LSB being 1, and
> +   the condition's variable and xor-ed variable depend on the same variable.
> +   Otherwise, returns false.
> +   XOR_BB is the basic block, where the xor operation is done.
> +   PRED_BB is the predecessor basic block of the XOR_BB, it is assumed that
> +   the last stmt of PRED_BB checks the condition under which xor is done.  */
> +
> +bool
> +crc_optimization::crc_cond (basic_block pred_bb, basic_block xor_bb,
> +			    basic_block *loop_bbs)
> +{
> +  /* Check whether PRED_BB contains condition.  We will consider only those
> +     cases when xor is done immediately under the condition.  */
> +  gcond *cond = safe_dyn_cast<gcond *> (*gsi_last_bb (pred_bb));
Hmm.  I think this works, but it's a bit obscure as it depends on 
operator* returning a pointer to the statement.  I didn't even know that 
worked.

Something like this might be more obvious:

gsi_stmt (gsi_last_bb (pred_bb)) should give you the last statement. 
Then you can see if it's a gimple cond.



> +
> +/* Returns true if the statement with STMT_CODE may occur in CRC calculation.
> +   Otherwise, returns false.  */
> +
> +bool
> +crc_optimization::is_acceptable_stmt_code (const tree_code &stmt_code)
> +{
> +  return stmt_code == BIT_IOR_EXPR
> +	 || stmt_code == BIT_AND_EXPR
> +	 || stmt_code == BIT_XOR_EXPR
> +	 || stmt_code == MINUS_EXPR
> +	 || stmt_code == PLUS_EXPR
> +	 || stmt_code == RSHIFT_EXPR
> +	 || stmt_code == LSHIFT_EXPR
> +	 || TREE_CODE_CLASS (stmt_code) == tcc_unary;
Go ahead and put parenthesis around all those tests :-)



> +
> +/* Set M_PHI_FOR_CRC and M_PHI_FOR_DATA fields.
> +   Returns false if there are more than two (as in CRC calculation only CRC's
> +   and data's phi may exist) or no phi statements in STMTS (at least there must
> +   be CRC's phi).
> +   Otherwise, returns true.  */
> +
> +bool
> +crc_optimization::set_crc_and_data_phi (auto_vec<gimple *> &stmts)
> +{
> +  for (auto stmt_it = stmts.begin (); stmt_it != stmts.end (); stmt_it++)
> +    {
> +      if (is_a<gphi *> (*stmt_it) && bb_loop_header_p (gimple_bb (*stmt_it)))
> +	{
> +	  if (!m_phi_for_crc)
> +	    m_phi_for_crc = as_a<gphi *> (*stmt_it);
> +	  else if (!m_phi_for_data)
> +	    m_phi_for_data = as_a<gphi *> (*stmt_it);
> +	  else
> +	    {
> +	      if (dump_file && (dump_flags & TDF_DETAILS))
> +		fprintf (dump_file, "Xor-ed variable depends on more than 2 "
> +				    "phis.\n");
> +	      return false;
> +	    }
> +	}
> +    }
> +  return m_phi_for_crc;
Hmm.  For a given PHI, how do we know if it's for the data item or the 
crc item, or something else (like a loop counter) entirely?



> diff --git a/gcc/gimple.cc b/gcc/gimple.cc
> index a9f968cb038..76518d8897d 100644
> --- a/gcc/gimple.cc
> +++ b/gcc/gimple.cc
> @@ -3426,6 +3426,44 @@ gimple_or_expr_nonartificial_location (gimple *stmt, tree expr)
>    return expansion_point_location_if_in_system_header (loc);
>  }
>  
> +/* Set GIMPLE_PHI statements of the BB not visited.  */
> +
> +void
> +set_phi_stmts_not_visited (basic_block bb)
[ ... ]
Something tells me we probably need to convert some code to use these 
new API points.  Not a requirement for this to go forward, but a useful 
cleanup.


Overall it looks pretty good.  I don't see anything that looks horribly 
wrong.  My biggest worry is that set_crc_and_data_phi routine.  But 
there's other small things noted above that we need to clean up.

So let's get the questions answered and some cleanups done, then repost.

Thanks,
Jeff

  reply	other threads:[~2024-05-28  4:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24  8:42 Mariam Arutunian
2024-05-28  4:20 ` Jeff Law [this message]
2024-05-29 11:12   ` Mariam Arutunian
2024-06-08 22:00     ` Jeff Law
2024-06-19 15:27       ` Mariam Arutunian
2024-05-28  7:01 ` Richard Biener
2024-05-29 22:30   ` Jeff Law
2024-05-30  7:33     ` Richard Biener
2024-05-29 12:43 ` David Malcolm
2024-06-04 13:41 Mariam Arutunian
2024-06-08 21:47 ` Jeff Law
2024-06-19 14:29   ` Mariam Arutunian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b525ae0c-af72-4399-bd44-f8a381133f3e@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=mariamarutunian@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).