From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 116078 invoked by alias); 4 Nov 2015 15:37:35 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 116056 invoked by uid 89); 4 Nov 2015 15:37:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.6 required=5.0 tests=AWL,BAYES_50,SPF_PASS,UNSUBSCRIBE_BODY autolearn=no version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 04 Nov 2015 15:37:32 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-37-euK8RrPJThyR7iyhsai2Xw-1; Wed, 04 Nov 2015 15:37:26 +0000 Received: from e107456-lin.cambridge.arm.com ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 4 Nov 2015 15:37:25 +0000 From: James Greenhalgh To: gcc-patches@gcc.gnu.org Cc: bernds_cb1@t-online.de, law@redhat.com, ebotcazou@libertysurf.fr, steven@gcc.gnu.org, Kyrylo.Tkachov@arm.com, richard.guenther@gmail.com Subject: Re: [Patch ifcvt] Teach RTL ifcvt to handle multiple simple set instructions Date: Wed, 04 Nov 2015 15:37:00 -0000 Message-Id: <1446651440-23017-1-git-send-email-james.greenhalgh@arm.com> In-Reply-To: <5639E633.7030705@redhat.com> References: <5639E633.7030705@redhat.com> MIME-Version: 1.0 X-MC-Unique: euK8RrPJThyR7iyhsai2Xw-1 Content-Type: multipart/mixed; boundary="------------2.2.0.1.gd394abb.dirty" X-IsSubscribed: yes X-SW-Source: 2015-11/txt/msg00365.txt.bz2 This is a multi-part message in MIME format. --------------2.2.0.1.gd394abb.dirty Content-Type: text/plain; charset=UTF-8; format=fixed Content-Transfer-Encoding: quoted-printable Content-length: 1716 On Wed, Nov 04, 2015 at 12:04:19PM +0100, Bernd Schmidt wrote: > On 10/30/2015 07:03 PM, James Greenhalgh wrote: > >+ i =3D tmp_i; <- Should be cleaned up > > Maybe reword as "Subsequent passes are expected to clean up the > extra moves", otherwise it sounds like a TODO item. > > >+ read back in anotyher SET, as might occur in a swap idiom or > > Typo. > > >+ if (find_reg_note (insn, REG_DEAD, new_val) !=3D NULL_RTX) > >+ { > >+ /* The write to targets[i] is only live until the read > >+ here. As the condition codes match, we can propagate > >+ the set to here. */ > >+ new_val =3D SET_SRC (single_set (unmodified_insns[i])); > >+ } > > Shouldn't use braces around single statements (also goes for the > surrounding for loop). > > >+ /* We must have at least one real insn to convert, or there will > >+ be trouble! */ > >+ unsigned count =3D 0; > > The comment seems a bit strange in this context - I think it's left > over from the earlier version? > > As far as I'm concerned this is otherwise ok. Thanks, I've updated the patch with those issues addressed. As the cost model was controversial in an earlier revision, I'll leave this on list for 24 hours and, if nobody jumps in to object, commit it tomorrow. I've bootstrapped and tested the updated patch on x86_64-none-linux-gnu just to check that I got the braces right, with no issues. Thanks, James --- gcc/ 2015-11-04 James Greenhalgh * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): New. (noce_convert_multiple_sets): Likewise. (noce_process_if_block): Call them. gcc/testsuite/ 2015-11-04 James Greenhalgh * gcc.dg/ifcvt-4.c: New. --------------2.2.0.1.gd394abb.dirty Content-Type: text/x-patch; name=0001-Re-Patch-ifcvt-Teach-RTL-ifcvt-to-handle-multiple-si.patch Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Re-Patch-ifcvt-Teach-RTL-ifcvt-to-handle-multiple-si.patch" Content-length: 9077 diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index f23d9afd..1c33283 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -3016,6 +3016,244 @@ bb_valid_for_noce_process_p (basic_block test_bb, r= tx cond, return false; } =20 +/* We have something like: + + if (x > y) + { i =3D a; j =3D b; k =3D c; } + + Make it: + + tmp_i =3D (x > y) ? a : i; + tmp_j =3D (x > y) ? b : j; + tmp_k =3D (x > y) ? c : k; + i =3D tmp_i; + j =3D tmp_j; + k =3D tmp_k; + + Subsequent passes are expected to clean up the extra moves. + + Look for special cases such as writes to one register which are + read back in another SET, as might occur in a swap idiom or + similar. + + These look like: + + if (x > y) + i =3D a; + j =3D i; + + Which we want to rewrite to: + + tmp_i =3D (x > y) ? a : i; + tmp_j =3D (x > y) ? tmp_i : j; + i =3D tmp_i; + j =3D tmp_j; + + We can catch these when looking at (SET x y) by keeping a list of the + registers we would have targeted before if-conversion and looking back + through it for an overlap with Y. If we find one, we rewire the + conditional set to use the temporary we introduced earlier. + + IF_INFO contains the useful information about the block structure and + jump instructions. */ + +static int +noce_convert_multiple_sets (struct noce_if_info *if_info) +{ + basic_block test_bb =3D if_info->test_bb; + basic_block then_bb =3D if_info->then_bb; + basic_block join_bb =3D if_info->join_bb; + rtx_insn *jump =3D if_info->jump; + rtx_insn *cond_earliest; + rtx_insn *insn; + + start_sequence (); + + /* Decompose the condition attached to the jump. */ + rtx cond =3D noce_get_condition (jump, &cond_earliest, false); + rtx x =3D XEXP (cond, 0); + rtx y =3D XEXP (cond, 1); + rtx_code cond_code =3D GET_CODE (cond); + + /* The true targets for a conditional move. */ + vec targets =3D vNULL; + /* The temporaries introduced to allow us to not consider register + overlap. */ + vec temporaries =3D vNULL; + /* The insns we've emitted. */ + vec unmodified_insns =3D vNULL; + int count =3D 0; + + FOR_BB_INSNS (then_bb, insn) + { + /* Skip over non-insns. */ + if (!active_insn_p (insn)) + continue; + + rtx set =3D single_set (insn); + gcc_checking_assert (set); + + rtx target =3D SET_DEST (set); + rtx temp =3D gen_reg_rtx (GET_MODE (target)); + rtx new_val =3D SET_SRC (set); + rtx old_val =3D target; + + /* If we were supposed to read from an earlier write in this block, + we've changed the register allocation. Rewire the read. While + we are looking, also try to catch a swap idiom. */ + for (int i =3D count - 1; i >=3D 0; --i) + if (reg_overlap_mentioned_p (new_val, targets[i])) + { + /* Catch a "swap" style idiom. */ + if (find_reg_note (insn, REG_DEAD, new_val) !=3D NULL_RTX) + /* The write to targets[i] is only live until the read + here. As the condition codes match, we can propagate + the set to here. */ + new_val =3D SET_SRC (single_set (unmodified_insns[i])); + else + new_val =3D temporaries[i]; + break; + } + + /* If we had a non-canonical conditional jump (i.e. one where + the fallthrough is to the "else" case) we need to reverse + the conditional select. */ + if (if_info->then_else_reversed) + std::swap (old_val, new_val); + + /* Actually emit the conditional move. */ + rtx temp_dest =3D noce_emit_cmove (if_info, temp, cond_code, + x, y, new_val, old_val); + + /* If we failed to expand the conditional move, drop out and don't + try to continue. */ + if (temp_dest =3D=3D NULL_RTX) + { + end_sequence (); + return FALSE; + } + + /* Bookkeeping. */ + count++; + targets.safe_push (target); + temporaries.safe_push (temp_dest); + unmodified_insns.safe_push (insn); + } + + /* We must have seen some sort of insn to insert, otherwise we were + given an empty BB to convert, and we can't handle that. */ + gcc_assert (!unmodified_insns.is_empty ()); + + /* Now fixup the assignments. */ + for (int i =3D 0; i < count; i++) + noce_emit_move_insn (targets[i], temporaries[i]); + + /* Actually emit the sequence. */ + rtx_insn *seq =3D get_insns (); + + for (insn =3D seq; insn; insn =3D NEXT_INSN (insn)) + set_used_flags (insn); + + /* Mark all our temporaries and targets as used. */ + for (int i =3D 0; i < count; i++) + { + set_used_flags (temporaries[i]); + set_used_flags (targets[i]); + } + + set_used_flags (cond); + set_used_flags (x); + set_used_flags (y); + + unshare_all_rtl_in_chain (seq); + end_sequence (); + + if (!seq) + return FALSE; + + for (insn =3D seq; insn; insn =3D NEXT_INSN (insn)) + if (JUMP_P (insn) + || recog_memoized (insn) =3D=3D -1) + return FALSE; + + emit_insn_before_setloc (seq, if_info->jump, + INSN_LOCATION (unmodified_insns.last ())); + + /* Clean up THEN_BB and the edges in and out of it. */ + remove_edge (find_edge (test_bb, join_bb)); + remove_edge (find_edge (then_bb, join_bb)); + redirect_edge_and_branch_force (single_succ_edge (test_bb), join_bb); + delete_basic_block (then_bb); + num_true_changes++; + + /* Maybe merge blocks now the jump is simple enough. */ + if (can_merge_blocks_p (test_bb, join_bb)) + { + merge_blocks (test_bb, join_bb); + num_true_changes++; + } + + num_updated_if_blocks++; + return TRUE; +} + +/* Return true iff basic block TEST_BB is comprised of only + (SET (REG) (REG)) insns suitable for conversion to a series + of conditional moves. FORNOW: Use II to find the expected cost of + the branch into/over TEST_BB. + + TODO: This creates an implicit "magic number" for branch_cost. + II->branch_cost now guides the maximum number of set instructions in + a basic block which is considered profitable to completely + if-convert. */ + +static bool +bb_ok_for_noce_convert_multiple_sets (basic_block test_bb, + struct noce_if_info *ii) +{ + rtx_insn *insn; + unsigned count =3D 0; + + FOR_BB_INSNS (test_bb, insn) + { + /* Skip over notes etc. */ + if (!active_insn_p (insn)) + continue; + + /* We only handle SET insns. */ + rtx set =3D single_set (insn); + if (set =3D=3D NULL_RTX) + return false; + + rtx dest =3D SET_DEST (set); + rtx src =3D SET_SRC (set); + + /* We can possibly relax this, but for now only handle REG to REG + moves. This avoids any issues that might come from introducing + loads/stores that might violate data-race-freedom guarantees. */ + if (!(REG_P (src) && REG_P (dest))) + return false; + + /* Destination must be appropriate for a conditional write. */ + if (!noce_operand_ok (dest)) + return false; + + /* We must be able to conditionally move in this mode. */ + if (!can_conditionally_move_p (GET_MODE (dest))) + return false; + + ++count; + } + + /* FORNOW: Our cost model is a count of the number of instructions we + would if-convert. This is suboptimal, and should be improved as part + of a wider rework of branch_cost. */ + if (count > ii->branch_cost) + return FALSE; + + return count > 0; +} + /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to conv= ert it without using conditional execution. Return TRUE if we were success= ful at converting the block. */ @@ -3038,12 +3276,22 @@ noce_process_if_block (struct noce_if_info *if_info) (1) if (...) x =3D a; else x =3D b; (2) x =3D b; if (...) x =3D a; (3) if (...) x =3D a; // as if with an initial x =3D x. - + (4) if (...) { x =3D a; y =3D b; z =3D c; } // Like 3, for multiple = SETS. The later patterns require jumps to be more expensive. For the if (...) x =3D a; else x =3D b; case we allow multiple insns inside the then and else blocks as long as their only effect is to calculate a value for x. - ??? For future expansion, look for multiple X in such patterns. */ + ??? For future expansion, further expand the "multiple X" rules. */ + + /* First look for multiple SETS. */ + if (!else_bb + && HAVE_conditional_move + && !HAVE_cc0 + && bb_ok_for_noce_convert_multiple_sets (then_bb, if_info)) + { + if (noce_convert_multiple_sets (if_info)) + return TRUE; + } =20 if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->then_cost, &if_info->then_simple)) diff --git a/gcc/testsuite/gcc.dg/ifcvt-4.c b/gcc/testsuite/gcc.dg/ifcvt-4.c new file mode 100644 index 0000000..16be2b0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ifcvt-4.c @@ -0,0 +1,16 @@ +/* { dg-options "-fdump-rtl-ce1 -O2" } */ +int +foo (int x, int y, int a) +{ + int i =3D x; + int j =3D y; + /* Try to make taking the branch likely. */ + __builtin_expect (x > y, 1); + if (x > y) + { + i =3D a; + j =3D i; + } + return i * j; +} +/* { dg-final { scan-rtl-dump "2 true changes made" "ce1" } } */ --------------2.2.0.1.gd394abb.dirty--