public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RFA: Avoiding unprofitable speculation
@ 2011-08-16 22:00 Jeff Law
  2011-08-17  9:34 ` Richard Guenther
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Law @ 2011-08-16 22:00 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2597 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


ifcvt.c is sometimes over-aggressive in speculating instructions from a
not-predicted path.

Given:

        if (test) goto E; // x not live
        x = big();
        goto L;
        E:
        x = b;
        goto M;


ifcvt wants to turn it into:

        x = b;
        if (test) goto M;
        x = big();
        goto L;

ie, we speculate x = b and remove a branch.

Similarly:

        if (test) goto over; // x not live
        x = a;
        goto label;
        over:

   becomes

        x = a;
        if (! test) goto label;


where again we speculate x = a and eliminate a branch.


ifcvt has tests to see if the code to speculate is cheap relative to the
cost of the branch we get to remove.  Unfortunately, that only takes
into account a static RTX cost.   We need to be looking at the branch
prediction too -- often the branch we're going to eliminate isn't going
to be executed at all!

Specifically, we should take the cost of the branch we want to eliminate
and scale that by how often we expect to reach that branch at runtime.
That allows us to compare the runtime cost of the speculative code vs
the runtime benefit of eliminating the branch.

Looking at branch & insn counts before/after that change is quite
interesting.   I've got gcc-4.6 built with/without the attached change.
 I then use that gcc-4.6 to compile a bunch of .i files under the
watchful eye of valgrind.

Using this data we can see the actual costs...  For every dynamic branch
eliminated, we had to execute an additional 300 instructions!
Again, remember these are dynamic counts, so we may have only speculated
one static instruction to eliminate one branch, but we hit that
speculated instruction far more often dynamically than the branch we
ultimately eliminated.



instructions w/o patch:1267286482371
instructions w   patch:1264140292731

branches w/o     patch: 231180206508
branches w       patch: 231190636813

Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for
trunk?



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOSt2sAAoJEBRtltQi2kC7UZUIAJ7fthVsCXxU3JOtIVbUSX5t
grCG73peQnBB7FhB58/jW1GJWc011mExLIJf74FDrNU+gMp3gn01L0zdjcaytmY6
sNjso7dLjW42a/wByzNlHNUy2KRMUqhobEhHYWgC0tMJFz8/ekCulI7h98pVISmT
np9G/1zRXn3uD7F3pKw7lLDS994nSUmjObPFIyFxTfVGhBTWZYY8JjKP7NsOCNli
Dr2BXFF4rahoSDUlcLHwPPBJPABLvxwvMo0dsmNkB3HEiajj7qVPGUYaGrTJ5M1g
Bvww+ozzJRT96qQ/smjVZutD2Cu/U74Ix6EX8Yj54Zp2HeX8tCJ3kXWPFI6cBpk=
=3BEA
-----END PGP SIGNATURE-----

[-- Attachment #2: Q --]
[-- Type: text/plain, Size: 6368 bytes --]

Index: ifcvt.c
===================================================================
*** ifcvt.c	(revision 177629)
--- ifcvt.c	(working copy)
*************** static int cond_exec_changed_p;
*** 85,91 ****
  
  /* Forward references.  */
  static int count_bb_insns (const_basic_block);
! static bool cheap_bb_rtx_cost_p (const_basic_block, int);
  static rtx first_active_insn (basic_block);
  static rtx last_active_insn (basic_block, int);
  static rtx find_active_insn_before (basic_block, rtx);
--- 85,91 ----
  
  /* Forward references.  */
  static int count_bb_insns (const_basic_block);
! static bool cheap_bb_rtx_cost_p (const_basic_block, int, int);
  static rtx first_active_insn (basic_block);
  static rtx last_active_insn (basic_block, int);
  static rtx find_active_insn_before (basic_block, rtx);
*************** count_bb_insns (const_basic_block bb)
*** 131,150 ****
  
  /* Determine whether the total insn_rtx_cost on non-jump insns in
     basic block BB is less than MAX_COST.  This function returns
!    false if the cost of any instruction could not be estimated.  */
  
  static bool
! cheap_bb_rtx_cost_p (const_basic_block bb, int max_cost)
  {
    int count = 0;
    rtx insn = BB_HEAD (bb);
    bool speed = optimize_bb_for_speed_p (bb);
  
    while (1)
      {
        if (NONJUMP_INSN_P (insn))
  	{
! 	  int cost = insn_rtx_cost (PATTERN (insn), speed);
  	  if (cost == 0)
  	    return false;
  
--- 131,161 ----
  
  /* Determine whether the total insn_rtx_cost on non-jump insns in
     basic block BB is less than MAX_COST.  This function returns
!    false if the cost of any instruction could not be estimated. 
! 
!    The cost of the non-jump insns in BB is scaled by REG_BR_PROB_BASE
!    as those insns are being speculated.  MAX_COST is scaled with SCALE
!    plus a small fudge factor.  */
  
  static bool
! cheap_bb_rtx_cost_p (const_basic_block bb, int scale, int max_cost)
  {
    int count = 0;
    rtx insn = BB_HEAD (bb);
    bool speed = optimize_bb_for_speed_p (bb);
  
+   /* Our branch probability/scaling factors are just estimates and don't
+      account for cases where we can get speculation for free and other
+      secondary benefits.  So we fudge the scale factor to make speculating
+      appear a little more profitable.  */
+   scale += REG_BR_PROB_BASE / 8;
+   max_cost *= scale;
+ 
    while (1)
      {
        if (NONJUMP_INSN_P (insn))
  	{
! 	  int cost = insn_rtx_cost (PATTERN (insn), speed) * REG_BR_PROB_BASE;
  	  if (cost == 0)
  	    return false;
  
*************** find_if_case_1 (basic_block test_bb, edg
*** 3796,3802 ****
    basic_block then_bb = then_edge->dest;
    basic_block else_bb = else_edge->dest;
    basic_block new_bb;
!   int then_bb_index;
  
    /* If we are partitioning hot/cold basic blocks, we don't want to
       mess up unconditional or indirect jumps that cross between hot
--- 3807,3814 ----
    basic_block then_bb = then_edge->dest;
    basic_block else_bb = else_edge->dest;
    basic_block new_bb;
!   int then_bb_index, then_prob;
!   rtx note;
  
    /* If we are partitioning hot/cold basic blocks, we don't want to
       mess up unconditional or indirect jumps that cross between hot
*************** find_if_case_1 (basic_block test_bb, edg
*** 3839,3846 ****
  	     "\nIF-CASE-1 found, start %d, then %d\n",
  	     test_bb->index, then_bb->index);
  
!   /* THEN is small.  */
!   if (! cheap_bb_rtx_cost_p (then_bb,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (then_edge->src),
  				    predictable_edge_p (then_edge)))))
      return FALSE;
--- 3851,3865 ----
  	     "\nIF-CASE-1 found, start %d, then %d\n",
  	     test_bb->index, then_bb->index);
  
!   note = find_reg_note (BB_END (test_bb), REG_BR_PROB, NULL_RTX);
!   if (!note)
!     then_prob = REG_BR_PROB_BASE / 2;
!   else
!     then_prob = REG_BR_PROB_BASE - INTVAL (XEXP (note, 0));
! 
!   /* We're speculating from the THEN path, we want to make sure the cost
!      of speculation is within reason.  */
!   if (! cheap_bb_rtx_cost_p (then_bb, then_prob,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (then_edge->src),
  				    predictable_edge_p (then_edge)))))
      return FALSE;
*************** find_if_case_2 (basic_block test_bb, edg
*** 3899,3904 ****
--- 3918,3924 ----
    basic_block then_bb = then_edge->dest;
    basic_block else_bb = else_edge->dest;
    edge else_succ;
+   int then_prob, else_prob;
    rtx note;
  
    /* If we are partitioning hot/cold basic blocks, we don't want to
*************** find_if_case_2 (basic_block test_bb, edg
*** 3938,3946 ****
    if (then_bb->index < NUM_FIXED_BLOCKS)
      return FALSE;
  
-   /* ELSE is predicted or SUCC(ELSE) postdominates THEN.  */
    note = find_reg_note (BB_END (test_bb), REG_BR_PROB, NULL_RTX);
!   if (note && INTVAL (XEXP (note, 0)) >= REG_BR_PROB_BASE / 2)
      ;
    else if (else_succ->dest->index < NUM_FIXED_BLOCKS
  	   || dominated_by_p (CDI_POST_DOMINATORS, then_bb,
--- 3958,3977 ----
    if (then_bb->index < NUM_FIXED_BLOCKS)
      return FALSE;
  
    note = find_reg_note (BB_END (test_bb), REG_BR_PROB, NULL_RTX);
!   if (!note)
!     {
!       else_prob = REG_BR_PROB_BASE / 2;
!       then_prob = REG_BR_PROB_BASE / 2;
!     }
!   else
!     {
!       else_prob = INTVAL (XEXP (note, 0));
!       then_prob = REG_BR_PROB_BASE - else_prob;
!     }
! 
!   /* ELSE is predicted or SUCC(ELSE) postdominates THEN.  */
!   if (else_prob > then_prob)
      ;
    else if (else_succ->dest->index < NUM_FIXED_BLOCKS
  	   || dominated_by_p (CDI_POST_DOMINATORS, then_bb,
*************** find_if_case_2 (basic_block test_bb, edg
*** 3955,3962 ****
  	     "\nIF-CASE-2 found, start %d, else %d\n",
  	     test_bb->index, else_bb->index);
  
!   /* ELSE is small.  */
!   if (! cheap_bb_rtx_cost_p (else_bb,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (else_edge->src),
  				    predictable_edge_p (else_edge)))))
      return FALSE;
--- 3986,3994 ----
  	     "\nIF-CASE-2 found, start %d, else %d\n",
  	     test_bb->index, else_bb->index);
  
!   /* We're speculating from the ELSE path, we want to make sure the cost
!      of speculation is within reason.  */
!   if (! cheap_bb_rtx_cost_p (else_bb, else_prob,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (else_edge->src),
  				    predictable_edge_p (else_edge)))))
      return FALSE;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFA: Avoiding unprofitable speculation
  2011-08-16 22:00 RFA: Avoiding unprofitable speculation Jeff Law
@ 2011-08-17  9:34 ` Richard Guenther
  2011-08-17 22:50   ` Jeff Law
  2011-08-18 22:59   ` Richard Henderson
  0 siblings, 2 replies; 7+ messages in thread
From: Richard Guenther @ 2011-08-17  9:34 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Tue, Aug 16, 2011 at 11:14 PM, Jeff Law <law@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> ifcvt.c is sometimes over-aggressive in speculating instructions from a
> not-predicted path.
>
> Given:
>
>        if (test) goto E; // x not live
>        x = big();
>        goto L;
>        E:
>        x = b;
>        goto M;
>
>
> ifcvt wants to turn it into:
>
>        x = b;
>        if (test) goto M;
>        x = big();
>        goto L;
>
> ie, we speculate x = b and remove a branch.
>
> Similarly:
>
>        if (test) goto over; // x not live
>        x = a;
>        goto label;
>        over:
>
>   becomes
>
>        x = a;
>        if (! test) goto label;
>
>
> where again we speculate x = a and eliminate a branch.
>
>
> ifcvt has tests to see if the code to speculate is cheap relative to the
> cost of the branch we get to remove.  Unfortunately, that only takes
> into account a static RTX cost.   We need to be looking at the branch
> prediction too -- often the branch we're going to eliminate isn't going
> to be executed at all!
>
> Specifically, we should take the cost of the branch we want to eliminate
> and scale that by how often we expect to reach that branch at runtime.
> That allows us to compare the runtime cost of the speculative code vs
> the runtime benefit of eliminating the branch.
>
> Looking at branch & insn counts before/after that change is quite
> interesting.   I've got gcc-4.6 built with/without the attached change.
>  I then use that gcc-4.6 to compile a bunch of .i files under the
> watchful eye of valgrind.
>
> Using this data we can see the actual costs...  For every dynamic branch
> eliminated, we had to execute an additional 300 instructions!
> Again, remember these are dynamic counts, so we may have only speculated
> one static instruction to eliminate one branch, but we hit that
> speculated instruction far more often dynamically than the branch we
> ultimately eliminated.
>
>
>
> instructions w/o patch:1267286482371
> instructions w   patch:1264140292731
>
> branches w/o     patch: 231180206508
> branches w       patch: 231190636813
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for
> trunk?

We don't have a way to distinguish branch-taken vs. branch-not-take
costs, right?
I would expect that for non-pipelined archs the branch does have the same cost
all the time, so ifcvt would be correct, no? Do you happen to know how valgrind
counts branches here?

The patch itself looks sensible, though I am surprised ifcvt doesn't run in
cfglayout mode (so you have to use reg notes to find probabilities ...)

Thanks,
Richard.

>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJOSt2sAAoJEBRtltQi2kC7UZUIAJ7fthVsCXxU3JOtIVbUSX5t
> grCG73peQnBB7FhB58/jW1GJWc011mExLIJf74FDrNU+gMp3gn01L0zdjcaytmY6
> sNjso7dLjW42a/wByzNlHNUy2KRMUqhobEhHYWgC0tMJFz8/ekCulI7h98pVISmT
> np9G/1zRXn3uD7F3pKw7lLDS994nSUmjObPFIyFxTfVGhBTWZYY8JjKP7NsOCNli
> Dr2BXFF4rahoSDUlcLHwPPBJPABLvxwvMo0dsmNkB3HEiajj7qVPGUYaGrTJ5M1g
> Bvww+ozzJRT96qQ/smjVZutD2Cu/U74Ix6EX8Yj54Zp2HeX8tCJ3kXWPFI6cBpk=
> =3BEA
> -----END PGP SIGNATURE-----
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFA: Avoiding unprofitable speculation
  2011-08-17  9:34 ` Richard Guenther
@ 2011-08-17 22:50   ` Jeff Law
  2011-08-18 22:59   ` Richard Henderson
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2011-08-17 22:50 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/11 01:21, Richard Guenther wrote:

> 
> We don't have a way to distinguish branch-taken vs. branch-not-take 
> costs, right?
Not that I'm aware of.  However, BRANCH_COST does allow the backend to
change the cost of a branch based on its predictability.


> I would expect that for non-pipelined archs the branch does have the
> same cost all the time, so ifcvt would be correct, no?
The branch we're able to eliminate is typically (always?) an
unconditional branch.  The unconditional branch we eliminate is only
executed some percentage of the time based on the conditional branch
that remains in the stream.

So regardless of the underlying cost of the branch we eliminate, we
still want to compare the cost of the speculated insns against the
scaled cost of the branch we're able to eliminate.



> Do you happen to know how valgrind counts branches here?
I don't know all the low level details -- I'd have to sit down with the
annotation code in the cachegrind plugin.  Presumably since valgrind
also attempts to model branch prediction it carries a number of counters
associated with each branch which get bumped as needed by the
instrumented code.

Jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOTBW/AAoJEBRtltQi2kC70kIH/183zcbFr5NiHbM7JO9xkGoL
lxiwEKsWW3m5x/PYRb+S82prPjRI/2ZzcnDE+ZzjffF+W2agOCdE29DvFjm8JkdI
bUwAMPUrxip5y9iZSwreywtbm73yw/9GTkkr+oHYZupqTUbbC3rw3kV5f/DJq/xP
jmzFPeK1U1Glmus9mWruiSwRloyh2o5usysdnB7aRhO/KdH1jWG7EfZ7cvfQhSWf
u8IYkxRsdQD/xd+6TpxOgmRj8kjJlYw0oAMjNsGkiNnNqyqzZBjOe6sHE59IWc27
Z35HrpRvXevuImE6XQF6KmBWiK1cjExVdmlnZMTuOdy/tXklmp0zLP1EAbP5Sd8=
=o4mr
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFA: Avoiding unprofitable speculation
  2011-08-17  9:34 ` Richard Guenther
  2011-08-17 22:50   ` Jeff Law
@ 2011-08-18 22:59   ` Richard Henderson
  2011-08-19 16:49     ` Jeff Law
  2011-09-27  0:11     ` Jeff Law
  1 sibling, 2 replies; 7+ messages in thread
From: Richard Henderson @ 2011-08-18 22:59 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jeff Law, gcc-patches

On 08/17/2011 12:21 AM, Richard Guenther wrote:
> The patch itself looks sensible, though I am surprised ifcvt doesn't run in
> cfglayout mode (so you have to use reg notes to find probabilities ...)

It does run in cfglayout mode.

Jeff, I believe you're supposed to get the probabilities from some combination of

 bb->frequency
 edge->probability
 EDGE_FREQUENCY(edge)


r~

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFA: Avoiding unprofitable speculation
  2011-08-18 22:59   ` Richard Henderson
@ 2011-08-19 16:49     ` Jeff Law
  2011-09-27  0:11     ` Jeff Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff Law @ 2011-08-19 16:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Richard Guenther, gcc-patches

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/18/11 15:59, Richard Henderson wrote:
> On 08/17/2011 12:21 AM, Richard Guenther wrote:
>> The patch itself looks sensible, though I am surprised ifcvt
>> doesn't run in cfglayout mode (so you have to use reg notes to find
>> probabilities ...)
> 
> It does run in cfglayout mode.
> 
> Jeff, I believe you're supposed to get the probabilities from some
> combination of
> 
> bb->frequency edge->probability EDGE_FREQUENCY(edge)
I was just utilizing code that was already sprinkled all over ifcvt.c to
get the probabilities.

Changing it to edge->probability should be trivial.

Jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOToNVAAoJEBRtltQi2kC7VJwH/itfjDUdJW+JavPAzXAOsMO2
JjSEGVtmokaxQoj2aVpF2uV8y0LyRY2eWQK2kNu9JC6B5BPXiWKknqJmDkAUTLO4
TAjbXrYcghcFoqQMBCC9yolyhbi5LxbIcFs7KT72s3vigOD6TSFRJZk/a02uyLNb
JxW3X8lB+K81LQKzLqNgJEW7k2FmbFYXDEJp0MZq+Y+un3vUTWytyX0Zbm6/caJc
cFxG+qi4HDCKCMBxwkuoxV+T+bEpW+VhIJecIVbIIU/GbzXJO9O+IgAJUMOPGIsh
lj239VeinlwE+4SGdwZQWTfJnmZRkDR3qS3xL67QlSLWpaM393D6uoUhRUfZlyg=
=uFvC
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFA: Avoiding unprofitable speculation
  2011-08-18 22:59   ` Richard Henderson
  2011-08-19 16:49     ` Jeff Law
@ 2011-09-27  0:11     ` Jeff Law
  2011-09-27 13:51       ` Richard Guenther
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff Law @ 2011-09-27  0:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Richard Guenther, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1294 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/18/11 15:59, Richard Henderson wrote:
> On 08/17/2011 12:21 AM, Richard Guenther wrote:
>> The patch itself looks sensible, though I am surprised ifcvt
>> doesn't run in cfglayout mode (so you have to use reg notes to
>> find probabilities ...)
> 
> It does run in cfglayout mode.
> 
> Jeff, I believe you're supposed to get the probabilities from some
> combination of
> 
> bb->frequency edge->probability EDGE_FREQUENCY(edge)
OK.  Here's the revised patch.  There's other places in ifcvt.c that
utilize the notes that I didn't modify.

Bootstrapped & regression tested x86_64-unknown-linux-gnu.  Also
verified performance data hasn't changed materially.

OK for trunk?

Thanks,
jeff


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOgPzuAAoJEBRtltQi2kC7uQkH+wTyLl3MwKkqC11ozyOAVSQ9
CrlwLAuN7JV7kZcguKZ9GpXYHePZHZHCISXaVv3LQQBXnE7PehJOWsd1D5BQRv2/
eqVHIAOYg0LamY2cRiWW8pKMiMjs7vb9q0fiehQGg0zxAJMc9crBjwLPGFjZAksw
UNzzon/NKfSMYsz9X/olDfk8DPa1DmAjBnNOcHzKLGdx7KDa6Npo20k3D/PwDbIe
y1Ff9pZBXJP6tNU+0cn9lyyt+w6ghFQRkpKJoJ6iSOxKQ6v23+03o4sT5GHv3Gvy
07R6NJU9vqt7a9GvxcyJ9BsOlCCJ/pA/4lProHrdcAZrOYWZAA4uoJeDBPOEq+g=
=a5LI
-----END PGP SIGNATURE-----

[-- Attachment #2: Q --]
[-- Type: text/plain, Size: 6923 bytes --]

	* ifcvt.c (cheap_bb_rtx_cost_p): Add SCALE argument.  Scale
	non-jumping insns by REG_BR_PROB_BASE and the maximum cost
	by SCALE.
	(find_if_case_1): Use the probability of the THEN clause when
	determining if speculation if profitable.
	(find_if_case_2): Similarly for the ELSE clause.
	
Index: ifcvt.c
===================================================================
*** ifcvt.c	(revision 179218)
--- ifcvt.c	(working copy)
*************** static int cond_exec_changed_p;
*** 85,91 ****
  
  /* Forward references.  */
  static int count_bb_insns (const_basic_block);
! static bool cheap_bb_rtx_cost_p (const_basic_block, int);
  static rtx first_active_insn (basic_block);
  static rtx last_active_insn (basic_block, int);
  static rtx find_active_insn_before (basic_block, rtx);
--- 85,91 ----
  
  /* Forward references.  */
  static int count_bb_insns (const_basic_block);
! static bool cheap_bb_rtx_cost_p (const_basic_block, int, int);
  static rtx first_active_insn (basic_block);
  static rtx last_active_insn (basic_block, int);
  static rtx find_active_insn_before (basic_block, rtx);
*************** count_bb_insns (const_basic_block bb)
*** 131,150 ****
  
  /* Determine whether the total insn_rtx_cost on non-jump insns in
     basic block BB is less than MAX_COST.  This function returns
!    false if the cost of any instruction could not be estimated.  */
  
  static bool
! cheap_bb_rtx_cost_p (const_basic_block bb, int max_cost)
  {
    int count = 0;
    rtx insn = BB_HEAD (bb);
    bool speed = optimize_bb_for_speed_p (bb);
  
    while (1)
      {
        if (NONJUMP_INSN_P (insn))
  	{
! 	  int cost = insn_rtx_cost (PATTERN (insn), speed);
  	  if (cost == 0)
  	    return false;
  
--- 131,161 ----
  
  /* Determine whether the total insn_rtx_cost on non-jump insns in
     basic block BB is less than MAX_COST.  This function returns
!    false if the cost of any instruction could not be estimated. 
! 
!    The cost of the non-jump insns in BB is scaled by REG_BR_PROB_BASE
!    as those insns are being speculated.  MAX_COST is scaled with SCALE
!    plus a small fudge factor.  */
  
  static bool
! cheap_bb_rtx_cost_p (const_basic_block bb, int scale, int max_cost)
  {
    int count = 0;
    rtx insn = BB_HEAD (bb);
    bool speed = optimize_bb_for_speed_p (bb);
  
+   /* Our branch probability/scaling factors are just estimates and don't
+      account for cases where we can get speculation for free and other
+      secondary benefits.  So we fudge the scale factor to make speculating
+      appear a little more profitable.  */
+   scale += REG_BR_PROB_BASE / 8;
+   max_cost *= scale;
+ 
    while (1)
      {
        if (NONJUMP_INSN_P (insn))
  	{
! 	  int cost = insn_rtx_cost (PATTERN (insn), speed) * REG_BR_PROB_BASE;
  	  if (cost == 0)
  	    return false;
  
*************** find_if_case_1 (basic_block test_bb, edg
*** 3796,3803 ****
    basic_block then_bb = then_edge->dest;
    basic_block else_bb = else_edge->dest;
    basic_block new_bb;
    rtx else_target = NULL_RTX;
-   int then_bb_index;
  
    /* If we are partitioning hot/cold basic blocks, we don't want to
       mess up unconditional or indirect jumps that cross between hot
--- 3807,3814 ----
    basic_block then_bb = then_edge->dest;
    basic_block else_bb = else_edge->dest;
    basic_block new_bb;
+   int then_bb_index, then_prob;
    rtx else_target = NULL_RTX;
  
    /* If we are partitioning hot/cold basic blocks, we don't want to
       mess up unconditional or indirect jumps that cross between hot
*************** find_if_case_1 (basic_block test_bb, edg
*** 3840,3847 ****
  	     "\nIF-CASE-1 found, start %d, then %d\n",
  	     test_bb->index, then_bb->index);
  
!   /* THEN is small.  */
!   if (! cheap_bb_rtx_cost_p (then_bb,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (then_edge->src),
  				    predictable_edge_p (then_edge)))))
      return FALSE;
--- 3851,3864 ----
  	     "\nIF-CASE-1 found, start %d, then %d\n",
  	     test_bb->index, then_bb->index);
  
!   if (then_edge->probability)
!     then_prob = REG_BR_PROB_BASE - then_edge->probability;
!   else
!     then_prob = REG_BR_PROB_BASE / 2;
! 
!   /* We're speculating from the THEN path, we want to make sure the cost
!      of speculation is within reason.  */
!   if (! cheap_bb_rtx_cost_p (then_bb, then_prob,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (then_edge->src),
  				    predictable_edge_p (then_edge)))))
      return FALSE;
*************** find_if_case_2 (basic_block test_bb, edg
*** 3910,3916 ****
    basic_block then_bb = then_edge->dest;
    basic_block else_bb = else_edge->dest;
    edge else_succ;
!   rtx note;
  
    /* If we are partitioning hot/cold basic blocks, we don't want to
       mess up unconditional or indirect jumps that cross between hot
--- 3927,3933 ----
    basic_block then_bb = then_edge->dest;
    basic_block else_bb = else_edge->dest;
    edge else_succ;
!   int then_prob, else_prob;
  
    /* If we are partitioning hot/cold basic blocks, we don't want to
       mess up unconditional or indirect jumps that cross between hot
*************** find_if_case_2 (basic_block test_bb, edg
*** 3949,3957 ****
    if (then_bb->index < NUM_FIXED_BLOCKS)
      return FALSE;
  
    /* ELSE is predicted or SUCC(ELSE) postdominates THEN.  */
!   note = find_reg_note (BB_END (test_bb), REG_BR_PROB, NULL_RTX);
!   if (note && INTVAL (XEXP (note, 0)) >= REG_BR_PROB_BASE / 2)
      ;
    else if (else_succ->dest->index < NUM_FIXED_BLOCKS
  	   || dominated_by_p (CDI_POST_DOMINATORS, then_bb,
--- 3966,3984 ----
    if (then_bb->index < NUM_FIXED_BLOCKS)
      return FALSE;
  
+   if (else_edge->probability)
+     {
+       else_prob = else_edge->probability;
+       then_prob = REG_BR_PROB_BASE - else_prob;
+     }
+   else
+     {
+       else_prob = REG_BR_PROB_BASE / 2;
+       then_prob = REG_BR_PROB_BASE / 2;
+     }
+ 
    /* ELSE is predicted or SUCC(ELSE) postdominates THEN.  */
!   if (else_prob > then_prob)
      ;
    else if (else_succ->dest->index < NUM_FIXED_BLOCKS
  	   || dominated_by_p (CDI_POST_DOMINATORS, then_bb,
*************** find_if_case_2 (basic_block test_bb, edg
*** 3966,3973 ****
  	     "\nIF-CASE-2 found, start %d, else %d\n",
  	     test_bb->index, else_bb->index);
  
!   /* ELSE is small.  */
!   if (! cheap_bb_rtx_cost_p (else_bb,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (else_edge->src),
  				    predictable_edge_p (else_edge)))))
      return FALSE;
--- 3993,4001 ----
  	     "\nIF-CASE-2 found, start %d, else %d\n",
  	     test_bb->index, else_bb->index);
  
!   /* We're speculating from the ELSE path, we want to make sure the cost
!      of speculation is within reason.  */
!   if (! cheap_bb_rtx_cost_p (else_bb, else_prob,
  	COSTS_N_INSNS (BRANCH_COST (optimize_bb_for_speed_p (else_edge->src),
  				    predictable_edge_p (else_edge)))))
      return FALSE;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RFA: Avoiding unprofitable speculation
  2011-09-27  0:11     ` Jeff Law
@ 2011-09-27 13:51       ` Richard Guenther
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Guenther @ 2011-09-27 13:51 UTC (permalink / raw)
  To: Jeff Law; +Cc: Richard Henderson, gcc-patches

On Tue, Sep 27, 2011 at 12:30 AM, Jeff Law <law@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 08/18/11 15:59, Richard Henderson wrote:
>> On 08/17/2011 12:21 AM, Richard Guenther wrote:
>>> The patch itself looks sensible, though I am surprised ifcvt
>>> doesn't run in cfglayout mode (so you have to use reg notes to
>>> find probabilities ...)
>>
>> It does run in cfglayout mode.
>>
>> Jeff, I believe you're supposed to get the probabilities from some
>> combination of
>>
>> bb->frequency edge->probability EDGE_FREQUENCY(edge)
> OK.  Here's the revised patch.  There's other places in ifcvt.c that
> utilize the notes that I didn't modify.
>
> Bootstrapped & regression tested x86_64-unknown-linux-gnu.  Also
> verified performance data hasn't changed materially.
>
> OK for trunk?

Looks good to me.

Thanks,
Richard.

> Thanks,
> jeff
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJOgPzuAAoJEBRtltQi2kC7uQkH+wTyLl3MwKkqC11ozyOAVSQ9
> CrlwLAuN7JV7kZcguKZ9GpXYHePZHZHCISXaVv3LQQBXnE7PehJOWsd1D5BQRv2/
> eqVHIAOYg0LamY2cRiWW8pKMiMjs7vb9q0fiehQGg0zxAJMc9crBjwLPGFjZAksw
> UNzzon/NKfSMYsz9X/olDfk8DPa1DmAjBnNOcHzKLGdx7KDa6Npo20k3D/PwDbIe
> y1Ff9pZBXJP6tNU+0cn9lyyt+w6ghFQRkpKJoJ6iSOxKQ6v23+03o4sT5GHv3Gvy
> 07R6NJU9vqt7a9GvxcyJ9BsOlCCJ/pA/4lProHrdcAZrOYWZAA4uoJeDBPOEq+g=
> =a5LI
> -----END PGP SIGNATURE-----
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-09-27 12:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-16 22:00 RFA: Avoiding unprofitable speculation Jeff Law
2011-08-17  9:34 ` Richard Guenther
2011-08-17 22:50   ` Jeff Law
2011-08-18 22:59   ` Richard Henderson
2011-08-19 16:49     ` Jeff Law
2011-09-27  0:11     ` Jeff Law
2011-09-27 13:51       ` Richard Guenther

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).