public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Question about RTL for bitwise AND
@ 2006-04-13 16:57 Matt Lee
  2006-04-13 20:29 ` Ian Lance Taylor
  0 siblings, 1 reply; 6+ messages in thread
From: Matt Lee @ 2006-04-13 16:57 UTC (permalink / raw)
  To: gcc-help

Hi,

 I am using powerpc-eabi-gcc (version 3.4.1) and I have a question
about the RTL that is produced for,

 test.c
 int a;

 if (a & 2) {
  // Do something
 } else
  // Do something else
 }


 I see in test.c.01.rtl,

 (insn 12 11 13 (set (reg:SI 122)
         (lshiftrt:SI (reg:SI 121)
             (const_int 1 [0x1]))) -1 (nil)
     (nil))

 (insn 13 12 14 (parallel [
             (set (reg:SI 123)
                 (and:SI (reg:SI 122)
                     (const_int 1 [0x1])))
             (clobber (scratch:CC))
         ]) -1 (nil)
     (nil))


 My question is, why is a logical shift right required? Wouldn't a
direct bit-wise AND with const_int 2 suffice?
 I am wondering if I am missing some C expression rules here? This
seems to be happening only for one-hot encoded bitwise ANDs.
 For e.g, if the expression is (a & 3), then I only see RTL generated
for the bitwise AND and there is no right shift. I am using -O3 -S as
compiler switches.
 I do see other architectures produce only the bitwise AND even for
one-hot encoded immediate operands such as 0x2.

 This is causing problems in my (other) port where I can do only
single-bit shifts. In the worst case, a & 0x80000000 the final
assembly contains 31 right shifts. This is a big optimization problem.

 Any advice is much helpful.

 thanks,
 Matt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about RTL for bitwise AND
  2006-04-13 16:57 Question about RTL for bitwise AND Matt Lee
@ 2006-04-13 20:29 ` Ian Lance Taylor
  2006-04-13 23:31   ` Matt Lee
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Lance Taylor @ 2006-04-13 20:29 UTC (permalink / raw)
  To: Matt Lee; +Cc: gcc-help

"Matt Lee" <reachmatt.lee@gmail.com> writes:

>  I am using powerpc-eabi-gcc (version 3.4.1) and I have a question
> about the RTL that is produced for,
> 
>  test.c
>  int a;
> 
>  if (a & 2) {
>   // Do something
>  } else
>   // Do something else
>  }
> 
> 
>  I see in test.c.01.rtl,
> 
>  (insn 12 11 13 (set (reg:SI 122)
>          (lshiftrt:SI (reg:SI 121)
>              (const_int 1 [0x1]))) -1 (nil)
>      (nil))
> 
>  (insn 13 12 14 (parallel [
>              (set (reg:SI 123)
>                  (and:SI (reg:SI 122)
>                      (const_int 1 [0x1])))
>              (clobber (scratch:CC))
>          ]) -1 (nil)
>      (nil))
> 
> 
>  My question is, why is a logical shift right required? Wouldn't a
> direct bit-wise AND with const_int 2 suffice?

In general this kind of decision is made based on target specific
costs.  See prefer_and_bit_test in dojump.c.

For the PowerPC, you should look at later optimization passes.  It
seems possible that the two instructions above will get combined into
a single rlwinm instruction.  Although I haven't tried it.

>  This is causing problems in my (other) port where I can do only
> single-bit shifts. In the worst case, a & 0x80000000 the final
> assembly contains 31 right shifts. This is a big optimization problem.

Fix your costs to indicate this.

Ian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about RTL for bitwise AND
  2006-04-13 20:29 ` Ian Lance Taylor
@ 2006-04-13 23:31   ` Matt Lee
  2006-04-14  3:21     ` Ian Lance Taylor
  0 siblings, 1 reply; 6+ messages in thread
From: Matt Lee @ 2006-04-13 23:31 UTC (permalink / raw)
  To: gcc-help

Hi Ian,

Thanks for the help. I did suspect the costs and reviewd them,

Here is what I have,

    case AND:
    case IOR:
    case XOR:
        *total = COSTS_N_INSNS (1);
        return true;
    case ASHIFT:
    case ASHIFTRT:
    case LSHIFTRT:
         if (GET_CODE (XEXP (x, 1)) == CONST_INT)
             *total = COSTS_N_INSNS (INTVAL (XEXP (x, 1)));
         return true;


This looks quite OK to me. I tried debugging if rtx_costs was doing
something wrong. I do see rtx_cost being invoked for the lshiftrt
expressions in question, but never for the "and". Seems like GCC had
pre-decided to only use lshiftrt even though it is expensive.

Any other ideas? Btw, I couldn't find prefer_and_bit_test in dojump.c.

thanks,
Matt



On 13 Apr 2006 13:29:16 -0700, Ian Lance Taylor <ian@airs.com> wrote:
> "Matt Lee" <reachmatt.lee@gmail.com> writes:
>
> >  I am using powerpc-eabi-gcc (version 3.4.1) and I have a question
> > about the RTL that is produced for,
> >
> >  test.c
> >  int a;
> >
> >  if (a & 2) {
> >   // Do something
> >  } else
> >   // Do something else
> >  }
> >
> >
> >  I see in test.c.01.rtl,
> >
> >  (insn 12 11 13 (set (reg:SI 122)
> >          (lshiftrt:SI (reg:SI 121)
> >              (const_int 1 [0x1]))) -1 (nil)
> >      (nil))
> >
> >  (insn 13 12 14 (parallel [
> >              (set (reg:SI 123)
> >                  (and:SI (reg:SI 122)
> >                      (const_int 1 [0x1])))
> >              (clobber (scratch:CC))
> >          ]) -1 (nil)
> >      (nil))
> >
> >
> >  My question is, why is a logical shift right required? Wouldn't a
> > direct bit-wise AND with const_int 2 suffice?
>
> In general this kind of decision is made based on target specific
> costs.  See prefer_and_bit_test in dojump.c.
>
> For the PowerPC, you should look at later optimization passes.  It
> seems possible that the two instructions above will get combined into
> a single rlwinm instruction.  Although I haven't tried it.
>
> >  This is causing problems in my (other) port where I can do only
> > single-bit shifts. In the worst case, a & 0x80000000 the final
> > assembly contains 31 right shifts. This is a big optimization problem.
>
> Fix your costs to indicate this.
>
> Ian
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about RTL for bitwise AND
  2006-04-13 23:31   ` Matt Lee
@ 2006-04-14  3:21     ` Ian Lance Taylor
  2006-04-14 21:49       ` Matt Lee
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Lance Taylor @ 2006-04-14  3:21 UTC (permalink / raw)
  To: Matt Lee; +Cc: gcc-help

"Matt Lee" <reachmatt.lee@gmail.com> writes:

> Here is what I have,
> 
>     case AND:
>     case IOR:
>     case XOR:
>         *total = COSTS_N_INSNS (1);
>         return true;
>     case ASHIFT:
>     case ASHIFTRT:
>     case LSHIFTRT:
>          if (GET_CODE (XEXP (x, 1)) == CONST_INT)
>              *total = COSTS_N_INSNS (INTVAL (XEXP (x, 1)));
>          return true;
> 
> 
> This looks quite OK to me. I tried debugging if rtx_costs was doing
> something wrong. I do see rtx_cost being invoked for the lshiftrt
> expressions in question, but never for the "and". Seems like GCC had
> pre-decided to only use lshiftrt even though it is expensive.

Your computations look wrong to me.  You are saying that an AND always
costs 1 instruction.  But you need to actually look at the operands.
Specifically, the compiler is going to compare these two:
    (and (reg ...) (const_int ...)
    (and (ashiftrt (reg ...) (const_int ...)) (const_int 1))

> Any other ideas? Btw, I couldn't find prefer_and_bit_test in dojump.c.

I'm looking at the current version of gcc.  prefer_and_bit_test was
added 2004-03-20, so I guess that just missed 3.4.  I don't have 3.4
sources handy.  In general the decision is made in the do_jump
function.

Ian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about RTL for bitwise AND
  2006-04-14  3:21     ` Ian Lance Taylor
@ 2006-04-14 21:49       ` Matt Lee
  2006-04-18  4:33         ` Ian Lance Taylor
  0 siblings, 1 reply; 6+ messages in thread
From: Matt Lee @ 2006-04-14 21:49 UTC (permalink / raw)
  To: gcc-help, reachmatt.lee

Hi Ian,

Thanks again. I do agree that my costs calculation do not recurse into
the sub-expressions, but I was just mimicing other ports in this
regard (see rs6000 for instance) . They do not recurse into
sub-expressions as well.

Also, I think I traced the problem to fold-const.c::fold_single_bit_test()

      /* If we have (A & C) != 0 or (A & C) == 0 and C is a power of
         2, then fold the expression into shifts and logical operations.  */
      tem = fold_single_bit_test (code, arg0, arg1, type);
      if (tem)
        return tem;

Commenting the above code out, makes the compiler spit out the right
expression -- direct bitwise AND. Comments at the top of the function
also seem to indicate that this is considered an optimization! 
haven't dug into this to see why this was considered so. This constant
folding might be suboptimal depending on features available on the
target machine.

I have also checked that GCC 4.0.1 code has not changed in this
behavior. The same transformation is present in fold-const.c

Please advise.

On a related note, it looks the if-conversion phase as well, chooses
to use a bunch of shifts and other logical operations to convert a
branch into linear code. This is fine, except that the resultant
operations prove more expensive than a branch. I will take a closer
look at the IFCVT_* macros and see if one of them is useful for
catching these conditions.


thanks,
Matt

On 13 Apr 2006 20:21:41 -0700, Ian Lance Taylor <ian@airs.com> wrote:
> "Matt Lee" <reachmatt.lee@gmail.com> writes:
>
> > Here is what I have,
> >
> >     case AND:
> >     case IOR:
> >     case XOR:
> >         *total = COSTS_N_INSNS (1);
> >         return true;
> >     case ASHIFT:
> >     case ASHIFTRT:
> >     case LSHIFTRT:
> >          if (GET_CODE (XEXP (x, 1)) == CONST_INT)
> >              *total = COSTS_N_INSNS (INTVAL (XEXP (x, 1)));
> >          return true;
> >
> >
> > This looks quite OK to me. I tried debugging if rtx_costs was doing
> > something wrong. I do see rtx_cost being invoked for the lshiftrt
> > expressions in question, but never for the "and". Seems like GCC had
> > pre-decided to only use lshiftrt even though it is expensive.
>
> Your computations look wrong to me.  You are saying that an AND always
> costs 1 instruction.  But you need to actually look at the operands.
> Specifically, the compiler is going to compare these two:
>     (and (reg ...) (const_int ...)
>     (and (ashiftrt (reg ...) (const_int ...)) (const_int 1))
>
> > Any other ideas? Btw, I couldn't find prefer_and_bit_test in dojump.c.
>
> I'm looking at the current version of gcc.  prefer_and_bit_test was
> added 2004-03-20, so I guess that just missed 3.4.  I don't have 3.4
> sources handy.  In general the decision is made in the do_jump
> function.
>
> Ian
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about RTL for bitwise AND
  2006-04-14 21:49       ` Matt Lee
@ 2006-04-18  4:33         ` Ian Lance Taylor
  0 siblings, 0 replies; 6+ messages in thread
From: Ian Lance Taylor @ 2006-04-18  4:33 UTC (permalink / raw)
  To: Matt Lee; +Cc: gcc-help

"Matt Lee" <reachmatt.lee@gmail.com> writes:

> Thanks again. I do agree that my costs calculation do not recurse into
> the sub-expressions, but I was just mimicing other ports in this
> regard (see rs6000 for instance) . They do not recurse into
> sub-expressions as well.

In the current mainline, the costs calculation in rs6000.c returns
false for AND and friends.  I haven't looked at the 3.4 sources (I
don't have them checked out).

> Also, I think I traced the problem to fold-const.c::fold_single_bit_test()
> 
>       /* If we have (A & C) != 0 or (A & C) == 0 and C is a power of
>          2, then fold the expression into shifts and logical operations.  */
>       tem = fold_single_bit_test (code, arg0, arg1, type);
>       if (tem)
>         return tem;

You're right, it looks like this is fixed in 4.1 and current mainline
with the patch in PR 14846.

> On a related note, it looks the if-conversion phase as well, chooses
> to use a bunch of shifts and other logical operations to convert a
> branch into linear code. This is fine, except that the resultant
> operations prove more expensive than a branch. I will take a closer
> look at the IFCVT_* macros and see if one of them is useful for
> catching these conditions.

If-conversion also uses costs to make these determinations.  The thing
to tweak is BRANCH_COST, and perhaps MAX_CONDITIONAL_EXECUTE.

Hope this helps.

Ian

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-04-18  4:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-04-13 16:57 Question about RTL for bitwise AND Matt Lee
2006-04-13 20:29 ` Ian Lance Taylor
2006-04-13 23:31   ` Matt Lee
2006-04-14  3:21     ` Ian Lance Taylor
2006-04-14 21:49       ` Matt Lee
2006-04-18  4:33         ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).