Re: [PATCH][RFC] Bit CCP and pointer alignment propagation

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
@ 2010-08-04  7:43 Jay Foad
  2010-08-04  8:18 ` Richard Guenther
  0 siblings, 1 reply; 21+ messages in thread
From: Jay Foad @ 2010-08-04  7:43 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Richard Guenther, gcc-patches

> I'm sure that you can optimize the addition, I'll think about it.

  /* Do the addition with unknown bits set to zero, to give carry-ins of zero
     wherever possible.  */
  lo = double_int_add(r1val.value & ~r1val.bit_lattice_val,
                      r2val.value & ~r2val.bit_lattice_val);

  /* Do the addition with unknown bits set to one, to give carry-ins of one
     wherever possible.  */
  hi = double_int_add(r1val.value | r1val.bit_lattice_val,
                      r2val.value | r2val.bit_lattice_val);

  /* Each bit in the result is known if (a) the corresponding bits in both
     inputs are known, and (b) the carry-in to that bit position is known. We
     can check condition (b) by seeing if we got the same result with minimised
     carries as with maximised carries.  */
  val.bit_lattice_val = r1val.bit_lattive_val | r2val.bit_lattive_val
    | (lo ^ hi);

  /* It shouldn't matter whether we choose lo or hi here.  */
  gcc_assert(((lo ^ hi) & ~val.bit_lattice_val) == 0);
  val.value = lo;

Jay.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-08-04  7:43 [PATCH][RFC] Bit CCP and pointer alignment propagation Jay Foad
@ 2010-08-04  8:18 ` Richard Guenther
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Guenther @ 2010-08-04  8:18 UTC (permalink / raw)
  To: Jay Foad; +Cc: Paolo Bonzini, gcc-patches

On Wed, 4 Aug 2010, Jay Foad wrote:

> > I'm sure that you can optimize the addition, I'll think about it.
> 
>   /* Do the addition with unknown bits set to zero, to give carry-ins of zero
>      wherever possible.  */
>   lo = double_int_add(r1val.value & ~r1val.bit_lattice_val,
>                       r2val.value & ~r2val.bit_lattice_val);
> 
>   /* Do the addition with unknown bits set to one, to give carry-ins of one
>      wherever possible.  */
>   hi = double_int_add(r1val.value | r1val.bit_lattice_val,
>                       r2val.value | r2val.bit_lattice_val);
> 
>   /* Each bit in the result is known if (a) the corresponding bits in both
>      inputs are known, and (b) the carry-in to that bit position is known. We
>      can check condition (b) by seeing if we got the same result with minimised
>      carries as with maximised carries.  */
>   val.bit_lattice_val = r1val.bit_lattive_val | r2val.bit_lattive_val
>     | (lo ^ hi);
> 
>   /* It shouldn't matter whether we choose lo or hi here.  */
>   gcc_assert(((lo ^ hi) & ~val.bit_lattice_val) == 0);
>   val.value = lo;

Nice - thanks.

Richard.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-08-04 13:05 Jay Foad
@ 2010-08-04 13:11 ` Richard Guenther
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Guenther @ 2010-08-04 13:11 UTC (permalink / raw)
  To: Jay Foad; +Cc: gcc-patches

On Wed, 4 Aug 2010, Jay Foad wrote:

> > > > + typedef struct bit_prop_value_d {
> > > > +   /* Lattice value.  */
> > > > +   unsigned lattice_val;
> > > > +
> > > > +   /* Bit lattice value.  Zero for constant, one for varying.  */
> > > > +   double_int bit_lattice_val;
> > >
> > > I always think of this member as a mask, why not call it so?  At runtime
> > > for a value X that has MASK and VALUE in the lattice this holds:
> > > (X & ~MASK) == VALUE .  Actually I'm not sure you ensure this as you
> > > sometimes only adjust the mask, but not value, so you we might know only
> > > this: (X & ~MASK) == (VALUE & ~MASK).
> >
> > That's true, varying value bits have undefined content.
> 
> If you define a canonical form where varying value bits have to be zero:
> 
>   gcc_assert((foo.value & foo.bit_lattice_val) == 0);
> 
> then you can use non-canonical values in the (bit_lattice_val, value)
> pair to represent different lattice values. For example:
> 
> lattice_val   : represented by (bit_lattice_val, value)
> -------------------------------------------------------
> UNINITIALIZED : (-1, -2)
> UNDEFINED     : (-1, -1)
> CONSTANT      : any (blv, v) where v & blv == 0
> VARYING       : (-1, 1)
> 
> With this scheme (v & blv) + 2 gives you the lattice value as you have
> it now.  Then you don't need the lattice_val field in struct
> bit_prop_value_d, which should make them pack together much better.
> Dunno whether it's worth the obfuscation, though.

IMHO it's not.  You also would loose the convenient trick to
be able to sign-/zero-extend the mask to treat unknown sign-bits
correctly.

Richard.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
@ 2010-08-04 13:05 Jay Foad
  2010-08-04 13:11 ` Richard Guenther
  0 siblings, 1 reply; 21+ messages in thread
From: Jay Foad @ 2010-08-04 13:05 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

> > > + typedef struct bit_prop_value_d {
> > > +   /* Lattice value.  */
> > > +   unsigned lattice_val;
> > > +
> > > +   /* Bit lattice value.  Zero for constant, one for varying.  */
> > > +   double_int bit_lattice_val;
> >
> > I always think of this member as a mask, why not call it so?  At runtime
> > for a value X that has MASK and VALUE in the lattice this holds:
> > (X & ~MASK) == VALUE .  Actually I'm not sure you ensure this as you
> > sometimes only adjust the mask, but not value, so you we might know only
> > this: (X & ~MASK) == (VALUE & ~MASK).
>
> That's true, varying value bits have undefined content.

If you define a canonical form where varying value bits have to be zero:

  gcc_assert((foo.value & foo.bit_lattice_val) == 0);

then you can use non-canonical values in the (bit_lattice_val, value)
pair to represent different lattice values. For example:

lattice_val   : represented by (bit_lattice_val, value)
-------------------------------------------------------
UNINITIALIZED : (-1, -2)
UNDEFINED     : (-1, -1)
CONSTANT      : any (blv, v) where v & blv == 0
VARYING       : (-1, 1)

With this scheme (v & blv) + 2 gives you the lattice value as you have
it now.  Then you don't need the lattice_val field in struct
bit_prop_value_d, which should make them pack together much better.
Dunno whether it's worth the obfuscation, though.

Jay.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 16:49           ` Joseph S. Myers
@ 2010-07-30 18:06             ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2010-07-30 18:06 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Richard Henderson, Richard Guenther, gcc-patches

On 07/30/2010 06:41 PM, Joseph S. Myers wrote:
> On Fri, 30 Jul 2010, Richard Henderson wrote:
>
>> On 07/30/2010 06:15 AM, Richard Guenther wrote:
>>> I think we can have negative shift counts (at least the constant folding
>>> code suggests so), this is why I have the code as-is.
>>
>> VAX has them; I can't recall any other target that does.
>>
>> Almost all truncate the shift count at some bit position
>> (sometimes at the largest supported shift count, e.g.
>> truncate to 6 bits even for a 32-bit shift because there
>> exists a 64-bit shift; TARGET_SHIFT_TRUNCATION_MASK will
>> tell you about this).
>
> ARM NEON vector shifts can have negative shift counts (and also truncate).
> The VSHL (register) instruction takes its shift counts from the least
> significant byte of each element of the second vector, sign-extended (so
> right shifts are represented by shift counts whose low byte is negative).
>
> Obviously that is a description of the instruction semantics, not of RTL
> or GIMPLE.

Yes, that's the crux.  If you wanted to access it you could with an 
intrinsic, not with << and >>.  If it is better for us (e.g. simpler or 
less code), we should define our intermediate representation to disallow 
negative shifts.  It will make GIMPLE and RTL slightly worse for VAX, 
and slightly better for everything else.  Targets that care about 
negative shifts can use a builtin or unspec respectively.

For the case at hand of course it's not too contorted to allow negative 
shifts, so all this is theoretical.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 16:23         ` Richard Henderson
  2010-07-30 16:38           ` Richard Guenther
@ 2010-07-30 16:49           ` Joseph S. Myers
  2010-07-30 18:06             ` Paolo Bonzini
  1 sibling, 1 reply; 21+ messages in thread
From: Joseph S. Myers @ 2010-07-30 16:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Richard Guenther, Paolo Bonzini, gcc-patches

On Fri, 30 Jul 2010, Richard Henderson wrote:

> On 07/30/2010 06:15 AM, Richard Guenther wrote:
> > I think we can have negative shift counts (at least the constant folding
> > code suggests so), this is why I have the code as-is.
> 
> VAX has them; I can't recall any other target that does.
> 
> Almost all truncate the shift count at some bit position
> (sometimes at the largest supported shift count, e.g. 
> truncate to 6 bits even for a 32-bit shift because there
> exists a 64-bit shift; TARGET_SHIFT_TRUNCATION_MASK will
> tell you about this).

ARM NEON vector shifts can have negative shift counts (and also truncate).  
The VSHL (register) instruction takes its shift counts from the least 
significant byte of each element of the second vector, sign-extended (so 
right shifts are represented by shift counts whose low byte is negative).

Obviously that is a description of the instruction semantics, not of RTL 
or GIMPLE.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 16:23         ` Richard Henderson
@ 2010-07-30 16:38           ` Richard Guenther
  2010-07-30 16:49           ` Joseph S. Myers
  1 sibling, 0 replies; 21+ messages in thread
From: Richard Guenther @ 2010-07-30 16:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, gcc-patches

On Fri, 30 Jul 2010, Richard Henderson wrote:

> On 07/30/2010 06:15 AM, Richard Guenther wrote:
> > I think we can have negative shift counts (at least the constant folding
> > code suggests so), this is why I have the code as-is.
> 
> VAX has them; I can't recall any other target that does.
> 
> Almost all truncate the shift count at some bit position
> (sometimes at the largest supported shift count, e.g. 
> truncate to 6 bits even for a 32-bit shift because there
> exists a 64-bit shift; TARGET_SHIFT_TRUNCATION_MASK will
> tell you about this).
> 
> A very few actually check all bits of the value and treat
> very large (unsigned) shift counts as producing a zero.
> 
> It does seem like a large code burden to carry around just
> for one (or a few) targets, when instead we could simply
> treat the value as implementation-defined (varying in this
> case, unfolded in the other case).

Indeed.  We can just interpret the shift count as unsigned
(even forcing unsigned types in the middle-end for the
2nd operand of the shift and rotate codes).

Richard.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:29       ` Richard Guenther
  2010-07-30 13:38         ` Paolo Bonzini
@ 2010-07-30 16:23         ` Richard Henderson
  2010-07-30 16:38           ` Richard Guenther
  2010-07-30 16:49           ` Joseph S. Myers
  1 sibling, 2 replies; 21+ messages in thread
From: Richard Henderson @ 2010-07-30 16:23 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Paolo Bonzini, gcc-patches

On 07/30/2010 06:15 AM, Richard Guenther wrote:
> I think we can have negative shift counts (at least the constant folding
> code suggests so), this is why I have the code as-is.

VAX has them; I can't recall any other target that does.

Almost all truncate the shift count at some bit position
(sometimes at the largest supported shift count, e.g. 
truncate to 6 bits even for a 32-bit shift because there
exists a 64-bit shift; TARGET_SHIFT_TRUNCATION_MASK will
tell you about this).

A very few actually check all bits of the value and treat
very large (unsigned) shift counts as producing a zero.

It does seem like a large code burden to carry around just
for one (or a few) targets, when instead we could simply
treat the value as implementation-defined (varying in this
case, unfolded in the other case).

r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 14:24             ` Paolo Bonzini
@ 2010-07-30 14:51               ` Richard Guenther
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Guenther @ 2010-07-30 14:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

On Fri, 30 Jul 2010, Paolo Bonzini wrote:

> On 07/30/2010 03:39 PM, Richard Guenther wrote:
> > On Fri, 30 Jul 2010, Paolo Bonzini wrote:
> > 
> > > On 07/30/2010 03:15 PM, Richard Guenther wrote:
> > > > I think we can have negative shift counts (at least the constant folding
> > > > code suggests so), this is why I have the code as-is.
> > > 
> > > No, that seems very weird.  Sure expand does not handle it, and the
> > > implementation-defined section of the manual does not mention it.  I'm
> > > more
> > > inclined to consider it a historical wart, given this comment:
> > > 
> > >    /* Previously detected shift-counts computed by NEGATE_EXPR
> > >       and shifted in the other direction; but that does not work
> > >       on all machines.  */
> > > 
> > > dating back to the beginning of the GCC repo.  I wonder what the attached
> > > patch would do.
> > 
> > Maybe - at least with 2<<  -1 we only warn:
> > 
> > t.c:3: warning: left shift count is negative
> > 
> > and happily continue, doing a constant right shift.
> 
> So the patch would crash.  The right solution is probably to change the shift
> count to unsigned and fix the fallout.

Yes.  Btw, the intel compiler warns the same but treats the shift
count unsigned (at least it produces zero, not 1).

Richard.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:39   ` Richard Guenther
@ 2010-07-30 14:26     ` Michael Matz
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Matz @ 2010-07-30 14:26 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

Hi,

On Fri, 30 Jul 2010, Richard Guenther wrote:

> > > + bit_prop_value_convert (tree type, bit_prop_value_t rval, tree rval_type)
> > > + {
> > > +   bit_prop_value_t val;
> > > +   bool uns;
> > > + 
> > > +   /* First extend lattice and value according to the original type.  */
> > > +   uns = (TREE_CODE (rval_type) == INTEGER_TYPE && TYPE_IS_SIZETYPE (rval_type)
> > > + 	 ? 0 : TYPE_UNSIGNED (rval_type));
> > > +   val.bit_lattice_val = double_int_ext (rval.bit_lattice_val,
> > > + 					TYPE_PRECISION (rval_type), uns);
> > 
> > ... but here you extend until the width of double_int.  Perhaps you want 
> > to mask out the upper bits again?
> > (And a lattice can't be sign/zero-extended :)  A mask could be.)
> 
> Well - if the value is signed and we do not know its sign bit then
> we have to sign extend the lattice.

You can't extend a lattice, because a lattice is a set of elements with 
some properties.  I was refering to the comment "/* First extend lattice 
...".

But yes, I didn't notice that your use of TYPE_PRECISION in generating the 
initial mask was only in the context of pointer types, hence unsigned.

> > Here you test the high bit of the mask, but above you fill the mask 
> > only upto TYPE_PRECISION.  I'm not sure if it matters for correctness, 
> > though.
> 
> Where exactly?  The lattice should always be extended as to how the
> value is extended.

Yeah, see above, I was confused.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:45           ` Richard Guenther
@ 2010-07-30 14:24             ` Paolo Bonzini
  2010-07-30 14:51               ` Richard Guenther
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2010-07-30 14:24 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

On 07/30/2010 03:39 PM, Richard Guenther wrote:
> On Fri, 30 Jul 2010, Paolo Bonzini wrote:
>
>> On 07/30/2010 03:15 PM, Richard Guenther wrote:
>>> I think we can have negative shift counts (at least the constant folding
>>> code suggests so), this is why I have the code as-is.
>>
>> No, that seems very weird.  Sure expand does not handle it, and the
>> implementation-defined section of the manual does not mention it.  I'm more
>> inclined to consider it a historical wart, given this comment:
>>
>>    /* Previously detected shift-counts computed by NEGATE_EXPR
>>       and shifted in the other direction; but that does not work
>>       on all machines.  */
>>
>> dating back to the beginning of the GCC repo.  I wonder what the attached
>> patch would do.
>
> Maybe - at least with 2<<  -1 we only warn:
>
> t.c:3: warning: left shift count is negative
>
> and happily continue, doing a constant right shift.

So the patch would crash.  The right solution is probably to change the 
shift count to unsigned and fix the fallout.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:38         ` Paolo Bonzini
@ 2010-07-30 13:45           ` Richard Guenther
  2010-07-30 14:24             ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Guenther @ 2010-07-30 13:45 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

On Fri, 30 Jul 2010, Paolo Bonzini wrote:

> On 07/30/2010 03:15 PM, Richard Guenther wrote:
> > I think we can have negative shift counts (at least the constant folding
> > code suggests so), this is why I have the code as-is.
> 
> No, that seems very weird.  Sure expand does not handle it, and the
> implementation-defined section of the manual does not mention it.  I'm more
> inclined to consider it a historical wart, given this comment:
> 
>   /* Previously detected shift-counts computed by NEGATE_EXPR
>      and shifted in the other direction; but that does not work
>      on all machines.  */
> 
> dating back to the beginning of the GCC repo.  I wonder what the attached
> patch would do.

Maybe - at least with 2 << -1 we only warn:

t.c:3: warning: left shift count is negative

and happily continue, doing a constant right shift.  With -1 put
into a temporary we get sall with a negative shift at -O0 (and
zero as a result) and one as result when optimizing.

> BTW the SHIFT_COUNT_TRUNCATED handling is not needed because you get it from
> lshift_double.  However, this opens another small can of worms. lshift_double
> does
> 
>   if (SHIFT_COUNT_TRUNCATED)
>     count %= prec;
> 
> which makes bit-level analysis totally wrong for non-power-of-two precisions,
> including IIRC bitfields bigger than sizeof(int).  I think the shifting
> functions are wrong however, and should round prec up to the next power of two
> before applying the truncation.

Ick.  Indeed ...

Richard.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:36 ` Michael Matz
@ 2010-07-30 13:39   ` Richard Guenther
  2010-07-30 14:26     ` Michael Matz
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Guenther @ 2010-07-30 13:39 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc-patches

On Fri, 30 Jul 2010, Michael Matz wrote:

> Hi,
> 
> On Fri, 30 Jul 2010, Richard Guenther wrote:
> 
> > The pass is heavily copied from CCP and then massaged to only track
> 
> I think they both should be unified.  You duplicate quite some amount of 
> code.  Possibly

I thought about it - it shouldn't be terribly hard.  The core pieces
are inside the new folders.

> > + /* Bitwise conditional constant propagation (CCP) is based on the SSA
> 
> This whole block comment describes CCP, not the bitwise piece of it that 
> would actually be more interesting to quickly describe as overview.

I'll adjust that.

> > + typedef struct bit_prop_value_d {
> > +   /* Lattice value.  */
> > +   unsigned lattice_val;
> > + 
> > +   /* Bit lattice value.  Zero for constant, one for varying.  */
> > +   double_int bit_lattice_val;
> 
> I always think of this member as a mask, why not call it so?  At runtime 
> for a value X that has MASK and VALUE in the lattice this holds:
> (X & ~MASK) == VALUE .  Actually I'm not sure you ensure this as you 
> sometimes only adjust the mask, but not value, so you we might know only 
> this: (X & ~MASK) == (VALUE & ~MASK).

That's true, varying value bits have undefined content.

> > + get_value (tree var)
> > + {
> > +   bit_prop_value_t *val;
> > + 
> > +   if (const_val == NULL)
> > +     return NULL;
> 
> const_val can't be NULL here I think.

Copied from CCP where get_value is used in fold_constant_aggregate_ref ...

Fixed.

> > + 
> > +           case GIMPLE_TERNARY_RHS:
> > +             {
> > +               /* Handle binary operators that can appear in GIMPLE form.  */
> 
> ternary

Copied from CCP.  Fixed.

> > + get_value_for_expr (tree expr)
> > + {
> > +   else if (TREE_CODE (expr) == ADDR_EXPR
> > + 	   && ((align = get_object_alignment (TREE_OPERAND (expr, 0),
> > + 					      BITS_PER_UNIT, BIGGEST_ALIGNMENT))
> > + 	       > BITS_PER_UNIT))
> > +     {
> > +       val.lattice_val = CONSTANT;
> > +       val.bit_lattice_val
> > + 	= double_int_and_not (double_int_mask (TYPE_PRECISION
> > + 					         (TREE_TYPE (expr))),
> > + 			      uhwi_to_double_int (align / BITS_PER_UNIT - 1));
> > +       val.value = double_int_zero;
> 
> So here your mask contains zeros for bits outside the type precision ...

Assuming pointers are unsigned.  Indeed.

> > + bit_prop_value_convert (tree type, bit_prop_value_t rval, tree rval_type)
> > + {
> > +   bit_prop_value_t val;
> > +   bool uns;
> > + 
> > +   /* First extend lattice and value according to the original type.  */
> > +   uns = (TREE_CODE (rval_type) == INTEGER_TYPE && TYPE_IS_SIZETYPE (rval_type)
> > + 	 ? 0 : TYPE_UNSIGNED (rval_type));
> > +   val.bit_lattice_val = double_int_ext (rval.bit_lattice_val,
> > + 					TYPE_PRECISION (rval_type), uns);
> 
> ... but here you extend until the width of double_int.  Perhaps you want 
> to mask out the upper bits again?
> (And a lattice can't be sign/zero-extended :)  A mask could be.)

Well - if the value is signed and we do not know its sign bit then
we have to sign extend the lattice.

> > + evaluate_stmt (gimple stmt)
> > + {
> ...
> > + 	    case LT_EXPR:
> > + 	      /* If the most significant bits are not known we know nothing.  */
> > + 	      if (double_int_negative_p (r1val.bit_lattice_val)
> > + 		  || double_int_negative_p (r2val.bit_lattice_val))
> > + 		break;
> 
> Here you test the high bit of the mask, but above you fill the mask only 
> upto TYPE_PRECISION.  I'm not sure if it matters for correctness, though.

Where exactly?  The lattice should always be extended as to how the
value is extended.

> > +       tem = 1;
> > +       while (tem < (1u << (sizeof (unsigned int) * 8 - 1))
> > + 	     && !(val->bit_lattice_val.low & tem))
> > + 	tem <<= 1;
> > +       if (tem == 1)
> > + 	continue;
> 
> "(x ^ (x-1)) >> 1" will give you the mask of the trailing zero bits (that 
> plus one is your align).

Thanks,
Richard.
k

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:29       ` Richard Guenther
@ 2010-07-30 13:38         ` Paolo Bonzini
  2010-07-30 13:45           ` Richard Guenther
  2010-07-30 16:23         ` Richard Henderson
  1 sibling, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2010-07-30 13:38 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1123 bytes --]

On 07/30/2010 03:15 PM, Richard Guenther wrote:
> I think we can have negative shift counts (at least the constant folding
> code suggests so), this is why I have the code as-is.

No, that seems very weird.  Sure expand does not handle it, and the 
implementation-defined section of the manual does not mention it.  I'm 
more inclined to consider it a historical wart, given this comment:

   /* Previously detected shift-counts computed by NEGATE_EXPR
      and shifted in the other direction; but that does not work
      on all machines.  */

dating back to the beginning of the GCC repo.  I wonder what the 
attached patch would do.

BTW the SHIFT_COUNT_TRUNCATED handling is not needed because you get it 
from lshift_double.  However, this opens another small can of worms. 
lshift_double does

   if (SHIFT_COUNT_TRUNCATED)
     count %= prec;

which makes bit-level analysis totally wrong for non-power-of-two 
precisions, including IIRC bitfields bigger than sizeof(int).  I think 
the shifting functions are wrong however, and should round prec up to 
the next power of two before applying the truncation.

Paolo

[-- Attachment #2: lshift-double-neg-count.patch --]
[-- Type: text/plain, Size: 2100 bytes --]

Index: double-int.c
===================================================================
--- double-int.c	(revision 160609)
+++ double-int.c	(working copy)
@@ -314,12 +314,7 @@ lshift_double (unsigned HOST_WIDE_INT l1
 {
   unsigned HOST_WIDE_INT signmask;
 
-  if (count < 0)
-    {
-      rshift_double (l1, h1, -count, prec, lv, hv, arith);
-      return;
-    }
-
+  gcc_assert (count >= 0);
   if (SHIFT_COUNT_TRUNCATED)
     count %= prec;
 
@@ -377,12 +372,7 @@ rshift_double (unsigned HOST_WIDE_INT l1
 {
   unsigned HOST_WIDE_INT signmask;
 
-  if (count < 0)
-    {
-      lshift_double (l1, h1, -count, prec, lv, hv, arith);
-      return;
-    }
-
+  gcc_assert (count >= 0);
   signmask = (arith
 	      ? -((unsigned HOST_WIDE_INT) h1 >> (HOST_BITS_PER_WIDE_INT - 1))
 	      : 0);
@@ -445,6 +435,7 @@ lrotate_double (unsigned HOST_WIDE_INT l
   unsigned HOST_WIDE_INT s1l, s2l;
   HOST_WIDE_INT s1h, s2h;
 
+  gcc_assert (count >= 0);
   count %= prec;
   if (count < 0)
     count += prec;
@@ -467,6 +458,7 @@ rrotate_double (unsigned HOST_WIDE_INT l
   unsigned HOST_WIDE_INT s1l, s2l;
   HOST_WIDE_INT s1h, s2h;
 
+  gcc_assert (count >= 0);
   count %= prec;
   if (count < 0)
     count += prec;
Index: fold-const.c
===================================================================
--- fold-const.c	(revision 160609)
+++ fold-const.c	(working copy)
@@ -957,7 +957,10 @@ int_const_binop (enum tree_code code, co
       break;
 
     case RSHIFT_EXPR:
-      int2l = -int2l;
+      rshift_double (int1l, int1h, int2l, TYPE_PRECISION (type),
+		     &low, &hi, !uns);
+      break;
+
     case LSHIFT_EXPR:
       /* It's unclear from the C standard whether shifts can overflow.
 	 The following code ignores overflow; perhaps a C standard
@@ -967,7 +970,10 @@ int_const_binop (enum tree_code code, co
       break;
 
     case RROTATE_EXPR:
-      int2l = - int2l;
+      rrotate_double (int1l, int1h, int2l, TYPE_PRECISION (type),
+		      &low, &hi);
+      break;
+
     case LROTATE_EXPR:
       lrotate_double (int1l, int1h, int2l, TYPE_PRECISION (type),
 		      &low, &hi);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 12:06 Richard Guenther
  2010-07-30 12:39 ` Paolo Bonzini
@ 2010-07-30 13:36 ` Michael Matz
  2010-07-30 13:39   ` Richard Guenther
  1 sibling, 1 reply; 21+ messages in thread
From: Michael Matz @ 2010-07-30 13:36 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

Hi,

On Fri, 30 Jul 2010, Richard Guenther wrote:

> The pass is heavily copied from CCP and then massaged to only track

I think they both should be unified.  You duplicate quite some amount of 
code.  Possibly

> + /* Bitwise conditional constant propagation (CCP) is based on the SSA

This whole block comment describes CCP, not the bitwise piece of it that 
would actually be more interesting to quickly describe as overview.

> + typedef struct bit_prop_value_d {
> +   /* Lattice value.  */
> +   unsigned lattice_val;
> + 
> +   /* Bit lattice value.  Zero for constant, one for varying.  */
> +   double_int bit_lattice_val;

I always think of this member as a mask, why not call it so?  At runtime 
for a value X that has MASK and VALUE in the lattice this holds:
(X & ~MASK) == VALUE .  Actually I'm not sure you ensure this as you 
sometimes only adjust the mask, but not value, so you we might know only 
this: (X & ~MASK) == (VALUE & ~MASK).

> + get_value (tree var)
> + {
> +   bit_prop_value_t *val;
> + 
> +   if (const_val == NULL)
> +     return NULL;

const_val can't be NULL here I think.

> + 
> +           case GIMPLE_TERNARY_RHS:
> +             {
> +               /* Handle binary operators that can appear in GIMPLE form.  */

ternary

> + get_value_for_expr (tree expr)
> + {
> +   else if (TREE_CODE (expr) == ADDR_EXPR
> + 	   && ((align = get_object_alignment (TREE_OPERAND (expr, 0),
> + 					      BITS_PER_UNIT, BIGGEST_ALIGNMENT))
> + 	       > BITS_PER_UNIT))
> +     {
> +       val.lattice_val = CONSTANT;
> +       val.bit_lattice_val
> + 	= double_int_and_not (double_int_mask (TYPE_PRECISION
> + 					         (TREE_TYPE (expr))),
> + 			      uhwi_to_double_int (align / BITS_PER_UNIT - 1));
> +       val.value = double_int_zero;

So here your mask contains zeros for bits outside the type precision ...

> + bit_prop_value_convert (tree type, bit_prop_value_t rval, tree rval_type)
> + {
> +   bit_prop_value_t val;
> +   bool uns;
> + 
> +   /* First extend lattice and value according to the original type.  */
> +   uns = (TREE_CODE (rval_type) == INTEGER_TYPE && TYPE_IS_SIZETYPE (rval_type)
> + 	 ? 0 : TYPE_UNSIGNED (rval_type));
> +   val.bit_lattice_val = double_int_ext (rval.bit_lattice_val,
> + 					TYPE_PRECISION (rval_type), uns);

... but here you extend until the width of double_int.  Perhaps you want 
to mask out the upper bits again?
(And a lattice can't be sign/zero-extended :)  A mask could be.)

> + evaluate_stmt (gimple stmt)
> + {
...
> + 	    case LT_EXPR:
> + 	      /* If the most significant bits are not known we know nothing.  */
> + 	      if (double_int_negative_p (r1val.bit_lattice_val)
> + 		  || double_int_negative_p (r2val.bit_lattice_val))
> + 		break;

Here you test the high bit of the mask, but above you fill the mask only 
upto TYPE_PRECISION.  I'm not sure if it matters for correctness, though.

> +       tem = 1;
> +       while (tem < (1u << (sizeof (unsigned int) * 8 - 1))
> + 	     && !(val->bit_lattice_val.low & tem))
> + 	tem <<= 1;
> +       if (tem == 1)
> + 	continue;

"(x ^ (x-1)) >> 1" will give you the mask of the trailing zero bits (that 
plus one is your align).

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:16     ` Paolo Bonzini
@ 2010-07-30 13:29       ` Richard Guenther
  2010-07-30 13:38         ` Paolo Bonzini
  2010-07-30 16:23         ` Richard Henderson
  0 siblings, 2 replies; 21+ messages in thread
From: Richard Guenther @ 2010-07-30 13:29 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

On Fri, 30 Jul 2010, Paolo Bonzini wrote:

> On 07/30/2010 02:39 PM, Richard Guenther wrote:
> > > > +     case LSHIFT_EXPR:
> > > > +     case RSHIFT_EXPR:
> > > > +       if (r2val.lattice_val == CONSTANT
> > > > + 	&&  double_int_zero_p (r2val.bit_lattice_val))
> > > 
> > > Even if some bits are varying, the result will have at least r2val.value&
> > > ~r2val.bit_lattice_val bits shifted in.  So you can do something like
> > > 
> > >        if (r2val.lattice_val != CONSTANT)
> > >          break;
> > >        if (double_int_zero_p (r2val.bit_lattice_val)) {
> > >          in_mask = r1val.bit_lattice_val;
> > >          in_value = r1val.value;
> > >        } else {
> > >          in_mask = double_int_minus_one;
> > >          in_value = double_int_minus_one;
> > >        }
> > > 
> > >        shift = r2val.value&  ~r2val.bit_lattice_val;
> > >        if (SHIFT_COUNT_TRUNCATED)
> > >          shift&= GETMODE_BITSIZE (TYPE_MODE (type)) - 1;
> > > 
> > >        val.lattice_val = CONSTANT;
> > >        val.bit_lattice_val
> > > 	= double_int_lshift (in_mask, shift,
> > > 			     TYPE_PRECISION (type), false);
> > >        val.value = double_int_lshift (in_value, shift,
> > > 				     TYPE_PRECISION (type), false);
> > 
> > Hmm.  I don't quite understand.  Are you saying that only the
> > lower bits of r2val are important if SHIFT_COUNT_TRUNCATED?
> > At least we need to know the sign of the shift amount
> > to be able to tell if we're shifting in zeros from the right
> > or ones from the left.
> 
> Well, yes. :)
> 
> I wonder if using negative shift counts is just a useless complication, since
> that's what got me confused.  If you just make "shift" a positive shift count
> and test the tree code rather than the sign of shift, the code I gave above to
> handle SHIFT_COUNT_TRUNCATED magically starts to work.

I think we can have negative shift counts (at least the constant folding
code suggests so), this is why I have the code as-is.

So if we know the MSB of the shift count we can for positive
shifts shift in v & ~m zeros and drop the rest to varying.
For negative shift counts we can shift in v & ~m and drop the
rest to varying.  So your (x << (y | 8)) & 255 becomes zero
if y is known to be non-negative.

I'll add a fixme for now.

> > > > + 	  else if (shift<  0)
> > > > + 	    {
> > > > + 	      val.lattice_val = CONSTANT;
> > > > + 	      val.bit_lattice_val
> > > > + 		= double_int_rshift (r1val.bit_lattice_val,
> > > > + 				     -shift, TYPE_PRECISION (type),
> > > > true);
> > > > + 	      val.value = double_int_rshift (r1val.value, shift,
> > > > + 					     TYPE_PRECISION (type),
> > > > true);
> > > > + 	    }
> > > 
> > > Here shifted in bits are varying for signed types (unless the sign bit is
> > > constant, but that's not handled here either).
> > 
> > That should be handled fine by using an arithmetic shift for both
> > the lattice and the value.
> 
> Aha, right.
> 
> > So for varying lattice "sign" bit we
> > shift in varying.  Hm, but indeed for unsigned constants we shift
> > in the sign bit - fixed to
> > 
> >            else if (shift<  0)
> >              {
> >                val.bit_lattice_val
> >                  = double_int_rshift (r1val.bit_lattice_val,
> >                                       -shift, TYPE_PRECISION (type), true);
> 
> This should be !uns too, since for unsigned values shifted-in bits are
> constant.  But it's much more clever than I thought: if the topmost bit is
> constant, the newly computed lattice is also constant.  Very nice.
> 
> This unfortunately wouldn't work if r1val.bit_lattice_val is changed to
> in_mask as in my example above.  To handle that you could use
> 
>    msb_constant = uns || !double_bit_set_p (r1val.bit_lattice_val,
>                                             TYPE_PRECISION (type) - 1)
> 
> and then you pass !msb_constant to this right shift.  I'll let you decide
> whether it's overkill, but in any case the current clever code needs
> documentation.

Yeah, I added some.

Richard.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 13:02   ` Richard Guenther
@ 2010-07-30 13:16     ` Paolo Bonzini
  2010-07-30 13:29       ` Richard Guenther
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2010-07-30 13:16 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

On 07/30/2010 02:39 PM, Richard Guenther wrote:
>>> +     case LSHIFT_EXPR:
>>> +     case RSHIFT_EXPR:
>>> +       if (r2val.lattice_val == CONSTANT
>>> + 	&&  double_int_zero_p (r2val.bit_lattice_val))
>>
>> Even if some bits are varying, the result will have at least r2val.value&
>> ~r2val.bit_lattice_val bits shifted in.  So you can do something like
>>
>>        if (r2val.lattice_val != CONSTANT)
>>          break;
>>        if (double_int_zero_p (r2val.bit_lattice_val)) {
>>          in_mask = r1val.bit_lattice_val;
>>          in_value = r1val.value;
>>        } else {
>>          in_mask = double_int_minus_one;
>>          in_value = double_int_minus_one;
>>        }
>>
>>        shift = r2val.value&  ~r2val.bit_lattice_val;
>>        if (SHIFT_COUNT_TRUNCATED)
>>          shift&= GETMODE_BITSIZE (TYPE_MODE (type)) - 1;
>>
>>        val.lattice_val = CONSTANT;
>>        val.bit_lattice_val
>> 	= double_int_lshift (in_mask, shift,
>> 			     TYPE_PRECISION (type), false);
>>        val.value = double_int_lshift (in_value, shift,
>> 				     TYPE_PRECISION (type), false);
>
> Hmm.  I don't quite understand.  Are you saying that only the
> lower bits of r2val are important if SHIFT_COUNT_TRUNCATED?
> At least we need to know the sign of the shift amount
> to be able to tell if we're shifting in zeros from the right
> or ones from the left.

Well, yes. :)

I wonder if using negative shift counts is just a useless complication, 
since that's what got me confused.  If you just make "shift" a positive 
shift count and test the tree code rather than the sign of shift, the 
code I gave above to handle SHIFT_COUNT_TRUNCATED magically starts to work.

>>> + 	  else if (shift<  0)
>>> + 	    {
>>> + 	      val.lattice_val = CONSTANT;
>>> + 	      val.bit_lattice_val
>>> + 		= double_int_rshift (r1val.bit_lattice_val,
>>> + 				     -shift, TYPE_PRECISION (type), true);
>>> + 	      val.value = double_int_rshift (r1val.value, shift,
>>> + 					     TYPE_PRECISION (type), true);
>>> + 	    }
>>
>> Here shifted in bits are varying for signed types (unless the sign bit is
>> constant, but that's not handled here either).
>
> That should be handled fine by using an arithmetic shift for both
> the lattice and the value.

Aha, right.

> So for varying lattice "sign" bit we
> shift in varying.  Hm, but indeed for unsigned constants we shift
> in the sign bit - fixed to
>
>            else if (shift<  0)
>              {
>                val.bit_lattice_val
>                  = double_int_rshift (r1val.bit_lattice_val,
>                                       -shift, TYPE_PRECISION (type), true);

This should be !uns too, since for unsigned values shifted-in bits are 
constant.  But it's much more clever than I thought: if the topmost bit 
is constant, the newly computed lattice is also constant.  Very nice.

This unfortunately wouldn't work if r1val.bit_lattice_val is changed to 
in_mask as in my example above.  To handle that you could use

    msb_constant = uns || !double_bit_set_p (r1val.bit_lattice_val,
                                             TYPE_PRECISION (type) - 1)

and then you pass !msb_constant to this right shift.  I'll let you 
decide whether it's overkill, but in any case the current clever code 
needs documentation.

Just remember to add testcases covering shifting of unsigned values, 
positive signed values, and negative signed values.  Also file a missing 
optimization bug for e.g. (x << (y | 8)) & 255 if you don't include my 
suggestions.

>>> + 	  unsigned r1ctz = 0, r2ctz = 0;
>>> + 	  while (r1ctz<  HOST_BITS_PER_WIDE_INT
>>> + 		&&  !(r1val.bit_lattice_val.low&  (1<<  r1ctz))
>>> + 		&&  !(r1val.value.low&  (1<<  r1ctz)))
>>> + 	    r1ctz++;
>>> + 	  while (r2ctz<  HOST_BITS_PER_WIDE_INT
>>> + 		&&  !(r2val.bit_lattice_val.low&  (1<<  r2ctz))
>>> + 		&&  !(r2val.value.low&  (1<<  r2ctz)))
>>> + 	    r2ctz++;
>>
>> This is just a ctz on (v | m), no?  This makes it easier to track high bits as
>> well.
>
> Yes.  Probably worth adding double_int_ctz.

Agreed.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 12:39 ` Paolo Bonzini
  2010-07-30 12:54   ` Paolo Bonzini
@ 2010-07-30 13:02   ` Richard Guenther
  2010-07-30 13:16     ` Paolo Bonzini
  1 sibling, 1 reply; 21+ messages in thread
From: Richard Guenther @ 2010-07-30 13:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

On Fri, 30 Jul 2010, Paolo Bonzini wrote:

> > + 	CONSTANT	->   V_i has been found to hold a constant
> > + 			    value C.
> 
> "Some bits of V_i have been found to..."

Done.

> Or just remove/rewrite the head comment and move that comment here:
> 
> > + /* Possible lattice values.  */
> > + typedef enum
> > + {
> > +   UNINITIALIZED,
> > +   UNDEFINED,
> > +   CONSTANT,
> > +   VARYING
> > + } ccp_lattice_t;
> 
> > + 	  val.bit_lattice_val
> > + 	    = double_int_and
> > + 	        (double_int_ior (r1val.bit_lattice_val,
> > + 				 r2val.bit_lattice_val),
> > + 		 double_int_and (double_int_ior (r1val.value,
> > + 						 r1val.bit_lattice_val),
> > + 				 double_int_ior (r2val.value,
> > + 						 r2val.bit_lattice_val)));
> > + 	  if (!double_int_minus_one_p (val.bit_lattice_val))
> > + 	    val.lattice_val = CONSTANT;
> 
> In other places of this function you just set it to CONSTANT without the if.
> You can just set the lattice_val to CONSTANT here:
> 
> > +   bit_prop_value_t val = { VARYING, { -1, -1 }, { 0, 0 } };
> 
> and change it to VARYING ("if minus one then VARYING") at the end of the
> function.  If no computation is done in the function, the bit_lattice_val will
> stay -1.

True.  I changed it to initial { CONSTANT, { -1, -1 }, {0, 0}} and
drop to varying in the end.  So I don't need to set the lattice value
at all.

> > +     case LSHIFT_EXPR:
> > +     case RSHIFT_EXPR:
> > +       if (r2val.lattice_val == CONSTANT
> > + 	  && double_int_zero_p (r2val.bit_lattice_val))
> 
> Even if some bits are varying, the result will have at least r2val.value &
> ~r2val.bit_lattice_val bits shifted in.  So you can do something like
> 
>       if (r2val.lattice_val != CONSTANT)
>         break;
>       if (double_int_zero_p (r2val.bit_lattice_val)) {
>         in_mask = r1val.bit_lattice_val;
>         in_value = r1val.value;
>       } else {
>         in_mask = double_int_minus_one;
>         in_value = double_int_minus_one;
>       }
> 
>       shift = r2val.value & ~r2val.bit_lattice_val;
>       if (SHIFT_COUNT_TRUNCATED)
>         shift &= GETMODE_BITSIZE (TYPE_MODE (type)) - 1;
> 
>       val.lattice_val = CONSTANT;
>       val.bit_lattice_val
> 	= double_int_lshift (in_mask, shift,
> 			     TYPE_PRECISION (type), false);
>       val.value = double_int_lshift (in_value, shift,
> 				     TYPE_PRECISION (type), false);

Hmm.  I don't quite understand.  Are you saying that only the
lower bits of r2val are important if SHIFT_COUNT_TRUNCATED?
At least we need to know the sign of the shift amount
to be able to tell if we're shifting in zeros from the right
or ones from the left.

> > + 	  else if (shift < 0)
> > + 	    {
> > + 	      val.lattice_val = CONSTANT;
> > + 	      val.bit_lattice_val
> > + 		= double_int_rshift (r1val.bit_lattice_val,
> > + 				     -shift, TYPE_PRECISION (type), true);
> > + 	      val.value = double_int_rshift (r1val.value, shift,
> > + 					     TYPE_PRECISION (type), true);
> > + 	    }
> 
> Here shifted in bits are varying for signed types (unless the sign bit is
> constant, but that's not handled here either).

That should be handled fine by using an arithmetic shift for both
the lattice and the value.  So for varying lattice "sign" bit we
shift in varying.  Hm, but indeed for unsigned constants we shift
in the sign bit - fixed to

          else if (shift < 0)
            {
              val.bit_lattice_val
                = double_int_rshift (r1val.bit_lattice_val,
                                     -shift, TYPE_PRECISION (type), true);
              val.value = double_int_rshift (r1val.value, -shift,
                                             TYPE_PRECISION (type), !uns);

(typo for using 'shift' fixed as well)

> > + 	  unsigned r1ctz = 0, r2ctz = 0;
> > + 	  while (r1ctz < HOST_BITS_PER_WIDE_INT
> > + 		 && !(r1val.bit_lattice_val.low & (1 << r1ctz))
> > + 		 && !(r1val.value.low & (1 << r1ctz)))
> > + 	    r1ctz++;
> > + 	  while (r2ctz < HOST_BITS_PER_WIDE_INT
> > + 		 && !(r2val.bit_lattice_val.low & (1 << r2ctz))
> > + 		 && !(r2val.value.low & (1 << r2ctz)))
> > + 	    r2ctz++;
> 
> This is just a ctz on (v | m), no?  This makes it easier to track high bits as
> well.

Yes.  Probably worth adding double_int_ctz.

> I'm sure that you can optimize the addition, I'll think about it.

I thought about it as well and decided to be safe for now ;)
But iterating sth with full-width operations should be possible.

Thanks for the review,
Richard.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 12:39 ` Paolo Bonzini
@ 2010-07-30 12:54   ` Paolo Bonzini
  2010-07-30 13:02   ` Richard Guenther
  1 sibling, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2010-07-30 12:54 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches

> + 	CONSTANT	->   V_i has been found to hold a constant
> + 			    value C.

"Some bits of V_i have been found to..."

Or just remove/rewrite the head comment and move that comment here:

> + /* Possible lattice values.  */
> + typedef enum
> + {
> +   UNINITIALIZED,
> +   UNDEFINED,
> +   CONSTANT,
> +   VARYING
> + } ccp_lattice_t;

> + 	  val.bit_lattice_val
> + 	    = double_int_and
> + 	        (double_int_ior (r1val.bit_lattice_val,
> + 				 r2val.bit_lattice_val),
> + 		 double_int_and (double_int_ior (r1val.value,
> + 						 r1val.bit_lattice_val),
> + 				 double_int_ior (r2val.value,
> + 						 r2val.bit_lattice_val)));
> + 	  if (!double_int_minus_one_p (val.bit_lattice_val))
> + 	    val.lattice_val = CONSTANT;

In other places of this function you just set it to CONSTANT without the 
if.  You can just set the lattice_val to CONSTANT here:

> +   bit_prop_value_t val = { VARYING, { -1, -1 }, { 0, 0 } };

and change it to VARYING ("if minus one then VARYING") at the end of the 
function.  If no computation is done in the function, the 
bit_lattice_val will stay -1.

> +     case LSHIFT_EXPR:
> +     case RSHIFT_EXPR:
> +       if (r2val.lattice_val == CONSTANT
> + 	  && double_int_zero_p (r2val.bit_lattice_val))

Even if some bits are varying, the result will have at least r2val.value 
& ~r2val.bit_lattice_val bits shifted in.  So you can do something like

       if (r2val.lattice_val != CONSTANT)
         break;
       if (double_int_zero_p (r2val.bit_lattice_val)) {
         in_mask = r1val.bit_lattice_val;
         in_value = r1val.value;
       } else {
         in_mask = double_int_minus_one;
         in_value = double_int_minus_one;
       }

       shift = r2val.value & ~r2val.bit_lattice_val;
       if (SHIFT_COUNT_TRUNCATED)
         shift &= GETMODE_BITSIZE (TYPE_MODE (type)) - 1;

       val.lattice_val = CONSTANT;
       val.bit_lattice_val
	= double_int_lshift (in_mask, shift,
			     TYPE_PRECISION (type), false);
       val.value = double_int_lshift (in_value, shift,
				     TYPE_PRECISION (type), false);

> + 	  else if (shift < 0)
> + 	    {
> + 	      val.lattice_val = CONSTANT;
> + 	      val.bit_lattice_val
> + 		= double_int_rshift (r1val.bit_lattice_val,
> + 				     -shift, TYPE_PRECISION (type), true);
> + 	      val.value = double_int_rshift (r1val.value, shift,
> + 					     TYPE_PRECISION (type), true);
> + 	    }

Here shifted in bits are varying for signed types (unless the sign bit 
is constant, but that's not handled here either).

> + 	  unsigned r1ctz = 0, r2ctz = 0;
> + 	  while (r1ctz < HOST_BITS_PER_WIDE_INT
> + 		 && !(r1val.bit_lattice_val.low & (1 << r1ctz))
> + 		 && !(r1val.value.low & (1 << r1ctz)))
> + 	    r1ctz++;
> + 	  while (r2ctz < HOST_BITS_PER_WIDE_INT
> + 		 && !(r2val.bit_lattice_val.low & (1 << r2ctz))
> + 		 && !(r2val.value.low & (1 << r2ctz)))
> + 	    r2ctz++;

This is just a ctz on (v | m), no?  This makes it easier to track high 
bits as well.

I'm sure that you can optimize the addition, I'll think about it.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH][RFC] Bit CCP and pointer alignment propagation
  2010-07-30 12:06 Richard Guenther
@ 2010-07-30 12:39 ` Paolo Bonzini
  2010-07-30 12:54   ` Paolo Bonzini
  2010-07-30 13:02   ` Richard Guenther
  2010-07-30 13:36 ` Michael Matz
  1 sibling, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2010-07-30 12:39 UTC (permalink / raw)
  To: gcc-patches; +Cc: gcc-patches

> + 	CONSTANT	->   V_i has been found to hold a constant
> + 			    value C.

"Some bits of V_i have been found to..."

Or just remove/rewrite the head comment and move that comment here:

> + /* Possible lattice values.  */
> + typedef enum
> + {
> +   UNINITIALIZED,
> +   UNDEFINED,
> +   CONSTANT,
> +   VARYING
> + } ccp_lattice_t;

> + 	  val.bit_lattice_val
> + 	    = double_int_and
> + 	        (double_int_ior (r1val.bit_lattice_val,
> + 				 r2val.bit_lattice_val),
> + 		 double_int_and (double_int_ior (r1val.value,
> + 						 r1val.bit_lattice_val),
> + 				 double_int_ior (r2val.value,
> + 						 r2val.bit_lattice_val)));
> + 	  if (!double_int_minus_one_p (val.bit_lattice_val))
> + 	    val.lattice_val = CONSTANT;

In other places of this function you just set it to CONSTANT without the 
if.  You can just set the lattice_val to CONSTANT here:

> +   bit_prop_value_t val = { VARYING, { -1, -1 }, { 0, 0 } };

and change it to VARYING ("if minus one then VARYING") at the end of the 
function.  If no computation is done in the function, the 
bit_lattice_val will stay -1.

> +     case LSHIFT_EXPR:
> +     case RSHIFT_EXPR:
> +       if (r2val.lattice_val == CONSTANT
> + 	  && double_int_zero_p (r2val.bit_lattice_val))

Even if some bits are varying, the result will have at least r2val.value 
& ~r2val.bit_lattice_val bits shifted in.  So you can do something like

       if (r2val.lattice_val != CONSTANT)
         break;
       if (double_int_zero_p (r2val.bit_lattice_val)) {
         in_mask = r1val.bit_lattice_val;
         in_value = r1val.value;
       } else {
         in_mask = double_int_minus_one;
         in_value = double_int_minus_one;
       }

       shift = r2val.value & ~r2val.bit_lattice_val;
       if (SHIFT_COUNT_TRUNCATED)
         shift &= GETMODE_BITSIZE (TYPE_MODE (type)) - 1;

       val.lattice_val = CONSTANT;
       val.bit_lattice_val
	= double_int_lshift (in_mask, shift,
			     TYPE_PRECISION (type), false);
       val.value = double_int_lshift (in_value, shift,
				     TYPE_PRECISION (type), false);

> + 	  else if (shift < 0)
> + 	    {
> + 	      val.lattice_val = CONSTANT;
> + 	      val.bit_lattice_val
> + 		= double_int_rshift (r1val.bit_lattice_val,
> + 				     -shift, TYPE_PRECISION (type), true);
> + 	      val.value = double_int_rshift (r1val.value, shift,
> + 					     TYPE_PRECISION (type), true);
> + 	    }

Here shifted in bits are varying for signed types (unless the sign bit 
is constant, but that's not handled here either).

> + 	  unsigned r1ctz = 0, r2ctz = 0;
> + 	  while (r1ctz < HOST_BITS_PER_WIDE_INT
> + 		 && !(r1val.bit_lattice_val.low & (1 << r1ctz))
> + 		 && !(r1val.value.low & (1 << r1ctz)))
> + 	    r1ctz++;
> + 	  while (r2ctz < HOST_BITS_PER_WIDE_INT
> + 		 && !(r2val.bit_lattice_val.low & (1 << r2ctz))
> + 		 && !(r2val.value.low & (1 << r2ctz)))
> + 	    r2ctz++;

This is just a ctz on (v | m), no?  This makes it easier to track high 
bits as well.

I'm sure that you can optimize the addition, I'll think about it.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH][RFC] Bit CCP and pointer alignment propagation
@ 2010-07-30 12:06 Richard Guenther
  2010-07-30 12:39 ` Paolo Bonzini
  2010-07-30 13:36 ` Michael Matz
  0 siblings, 2 replies; 21+ messages in thread
From: Richard Guenther @ 2010-07-30 12:06 UTC (permalink / raw)
  To: gcc-patches


This implements bit CCP as a way to do pointer alignment propagation.
The pass is heavily copied from CCP and then massaged to only track
integer (or pointer) constants together with a sub-lattice for
constness or varyingness of individual bits.  A value is considered
CONSTANT as long as it has at least a single constant bit.
Compared to the regular CCP it adds piecewise constant foldings
in bit_prop_value_unop, bit_prop_value_binop, bit_prop_value_convert
and the not-yet-split comparison folding in evaluate_stmt.
get_value_for_expr extracts initial alignment from addresses
(and misses that &a.b.c has more knowledge than the alignment
of the element, and also misses to propagate across &p->b.c).

Pointer alignment (and known misalignment) information is stored
in the SSA_NAME_PTR_INFO structure and currently unused.  The
plan is to use that infrastructure piece to get rid of
MISALIGNED_INDIRECT_REF and to make get_pointer_alignment
more correct.

There is the issue of lack of suitable alignment information
for pointers that are not based on decls - eventually we have
to give the frontends some way to transfer knowledge to the
middle-end or continue to be optimistic here.

Bootstrapped and tested on x86_64-unknown-linux-gnu with
the excessive placement of this pass like in the patch below.

Any comments?

Thanks,
Richard.

2010-07-28  Richard Guenther  <rguenther@suse.de>

	* tree-flow.h (struct ptr_info_def): Add align and misalign fields.
	* tree-ssa-alias.c (get_ptr_info): Move ...
	* tree-ssanames.c (get_ptr_info): ... here.  Initialize new fields.
	* Makefile.in (OBJS-common): Add tree-ssa-bit-ccp.o.
	(tree-ssa-bit-ccp.o): Dependencies for new target.
	* passes.c (init_optimization_passes): Schedule bit-ccp.
	* tree-pass.h (pass_bit_ccp): Declare.
	* tree-ssa-bit-ccp.c: New file.
	* timevar.def (TV_TREE_STORE_CCP): Remove.
	(TV_TREE_BIT_CCP): New.
	* common.opt (ftree-bit-ccp): New flag.
	* gimple-pretty-print.c (dump_gimple_phi): Dump alignment info.
	(dump_gimple_stmt): Likewise.
	* opts.c (decode_options): Enable bit-ccp at -O1 and up.
	* doc/invoke.texi (ftree-bit-ccp): Document.

Index: gcc/tree-flow.h
===================================================================
*** gcc/tree-flow.h.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/tree-flow.h	2010-07-29 15:50:17.000000000 +0200
*************** struct GTY(()) ptr_info_def
*** 119,124 ****
--- 119,129 ----
  {
    /* The points-to solution.  */
    struct pt_solution pt;
+ 
+   /* Alignment of the pointer in bytes.  ptr & ~(align-1) == 0.  */
+   unsigned int align;
+   /* Misalignment of the pointer in bytes.  ptr & (align-1) == misalign.  */
+   unsigned int misalign;
  };
  
  
Index: gcc/tree-ssa-alias.c
===================================================================
*** gcc/tree-ssa-alias.c.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/tree-ssa-alias.c	2010-07-29 15:50:17.000000000 +0200
*************** debug_alias_info (void)
*** 368,394 ****
  }
  
  
- /* Return the alias information associated with pointer T.  It creates a
-    new instance if none existed.  */
- 
- struct ptr_info_def *
- get_ptr_info (tree t)
- {
-   struct ptr_info_def *pi;
- 
-   gcc_assert (POINTER_TYPE_P (TREE_TYPE (t)));
- 
-   pi = SSA_NAME_PTR_INFO (t);
-   if (pi == NULL)
-     {
-       pi = ggc_alloc_cleared_ptr_info_def ();
-       pt_solution_reset (&pi->pt);
-       SSA_NAME_PTR_INFO (t) = pi;
-     }
- 
-   return pi;
- }
- 
  /* Dump the points-to set *PT into FILE.  */
  
  void
--- 368,373 ----
Index: gcc/tree-ssanames.c
===================================================================
*** gcc/tree-ssanames.c.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/tree-ssanames.c	2010-07-29 15:50:17.000000000 +0200
*************** release_ssa_name (tree var)
*** 240,259 ****
      }
  }
  
- /* Creates a duplicate of a ssa name NAME defined in statement STMT.  */
  
! tree
! duplicate_ssa_name (tree name, gimple stmt)
  {
!   tree new_name = make_ssa_name (SSA_NAME_VAR (name), stmt);
!   struct ptr_info_def *old_ptr_info = SSA_NAME_PTR_INFO (name);
  
!   if (old_ptr_info)
!     duplicate_ssa_name_ptr_info (new_name, old_ptr_info);
  
!   return new_name;
! }
  
  
  /* Creates a duplicate of the ptr_info_def at PTR_INFO for use by
     the SSA name NAME.  */
--- 240,268 ----
      }
  }
  
  
! /* Return the alias information associated with pointer T.  It creates a
!    new instance if none existed.  */
! 
! struct ptr_info_def *
! get_ptr_info (tree t)
  {
!   struct ptr_info_def *pi;
  
!   gcc_assert (POINTER_TYPE_P (TREE_TYPE (t)));
  
!   pi = SSA_NAME_PTR_INFO (t);
!   if (pi == NULL)
!     {
!       pi = ggc_alloc_cleared_ptr_info_def ();
!       pt_solution_reset (&pi->pt);
!       pi->align = 1;
!       pi->misalign = 0;
!       SSA_NAME_PTR_INFO (t) = pi;
!     }
  
+   return pi;
+ }
  
  /* Creates a duplicate of the ptr_info_def at PTR_INFO for use by
     the SSA name NAME.  */
*************** duplicate_ssa_name_ptr_info (tree name,
*** 276,281 ****
--- 285,305 ----
  }
  
  
+ /* Creates a duplicate of a ssa name NAME defined in statement STMT.  */
+ 
+ tree
+ duplicate_ssa_name (tree name, gimple stmt)
+ {
+   tree new_name = make_ssa_name (SSA_NAME_VAR (name), stmt);
+   struct ptr_info_def *old_ptr_info = SSA_NAME_PTR_INFO (name);
+ 
+   if (old_ptr_info)
+     duplicate_ssa_name_ptr_info (new_name, old_ptr_info);
+ 
+   return new_name;
+ }
+ 
+ 
  /* Release all the SSA_NAMEs created by STMT.  */
  
  void
Index: gcc/Makefile.in
===================================================================
*** gcc/Makefile.in.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/Makefile.in	2010-07-29 15:50:17.000000000 +0200
*************** OBJS-common = \
*** 1378,1383 ****
--- 1378,1384 ----
  	tree-ssa-address.o \
  	tree-ssa-alias.o \
  	tree-ssa-ccp.o \
+ 	tree-ssa-bit-ccp.o \
  	tree-ssa-coalesce.o \
  	tree-ssa-copy.o \
  	tree-ssa-copyrename.o \
*************** tree-ssa-ccp.o : tree-ssa-ccp.c $(TREE_F
*** 3129,3134 ****
--- 3130,3141 ----
     $(SYSTEM_H) $(TREE_H) $(TM_P_H) $(EXPR_H) output.h \
     $(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
     $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h \
+    tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(TOPLEV_H) $(DIAGNOSTIC_CORE_H) \
+    $(DBGCNT_H) tree-pretty-print.h gimple-pretty-print.h
+ tree-ssa-bit-ccp.o : tree-ssa-bit-ccp.c $(TREE_FLOW_H) $(CONFIG_H) \
+    $(SYSTEM_H) $(TREE_H) $(TM_P_H) $(EXPR_H) output.h \
+    $(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
+    $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h \
     tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(TOPLEV_H) $(DIAGNOSTIC_CORE_H) \
     $(DBGCNT_H) tree-pretty-print.h gimple-pretty-print.h
  tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h alloc-pool.h \
Index: gcc/passes.c
===================================================================
*** gcc/passes.c.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/passes.c	2010-07-30 11:33:52.000000000 +0200
*************** init_optimization_passes (void)
*** 776,781 ****
--- 776,782 ----
  	  struct opt_pass **p = &pass_all_early_optimizations.pass.sub;
  	  NEXT_PASS (pass_remove_cgraph_callee_edges);
  	  NEXT_PASS (pass_rename_ssa_copies);
+ 	  NEXT_PASS (pass_bit_ccp);
  	  NEXT_PASS (pass_ccp);
  	  NEXT_PASS (pass_forwprop);
  	  /* pass_build_ealias is a dummy pass that ensures that we
*************** init_optimization_passes (void)
*** 909,914 ****
--- 910,916 ----
  	    }
  	  NEXT_PASS (pass_iv_canon);
  	  NEXT_PASS (pass_if_conversion);
+ 	  NEXT_PASS (pass_bit_ccp);
  	  NEXT_PASS (pass_vectorize);
  	    {
  	      struct opt_pass **p = &pass_vectorize.pass.sub;
*************** init_optimization_passes (void)
*** 949,954 ****
--- 951,957 ----
        NEXT_PASS (pass_dse);
        NEXT_PASS (pass_forwprop);
        NEXT_PASS (pass_phiopt);
+       NEXT_PASS (pass_bit_ccp);
        NEXT_PASS (pass_fold_builtins);
        NEXT_PASS (pass_optimize_widening_mul);
        NEXT_PASS (pass_tail_calls);
Index: gcc/tree-pass.h
===================================================================
*** gcc/tree-pass.h.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/tree-pass.h	2010-07-29 15:50:17.000000000 +0200
*************** extern struct gimple_opt_pass pass_iv_op
*** 386,391 ****
--- 386,392 ----
  extern struct gimple_opt_pass pass_tree_loop_done;
  extern struct gimple_opt_pass pass_ch;
  extern struct gimple_opt_pass pass_ccp;
+ extern struct gimple_opt_pass pass_bit_ccp;
  extern struct gimple_opt_pass pass_phi_only_cprop;
  extern struct gimple_opt_pass pass_build_ssa;
  extern struct gimple_opt_pass pass_build_alias;
Index: gcc/tree-ssa-bit-ccp.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- gcc/tree-ssa-bit-ccp.c	2010-07-29 17:18:29.000000000 +0200
***************
*** 0 ****
--- 1,1826 ----
+ /* Bitwise conditional constant propagation pass for the GNU compiler.
+    Copyright (C) 2010 Free Software Foundation, Inc.
+    Copied from tree-ssa-ccp.c and adapted to do bit tracking
+    by Richard Guenther <rguenther@suse.de>.
+ 
+ This file is part of GCC.
+ 
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by the
+ Free Software Foundation; either version 3, or (at your option) any
+ later version.
+ 
+ GCC is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+ 
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3.  If not see
+ <http://www.gnu.org/licenses/>.  */
+ 
+ /* Bitwise conditional constant propagation (CCP) is based on the SSA
+    propagation engine (tree-ssa-propagate.c).  Constant assignments of
+    the form VAR = CST are propagated from the assignments into uses of
+    VAR, which in turn may generate new constants.  The simulation uses
+    a four level lattice to keep track of constant values associated
+    with SSA names.  Given an SSA name V_i, it may take one of the
+    following values:
+ 
+ 	UNINITIALIZED   ->  the initial state of the value.  This value
+ 			    is replaced with a correct initial value
+ 			    the first time the value is used, so the
+ 			    rest of the pass does not need to care about
+ 			    it.  Using this value simplifies initialization
+ 			    of the pass, and prevents us from needlessly
+ 			    scanning statements that are never reached.
+ 
+ 	UNDEFINED	->  V_i is a local variable whose definition
+ 			    has not been processed yet.  Therefore we
+ 			    don't yet know if its value is a constant
+ 			    or not.
+ 
+ 	CONSTANT	->  V_i has been found to hold a constant
+ 			    value C.
+ 
+ 	VARYING		->  V_i cannot take a constant value, or if it
+ 			    does, it is not possible to determine it
+ 			    at compile time.
+ 
+    The core of SSA-CCP is in ccp_visit_stmt and ccp_visit_phi_node:
+ 
+    1- In ccp_visit_stmt, we are interested in assignments whose RHS
+       evaluates into a constant and conditional jumps whose predicate
+       evaluates into a boolean true or false.  When an assignment of
+       the form V_i = CONST is found, V_i's lattice value is set to
+       CONSTANT and CONST is associated with it.  This causes the
+       propagation engine to add all the SSA edges coming out the
+       assignment into the worklists, so that statements that use V_i
+       can be visited.
+ 
+       If the statement is a conditional with a constant predicate, we
+       mark the outgoing edges as executable or not executable
+       depending on the predicate's value.  This is then used when
+       visiting PHI nodes to know when a PHI argument can be ignored.
+ 
+ 
+    2- In ccp_visit_phi_node, if all the PHI arguments evaluate to the
+       same constant C, then the LHS of the PHI is set to C.  This
+       evaluation is known as the "meet operation".  Since one of the
+       goals of this evaluation is to optimistically return constant
+       values as often as possible, it uses two main short cuts:
+ 
+       - If an argument is flowing in through a non-executable edge, it
+ 	is ignored.  This is useful in cases like this:
+ 
+ 			if (PRED)
+ 			  a_9 = 3;
+ 			else
+ 			  a_10 = 100;
+ 			a_11 = PHI (a_9, a_10)
+ 
+ 	If PRED is known to always evaluate to false, then we can
+ 	assume that a_11 will always take its value from a_10, meaning
+ 	that instead of consider it VARYING (a_9 and a_10 have
+ 	different values), we can consider it CONSTANT 100.
+ 
+       - If an argument has an UNDEFINED value, then it does not affect
+ 	the outcome of the meet operation.  If a variable V_i has an
+ 	UNDEFINED value, it means that either its defining statement
+ 	hasn't been visited yet or V_i has no defining statement, in
+ 	which case the original symbol 'V' is being used
+ 	uninitialized.  Since 'V' is a local variable, the compiler
+ 	may assume any initial value for it.
+ 
+ 
+    After propagation, every variable V_i that ends up with a lattice
+    value of CONSTANT will have the associated constant value in the
+    array CONST_VAL[i].VALUE.  That is fed into substitute_and_fold for
+    final substitution and folding.
+ 
+    References:
+ 
+      Constant propagation with conditional branches,
+      Wegman and Zadeck, ACM TOPLAS 13(2):181-210.
+ 
+      Building an Optimizing Compiler,
+      Robert Morgan, Butterworth-Heinemann, 1998, Section 8.9.
+ 
+      Advanced Compiler Design and Implementation,
+      Steven Muchnick, Morgan Kaufmann, 1997, Section 12.6  */
+ 
+ #include "config.h"
+ #include "system.h"
+ #include "coretypes.h"
+ #include "tm.h"
+ #include "tree.h"
+ #include "flags.h"
+ #include "tm_p.h"
+ #include "basic-block.h"
+ #include "output.h"
+ #include "function.h"
+ #include "tree-pretty-print.h"
+ #include "gimple-pretty-print.h"
+ #include "timevar.h"
+ #include "tree-dump.h"
+ #include "tree-flow.h"
+ #include "tree-pass.h"
+ #include "tree-ssa-propagate.h"
+ #include "value-prof.h"
+ #include "langhooks.h"
+ #include "target.h"
+ #include "diagnostic-core.h"
+ #include "toplev.h"
+ #include "dbgcnt.h"
+ 
+ 
+ /* Possible lattice values.  */
+ typedef enum
+ {
+   UNINITIALIZED,
+   UNDEFINED,
+   CONSTANT,
+   VARYING
+ } ccp_lattice_t;
+ 
+ typedef struct bit_prop_value_d {
+   /* Lattice value.  */
+   unsigned lattice_val;
+ 
+   /* Bit lattice value.  Zero for constant, one for varying.  */
+   double_int bit_lattice_val;
+ 
+   /* Propagated value.  */
+   double_int value;
+ } bit_prop_value_t;
+ 
+ /* Array of propagated constant values.  After propagation,
+    CONST_VAL[I].VALUE holds the constant value for SSA_NAME(I).  If
+    the constant is held in an SSA name representing a memory store
+    (i.e., a VDEF), CONST_VAL[I].MEM_REF will contain the actual
+    memory reference used to store (i.e., the LHS of the assignment
+    doing the store).  */
+ static bit_prop_value_t *const_val;
+ 
+ static bool bit_ccp_fold_stmt (gimple_stmt_iterator *);
+ 
+ /* Dump constant propagation value VAL to file OUTF prefixed by PREFIX.  */
+ 
+ static void
+ dump_lattice_value (FILE *outf, const char *prefix, bit_prop_value_t val)
+ {
+   switch (val.lattice_val)
+     {
+     case UNINITIALIZED:
+       fprintf (outf, "%sUNINITIALIZED", prefix);
+       break;
+     case UNDEFINED:
+       fprintf (outf, "%sUNDEFINED", prefix);
+       break;
+     case VARYING:
+       fprintf (outf, "%sVARYING", prefix);
+       break;
+     case CONSTANT:
+       {
+ 	double_int cval = double_int_and_not (val.value, val.bit_lattice_val);
+ 	fprintf (outf, "%sCONSTANT " HOST_WIDE_INT_PRINT_DOUBLE_HEX,
+ 		 prefix, cval.high, cval.low);
+ 	if (!double_int_zero_p (val.bit_lattice_val))
+ 	  fprintf (outf, " (" HOST_WIDE_INT_PRINT_DOUBLE_HEX ")",
+ 		   val.bit_lattice_val.high, val.bit_lattice_val.low);
+ 	break;
+       }
+     default:
+       gcc_unreachable ();
+     }
+ }
+ 
+ 
+ /* Print lattice value VAL to stderr.  */
+ 
+ void debug_bit_lattice_value (bit_prop_value_t val);
+ 
+ DEBUG_FUNCTION void
+ debug_bit_lattice_value (bit_prop_value_t val)
+ {
+   dump_lattice_value (stderr, "", val);
+   fprintf (stderr, "\n");
+ }
+ 
+ 
+ /* Compute a default value for variable VAR and store it in the
+    CONST_VAL array.  The following rules are used to get default
+    values:
+ 
+    1- Global and static variables that are declared constant are
+       considered CONSTANT.
+ 
+    2- Any other value is considered UNDEFINED.  This is useful when
+       considering PHI nodes.  PHI arguments that are undefined do not
+       change the constant value of the PHI node, which allows for more
+       constants to be propagated.
+ 
+    3- Variables defined by statements other than assignments and PHI
+       nodes are considered VARYING.
+ 
+    4- Initial values of variables that are not GIMPLE registers are
+       considered VARYING.  */
+ 
+ static bit_prop_value_t
+ get_default_value (tree var)
+ {
+   tree sym = SSA_NAME_VAR (var);
+   bit_prop_value_t val = { UNINITIALIZED, { 0, 0 }, { 0, 0 } };
+   gimple stmt;
+ 
+   stmt = SSA_NAME_DEF_STMT (var);
+ 
+   if (gimple_nop_p (stmt))
+     {
+       /* Variables defined by an empty statement are those used
+ 	 before being initialized.  If VAR is a local variable, we
+ 	 can assume initially that it is UNDEFINED, otherwise we must
+ 	 consider it VARYING.  */
+       if (is_gimple_reg (sym)
+ 	  && TREE_CODE (sym) == VAR_DECL)
+ 	val.lattice_val = UNDEFINED;
+       else
+ 	{
+ 	  val.lattice_val = VARYING;
+ 	  val.bit_lattice_val = double_int_minus_one;
+ 	}
+     }
+   else if (is_gimple_assign (stmt)
+ 	   /* Value-returning GIMPLE_CALL statements assign to
+ 	      a variable, and are treated similarly to GIMPLE_ASSIGN.  */
+ 	   || (is_gimple_call (stmt)
+ 	       && gimple_call_lhs (stmt) != NULL_TREE)
+ 	   || gimple_code (stmt) == GIMPLE_PHI)
+     {
+       tree cst;
+       if (gimple_assign_single_p (stmt)
+ 	  && DECL_P (gimple_assign_rhs1 (stmt))
+ 	  && (cst = get_symbol_constant_value (gimple_assign_rhs1 (stmt)))
+ 	  && TREE_CODE (cst) == INTEGER_CST)
+ 	{
+ 	  val.lattice_val = CONSTANT;
+ 	  val.value = tree_to_double_int (cst);
+ 	  val.bit_lattice_val = double_int_zero;
+ 	}
+       else
+ 	/* Any other variable defined by an assignment or a PHI node
+ 	   is considered UNDEFINED.  */
+ 	val.lattice_val = UNDEFINED;
+     }
+   else
+     {
+       /* Otherwise, VAR will never take on a constant value.  */
+       val.lattice_val = VARYING;
+       val.bit_lattice_val = double_int_minus_one;
+     }
+ 
+   return val;
+ }
+ 
+ 
+ /* Get the constant value associated with variable VAR.  */
+ 
+ static inline bit_prop_value_t *
+ get_value (tree var)
+ {
+   bit_prop_value_t *val;
+ 
+   if (const_val == NULL)
+     return NULL;
+ 
+   val = &const_val[SSA_NAME_VERSION (var)];
+   if (val->lattice_val == UNINITIALIZED)
+     *val = get_default_value (var);
+ 
+   return val;
+ }
+ 
+ /* Get the fully constant tree value associated with variable VAR
+    or NULL_TREE if it is not fully constant.  */
+ 
+ static tree
+ get_tree_value (tree var)
+ {
+   bit_prop_value_t *val = get_value (var);
+   if (val->lattice_val == CONSTANT
+       && double_int_zero_p (val->bit_lattice_val))
+     return double_int_to_tree (TREE_TYPE (var), val->value);
+   return NULL_TREE;
+ }
+ 
+ /* Sets the value associated with VAR to VARYING.  */
+ 
+ static inline void
+ set_value_varying (tree var)
+ {
+   bit_prop_value_t *val = &const_val[SSA_NAME_VERSION (var)];
+ 
+   val->lattice_val = VARYING;
+   val->bit_lattice_val = double_int_minus_one;
+ }
+ 
+ /* Set the value for variable VAR to NEW_VAL.  Return true if the new
+    value is different from VAR's previous value.  */
+ 
+ static bool
+ set_lattice_value (tree var, bit_prop_value_t new_val)
+ {
+   /* We can deal with old UNINITIALIZED values just fine here.  */
+   bit_prop_value_t *old_val = &const_val[SSA_NAME_VERSION (var)];
+ 
+   /* Lattice transitions must always be monotonically increasing in
+      value.  If *OLD_VAL and NEW_VAL are the same, return false to
+      inform the caller that this was a non-transition.  */
+ 
+   gcc_assert (old_val->lattice_val < new_val.lattice_val
+               || (old_val->lattice_val == new_val.lattice_val
+ 		  && (old_val->lattice_val != CONSTANT
+ 		      || 1 /* ???  See the fixup below.  */
+ 		      || (double_int_equal_p
+ 			    (double_int_ior (old_val->bit_lattice_val,
+ 					     new_val.bit_lattice_val),
+ 			     new_val.bit_lattice_val)
+ 			  && double_int_equal_p
+ 			       (double_int_and_not (old_val->value,
+ 						    new_val.bit_lattice_val),
+ 				double_int_and_not (new_val.value,
+ 						    new_val.bit_lattice_val))))));
+ 
+   /* We have to be careful to not go up the bitwise lattice.
+      ???  This doesn't seem to be the best place to do this.  */
+   if (new_val.lattice_val == CONSTANT
+       && old_val->lattice_val == CONSTANT)
+     new_val.bit_lattice_val
+       = double_int_ior (new_val.bit_lattice_val,
+ 			double_int_ior (old_val->bit_lattice_val,
+ 					double_int_xor (new_val.value,
+ 							old_val->value)));
+ 
+   if (old_val->lattice_val != new_val.lattice_val
+       || (new_val.lattice_val == CONSTANT
+ 	  && !double_int_equal_p (old_val->bit_lattice_val,
+ 				  new_val.bit_lattice_val)))
+     {
+       if (dump_file && (dump_flags & TDF_DETAILS))
+ 	{
+ 	  dump_lattice_value (dump_file, "Lattice value changed to ", new_val);
+ 	  fprintf (dump_file, ".  Adding SSA edges to worklist.\n");
+ 	}
+ 
+       *old_val = new_val;
+ 
+       gcc_assert (new_val.lattice_val != UNINITIALIZED);
+       return true;
+     }
+ 
+   return false;
+ }
+ 
+ 
+ /* Return the likely CCP lattice value for STMT.
+ 
+    If STMT has no operands, then return CONSTANT.
+ 
+    Else if undefinedness of operands of STMT cause its value to be
+    undefined, then return UNDEFINED.
+ 
+    Else if any operands of STMT are constants, then return CONSTANT.
+ 
+    Else return VARYING.  */
+ 
+ static ccp_lattice_t
+ likely_value (gimple stmt)
+ {
+   bool has_constant_operand, has_undefined_operand, all_undefined_operands;
+   tree use;
+   ssa_op_iter iter;
+   unsigned i;
+ 
+   enum gimple_code code = gimple_code (stmt);
+ 
+   /* This function appears to be called only for assignments, calls,
+      conditionals, and switches, due to the logic in visit_stmt.  */
+   gcc_assert (code == GIMPLE_ASSIGN
+               || code == GIMPLE_CALL
+               || code == GIMPLE_COND
+               || code == GIMPLE_SWITCH);
+ 
+   /* If the statement has volatile operands, it won't fold to a
+      constant value.  */
+   if (gimple_has_volatile_ops (stmt))
+     return VARYING;
+ 
+   /* Arrive here for more complex cases.  */
+   has_constant_operand = false;
+   has_undefined_operand = false;
+   all_undefined_operands = true;
+   FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
+     {
+       bit_prop_value_t *val = get_value (use);
+ 
+       if (val->lattice_val == UNDEFINED)
+ 	has_undefined_operand = true;
+       else
+ 	all_undefined_operands = false;
+ 
+       if (val->lattice_val == CONSTANT)
+ 	has_constant_operand = true;
+     }
+ 
+   /* There may be constants in regular rhs operands.  For calls we
+      have to ignore lhs, fndecl and static chain, otherwise only
+      the lhs.  */
+   for (i = (is_gimple_call (stmt) ? 2 : 0) + gimple_has_lhs (stmt);
+        i < gimple_num_ops (stmt); ++i)
+     {
+       tree op = gimple_op (stmt, i);
+       if (!op || TREE_CODE (op) == SSA_NAME)
+ 	continue;
+       if (is_gimple_min_invariant (op))
+ 	has_constant_operand = true;
+     }
+ 
+   if (has_constant_operand)
+     all_undefined_operands = false;
+ 
+   /* If the operation combines operands like COMPLEX_EXPR make sure to
+      not mark the result UNDEFINED if only one part of the result is
+      undefined.  */
+   if (has_undefined_operand && all_undefined_operands)
+     return UNDEFINED;
+   else if (code == GIMPLE_ASSIGN && has_undefined_operand)
+     {
+       switch (gimple_assign_rhs_code (stmt))
+ 	{
+ 	/* Unary operators are handled with all_undefined_operands.  */
+ 	case PLUS_EXPR:
+ 	case MINUS_EXPR:
+ 	case POINTER_PLUS_EXPR:
+ 	  /* Not MIN_EXPR, MAX_EXPR.  One VARYING operand may be selected.
+ 	     Not bitwise operators, one VARYING operand may specify the
+ 	     result completely.  Not logical operators for the same reason.
+ 	     Not COMPLEX_EXPR as one VARYING operand makes the result partly
+ 	     not UNDEFINED.  Not *DIV_EXPR, comparisons and shifts because
+ 	     the undefined operand may be promoted.  */
+ 	  return UNDEFINED;
+ 
+ 	default:
+ 	  ;
+ 	}
+     }
+   /* If there was an UNDEFINED operand but the result may be not UNDEFINED
+      fall back to VARYING even if there were CONSTANT operands.  */
+   if (has_undefined_operand)
+     return VARYING;
+ 
+   /* We do not consider virtual operands here -- load from read-only
+      memory may have only VARYING virtual operands, but still be
+      constant.  */
+   if (has_constant_operand
+       || gimple_references_memory_p (stmt))
+     return CONSTANT;
+ 
+   return VARYING;
+ }
+ 
+ /* Returns true if STMT cannot be constant.  */
+ 
+ static bool
+ surely_varying_stmt_p (gimple stmt)
+ {
+   /* If the statement has operands that we cannot handle, it cannot be
+      constant.  */
+   if (gimple_has_volatile_ops (stmt))
+     return true;
+ 
+   /* If it is a call and does not return a value or is not a
+      builtin and not an indirect call, it is varying.  */
+   if (is_gimple_call (stmt))
+     {
+       tree fndecl;
+       if (!gimple_call_lhs (stmt)
+ 	  || ((fndecl = gimple_call_fndecl (stmt)) != NULL_TREE
+ 	      && !DECL_BUILT_IN (fndecl)))
+ 	return true;
+     }
+ 
+   /* Any other store operation is not interesting.  */
+   else if (gimple_vdef (stmt))
+     return true;
+ 
+   /* Anything other than assignments and conditional jumps are not
+      interesting for CCP.  */
+   if (gimple_code (stmt) != GIMPLE_ASSIGN
+       && gimple_code (stmt) != GIMPLE_COND
+       && gimple_code (stmt) != GIMPLE_SWITCH
+       && gimple_code (stmt) != GIMPLE_CALL)
+     return true;
+ 
+   return false;
+ }
+ 
+ 
+ /* Compute the meet operator between *VAL1 and *VAL2.  Store the result
+    in VAL1.
+ 
+    		any  M UNDEFINED   = any
+ 		any  M VARYING     = VARYING
+ 		Ci   M Cj	   = Ci		if (i == j)
+ 		Ci   M Cj	   = VARYING	if (i != j)
+    */
+ 
+ static void
+ bit_ccp_lattice_meet (bit_prop_value_t *val1, bit_prop_value_t *val2)
+ {
+   if (val1->lattice_val == UNDEFINED)
+     {
+       /* UNDEFINED M any = any   */
+       *val1 = *val2;
+     }
+   else if (val2->lattice_val == UNDEFINED)
+     {
+       /* any M UNDEFINED = any
+          Nothing to do.  VAL1 already contains the value we want.  */
+       ;
+     }
+   else if (val1->lattice_val == VARYING
+            || val2->lattice_val == VARYING)
+     {
+       /* any M VARYING = VARYING.  */
+       val1->lattice_val = VARYING;
+       val1->bit_lattice_val = double_int_minus_one;
+     }
+   else if (val1->lattice_val == CONSTANT
+ 	   && val2->lattice_val == CONSTANT)
+     {
+       /* Drop unequal or varying bits to varying.  */
+       val1->bit_lattice_val
+         = double_int_ior (double_int_ior (val1->bit_lattice_val,
+ 					  val2->bit_lattice_val),
+ 			  double_int_xor (val1->value, val2->value));
+       if (double_int_minus_one_p (val1->bit_lattice_val))
+ 	val1->lattice_val = VARYING;
+     }
+   else
+     {
+       /* Any other combination is VARYING.  */
+       val1->lattice_val = VARYING;
+       val1->bit_lattice_val = double_int_minus_one;
+     }
+ }
+ 
+ /* Get operand number OPNR from the rhs of STMT.  Before returning it,
+    simplify it to a constant if possible.  */
+ 
+ static tree
+ get_rhs_assign_op_for_ccp (gimple stmt, int opnr)
+ {
+   tree op = gimple_op (stmt, opnr);
+ 
+   if (TREE_CODE (op) == SSA_NAME)
+     {
+       bit_prop_value_t *val = get_value (op);
+       if (val->lattice_val == CONSTANT
+ 	  && double_int_zero_p (val->bit_lattice_val))
+ 	return double_int_to_tree (TREE_TYPE (op), val->value);
+     }
+   return op;
+ }
+ 
+ /* CCP specific front-end to the non-destructive constant folding
+    routines.
+ 
+    Attempt to simplify the RHS of STMT knowing that one or more
+    operands are constants.
+ 
+    If simplification is possible, return the simplified RHS,
+    otherwise return the original RHS or NULL_TREE.  */
+ 
+ static tree
+ ccp_fold (gimple stmt)
+ {
+   location_t loc = gimple_location (stmt);
+   switch (gimple_code (stmt))
+     {
+     case GIMPLE_ASSIGN:
+       {
+         enum tree_code subcode = gimple_assign_rhs_code (stmt);
+ 
+         switch (get_gimple_rhs_class (subcode))
+           {
+           case GIMPLE_SINGLE_RHS:
+             {
+               tree rhs = gimple_assign_rhs1 (stmt);
+               enum tree_code_class kind = TREE_CODE_CLASS (subcode);
+ 
+               if (TREE_CODE (rhs) == SSA_NAME)
+                 {
+                   /* If the RHS is an SSA_NAME, return its known constant value,
+                      if any.  */
+                   return get_tree_value (rhs);
+                 }
+ 	      else if (TREE_CODE (rhs) == CONSTRUCTOR
+ 		       && TREE_CODE (TREE_TYPE (rhs)) == VECTOR_TYPE
+ 		       && (CONSTRUCTOR_NELTS (rhs)
+ 			   == TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs))))
+ 		{
+ 		  unsigned i;
+ 		  tree val, list;
+ 
+ 		  list = NULL_TREE;
+ 		  FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), i, val)
+ 		    {
+ 		      tree cval;
+ 		      if (TREE_CODE (val) == SSA_NAME
+ 			  && (cval = get_tree_value (val)))
+ 			val = cval;
+ 		      if (TREE_CODE (val) == INTEGER_CST
+ 			  || TREE_CODE (val) == REAL_CST
+ 			  || TREE_CODE (val) == FIXED_CST)
+ 			list = tree_cons (NULL_TREE, val, list);
+ 		      else
+ 			return NULL_TREE;
+ 		    }
+ 
+ 		  return build_vector (TREE_TYPE (rhs), nreverse (list));
+ 		}
+ 
+               if (kind == tcc_reference)
+ 		{
+ 		  if ((TREE_CODE (rhs) == VIEW_CONVERT_EXPR
+ 		       || TREE_CODE (rhs) == REALPART_EXPR
+ 		       || TREE_CODE (rhs) == IMAGPART_EXPR)
+ 		      && TREE_CODE (TREE_OPERAND (rhs, 0)) == SSA_NAME)
+ 		    {
+ 		      tree val = get_tree_value (TREE_OPERAND (rhs, 0));
+ 		      if (val)
+ 			return fold_unary_loc (EXPR_LOCATION (rhs),
+ 					       TREE_CODE (rhs),
+ 					       TREE_TYPE (rhs), val);
+ 		    }
+ 		}
+               else if (kind == tcc_declaration)
+                 return get_symbol_constant_value (rhs);
+               return rhs;
+             }
+ 
+           case GIMPLE_UNARY_RHS:
+             {
+               /* Handle unary operators that can appear in GIMPLE form.
+                  Note that we know the single operand must be a constant,
+                  so this should almost always return a simplified RHS.  */
+               tree op0 = get_rhs_assign_op_for_ccp (stmt, 1);
+ 
+               return
+ 		fold_unary_ignore_overflow_loc (loc, subcode,
+ 						gimple_expr_type (stmt), op0);
+             }
+ 
+           case GIMPLE_BINARY_RHS:
+             {
+               /* Handle binary operators that can appear in GIMPLE form.  */
+               tree op0 = get_rhs_assign_op_for_ccp (stmt, 1);
+               tree op1 = get_rhs_assign_op_for_ccp (stmt, 2);
+ 
+               return fold_binary_loc (loc, subcode,
+ 				      gimple_expr_type (stmt), op0, op1);
+             }
+ 
+           case GIMPLE_TERNARY_RHS:
+             {
+               /* Handle binary operators that can appear in GIMPLE form.  */
+               tree op0 = get_rhs_assign_op_for_ccp (stmt, 1);
+               tree op1 = get_rhs_assign_op_for_ccp (stmt, 2);
+               tree op2 = get_rhs_assign_op_for_ccp (stmt, 3);
+ 
+               return fold_ternary_loc (loc, subcode,
+ 				       gimple_expr_type (stmt), op0, op1, op2);
+             }
+ 
+           default:
+             gcc_unreachable ();
+           }
+       }
+       break;
+ 
+     case GIMPLE_CALL:
+       {
+ 	tree fn = gimple_call_fn (stmt);
+ 	tree val;
+ 
+ 	if (TREE_CODE (fn) == ADDR_EXPR
+ 	    && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL
+ 	    && DECL_BUILT_IN (TREE_OPERAND (fn, 0)))
+ 	  {
+ 	    tree *args = XALLOCAVEC (tree, gimple_call_num_args (stmt));
+ 	    tree call, retval;
+ 	    unsigned i;
+ 	    for (i = 0; i < gimple_call_num_args (stmt); ++i)
+ 	      {
+ 		args[i] = gimple_call_arg (stmt, i);
+ 		if (TREE_CODE (args[i]) == SSA_NAME)
+ 		  {
+ 		    val = get_tree_value (args[i]);
+ 		    if (val)
+ 		      args[i] = val;
+ 		  }
+ 	      }
+ 	    call = build_call_array_loc (loc,
+ 					 gimple_call_return_type (stmt),
+ 					 fn, gimple_call_num_args (stmt), args);
+ 	    retval = fold_call_expr (EXPR_LOCATION (call), call, false);
+ 	    if (retval)
+ 	      /* fold_call_expr wraps the result inside a NOP_EXPR.  */
+ 	      STRIP_NOPS (retval);
+ 	    return retval;
+ 	  }
+ 	return NULL_TREE;
+       }
+ 
+     case GIMPLE_COND:
+       {
+         /* Handle comparison operators that can appear in GIMPLE form.  */
+         tree op0 = gimple_cond_lhs (stmt);
+         tree op1 = gimple_cond_rhs (stmt);
+         enum tree_code code = gimple_cond_code (stmt);
+ 
+         /* Simplify the operands down to constants when appropriate.  */
+         if (TREE_CODE (op0) == SSA_NAME)
+           {
+             tree val = get_tree_value (op0);
+             if (val)
+               op0 = val;
+           }
+ 
+         if (TREE_CODE (op1) == SSA_NAME)
+           {
+             tree val = get_tree_value (op1);
+             if (val)
+               op1 = val;
+           }
+ 
+         return fold_binary_loc (loc, code, boolean_type_node, op0, op1);
+       }
+ 
+     case GIMPLE_SWITCH:
+       {
+         tree rhs = gimple_switch_index (stmt);
+ 
+         if (TREE_CODE (rhs) == SSA_NAME)
+           {
+             /* If the RHS is an SSA_NAME, return its known constant value,
+                if any.  */
+             return get_tree_value (rhs);
+           }
+ 
+         return rhs;
+       }
+ 
+     default:
+       gcc_unreachable ();
+     }
+ }
+ 
+ /* Return the value for the tree operand EXPR.  */
+ 
+ static bit_prop_value_t
+ get_value_for_expr (tree expr)
+ {
+   bit_prop_value_t val;
+   int align;
+ 
+   if (TREE_CODE (expr) == SSA_NAME)
+     val = *get_value (expr);
+   else if (TREE_CODE (expr) == INTEGER_CST)
+     {
+       val.lattice_val = CONSTANT;
+       val.bit_lattice_val = double_int_zero;
+       val.value = tree_to_double_int (expr);
+     }
+   else if (TREE_CODE (expr) == ADDR_EXPR
+ 	   && ((align = get_object_alignment (TREE_OPERAND (expr, 0),
+ 					      BITS_PER_UNIT, BIGGEST_ALIGNMENT))
+ 	       > BITS_PER_UNIT))
+     {
+       val.lattice_val = CONSTANT;
+       val.bit_lattice_val
+ 	= double_int_and_not (double_int_mask (TYPE_PRECISION
+ 					         (TREE_TYPE (expr))),
+ 			      uhwi_to_double_int (align / BITS_PER_UNIT - 1));
+       val.value = double_int_zero;
+     }
+   else
+     {
+       val.lattice_val = VARYING;
+       val.bit_lattice_val = double_int_minus_one;
+       val.value = double_int_zero;
+     }
+   return val;
+ }
+ 
+ static bit_prop_value_t
+ bit_prop_value_binop (enum tree_code, tree, bit_prop_value_t, bit_prop_value_t);
+ 
+ /* Return the propagation value when applying the operation CODE to
+    the propagation value RVAL yielding type TYPE.  */
+ 
+ static bit_prop_value_t
+ bit_prop_value_unop (enum tree_code code, tree type, bit_prop_value_t rval)
+ {
+   bit_prop_value_t val = { VARYING, { -1, -1 }, { 0, 0 } };
+   switch (code)
+     {
+     case BIT_NOT_EXPR:
+       if (rval.lattice_val == CONSTANT)
+ 	{
+ 	  val = rval;
+ 	  val.value = double_int_not (rval.value);
+ 	}
+       break;
+ 
+     case NEGATE_EXPR:
+       {
+ 	bit_prop_value_t mone;
+ 	mone.lattice_val = CONSTANT;
+ 	mone.bit_lattice_val = double_int_zero;
+ 	mone.value = double_int_one;
+ 	/* Return ~rval + 1.  */
+ 	return bit_prop_value_binop
+ 	         (PLUS_EXPR, type,
+ 		  bit_prop_value_unop (BIT_NOT_EXPR, type, rval), mone);
+       }
+ 
+     default:;
+     }
+ 
+   return val;
+ }
+ 
+ /* Return the propagation value when applying the operation CODE to
+    the propagation values R1VAL and R2VAL yielding type TYPE.  */
+ 
+ static bit_prop_value_t
+ bit_prop_value_binop (enum tree_code code, tree type,
+ 		      bit_prop_value_t r1val, bit_prop_value_t r2val)
+ {
+   bit_prop_value_t val = { VARYING, { -1, -1 }, { 0, 0 } };
+   switch (code)
+     {
+     case BIT_AND_EXPR:
+       if (r1val.lattice_val == CONSTANT
+ 	  || r2val.lattice_val == CONSTANT)
+ 	{
+ 	  /* The lattice is constant where there is a known not
+ 	     set bit, (m1 | m2) & ((v1 | m1) & (v2 | m2)) */
+ 	  val.bit_lattice_val
+ 	    = double_int_and
+ 	        (double_int_ior (r1val.bit_lattice_val,
+ 				 r2val.bit_lattice_val),
+ 		 double_int_and (double_int_ior (r1val.value,
+ 						 r1val.bit_lattice_val),
+ 				 double_int_ior (r2val.value,
+ 						 r2val.bit_lattice_val)));
+ 	  if (!double_int_minus_one_p (val.bit_lattice_val))
+ 	    val.lattice_val = CONSTANT;
+ 	  val.value = double_int_and (r1val.value, r2val.value);
+ 	}
+       break;
+ 
+     case BIT_IOR_EXPR:
+       if (r1val.lattice_val == CONSTANT
+ 	  || r2val.lattice_val == CONSTANT)
+ 	{
+ 	  /* The lattice is constant where there is a known
+ 	     set bit, (m1 | m2) & ~((v1 & ~m1) | (v2 & ~m2)).  */
+ 	  val.bit_lattice_val
+ 	    = double_int_and_not
+ 	        (double_int_ior (r1val.bit_lattice_val,
+ 				 r2val.bit_lattice_val),
+ 		 double_int_ior
+ 		   (double_int_and_not (r1val.value,
+ 					r1val.bit_lattice_val),
+ 		    double_int_and_not (r2val.value,
+ 					r2val.bit_lattice_val)));
+ 	  if (!double_int_minus_one_p (val.bit_lattice_val))
+ 	    val.lattice_val = CONSTANT;
+ 	  val.value = double_int_ior (r1val.value, r2val.value);
+ 	}
+       break;
+ 
+     case BIT_XOR_EXPR:
+       if (r1val.lattice_val == CONSTANT
+ 	  && r2val.lattice_val == CONSTANT)
+ 	{
+ 	  /* m1 | m2  */
+ 	  val.bit_lattice_val = double_int_ior (r1val.bit_lattice_val,
+ 						r2val.bit_lattice_val);
+ 	  if (!double_int_minus_one_p (val.bit_lattice_val))
+ 	    val.lattice_val = CONSTANT;
+ 	  val.value = double_int_xor (r1val.value, r2val.value);
+ 	}
+       break;
+ 
+     case LROTATE_EXPR:
+     case RROTATE_EXPR:
+       if (r1val.lattice_val == CONSTANT
+ 	  && r2val.lattice_val == CONSTANT
+ 	  && double_int_zero_p (r2val.bit_lattice_val))
+ 	{
+ 	  HOST_WIDE_INT shift = r2val.value.low;
+ 	  if (code == RROTATE_EXPR)
+ 	    shift = -shift;
+ 	  val.lattice_val = CONSTANT;
+ 	  val.bit_lattice_val
+ 	    = double_int_lrotate (r1val.bit_lattice_val,
+ 				  shift, TYPE_PRECISION (type));
+ 	  val.value = double_int_lrotate (r1val.value, shift,
+ 					  TYPE_PRECISION (type));
+ 	}
+       break;
+ 
+     case LSHIFT_EXPR:
+     case RSHIFT_EXPR:
+       if (r2val.lattice_val == CONSTANT
+ 	  && double_int_zero_p (r2val.bit_lattice_val))
+ 	{
+ 	  HOST_WIDE_INT shift = r2val.value.low;
+ 	  if (code == RSHIFT_EXPR)
+ 	    shift = -shift;
+ 	  if (shift > 0)
+ 	    {
+ 	      val.lattice_val = CONSTANT;
+ 	      val.bit_lattice_val
+ 		= double_int_lshift (r1val.bit_lattice_val,
+ 				     shift, TYPE_PRECISION (type), false);
+ 	      val.value = double_int_lshift (r1val.value, shift,
+ 					     TYPE_PRECISION (type), false);
+ 	    }
+ 	  else if (shift < 0)
+ 	    {
+ 	      val.lattice_val = CONSTANT;
+ 	      val.bit_lattice_val
+ 		= double_int_rshift (r1val.bit_lattice_val,
+ 				     -shift, TYPE_PRECISION (type), true);
+ 	      val.value = double_int_rshift (r1val.value, shift,
+ 					     TYPE_PRECISION (type), true);
+ 	    }
+ 	  else
+ 	    val = r1val;
+ 	}
+       break;
+ 
+     case PLUS_EXPR:
+     case POINTER_PLUS_EXPR:
+       if (r1val.lattice_val == CONSTANT
+ 	  && r2val.lattice_val == CONSTANT)
+ 	{
+ 	  /* Perform a bit-wise addition.  */
+ 	  unsigned i;
+ 	  bool ovf = false;
+ 	  bool vovf = false;
+ 	  for (i = 0; i < HOST_BITS_PER_WIDE_INT; ++i)
+ 	    {
+ 	      unsigned HOST_WIDE_INT mask = (unsigned HOST_WIDE_INT)1 << i;
+ 	      unsigned var1 = r1val.bit_lattice_val.low & mask;
+ 	      unsigned var2 = r2val.bit_lattice_val.low & mask;
+ 	      unsigned bit1 = r1val.value.low & mask;
+ 	      unsigned bit2 = r2val.value.low & mask;
+ 	      if (var1 || var2 || vovf)
+ 		val.bit_lattice_val.low = val.bit_lattice_val.low | mask;
+ 	      else if ((bit1 + bit2 + ovf) & 1)
+ 		val.value.low = val.value.low | mask;
+ 	      else
+ 		val.value.low = val.value.low & ~mask;
+ 	      ovf = (bit1 + bit2
+ 		     + (ovf ? 1 : 0)) > 1;
+ 	      vovf = ((var1 ? var1 : bit1) + (var2 ? var2 : bit2)
+ 		      + (vovf ? 1 : 0)) > 1;
+ 	    }
+ 	  for (i = 0; i < HOST_BITS_PER_WIDE_INT; ++i)
+ 	    {
+ 	      unsigned HOST_WIDE_INT mask = (unsigned HOST_WIDE_INT)1 << i;
+ 	      unsigned var1 = r1val.bit_lattice_val.high & mask;
+ 	      unsigned var2 = r2val.bit_lattice_val.high & mask;
+ 	      unsigned bit1 = r1val.value.high & mask;
+ 	      unsigned bit2 = r2val.value.high & mask;
+ 	      if (var1 || var2 || vovf)
+ 		val.bit_lattice_val.high = val.bit_lattice_val.high | mask;
+ 	      else if ((bit1 + bit2 + ovf) & 1)
+ 		val.value.high = val.value.high | mask;
+ 	      else
+ 		val.value.high = val.value.high & ~mask;
+ 	      ovf = (bit1 + bit2
+ 		     + (ovf ? 1 : 0)) > 1;
+ 	      vovf = ((var1 ? var1 : bit1) + (var2 ? var2 : bit2)
+ 		      + (vovf ? 1 : 0)) > 1;
+ 	    }
+ 	  if (double_int_minus_one_p (val.bit_lattice_val))
+ 	    val.lattice_val = VARYING;
+ 	}
+       break;
+ 
+     case MINUS_EXPR:
+       return bit_prop_value_binop (PLUS_EXPR, type, r1val,
+ 				   bit_prop_value_unop (NEGATE_EXPR,
+ 							type, r2val));
+ 
+     case MULT_EXPR:
+       if (r1val.lattice_val == CONSTANT
+ 	  || r2val.lattice_val == CONSTANT)
+ 	{
+ 	  /* Just track trailing zeros in both operands and transfer
+ 	     them to the other.  */
+ 	  unsigned r1ctz = 0, r2ctz = 0;
+ 	  while (r1ctz < HOST_BITS_PER_WIDE_INT
+ 		 && !(r1val.bit_lattice_val.low & (1 << r1ctz))
+ 		 && !(r1val.value.low & (1 << r1ctz)))
+ 	    r1ctz++;
+ 	  while (r2ctz < HOST_BITS_PER_WIDE_INT
+ 		 && !(r2val.bit_lattice_val.low & (1 << r2ctz))
+ 		 && !(r2val.value.low & (1 << r2ctz)))
+ 	    r2ctz++;
+ 	  if (r1ctz + r2ctz > 0)
+ 	    {
+ 	      val.lattice_val = CONSTANT;
+ 	      val.bit_lattice_val
+ 		= double_int_not (double_int_mask (r1ctz + r2ctz));
+ 	      val.value = double_int_zero;
+ 	    }
+ 	}
+       break;
+ 
+     default:;
+     }
+ 
+   return val;
+ }
+ 
+ /* Convert the propagation value RVAL from its original type RVAL_TYPE
+    to TYPE and return the resulting propagation value.  */
+ 
+ static bit_prop_value_t
+ bit_prop_value_convert (tree type, bit_prop_value_t rval, tree rval_type)
+ {
+   bit_prop_value_t val;
+   bool uns;
+ 
+   /* If the lattice value isn't constant just return it unchanged.  */
+   if (!rval.lattice_val == CONSTANT)
+     return rval;
+ 
+   val.lattice_val = CONSTANT;
+ 
+   /* First extend lattice and value according to the original type.  */
+   uns = (TREE_CODE (rval_type) == INTEGER_TYPE && TYPE_IS_SIZETYPE (rval_type)
+ 	 ? 0 : TYPE_UNSIGNED (rval_type));
+   val.bit_lattice_val = double_int_ext (rval.bit_lattice_val,
+ 					TYPE_PRECISION (rval_type), uns);
+   val.value = double_int_ext (rval.value,
+ 			      TYPE_PRECISION (rval_type), uns);
+ 
+   /* Then extend lattice and value according to the target type.  */
+   uns = (TREE_CODE (type) == INTEGER_TYPE && TYPE_IS_SIZETYPE (type)
+ 	 ? 0 : TYPE_UNSIGNED (type));
+   val.bit_lattice_val = double_int_ext (val.bit_lattice_val,
+ 					TYPE_PRECISION (type), uns);
+   val.value = double_int_ext (val.value,
+ 			      TYPE_PRECISION (type), uns);
+ 
+   return val;
+ }
+ 
+ 
+ /* Evaluate statement STMT.
+    Valid only for assignments, calls, conditionals, and switches. */
+ 
+ static bit_prop_value_t
+ evaluate_stmt (gimple stmt)
+ {
+   bit_prop_value_t val;
+   tree simplified = NULL_TREE;
+   ccp_lattice_t likelyvalue = likely_value (stmt);
+   bool is_constant;
+ 
+   fold_defer_overflow_warnings ();
+ 
+   /* If the statement is likely to have a CONSTANT result, then try
+      to fold the statement to determine the constant value.  */
+   /* FIXME.  This is the only place that we call ccp_fold.
+      Since likely_value never returns CONSTANT for calls, we will
+      not attempt to fold them, including builtins that may profit.  */
+   if (likelyvalue == CONSTANT)
+     simplified = ccp_fold (stmt);
+   /* If the statement is likely to have a VARYING result, then do not
+      bother folding the statement.  */
+   else if (likelyvalue == VARYING)
+     {
+       enum gimple_code code = gimple_code (stmt);
+       if (code == GIMPLE_ASSIGN)
+         {
+           enum tree_code subcode = gimple_assign_rhs_code (stmt);
+ 
+           /* Other cases cannot satisfy is_gimple_min_invariant
+              without folding.  */
+           if (get_gimple_rhs_class (subcode) == GIMPLE_SINGLE_RHS)
+             simplified = gimple_assign_rhs1 (stmt);
+         }
+       else if (code == GIMPLE_SWITCH)
+         simplified = gimple_switch_index (stmt);
+       else
+ 	/* These cannot satisfy is_gimple_min_invariant without folding.  */
+ 	gcc_assert (code == GIMPLE_CALL || code == GIMPLE_COND);
+     }
+ 
+   is_constant = simplified && is_gimple_min_invariant (simplified);
+ 
+   fold_undefer_overflow_warnings (is_constant, stmt, 0);
+ 
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     {
+       fprintf (dump_file, "which is likely ");
+       switch (likelyvalue)
+ 	{
+ 	case CONSTANT:
+ 	  fprintf (dump_file, "CONSTANT");
+ 	  break;
+ 	case UNDEFINED:
+ 	  fprintf (dump_file, "UNDEFINED");
+ 	  break;
+ 	case VARYING:
+ 	  fprintf (dump_file, "VARYING");
+ 	  break;
+ 	default:;
+ 	}
+       fprintf (dump_file, "\n");
+     }
+ 
+   if (is_constant
+       && TREE_CODE (simplified) == INTEGER_CST)
+     {
+       /* The statement produced a constant value.  */
+       val.lattice_val = CONSTANT;
+       val.bit_lattice_val = double_int_zero;
+       val.value = tree_to_double_int (simplified);
+       return val;
+     }
+ 
+   val.lattice_val = VARYING;
+   if (likelyvalue == CONSTANT
+       && gimple_code (stmt) == GIMPLE_ASSIGN)
+     {
+       enum tree_code subcode = gimple_assign_rhs_code (stmt);
+       tree rhs1 = gimple_assign_rhs1 (stmt);
+       switch (get_gimple_rhs_class (subcode))
+ 	{
+ 	case GIMPLE_SINGLE_RHS:
+ 	case GIMPLE_UNARY_RHS:
+ 	  if (CONVERT_EXPR_CODE_P (subcode)
+ 	      && (INTEGRAL_TYPE_P (gimple_expr_type (stmt))
+ 		  || POINTER_TYPE_P (gimple_expr_type (stmt)))
+ 	      && (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
+ 		  || POINTER_TYPE_P (TREE_TYPE (rhs1))))
+ 	    val = bit_prop_value_convert (gimple_expr_type (stmt),
+ 					  get_value_for_expr (rhs1),
+ 					  TREE_TYPE (rhs1));
+ 	  else if (subcode == ADDR_EXPR)
+ 	    val = get_value_for_expr (gimple_assign_rhs1 (stmt));
+ 	  else if ((INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
+ 		    || POINTER_TYPE_P (TREE_TYPE (rhs1)))
+ 		   && (INTEGRAL_TYPE_P (gimple_expr_type (stmt))
+ 		       || POINTER_TYPE_P (gimple_expr_type (stmt))))
+ 	    {
+ 	      val = bit_prop_value_unop (subcode, gimple_expr_type (stmt),
+ 					 get_value_for_expr (rhs1));
+ 	    }
+ 	  break;
+ 
+ 	case GIMPLE_BINARY_RHS:
+ 	  if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
+ 	      || POINTER_TYPE_P (TREE_TYPE (rhs1)))
+ 	    {
+ 	      tree rhs2 = gimple_assign_rhs2 (stmt);
+ 	      val = bit_prop_value_binop (subcode, TREE_TYPE (rhs1),
+ 					  get_value_for_expr (rhs1),
+ 					  get_value_for_expr (rhs2));
+ 	    }
+ 	  break;
+ 
+ 	default:;
+ 	}
+     }
+   else if (likelyvalue == CONSTANT
+ 	   && gimple_code (stmt) == GIMPLE_COND)
+     {
+       enum tree_code code = gimple_cond_code (stmt);
+       tree rhs1 = gimple_cond_lhs (stmt);
+       tree rhs2 = gimple_cond_rhs (stmt);
+       bit_prop_value_t r1val = get_value_for_expr (rhs1);
+       bit_prop_value_t r2val = get_value_for_expr (rhs2);
+       tree type = TREE_TYPE (rhs1);
+       if (r1val.lattice_val == CONSTANT
+ 	  && r2val.lattice_val == CONSTANT)
+ 	{
+ 	  double_int mask = double_int_ior (r1val.bit_lattice_val,
+ 					    r2val.bit_lattice_val);
+ 	  bool uns = (TREE_CODE (type) == INTEGER_TYPE
+ 		      && TYPE_IS_SIZETYPE (type) ? 0 : TYPE_UNSIGNED (type));
+ 	  int minmax, maxmin;
+ 	  switch (code)
+ 	    {
+ 	    case EQ_EXPR:
+ 	    case NE_EXPR:
+ 	      if (!double_int_equal_p (double_int_and_not (r1val.value, mask),
+ 				       double_int_and_not (r2val.value, mask)))
+ 		{
+ 		  val.lattice_val = CONSTANT;
+ 		  val.bit_lattice_val = double_int_zero;
+ 		  val.value = ((code == EQ_EXPR)
+ 			       ? double_int_zero : double_int_one);
+ 		}
+ 	      break;
+ 
+ 	    case GE_EXPR:
+ 	    case GT_EXPR:
+ 	      {
+ 		bit_prop_value_t tem = r1val;
+ 		r1val = r2val;
+ 		r2val = tem;
+ 		code = swap_tree_comparison (code);
+ 	      }
+ 
+ 	      /* Fallthru.  */
+ 	    case LE_EXPR:
+ 	    case LT_EXPR:
+ 	      /* If the most significant bits are not known we know nothing.  */
+ 	      if (double_int_negative_p (r1val.bit_lattice_val)
+ 		  || double_int_negative_p (r2val.bit_lattice_val))
+ 		break;
+ 
+ 	      /* If we know the most significant bits we know the values
+ 	         value ranges by means of treating varying bits as zero
+ 		 or one.  Do a cross comparison of the max/min pairs.  */
+ 	      maxmin = double_int_cmp
+ 			 (double_int_ior (r1val.value,
+ 					  r1val.bit_lattice_val),
+ 			  double_int_and_not (r2val.value,
+ 					      r2val.bit_lattice_val), uns);
+ 	      minmax = double_int_cmp
+ 			 (double_int_and_not (r1val.value,
+ 					      r1val.bit_lattice_val),
+ 			  double_int_ior (r2val.value,
+ 					  r2val.bit_lattice_val), uns);
+ 	      if (maxmin < 0)  /* r1 is less than r2.  */
+ 		{
+ 		  val.lattice_val = CONSTANT;
+ 		  val.bit_lattice_val = double_int_zero;
+ 		  val.value = double_int_one;
+ 		}
+ 	      else if (minmax > 0)  /* r1 is not less or equal to r2.  */
+ 		{
+ 		  val.lattice_val = CONSTANT;
+ 		  val.bit_lattice_val = double_int_zero;
+ 		  val.value = double_int_zero;
+ 		}
+ 	      else if (maxmin == minmax)  /* r1 and r2 are equal.  */
+ 		{
+ 		  /* This probably should never happen as we'd have
+ 		     folded the thing during fully constant value folding.  */
+ 		  val.lattice_val = CONSTANT;
+ 		  val.bit_lattice_val = double_int_zero;
+ 		  val.value = (code == LE_EXPR
+ 			       ? double_int_one :  double_int_zero);
+ 		}
+ 	      break;
+ 
+ 	    default:;
+ 	    }
+ 	}
+     }
+   if (val.lattice_val != VARYING)
+     return val;
+ 
+   /* The statement produced a nonconstant value.  If the statement
+      had UNDEFINED operands, then the result of the statement
+      should be UNDEFINED.  Otherwise, the statement is VARYING.  */
+   if (likelyvalue == UNDEFINED)
+     {
+       val.lattice_val = UNDEFINED;
+       val.bit_lattice_val = double_int_zero;
+     }
+   else
+     {
+       val.lattice_val = VARYING;
+       val.bit_lattice_val = double_int_minus_one;
+     }
+ 
+   return val;
+ }
+ 
+ /* Fold the stmt at *GSI with CCP specific information that propagating
+    and regular folding does not catch.  */
+ 
+ static bool
+ bit_ccp_fold_stmt (gimple_stmt_iterator *gsi)
+ {
+   gimple stmt = gsi_stmt (*gsi);
+ 
+   switch (gimple_code (stmt))
+     {
+     case GIMPLE_COND:
+       {
+ 	bit_prop_value_t val;
+ 	/* Statement evaluation will handle type mismatches in constants
+ 	   more gracefully than the final propagation.  This allows us to
+ 	   fold more conditionals here.  */
+ 	val = evaluate_stmt (stmt);
+ 	if (val.lattice_val != CONSTANT
+ 	    || !double_int_zero_p (val.bit_lattice_val))
+ 	  return false;
+ 
+ 	if (double_int_zero_p (val.value))
+ 	  gimple_cond_make_false (stmt);
+ 	else
+ 	  gimple_cond_make_true (stmt);
+ 
+ 	return true;
+       }
+ 
+     case GIMPLE_CALL:
+       {
+ 	tree lhs = gimple_call_lhs (stmt);
+ 	bit_prop_value_t *val;
+ 	tree argt;
+ 	bool changed = false;
+ 	unsigned i;
+ 
+ 	/* If the call was folded into a constant make sure it goes
+ 	   away even if we cannot propagate into all uses because of
+ 	   type issues.  */
+ 	if (lhs
+ 	    && TREE_CODE (lhs) == SSA_NAME
+ 	    && (val = get_value (lhs))
+ 	    && val->lattice_val == CONSTANT
+ 	    && double_int_zero_p (val->bit_lattice_val))
+ 	  {
+ 	    tree new_rhs = double_int_to_tree (TREE_TYPE (lhs), val->value);
+ 	    bool res = update_call_from_tree (gsi, new_rhs);
+ 	    gcc_assert (res);
+ 	    return true;
+ 	  }
+ 
+ 	/* Propagate into the call arguments.  Compared to replace_uses_in
+ 	   this can use the argument slot types for type verification
+ 	   instead of the current argument type.  We also can safely
+ 	   drop qualifiers here as we are dealing with constants anyway.  */
+ 	argt = TYPE_ARG_TYPES (TREE_TYPE (TREE_TYPE (gimple_call_fn (stmt))));
+ 	for (i = 0; i < gimple_call_num_args (stmt) && argt;
+ 	     ++i, argt = TREE_CHAIN (argt))
+ 	  {
+ 	    tree arg = gimple_call_arg (stmt, i);
+ 	    if (TREE_CODE (arg) == SSA_NAME
+ 		&& (val = get_value (arg))
+ 		&& val->lattice_val == CONSTANT
+ 		&& double_int_zero_p (val->bit_lattice_val)
+ 		&& (INTEGRAL_TYPE_P (TREE_VALUE (argt))
+ 		    || POINTER_TYPE_P (TREE_VALUE (argt))))
+ 	      {
+ 		arg = double_int_to_tree (TREE_VALUE (argt), val->value);
+ 		gimple_call_set_arg (stmt, i, arg);
+ 		changed = true;
+ 	      }
+ 	  }
+ 
+ 	return changed;
+       }
+ 
+     case GIMPLE_ASSIGN:
+       {
+ 	tree lhs = gimple_assign_lhs (stmt);
+ 	bit_prop_value_t *val;
+ 
+ 	/* If we have a load that turned out to be constant replace it
+ 	   as we cannot propagate into all uses in all cases.  */
+ 	if (gimple_assign_single_p (stmt)
+ 	    && TREE_CODE (lhs) == SSA_NAME
+ 	    && (val = get_value (lhs))
+ 	    && val->lattice_val == CONSTANT
+ 	    && double_int_zero_p (val->bit_lattice_val)
+ 	    && !is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
+ 	  {
+ 	    tree rhs = double_int_to_tree (TREE_TYPE (lhs), val->value);
+ 	    gimple_assign_set_rhs_from_tree (gsi, rhs);
+ 	    return true;
+ 	  }
+ 
+ 	return false;
+       }
+ 
+     default:
+       return false;
+     }
+ }
+ 
+ /* Visit the assignment statement STMT.  Set the value of its LHS to the
+    value computed by the RHS and store LHS in *OUTPUT_P.  If STMT
+    creates virtual definitions, set the value of each new name to that
+    of the RHS (if we can derive a constant out of the RHS).
+    Value-returning call statements also perform an assignment, and
+    are handled here.  */
+ 
+ static enum ssa_prop_result
+ visit_assignment (gimple stmt, tree *output_p)
+ {
+   bit_prop_value_t val;
+   enum ssa_prop_result retval;
+ 
+   tree lhs = gimple_get_lhs (stmt);
+ 
+   gcc_assert (gimple_code (stmt) != GIMPLE_CALL
+               || gimple_call_lhs (stmt) != NULL_TREE);
+ 
+   if (gimple_assign_copy_p (stmt))
+     {
+       tree rhs = gimple_assign_rhs1 (stmt);
+ 
+       if  (TREE_CODE (rhs) == SSA_NAME)
+         {
+           /* For a simple copy operation, we copy the lattice values.  */
+           bit_prop_value_t *nval = get_value (rhs);
+           val = *nval;
+         }
+       else
+         val = evaluate_stmt (stmt);
+     }
+   else
+     /* Evaluate the statement, which could be
+        either a GIMPLE_ASSIGN or a GIMPLE_CALL.  */
+     val = evaluate_stmt (stmt);
+ 
+   retval = SSA_PROP_NOT_INTERESTING;
+ 
+   /* Set the lattice value of the statement's output.  */
+   if (TREE_CODE (lhs) == SSA_NAME)
+     {
+       /* If STMT is an assignment to an SSA_NAME, we only have one
+ 	 value to set.  */
+       if (set_lattice_value (lhs, val))
+ 	{
+ 	  *output_p = lhs;
+ 	  if (val.lattice_val == VARYING)
+ 	    retval = SSA_PROP_VARYING;
+ 	  else
+ 	    retval = SSA_PROP_INTERESTING;
+ 	}
+     }
+ 
+   return retval;
+ }
+ 
+ 
+ /* Visit the conditional statement STMT.  Return SSA_PROP_INTERESTING
+    if it can determine which edge will be taken.  Otherwise, return
+    SSA_PROP_VARYING.  */
+ 
+ static enum ssa_prop_result
+ visit_cond_stmt (gimple stmt, edge *taken_edge_p)
+ {
+   bit_prop_value_t val;
+   basic_block block;
+   tree tree_val;
+ 
+   block = gimple_bb (stmt);
+   val = evaluate_stmt (stmt);
+ 
+   /* If the value is not fully constant we cannot determine the
+      edge statically.  */
+   if (val.lattice_val != CONSTANT
+       || !double_int_zero_p (val.bit_lattice_val))
+     return SSA_PROP_VARYING;
+ 
+   if (gimple_code (stmt) == GIMPLE_SWITCH)
+     tree_val = double_int_to_tree (TREE_TYPE (gimple_switch_index (stmt)),
+ 				   val.value);
+   else if (gimple_code (stmt) == GIMPLE_COND)
+     tree_val = (double_int_zero_p (val.value)
+ 		? boolean_false_node : boolean_true_node);
+   else
+     gcc_unreachable ();
+ 
+   /* Find which edge out of the conditional block will be taken and add it
+      to the worklist.  If no single edge can be determined statically,
+      return SSA_PROP_VARYING to feed all the outgoing edges to the
+      propagation engine.  */
+   *taken_edge_p = find_taken_edge (block, tree_val);
+   if (*taken_edge_p)
+     return SSA_PROP_INTERESTING;
+   else
+     return SSA_PROP_VARYING;
+ }
+ 
+ /* Initialize local data structures for CCP.  */
+ 
+ static void
+ bit_ccp_initialize (void)
+ {
+   basic_block bb;
+ 
+   const_val = XCNEWVEC (bit_prop_value_t, num_ssa_names);
+ 
+   /* Initialize simulation flags for PHI nodes and statements.  */
+   FOR_EACH_BB (bb)
+     {
+       gimple_stmt_iterator i;
+ 
+       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+         {
+ 	  gimple stmt = gsi_stmt (i);
+ 	  bool is_varying;
+ 
+ 	  /* If the statement is a control insn, then we do not
+ 	     want to avoid simulating the statement once.  Failure
+ 	     to do so means that those edges will never get added.  */
+ 	  if (stmt_ends_bb_p (stmt))
+ 	    is_varying = false;
+ 	  else
+ 	    is_varying = surely_varying_stmt_p (stmt);
+ 
+ 	  if (is_varying)
+ 	    {
+ 	      tree def;
+ 	      ssa_op_iter iter;
+ 
+ 	      /* If the statement will not produce a constant, mark
+ 		 all its outputs VARYING.  */
+ 	      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_ALL_DEFS)
+ 		set_value_varying (def);
+ 	    }
+           prop_set_simulate_again (stmt, !is_varying);
+ 	}
+     }
+ 
+   /* Now process PHI nodes.  We never clear the simulate_again flag on
+      phi nodes, since we do not know which edges are executable yet,
+      except for phi nodes for virtual operands when we do not do store ccp.  */
+   FOR_EACH_BB (bb)
+     {
+       gimple_stmt_iterator i;
+ 
+       for (i = gsi_start_phis (bb); !gsi_end_p (i); gsi_next (&i))
+         {
+           gimple phi = gsi_stmt (i);
+ 
+ 	  if (!is_gimple_reg (gimple_phi_result (phi)))
+             prop_set_simulate_again (phi, false);
+ 	  else
+             prop_set_simulate_again (phi, true);
+ 	}
+     }
+ }
+ 
+ /* Do final substitution of propagated values, cleanup the flowgraph and
+    free allocated storage.
+ 
+    Return TRUE when something was optimized.  */
+ 
+ static bool
+ bit_ccp_finalize (void)
+ {
+   bool something_changed;
+   unsigned i;
+ 
+   /* Perform substitutions based on the known constant values.  */
+   /* ???  Build a constant lattice?  */
+   something_changed = substitute_and_fold (NULL, bit_ccp_fold_stmt, true);
+ 
+   for (i = 1; i < num_ssa_names; ++i)
+     {
+       tree name = ssa_name (i);
+       bit_prop_value_t *val;
+       struct ptr_info_def *pi;
+       unsigned int tem;
+ 
+       if (!name
+ 	  || !POINTER_TYPE_P (TREE_TYPE (name)))
+ 	continue;
+ 
+       val = get_value (name);
+       if (val->lattice_val != CONSTANT)
+ 	continue;
+ 
+       tem = 1;
+       while (tem < (1u << (sizeof (unsigned int) * 8 - 1))
+ 	     && !(val->bit_lattice_val.low & tem))
+ 	tem <<= 1;
+       if (tem == 1)
+ 	continue;
+ 
+       /* Trailing constant bits specify the alignment, trailing value
+ 	 bits the misalignment.  */
+       pi = get_ptr_info (name);
+       pi->align = tem;
+       pi->misalign = val->value.low & (tem - 1);
+     }
+ 
+   free (const_val);
+   const_val = NULL;
+   return something_changed;;
+ }
+ 
+ /* Loop through the PHI_NODE's parameters for BLOCK and compare their
+    lattice values to determine PHI_NODE's lattice value.  The value of a
+    PHI node is determined calling ccp_lattice_meet with all the arguments
+    of the PHI node that are incoming via executable edges.  */
+ 
+ static enum ssa_prop_result
+ bit_ccp_visit_phi_node (gimple phi)
+ {
+   unsigned i;
+   bit_prop_value_t *old_val, new_val;
+ 
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     {
+       fprintf (dump_file, "\nVisiting PHI node: ");
+       print_gimple_stmt (dump_file, phi, 0, dump_flags);
+     }
+ 
+   old_val = get_value (gimple_phi_result (phi));
+   switch (old_val->lattice_val)
+     {
+     case VARYING:
+       return SSA_PROP_VARYING;
+ 
+     case CONSTANT:
+       new_val = *old_val;
+       break;
+ 
+     case UNDEFINED:
+       new_val.lattice_val = UNDEFINED;
+       new_val.bit_lattice_val = double_int_zero;
+       break;
+ 
+     default:
+       gcc_unreachable ();
+     }
+ 
+   for (i = 0; i < gimple_phi_num_args (phi); i++)
+     {
+       /* Compute the meet operator over all the PHI arguments flowing
+ 	 through executable edges.  */
+       edge e = gimple_phi_arg_edge (phi, i);
+ 
+       if (dump_file && (dump_flags & TDF_DETAILS))
+ 	{
+ 	  fprintf (dump_file,
+ 	      "\n    Argument #%d (%d -> %d %sexecutable)\n",
+ 	      i, e->src->index, e->dest->index,
+ 	      (e->flags & EDGE_EXECUTABLE) ? "" : "not ");
+ 	}
+ 
+       /* If the incoming edge is executable, Compute the meet operator for
+ 	 the existing value of the PHI node and the current PHI argument.  */
+       if (e->flags & EDGE_EXECUTABLE)
+ 	{
+ 	  tree arg = gimple_phi_arg (phi, i)->def;
+ 	  bit_prop_value_t arg_val = get_value_for_expr (arg);
+ 	  bit_ccp_lattice_meet (&new_val, &arg_val);
+ 
+ 	  if (dump_file && (dump_flags & TDF_DETAILS))
+ 	    {
+ 	      fprintf (dump_file, "\t");
+ 	      print_generic_expr (dump_file, arg, dump_flags);
+ 	      dump_lattice_value (dump_file, "\tValue: ", arg_val);
+ 	      fprintf (dump_file, "\n");
+ 	    }
+ 
+ 	  if (new_val.lattice_val == VARYING)
+ 	    break;
+ 	}
+     }
+ 
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     {
+       dump_lattice_value (dump_file, "\n    PHI node value: ", new_val);
+       fprintf (dump_file, "\n\n");
+     }
+ 
+   /* Make the transition to the new value.  */
+   if (set_lattice_value (gimple_phi_result (phi), new_val))
+     {
+       if (new_val.lattice_val == VARYING)
+ 	return SSA_PROP_VARYING;
+       else
+ 	return SSA_PROP_INTERESTING;
+     }
+   else
+     return SSA_PROP_NOT_INTERESTING;
+ }
+ 
+ /* Evaluate statement STMT.  If the statement produces an output value and
+    its evaluation changes the lattice value of its output, return
+    SSA_PROP_INTERESTING and set *OUTPUT_P to the SSA_NAME holding the
+    output value.
+ 
+    If STMT is a conditional branch and we can determine its truth
+    value, set *TAKEN_EDGE_P accordingly.  If STMT produces a varying
+    value, return SSA_PROP_VARYING.  */
+ 
+ static enum ssa_prop_result
+ bit_ccp_visit_stmt (gimple stmt, edge *taken_edge_p, tree *output_p)
+ {
+   tree def;
+   ssa_op_iter iter;
+ 
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     {
+       fprintf (dump_file, "\nVisiting statement:\n");
+       print_gimple_stmt (dump_file, stmt, 0, dump_flags);
+     }
+ 
+   switch (gimple_code (stmt))
+     {
+       case GIMPLE_ASSIGN:
+         /* If the statement is an assignment that produces a single
+            output value, evaluate its RHS to see if the lattice value of
+            its output has changed.  */
+         return visit_assignment (stmt, output_p);
+ 
+       case GIMPLE_CALL:
+         /* A value-returning call also performs an assignment.  */
+         if (gimple_call_lhs (stmt) != NULL_TREE)
+           return visit_assignment (stmt, output_p);
+         break;
+ 
+       case GIMPLE_COND:
+       case GIMPLE_SWITCH:
+         /* If STMT is a conditional branch, see if we can determine
+            which branch will be taken.   */
+         /* FIXME.  It appears that we should be able to optimize
+            computed GOTOs here as well.  */
+         return visit_cond_stmt (stmt, taken_edge_p);
+ 
+       default:
+         break;
+     }
+ 
+   /* Any other kind of statement is not interesting for constant
+      propagation and, therefore, not worth simulating.  */
+   if (dump_file && (dump_flags & TDF_DETAILS))
+     fprintf (dump_file, "No interesting values produced.  Marked VARYING.\n");
+ 
+   /* Definitions made by statements other than assignments to
+      SSA_NAMEs represent unknown modifications to their outputs.
+      Mark them VARYING.  */
+   FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_ALL_DEFS)
+     {
+       bit_prop_value_t v = { VARYING, { -1, -1 }, { 0, 0 } };
+       set_lattice_value (def, v);
+     }
+ 
+   return SSA_PROP_VARYING;
+ }
+ 
+ 
+ /* Main entry point for SSA Conditional Constant Propagation.  */
+ 
+ static unsigned int
+ do_ssa_bit_ccp (void)
+ {
+   bit_ccp_initialize ();
+   ssa_propagate (bit_ccp_visit_stmt, bit_ccp_visit_phi_node);
+   if (bit_ccp_finalize ())
+     return (TODO_cleanup_cfg | TODO_update_ssa | TODO_remove_unused_locals);
+   return 0;
+ }
+ 
+ 
+ static bool
+ gate_bit_ccp (void)
+ {
+   return flag_tree_bit_ccp != 0;
+ }
+ 
+ 
+ struct gimple_opt_pass pass_bit_ccp =
+ {
+  {
+   GIMPLE_PASS,
+   "bit_ccp",				/* name */
+   gate_bit_ccp,				/* gate */
+   do_ssa_bit_ccp,			/* execute */
+   NULL,					/* sub */
+   NULL,					/* next */
+   0,					/* static_pass_number */
+   TV_TREE_BIT_CCP,			/* tv_id */
+   PROP_cfg | PROP_ssa,			/* properties_required */
+   0,					/* properties_provided */
+   0,					/* properties_destroyed */
+   0,					/* todo_flags_start */
+   TODO_dump_func | TODO_verify_ssa
+   | TODO_verify_stmts | TODO_ggc_collect/* todo_flags_finish */
+  }
+ };
Index: gcc/timevar.def
===================================================================
*** gcc/timevar.def.orig	2010-07-29 15:48:14.000000000 +0200
--- gcc/timevar.def	2010-07-29 15:50:40.000000000 +0200
*************** DEFTIMEVAR (TV_TREE_OPS	             , "
*** 122,127 ****
--- 122,128 ----
  DEFTIMEVAR (TV_TREE_SSA_DOMINATOR_OPTS   , "dominator optimization")
  DEFTIMEVAR (TV_TREE_SRA              , "tree SRA")
  DEFTIMEVAR (TV_TREE_CCP		     , "tree CCP")
+ DEFTIMEVAR (TV_TREE_BIT_CCP	     , "tree bit-CCP")
  DEFTIMEVAR (TV_TREE_PHI_CPROP	     , "tree PHI const/copy prop")
  DEFTIMEVAR (TV_TREE_SPLIT_EDGES      , "tree split crit edges")
  DEFTIMEVAR (TV_TREE_REASSOC          , "tree reassociation")
Index: gcc/common.opt
===================================================================
*** gcc/common.opt.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/common.opt	2010-07-29 15:50:17.000000000 +0200
*************** ftree-ccp
*** 1285,1290 ****
--- 1285,1294 ----
  Common Report Var(flag_tree_ccp) Optimization
  Enable SSA-CCP optimization on trees
  
+ ftree-bit-ccp
+ Common Report Var(flag_tree_bit_ccp) Optimization
+ Enable SSA-BIT-CCP optimization on trees
+ 
  ftree-store-ccp
  Common
  Does nothing.  Preserved for backward compatibility.
Index: gcc/gimple-pretty-print.c
===================================================================
*** gcc/gimple-pretty-print.c.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/gimple-pretty-print.c	2010-07-29 15:50:17.000000000 +0200
*************** dump_gimple_phi (pretty_printer *buffer,
*** 1363,1370 ****
        && POINTER_TYPE_P (TREE_TYPE (lhs))
        && SSA_NAME_PTR_INFO (lhs))
      {
        pp_string (buffer, "PT = ");
!       pp_points_to_solution (buffer, &SSA_NAME_PTR_INFO (lhs)->pt);
        newline_and_indent (buffer, spc);
        pp_string (buffer, "# ");
      }
--- 1363,1375 ----
        && POINTER_TYPE_P (TREE_TYPE (lhs))
        && SSA_NAME_PTR_INFO (lhs))
      {
+       struct ptr_info_def *pi = SSA_NAME_PTR_INFO (lhs);
        pp_string (buffer, "PT = ");
!       pp_points_to_solution (buffer, &pi->pt);
!       newline_and_indent (buffer, spc);
!       if (pi->align != 1)
! 	pp_printf (buffer, "# ALIGN = %u, MISALIGN = %u",
! 		   pi->align, pi->misalign);
        newline_and_indent (buffer, spc);
        pp_string (buffer, "# ");
      }
*************** dump_gimple_stmt (pretty_printer *buffer
*** 1650,1658 ****
  	  && POINTER_TYPE_P (TREE_TYPE (lhs))
  	  && SSA_NAME_PTR_INFO (lhs))
  	{
  	  pp_string (buffer, "# PT = ");
! 	  pp_points_to_solution (buffer, &SSA_NAME_PTR_INFO (lhs)->pt);
  	  newline_and_indent (buffer, spc);
  	}
      }
  
--- 1655,1670 ----
  	  && POINTER_TYPE_P (TREE_TYPE (lhs))
  	  && SSA_NAME_PTR_INFO (lhs))
  	{
+ 	  struct ptr_info_def *pi = SSA_NAME_PTR_INFO (lhs);
  	  pp_string (buffer, "# PT = ");
! 	  pp_points_to_solution (buffer, &pi->pt);
  	  newline_and_indent (buffer, spc);
+ 	  if (pi->align != 1)
+ 	    {
+ 	      pp_printf (buffer, "# ALIGN = %u, MISALIGN = %u",
+ 			 pi->align, pi->misalign);
+ 	      newline_and_indent (buffer, spc);
+ 	    }
  	}
      }
  
Index: gcc/opts.c
===================================================================
*** gcc/opts.c.orig	2010-07-29 15:44:45.000000000 +0200
--- gcc/opts.c	2010-07-29 15:50:17.000000000 +0200
*************** decode_options (unsigned int argc, const
*** 767,772 ****
--- 767,773 ----
    flag_merge_constants = opt1;
    flag_split_wide_types = opt1;
    flag_tree_ccp = opt1;
+   flag_tree_bit_ccp = opt1;
    flag_tree_dce = opt1;
    flag_tree_dom = opt1;
    flag_tree_dse = opt1;
Index: gcc/doc/invoke.texi
===================================================================
*** gcc/doc/invoke.texi.orig	2010-07-29 12:54:25.000000000 +0200
--- gcc/doc/invoke.texi	2010-07-29 15:54:03.000000000 +0200
*************** Objective-C and Objective-C++ Dialects}.
*** 380,386 ****
  -fsel-sched-pipelining -fsel-sched-pipelining-outer-loops @gol
  -fsignaling-nans -fsingle-precision-constant -fsplit-ivs-in-unroller @gol
  -fsplit-wide-types -fstack-protector -fstack-protector-all @gol
! -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer @gol
  -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
  -ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
--- 380,386 ----
  -fsel-sched-pipelining -fsel-sched-pipelining-outer-loops @gol
  -fsignaling-nans -fsingle-precision-constant -fsplit-ivs-in-unroller @gol
  -fsplit-wide-types -fstack-protector -fstack-protector-all @gol
! -fstrict-aliasing -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
  -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copy-prop @gol
  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
  -ftree-forwprop -ftree-fre -ftree-loop-if-convert -ftree-loop-im @gol
*************** compilation time.
*** 5848,5853 ****
--- 5848,5854 ----
  -fipa-reference @gol
  -fmerge-constants
  -fsplit-wide-types @gol
+ -ftree-bit-ccp @gol
  -ftree-builtin-call-dce @gol
  -ftree-ccp @gol
  -ftree-ch @gol
*************** Transposing is enabled only if profiling
*** 6737,6742 ****
--- 6738,6750 ----
  Perform forward store motion  on trees.  This flag is
  enabled by default at @option{-O} and higher.
  
+ @item -ftree-bit-ccp
+ @opindex ftree-bit-ccp
+ Perform sparse conditional bit constant propagation on trees and propagate
+ pointer alignment information.
+ This pass only operates on local scalar variables and is enabled by default
+ at @option{-O} and higher.
+ 
  @item -ftree-ccp
  @opindex ftree-ccp
  Perform sparse conditional constant propagation (CCP) on trees.  This

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-08-04 13:11 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-04  7:43 [PATCH][RFC] Bit CCP and pointer alignment propagation Jay Foad
2010-08-04  8:18 ` Richard Guenther
  -- strict thread matches above, loose matches on Subject: below --
2010-08-04 13:05 Jay Foad
2010-08-04 13:11 ` Richard Guenther
2010-07-30 12:06 Richard Guenther
2010-07-30 12:39 ` Paolo Bonzini
2010-07-30 12:54   ` Paolo Bonzini
2010-07-30 13:02   ` Richard Guenther
2010-07-30 13:16     ` Paolo Bonzini
2010-07-30 13:29       ` Richard Guenther
2010-07-30 13:38         ` Paolo Bonzini
2010-07-30 13:45           ` Richard Guenther
2010-07-30 14:24             ` Paolo Bonzini
2010-07-30 14:51               ` Richard Guenther
2010-07-30 16:23         ` Richard Henderson
2010-07-30 16:38           ` Richard Guenther
2010-07-30 16:49           ` Joseph S. Myers
2010-07-30 18:06             ` Paolo Bonzini
2010-07-30 13:36 ` Michael Matz
2010-07-30 13:39   ` Richard Guenther
2010-07-30 14:26     ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).