public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Reload bug & SRA oddness
@ 2007-04-20  0:29 Bernd Schmidt
  2007-04-20  3:19 ` Andrew Pinski
                   ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-20  0:29 UTC (permalink / raw)
  To: GCC Patches; +Cc: Diego Novillo

[-- Attachment #1: Type: text/plain, Size: 5023 bytes --]

execute/20040709-2.c is currently failing on the Blackfin at -O3.  This 
is a problem in reload: these days register allocation can assign the 
same hard register to two pseudos even if their lifetimes overlap if one 
of them is uninitialized.  If you then have an insn where the 
uninitialized pseudo dies, and reload needs an output reload, it can 
choose the dying register for it.  Doing so, it clobbers the other live 
pseudo that shares the hardregister with the uninitialized register.

We've fixed this bug twice before, once in find_dummy_reload, and in 
push_reload.  This adds the same check to combine_reloads, where we can 
hit the same problem.

Bootstrapped and regression tested on i686-linux; committed as 123986.

Now, the reason why we have an uninitialized register at all is some 
strangeness in the tree optimizers.  I'm not too familiar with the 
tree-ssa machinery yet, but from what I've seen I blame the SRA pass. 
I've attached a reduced version of the testcase.  Consider the following 
diff between 029t.forwprop1 and 030t.esra:

;; Function fn1D (fn1D)                 ;; Function fn1D (fn1D)

                                       > Initial instantiation for y
                                       >   y.k -> y$k
                                       > Using block-copy for y
                                       > Initial instantiation for x
                                       > Using block-copy for x
                                       > Initial instantiation for D.1631
                                       > Using block-copy for D.1631
                                       >
                                       >
                                       >
                                       > Symbols to be put in SSA form
                                       >
                                       > { y y$k }
                                       >
                                       >
                                       > Incremental SSA update started
                                       >
                                       > Number of blocks in CFG: 3
                                       > Number of blocks to update: 2
                                       >
                                       >
                                       >
fn1D (x)                                fn1D (x)
{                                       {
                                       >   <unnamed-unsigned:29> y$k;
   struct D D.1631;                        struct D D.1631;
   struct D x;                             struct D x;
   struct D y;                             struct D y;
   unsigned int D.1534;                    unsigned int D.1534;
   <unnamed-unsigned:29> D.1533;           <unnamed-unsigned:29> D.1533;
   unsigned int D.1532;                    unsigned int D.1532;
   unsigned int D.1531;                    unsigned int D.1531;
   <unnamed-unsigned:29> D.1530;           <unnamed-unsigned:29> D.1530;

<bb 2>:                                 <bb 2>:
                                       >   y.k = y$k_9(D);
   y = sD;                                 y = sD;
   D.1530_1 = y.k;                     |   y$k_10 = y.k;
                                       >   D.1530_1 = y$k_10;
   D.1531_2 = (unsigned int) D.1530_1;     D.1531_2 = (unsigned int) 
D.1530_1;
   D.1532_4 = D.1531_2 + x_3(D);           D.1532_4 = D.1531_2 + x_3(D);
   D.1533_5 = (<unnamed-unsigned:29>)      D.1533_5 = 
(<unnamed-unsigned:29>)
   y.k = D.1533_5;                     |   y$k_11 = D.1533_5;
                                       >   y.k = y$k_11;
   x = y;                                  x = y;
   D.1631 = x;                             D.1631 = x;
                                       >   y.k = y$k_11;
   y = D.1631;                             y = D.1631;
   D.1530_6 = y.k;                     |   y$k_12 = y.k;
                                       >   D.1530_6 = y$k_12;
   D.1534_7 = (unsigned int) D.1530_6;     D.1534_7 = (unsigned int) 
D.1530_6;
   return D.1534_7;                        return D.1534_7;

}                                       }

What strikes me as odd here is the following piece of code:
    y.k = y$k_9(D);
    y = sD;
The (D) apparently stands for "default definition" which I take to mean 
it's considered uninitialized.  I don't know why SRA thinks it needs to 
insert this statement, as all of y is overwritten immediately after. 
This survives into RTL, there are several uninitialized pseudos with 
variable name "y$k", and since we're dealing with bit fields it's not 
eliminated - a lot of unneeded code seems to make it into the final output.

I'll try to dig deeper, but someone more familiar with this code 
(Diego?) can probably fix it more quickly.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, Vincent Roche, Joseph E. McDonough

[-- Attachment #2: uninitialized.diff --]
[-- Type: text/plain, Size: 1243 bytes --]

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 123985)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2007-04-19  Bernd Schmidt  <bernd.schmidt@analog.com>
+
+	* reload.c (combine_reloads): When trying to use a dying register,
+	check whether it's uninitialized and don't use if so.
+
 2007-04-19  Brooks Moses  <brooks.moses@codesourcery.com>
 
 	* fold-const.c: Remove prototypes for native_encode_expr and
Index: reload.c
===================================================================
--- reload.c	(revision 123979)
+++ reload.c	(working copy)
@@ -1853,7 +1853,12 @@ combine_reloads (void)
 		    ||  ! (TEST_HARD_REG_BIT
 			   (reg_class_contents[(int) rld[secondary_out].class],
 			    REGNO (XEXP (note, 0)))))))
-	&& ! fixed_regs[REGNO (XEXP (note, 0))])
+	&& ! fixed_regs[REGNO (XEXP (note, 0))]
+	/* Check that we don't use a hardreg for an uninitialized
+	   pseudo.  See also find_dummy_reload().  */
+	&& (ORIGINAL_REGNO (XEXP (note, 0)) < FIRST_PSEUDO_REGISTER
+	    || ! bitmap_bit_p (ENTRY_BLOCK_PTR->il.rtl->global_live_at_end,
+			       ORIGINAL_REGNO (XEXP (note, 0)))))
       {
 	rld[output_reload].reg_rtx
 	  = gen_rtx_REG (rld[output_reload].outmode,

[-- Attachment #3: 20040709-2-reduced.c --]
[-- Type: text/plain, Size: 1445 bytes --]


extern void abort (void);
extern void exit (int);

unsigned int
myrnd (void)
{
  static unsigned int s = 1388815473;
  s *= 1103515245;
  s += 12345;
  return (s / 65536) % 2048;
}
struct __attribute__((packed)) D { unsigned long long l : 6, i : 6, j : 23, k : 29; };
struct D sD;
struct D retmeD (struct D x)
{
    return x;
}
unsigned int fn1D (unsigned int x)
{
    struct D y = sD;
    y.k += x;
    y = retmeD (y);
    return y.k;
}
unsigned int fn2D (unsigned int x)
{
    struct D y = sD;
    y.k += x;
    y.k %= 15;
    return y.k;
}
unsigned int retitD (void)
{
    return sD.k;
}
unsigned int fn3D (unsigned int x)
{
    sD.k += x;
    return retitD ();
}
void testD (void)
{
    int i;
    unsigned int mask, v, a, r;
    struct D x;
    char *p = (char *) &sD;
    for (i = 0; i < sizeof (sD); ++i)
	*p++ = myrnd ();
    sD.k = -1;
    mask = sD.k;
    v = myrnd ();
    a = myrnd ();
    sD.k = v;
    x = sD;
    r = fn1D (a);
    if (x.i != sD.i || x.j != sD.j || x.k != sD.k || x.l != sD.l || ((v + a) & mask) != r)
	abort ();
    v = myrnd ();
    a = myrnd ();
    sD.k = v;
    x = sD;
    r = fn2D (a);
    if (x.i != sD.i || x.j != sD.j || x.k != sD.k || x.l != sD.l || ((((v + a) & mask) % 15) & mask) != r)
	abort ();
    v = myrnd ();
    a = myrnd ();
    sD.k = v;
    x = sD;
    r = fn3D (a);
    if (x.i != sD.i || x.j != sD.j || sD.k != r || x.l != sD.l)
	abort ();
}

int
main (void)
{
  testD ();
  exit (0);
}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  0:29 Reload bug & SRA oddness Bernd Schmidt
@ 2007-04-20  3:19 ` Andrew Pinski
  2007-04-20  3:53 ` Daniel Berlin
  2007-04-20 14:01 ` Bernd Schmidt
  2 siblings, 0 replies; 104+ messages in thread
From: Andrew Pinski @ 2007-04-20  3:19 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches, Diego Novillo, Alexandre Oliva

On 4/19/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> execute/20040709-2.c is currently failing on the Blackfin at -O3.  This
> is a problem in reload: these days register allocation can assign the
> same hard register to two pseudos even if their lifetimes overlap if one
> of them is uninitialized.  If you then have an insn where the
> uninitialized pseudo dies, and reload needs an output reload, it can
> choose the dying register for it.  Doing so, it clobbers the other live
> pseudo that shares the hardregister with the uninitialized register.

I reported this same bug back to the original developer of the SRA
patch which caused the regression on spu-elf in
http://gcc.gnu.org/ml/gcc-patches/2007-04/msg01036.html .  I did not
look into why it failed after I saw SRA was messing up in the first
place.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  0:29 Reload bug & SRA oddness Bernd Schmidt
  2007-04-20  3:19 ` Andrew Pinski
@ 2007-04-20  3:53 ` Daniel Berlin
  2007-04-20  4:30   ` Bernd Schmidt
  2007-04-28 20:48   ` Bernd Schmidt
  2007-04-20 14:01 ` Bernd Schmidt
  2 siblings, 2 replies; 104+ messages in thread
From: Daniel Berlin @ 2007-04-20  3:53 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches, Diego Novillo

On 4/19/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:          }
>
> What strikes me as odd here is the following piece of code:
>     y.k = y$k_9(D);
>     y = sD;
> The (D) apparently stands for "default definition" which I take to mean
> it's considered uninitialized.

Yes
>  I don't know why SRA thinks it needs to
> insert this statement, as all of y is overwritten immediately after.
SRA does not eliminate dead stores, the dead store elimination does.

SRA also does not try to discover which pieces of the structure are
actually used in the function prior to performing SRA, so it will
generate uninitialized uses when it does element by element copies.

This has been the way SRA has worked forever.  I noticed it when I
wrote create_structure_field_vars, since after SRA, we end up with
"used" fields that have no structure field vars.
I mentioned this, and was told it was going to be too expensive to
track which pieces are used.  After trying to do the same thing in
create_structure_vars, I agree.

It would also be a mistake to try to do significant optimization of
copy in/out placement directly in SRA.  Again, that is why we have PRE
and DSE.  If they are not moving/eliminating these things, they need
to be fixed.

> This survives into RTL, there are several uninitialized pseudos with
> variable name "y$k",

This is a deficiency in tree level DSE then, it should have been eliminated.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  3:53 ` Daniel Berlin
@ 2007-04-20  4:30   ` Bernd Schmidt
  2007-04-20  4:48     ` Andrew Pinski
                       ` (2 more replies)
  2007-04-28 20:48   ` Bernd Schmidt
  1 sibling, 3 replies; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-20  4:30 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: GCC Patches, Diego Novillo, Alexandre Oliva, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 1940 bytes --]

Daniel Berlin wrote:
> On 4/19/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:          }
>>
>> What strikes me as odd here is the following piece of code:
>>     y.k = y$k_9(D);
>>     y = sD;
>> The (D) apparently stands for "default definition" which I take to mean
>> it's considered uninitialized.
> 
> Yes
>>  I don't know why SRA thinks it needs to
>> insert this statement, as all of y is overwritten immediately after.
> SRA does not eliminate dead stores, the dead store elimination does.
> 
> SRA also does not try to discover which pieces of the structure are
> actually used in the function prior to performing SRA, so it will
> generate uninitialized uses when it does element by element copies.
> 
> This has been the way SRA has worked forever.  I noticed it when I
> wrote create_structure_field_vars, since after SRA, we end up with
> "used" fields that have no structure field vars.
> I mentioned this, and was told it was going to be too expensive to
> track which pieces are used.  After trying to do the same thing in
> create_structure_vars, I agree.
> 
> It would also be a mistake to try to do significant optimization of
> copy in/out placement directly in SRA.  Again, that is why we have PRE
> and DSE.  If they are not moving/eliminating these things, they need
> to be fixed.

I'm not convinced it can't be done with changes to SRA - this behaviour 
does seem to have been introduced by a recent patch by aoliva.  I'm 
testing the appended patch which cures the inefficiency in the testcase 
I sent previously and is pretty far along in a bootstrap.  Alex, any 
particular reason why you made this change in the first place?  Andrew, 
does this fix your problems too?


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, Vincent Roche, Joseph E. McDonough

[-- Attachment #2: sra.diff --]
[-- Type: text/plain, Size: 1037 bytes --]

Index: tree-sra.c
===================================================================
--- tree-sra.c	(revision 123979)
+++ tree-sra.c	(working copy)
@@ -2523,25 +2523,17 @@ scalarize_use (struct sra_elt *elt, tree
 	 a structure is passed as more than one argument to a function call.
 	 This optimization would be most effective if sra_walk_function
 	 processed the blocks in dominator order.  */
-
-      generate_copy_inout (elt, false, generate_element_ref (elt), &list);
-      if (list)
+      generate_copy_inout (elt, is_output, generate_element_ref (elt), &list);
+      if (list == NULL)
+	return;
+      mark_all_v_defs (list);
+      if (is_output)
+	sra_insert_after (bsi, list);
+      else
 	{
-	  mark_all_v_defs (list);
 	  sra_insert_before (bsi, list);
 	  mark_no_warning (elt);
 	}
-
-      if (is_output)
-	{
-	  list = NULL;
-	  generate_copy_inout (elt, true, generate_element_ref (elt), &list);
-	  if (list)
-	    {
-	      mark_all_v_defs (list);
-	      sra_insert_after (bsi, list);
-	    }
-	}
     }
 }
 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  4:30   ` Bernd Schmidt
@ 2007-04-20  4:48     ` Andrew Pinski
  2007-04-20  7:32     ` Alexandre Oliva
  2007-04-20 17:07     ` Daniel Berlin
  2 siblings, 0 replies; 104+ messages in thread
From: Andrew Pinski @ 2007-04-20  4:48 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Alexandre Oliva

On 4/19/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>
> I'm not convinced it can't be done with changes to SRA - this behaviour
> does seem to have been introduced by a recent patch by aoliva.  I'm
> testing the appended patch which cures the inefficiency in the testcase
> I sent previously and is pretty far along in a bootstrap.  Alex, any
> particular reason why you made this change in the first place?  Andrew,
> does this fix your problems too?
>

I will check tomorrow as my spu-elf build is still going.  Also the
reload change did not fix up the regressions there.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  4:30   ` Bernd Schmidt
  2007-04-20  4:48     ` Andrew Pinski
@ 2007-04-20  7:32     ` Alexandre Oliva
  2007-04-20 12:57       ` Bernd Schmidt
  2007-04-20 17:07     ` Daniel Berlin
  2 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-04-20  7:32 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

On Apr 20, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> Alex, any particular reason why you made this change in
> the first place?

Sure.  IIRC, without it, we'd initialize scalarized fields, fail to
copy them into the unscalarized variable, then block-copy the entire
variable with outdated contents.  I remember this was one of the last
fixes I added to the patch, so taking it out you're certainly going to
observe regressions on x86_64-linux-gnu and/or i686-linux-gnu.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  7:32     ` Alexandre Oliva
@ 2007-04-20 12:57       ` Bernd Schmidt
  2007-04-20 18:28         ` Alexandre Oliva
  2007-04-20 18:40         ` Alexandre Oliva
  0 siblings, 2 replies; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-20 12:57 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

Alexandre Oliva wrote:
> On Apr 20, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> 
>> Alex, any particular reason why you made this change in
>> the first place?
> 
> Sure.  IIRC, without it, we'd initialize scalarized fields, fail to
> copy them into the unscalarized variable, then block-copy the entire
> variable with outdated contents.

Doesn't this indicate we're passing the wrong value for the IS_OUTPUT 
arg of scalarize_use somewhere?

> I remember this was one of the last
> fixes I added to the patch, so taking it out you're certainly going to
> observe regressions on x86_64-linux-gnu and/or i686-linux-gnu.

Bootstrap completed, and no check-gcc regressions on i686-linux.  Can 
you send me a testcase that fails when this change is made?


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, Vincent Roche, Joseph E. McDonough

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  0:29 Reload bug & SRA oddness Bernd Schmidt
  2007-04-20  3:19 ` Andrew Pinski
  2007-04-20  3:53 ` Daniel Berlin
@ 2007-04-20 14:01 ` Bernd Schmidt
  2007-04-20 22:00   ` Eric Botcazou
  2007-04-29  6:42   ` Bernd Schmidt
  2 siblings, 2 replies; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-20 14:01 UTC (permalink / raw)
  To: GCC Patches

Bernd Schmidt wrote:
> execute/20040709-2.c is currently failing on the Blackfin at -O3.  This 
> is a problem in reload: these days register allocation can assign the 
> same hard register to two pseudos even if their lifetimes overlap if one 
> of them is uninitialized.  If you then have an insn where the 
> uninitialized pseudo dies, and reload needs an output reload, it can 
> choose the dying register for it.  Doing so, it clobbers the other live 
> pseudo that shares the hardregister with the uninitialized register.
> 
> We've fixed this bug twice before, once in find_dummy_reload, and in 
> push_reload.  This adds the same check to combine_reloads, where we can 
> hit the same problem.

I've bootstrapped & tested this on the 4.2 branch (i686-linux) and 
committed there too, as 123995.  Probably won't get around to doing 4.1 
for a week or so.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, Vincent Roche, Joseph E. McDonough

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  4:30   ` Bernd Schmidt
  2007-04-20  4:48     ` Andrew Pinski
  2007-04-20  7:32     ` Alexandre Oliva
@ 2007-04-20 17:07     ` Daniel Berlin
  2 siblings, 0 replies; 104+ messages in thread
From: Daniel Berlin @ 2007-04-20 17:07 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches, Diego Novillo, Alexandre Oliva, Andrew Pinski

On 4/19/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> Daniel Berlin wrote:
> > On 4/19/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:          }
> >>
> >> What strikes me as odd here is the following piece of code:
> >>     y.k = y$k_9(D);
> >>     y = sD;
> >> The (D) apparently stands for "default definition" which I take to mean
> >> it's considered uninitialized.
> >
> > Yes
> >>  I don't know why SRA thinks it needs to
> >> insert this statement, as all of y is overwritten immediately after.
> > SRA does not eliminate dead stores, the dead store elimination does.
> >
> > SRA also does not try to discover which pieces of the structure are
> > actually used in the function prior to performing SRA, so it will
> > generate uninitialized uses when it does element by element copies.
> >
> > This has been the way SRA has worked forever.  I noticed it when I
> > wrote create_structure_field_vars, since after SRA, we end up with
> > "used" fields that have no structure field vars.
> > I mentioned this, and was told it was going to be too expensive to
> > track which pieces are used.  After trying to do the same thing in
> > create_structure_vars, I agree.
> >
> > It would also be a mistake to try to do significant optimization of
> > copy in/out placement directly in SRA.  Again, that is why we have PRE
> > and DSE.  If they are not moving/eliminating these things, they need
> > to be fixed.
>
> I'm not convinced it can't be done with changes to SRA - this behaviour
> does seem to have been introduced by a recent patch by aoliva.

So, looking again at your testcase, it looks like you should be right.
We definitely don't need to *copy into the scalarized variable* before
a store to the unscalarized version, ever.
But this won't get you less uninitialized variables in general
generated from SRA, just as an FYI.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20 12:57       ` Bernd Schmidt
@ 2007-04-20 18:28         ` Alexandre Oliva
  2007-04-20 18:40         ` Alexandre Oliva
  1 sibling, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-04-20 18:28 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

On Apr 20, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> Doesn't this indicate we're passing the wrong value for the IS_OUTPUT
> arg of scalarize_use somewhere?

Not quite.  Modifying a bit-field needs to preserve part of the
original value, so it's both input and output.

> Bootstrap completed, and no check-gcc regressions on i686-linux.  Can
> you send me a testcase that fails when this change is made?

I'll give your patch a try.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20 12:57       ` Bernd Schmidt
  2007-04-20 18:28         ` Alexandre Oliva
@ 2007-04-20 18:40         ` Alexandre Oliva
  2007-04-30 10:09           ` Bernd Schmidt
  1 sibling, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-04-20 18:40 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

On Apr 20, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> Bootstrap completed, and no check-gcc regressions on i686-linux.  Can
> you send me a testcase that fails when this change is made?

Native configuration is x86_64-pc-linux-gnu
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -g 

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20 14:01 ` Bernd Schmidt
@ 2007-04-20 22:00   ` Eric Botcazou
  2007-04-28 16:25     ` Bernd Schmidt
  2007-04-29  6:42   ` Bernd Schmidt
  1 sibling, 1 reply; 104+ messages in thread
From: Eric Botcazou @ 2007-04-20 22:00 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

> I've bootstrapped & tested this on the 4.2 branch (i686-linux) and
> committed there too, as 123995.  Probably won't get around to doing 4.1
> for a week or so.

Is the problem a regression on the 4.1 branch too?  If not, please refrain 
from touching reload at this point.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20 22:00   ` Eric Botcazou
@ 2007-04-28 16:25     ` Bernd Schmidt
  2007-04-28 17:46       ` Eric Botcazou
  0 siblings, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-28 16:25 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

Eric Botcazou wrote:
>> I've bootstrapped & tested this on the 4.2 branch (i686-linux) and
>> committed there too, as 123995.  Probably won't get around to doing 4.1
>> for a week or so.
> 
> Is the problem a regression on the 4.1 branch too?  If not, please refrain 
> from touching reload at this point.

4.1 has Vlad's accurate liveness code, and the two other bug fixes I 
mentioned, so there's no reason to believe the problem does not affect 
the branch.  The patch is conservative, and not intrusive.  I've 
bootstrapped and regression tested it on i686-linux.  Ok to commit?


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-28 16:25     ` Bernd Schmidt
@ 2007-04-28 17:46       ` Eric Botcazou
  0 siblings, 0 replies; 104+ messages in thread
From: Eric Botcazou @ 2007-04-28 17:46 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

> 4.1 has Vlad's accurate liveness code, and the two other bug fixes I
> mentioned, so there's no reason to believe the problem does not affect
> the branch.  The patch is conservative, and not intrusive.  I've
> bootstrapped and regression tested it on i686-linux.  Ok to commit?

Well, you have more authority than anyone else over this, so it's up to you to 
answer your own question, I'm afraid. :-)

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20  3:53 ` Daniel Berlin
  2007-04-20  4:30   ` Bernd Schmidt
@ 2007-04-28 20:48   ` Bernd Schmidt
  2007-04-28 21:26     ` Richard Guenther
  1 sibling, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-28 20:48 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: GCC Patches, Diego Novillo

Daniel Berlin wrote:
> It would also be a mistake to try to do significant optimization of
> copy in/out placement directly in SRA.  Again, that is why we have PRE
> and DSE.  If they are not moving/eliminating these things, they need
> to be fixed.

I've had a look at why these statements are not eliminated.  I figure 
that dce is the pass that should get rid of them, but it doesn't.  From 
the dumps:

   # SFT.70_14 = VDEF <SFT.70_13(D)>
   y.k = y$k_9(D);
   # VUSE <sD_15(D)>
   # SFT.70_19 = VDEF <SFT.70_14>
   # SFT.71_20 = VDEF <SFT.71_16(D)>
   # SFT.72_21 = VDEF <SFT.72_17(D)>
   # SFT.73_22 = VDEF <SFT.73_18(D)>
   y = sD;

Note how all the VDEFs have an input argument, which causes us to 
believe that the second statement depends on the first.  AFAICT, these 
VDEFs should have no arguments, but the compiler always generates them 
with exactly one input.  I've tried to fix that, but so far the patch I 
have is only useful as an SSA learning tool for myself, as it creates 
lots of failures elsewhere in the compiler...

Any other ideas how to address this?


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-28 20:48   ` Bernd Schmidt
@ 2007-04-28 21:26     ` Richard Guenther
  2007-04-28 21:49       ` Daniel Berlin
  0 siblings, 1 reply; 104+ messages in thread
From: Richard Guenther @ 2007-04-28 21:26 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo

On 4/28/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> Daniel Berlin wrote:
> > It would also be a mistake to try to do significant optimization of
> > copy in/out placement directly in SRA.  Again, that is why we have PRE
> > and DSE.  If they are not moving/eliminating these things, they need
> > to be fixed.
>
> I've had a look at why these statements are not eliminated.  I figure
> that dce is the pass that should get rid of them, but it doesn't.  From
> the dumps:
>
>    # SFT.70_14 = VDEF <SFT.70_13(D)>
>    y.k = y$k_9(D);
>    # VUSE <sD_15(D)>
>    # SFT.70_19 = VDEF <SFT.70_14>
>    # SFT.71_20 = VDEF <SFT.71_16(D)>
>    # SFT.72_21 = VDEF <SFT.72_17(D)>
>    # SFT.73_22 = VDEF <SFT.73_18(D)>
>    y = sD;
>
> Note how all the VDEFs have an input argument, which causes us to
> believe that the second statement depends on the first.  AFAICT, these
> VDEFs should have no arguments, but the compiler always generates them
> with exactly one input.  I've tried to fix that, but so far the patch I
> have is only useful as an SSA learning tool for myself, as it creates
> lots of failures elsewhere in the compiler...
>
> Any other ideas how to address this?

These are all the default SSA_NAMEs of the SFTs (note the (D)).  If
y is a local variable DCE should "ignore" them (those uses are undefined
in that case).  So the only relevant VDEF for DCE/DSE is

# SFT.70_19 = VDEF <SFT.70_14>

So it's a DCE problem, not a problem of the SSA form.

Richard.
>
> Bernd
> --
> This footer brought to you by insane German lawmakers.
> Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
> Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
> Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-28 21:26     ` Richard Guenther
@ 2007-04-28 21:49       ` Daniel Berlin
  2007-04-29 10:04         ` Richard Guenther
  0 siblings, 1 reply; 104+ messages in thread
From: Daniel Berlin @ 2007-04-28 21:49 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Bernd Schmidt, GCC Patches, Diego Novillo

On 4/28/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> On 4/28/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> > Daniel Berlin wrote:
> > > It would also be a mistake to try to do significant optimization of
> > > copy in/out placement directly in SRA.  Again, that is why we have PRE
> > > and DSE.  If they are not moving/eliminating these things, they need
> > > to be fixed.
> >
> > I've had a look at why these statements are not eliminated.  I figure
> > that dce is the pass that should get rid of them, but it doesn't.  From
> > the dumps:
> >
> >    # SFT.70_14 = VDEF <SFT.70_13(D)>
> >    y.k = y$k_9(D);
> >    # VUSE <sD_15(D)>
> >    # SFT.70_19 = VDEF <SFT.70_14>
> >    # SFT.71_20 = VDEF <SFT.71_16(D)>
> >    # SFT.72_21 = VDEF <SFT.72_17(D)>
> >    # SFT.73_22 = VDEF <SFT.73_18(D)>
> >    y = sD;
> >
> > Note how all the VDEFs have an input argument, which causes us to
> > believe that the second statement depends on the first.  AFAICT, these
> > VDEFs should have no arguments, but the compiler always generates them
> > with exactly one input.  I've tried to fix that, but so far the patch I
> > have is only useful as an SSA learning tool for myself, as it creates
> > lots of failures elsewhere in the compiler...
> >
> > Any other ideas how to address this?
>
> These are all the default SSA_NAMEs of the SFTs (note the (D)).  If
> y is a local variable DCE should "ignore" them (those uses are undefined
> in that case).  So the only relevant VDEF for DCE/DSE is
>
> # SFT.70_19 = VDEF <SFT.70_14>
>
Right.

I would imagine *DSE* should eliminate this, not DCE, because it's
store over store.
However, this does show a problem with our SSA form, actually.
Our VDEF's treat the RHS argument of the VDEF as an implicit use (and
will avoid adding explicit vuses of those arguments), besides just the
next part of the def-def chain to follow.  However, in cases like the
above, there *is no use of SFT.70_14* in that statement, only a def.
We don't have a way to specify that, so DCE will mark the statement
defining SFT.70_14 as necessary.

DSE should do better, since it was taught how to deal with this
problem by seeing if the store is to the same place, or subsumed by
the later store with no intervening uses.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20 14:01 ` Bernd Schmidt
  2007-04-20 22:00   ` Eric Botcazou
@ 2007-04-29  6:42   ` Bernd Schmidt
  1 sibling, 0 replies; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-29  6:42 UTC (permalink / raw)
  To: GCC Patches

Bernd Schmidt wrote:
> Bernd Schmidt wrote:
>> execute/20040709-2.c is currently failing on the Blackfin at -O3. 
>> This is a problem in reload: these days register allocation can assign
>> the same hard register to two pseudos even if their lifetimes overlap
>> if one of them is uninitialized.  If you then have an insn where the
>> uninitialized pseudo dies, and reload needs an output reload, it can
>> choose the dying register for it.  Doing so, it clobbers the other
>> live pseudo that shares the hardregister with the uninitialized register.
>>
>> We've fixed this bug twice before, once in find_dummy_reload, and in
>> push_reload.  This adds the same check to combine_reloads, where we
>> can hit the same problem.
> 
> I've bootstrapped & tested this on the 4.2 branch (i686-linux) and
> committed there too, as 123995.  Probably won't get around to doing 4.1
> for a week or so.

Now committed on the 4.1 branch as 124268 after the same tests.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-28 21:49       ` Daniel Berlin
@ 2007-04-29 10:04         ` Richard Guenther
  2007-04-29 10:27           ` Richard Guenther
  0 siblings, 1 reply; 104+ messages in thread
From: Richard Guenther @ 2007-04-29 10:04 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Bernd Schmidt, GCC Patches, Diego Novillo

On 4/28/07, Daniel Berlin <dberlin@dberlin.org> wrote:
> On 4/28/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> > On 4/28/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> > > Daniel Berlin wrote:
> > > > It would also be a mistake to try to do significant optimization of
> > > > copy in/out placement directly in SRA.  Again, that is why we have PRE
> > > > and DSE.  If they are not moving/eliminating these things, they need
> > > > to be fixed.
> > >
> > > I've had a look at why these statements are not eliminated.  I figure
> > > that dce is the pass that should get rid of them, but it doesn't.  From
> > > the dumps:
> > >
> > >    # SFT.70_14 = VDEF <SFT.70_13(D)>
> > >    y.k = y$k_9(D);
> > >    # VUSE <sD_15(D)>
> > >    # SFT.70_19 = VDEF <SFT.70_14>
> > >    # SFT.71_20 = VDEF <SFT.71_16(D)>
> > >    # SFT.72_21 = VDEF <SFT.72_17(D)>
> > >    # SFT.73_22 = VDEF <SFT.73_18(D)>
> > >    y = sD;
> > >
> > > Note how all the VDEFs have an input argument, which causes us to
> > > believe that the second statement depends on the first.  AFAICT, these
> > > VDEFs should have no arguments, but the compiler always generates them
> > > with exactly one input.  I've tried to fix that, but so far the patch I
> > > have is only useful as an SSA learning tool for myself, as it creates
> > > lots of failures elsewhere in the compiler...
> > >
> > > Any other ideas how to address this?
> >
> > These are all the default SSA_NAMEs of the SFTs (note the (D)).  If
> > y is a local variable DCE should "ignore" them (those uses are undefined
> > in that case).  So the only relevant VDEF for DCE/DSE is
> >
> > # SFT.70_19 = VDEF <SFT.70_14>
> >
> Right.
>
> I would imagine *DSE* should eliminate this, not DCE, because it's
> store over store.
> However, this does show a problem with our SSA form, actually.
> Our VDEF's treat the RHS argument of the VDEF as an implicit use (and
> will avoid adding explicit vuses of those arguments), besides just the
> next part of the def-def chain to follow.  However, in cases like the
> above, there *is no use of SFT.70_14* in that statement, only a def.
> We don't have a way to specify that, so DCE will mark the statement
> defining SFT.70_14 as necessary.
>
> DSE should do better, since it was taught how to deal with this
> problem by seeing if the store is to the same place, or subsumed by
> the later store with no intervening uses.

Now the problem with DSE seems to be that for

(gdb) call debug_generic_expr (stmt)
# SFT.84_24 = VDEF <SFT.84_10(D)> { SFT.84 }
y.k = y$k_23(D)
(gdb) call debug_generic_expr (use_stmt)
# VUSE <sD_9(D)> { sD }
# SFT.84_14 = VDEF <SFT.84_24>
# SFT.85_15 = VDEF <SFT.85_11(D)>
# SFT.86_16 = VDEF <SFT.86_12(D)>
# SFT.87_17 = VDEF <SFT.87_13(D)> { SFT.84 SFT.85 SFT.86 SFT.87 }
y = sD

at tree-ssa-dse.c:604 (the call to dse_partial_kill_p):

      /* If we have precisely one immediate use at this point, then we may
         have found redundant store.  Make sure that the stores are to
         the same memory location.  This includes checking that any
         SSA-form variables in the address will have the same values.  */
      if (use_p != NULL_USE_OPERAND_P
          && bitmap_bit_p (dse_gd->stores, get_stmt_uid (use_stmt))
          && (operand_equal_p (GIMPLE_STMT_OPERAND (stmt, 0),
                               GIMPLE_STMT_OPERAND (use_stmt, 0), 0)
              || dse_partial_kill_p (stmt, dse_gd))
          && memory_address_same (stmt, use_stmt))

dse_partial_kill_p does not do what its same suggests, but first asserts
that stmt stores to all of the aggregate (which it of course does not).
It _looks_ like that maybe dse_partial_kill_p (use_stmt, dse_gd) was
intended here.  Otherwise DSE detects stmt as possible dead store.

Richard.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-29 10:04         ` Richard Guenther
@ 2007-04-29 10:27           ` Richard Guenther
  2007-04-29 10:31             ` Richard Guenther
  0 siblings, 1 reply; 104+ messages in thread
From: Richard Guenther @ 2007-04-29 10:27 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Bernd Schmidt, GCC Patches, Diego Novillo

On 4/29/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> On 4/28/07, Daniel Berlin <dberlin@dberlin.org> wrote:
> > On 4/28/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> > > On 4/28/07, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> > > > Daniel Berlin wrote:
> > > > > It would also be a mistake to try to do significant optimization of
> > > > > copy in/out placement directly in SRA.  Again, that is why we have PRE
> > > > > and DSE.  If they are not moving/eliminating these things, they need
> > > > > to be fixed.
> > > >
> > > > I've had a look at why these statements are not eliminated.  I figure
> > > > that dce is the pass that should get rid of them, but it doesn't.  From
> > > > the dumps:
> > > >
> > > >    # SFT.70_14 = VDEF <SFT.70_13(D)>
> > > >    y.k = y$k_9(D);
> > > >    # VUSE <sD_15(D)>
> > > >    # SFT.70_19 = VDEF <SFT.70_14>
> > > >    # SFT.71_20 = VDEF <SFT.71_16(D)>
> > > >    # SFT.72_21 = VDEF <SFT.72_17(D)>
> > > >    # SFT.73_22 = VDEF <SFT.73_18(D)>
> > > >    y = sD;
> > > >
> > > > Note how all the VDEFs have an input argument, which causes us to
> > > > believe that the second statement depends on the first.  AFAICT, these
> > > > VDEFs should have no arguments, but the compiler always generates them
> > > > with exactly one input.  I've tried to fix that, but so far the patch I
> > > > have is only useful as an SSA learning tool for myself, as it creates
> > > > lots of failures elsewhere in the compiler...
> > > >
> > > > Any other ideas how to address this?
> > >
> > > These are all the default SSA_NAMEs of the SFTs (note the (D)).  If
> > > y is a local variable DCE should "ignore" them (those uses are undefined
> > > in that case).  So the only relevant VDEF for DCE/DSE is
> > >
> > > # SFT.70_19 = VDEF <SFT.70_14>
> > >
> > Right.
> >
> > I would imagine *DSE* should eliminate this, not DCE, because it's
> > store over store.
> > However, this does show a problem with our SSA form, actually.
> > Our VDEF's treat the RHS argument of the VDEF as an implicit use (and
> > will avoid adding explicit vuses of those arguments), besides just the
> > next part of the def-def chain to follow.  However, in cases like the
> > above, there *is no use of SFT.70_14* in that statement, only a def.
> > We don't have a way to specify that, so DCE will mark the statement
> > defining SFT.70_14 as necessary.
> >
> > DSE should do better, since it was taught how to deal with this
> > problem by seeing if the store is to the same place, or subsumed by
> > the later store with no intervening uses.
>
> Now the problem with DSE seems to be that for
>
> (gdb) call debug_generic_expr (stmt)
> # SFT.84_24 = VDEF <SFT.84_10(D)> { SFT.84 }
> y.k = y$k_23(D)
> (gdb) call debug_generic_expr (use_stmt)
> # VUSE <sD_9(D)> { sD }
> # SFT.84_14 = VDEF <SFT.84_24>
> # SFT.85_15 = VDEF <SFT.85_11(D)>
> # SFT.86_16 = VDEF <SFT.86_12(D)>
> # SFT.87_17 = VDEF <SFT.87_13(D)> { SFT.84 SFT.85 SFT.86 SFT.87 }
> y = sD
>
> at tree-ssa-dse.c:604 (the call to dse_partial_kill_p):
>
>       /* If we have precisely one immediate use at this point, then we may
>          have found redundant store.  Make sure that the stores are to
>          the same memory location.  This includes checking that any
>          SSA-form variables in the address will have the same values.  */
>       if (use_p != NULL_USE_OPERAND_P
>           && bitmap_bit_p (dse_gd->stores, get_stmt_uid (use_stmt))
>           && (operand_equal_p (GIMPLE_STMT_OPERAND (stmt, 0),
>                                GIMPLE_STMT_OPERAND (use_stmt, 0), 0)
>               || dse_partial_kill_p (stmt, dse_gd))
>           && memory_address_same (stmt, use_stmt))
>
> dse_partial_kill_p does not do what its same suggests, but first asserts
> that stmt stores to all of the aggregate (which it of course does not).
> It _looks_ like that maybe dse_partial_kill_p (use_stmt, dse_gd) was
> intended here.  Otherwise DSE detects stmt as possible dead store.

Oh, and that doesn't work because we don't visit

  # SFT.84D.1820_19 = VDEF <SFT.84D.1820_28>
  # SFT.85D.1821_20 = VDEF <SFT.85D.1821_15>
  # SFT.86D.1822_21 = VDEF <SFT.86D.1822_16>
  # SFT.87D.1823_22 = VDEF <SFT.87D.1823_17>
  yD.1648 = retmeD (yD.1648) [return slot optimization];

and thus don't record y as dse_whole_aggregate_clobbered_p.

Richard.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-29 10:27           ` Richard Guenther
@ 2007-04-29 10:31             ` Richard Guenther
  2007-04-29 11:16               ` Richard Guenther
  0 siblings, 1 reply; 104+ messages in thread
From: Richard Guenther @ 2007-04-29 10:31 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Bernd Schmidt, GCC Patches, Diego Novillo

On 4/29/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> On 4/29/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> > Now the problem with DSE seems to be that for
> >
> > (gdb) call debug_generic_expr (stmt)
> > # SFT.84_24 = VDEF <SFT.84_10(D)> { SFT.84 }
> > y.k = y$k_23(D)
> > (gdb) call debug_generic_expr (use_stmt)
> > # VUSE <sD_9(D)> { sD }
> > # SFT.84_14 = VDEF <SFT.84_24>
> > # SFT.85_15 = VDEF <SFT.85_11(D)>
> > # SFT.86_16 = VDEF <SFT.86_12(D)>
> > # SFT.87_17 = VDEF <SFT.87_13(D)> { SFT.84 SFT.85 SFT.86 SFT.87 }
> > y = sD
> >
> > at tree-ssa-dse.c:604 (the call to dse_partial_kill_p):
> >
> >       /* If we have precisely one immediate use at this point, then we may
> >          have found redundant store.  Make sure that the stores are to
> >          the same memory location.  This includes checking that any
> >          SSA-form variables in the address will have the same values.  */
> >       if (use_p != NULL_USE_OPERAND_P
> >           && bitmap_bit_p (dse_gd->stores, get_stmt_uid (use_stmt))
> >           && (operand_equal_p (GIMPLE_STMT_OPERAND (stmt, 0),
> >                                GIMPLE_STMT_OPERAND (use_stmt, 0), 0)
> >               || dse_partial_kill_p (stmt, dse_gd))
> >           && memory_address_same (stmt, use_stmt))
> >
> > dse_partial_kill_p does not do what its same suggests, but first asserts
> > that stmt stores to all of the aggregate (which it of course does not).
> > It _looks_ like that maybe dse_partial_kill_p (use_stmt, dse_gd) was
> > intended here.  Otherwise DSE detects stmt as possible dead store.
>
> Oh, and that doesn't work because we don't visit
>
>   # SFT.84D.1820_19 = VDEF <SFT.84D.1820_28>
>   # SFT.85D.1821_20 = VDEF <SFT.85D.1821_15>
>   # SFT.86D.1822_21 = VDEF <SFT.86D.1822_16>
>   # SFT.87D.1823_22 = VDEF <SFT.87D.1823_17>
>   yD.1648 = retmeD (yD.1648) [return slot optimization];
>
> and thus don't record y as dse_whole_aggregate_clobbered_p.

We don't record whole-aggregate sets in that bitmap anyway.

Richard.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-29 10:31             ` Richard Guenther
@ 2007-04-29 11:16               ` Richard Guenther
  0 siblings, 0 replies; 104+ messages in thread
From: Richard Guenther @ 2007-04-29 11:16 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Bernd Schmidt, GCC Patches, Diego Novillo

On 4/29/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> On 4/29/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> > Now the problem with DSE seems to be that for
> >
> > (gdb) call debug_generic_expr (stmt)
> > # SFT.84_24 = VDEF <SFT.84_10(D)> { SFT.84 }
> > y.k = y$k_23(D)
> > (gdb) call debug_generic_expr (use_stmt)
> > # VUSE <sD_9(D)> { sD }
> > # SFT.84_14 = VDEF <SFT.84_24>
> > # SFT.85_15 = VDEF <SFT.85_11(D)>
> > # SFT.86_16 = VDEF <SFT.86_12(D)>
> > # SFT.87_17 = VDEF <SFT.87_13(D)> { SFT.84 SFT.85 SFT.86 SFT.87 }
> > y = sD
> >
> > at tree-ssa-dse.c:604 (the call to dse_partial_kill_p):
> >
> >       /* If we have precisely one immediate use at this point, then we may
> >          have found redundant store.  Make sure that the stores are to
> >          the same memory location.  This includes checking that any
> >          SSA-form variables in the address will have the same values.  */
> >       if (use_p != NULL_USE_OPERAND_P
> >           && bitmap_bit_p (dse_gd->stores, get_stmt_uid (use_stmt))
> >           && (operand_equal_p (GIMPLE_STMT_OPERAND (stmt, 0),
> >                                GIMPLE_STMT_OPERAND (use_stmt, 0), 0)
> >               || dse_partial_kill_p (stmt, dse_gd))
> >           && memory_address_same (stmt, use_stmt))
> >
> > dse_partial_kill_p does not do what its same suggests, but first asserts
> > that stmt stores to all of the aggregate (which it of course does not).
> > It _looks_ like that maybe dse_partial_kill_p (use_stmt, dse_gd) was
> > intended here.  Otherwise DSE detects stmt as possible dead store.
>
> Oh, and that doesn't work because we don't visit
>
>   # SFT.84D.1820_19 = VDEF <SFT.84D.1820_28>
>   # SFT.85D.1821_20 = VDEF <SFT.85D.1821_15>
>   # SFT.86D.1822_21 = VDEF <SFT.86D.1822_16>
>   # SFT.87D.1823_22 = VDEF <SFT.87D.1823_17>
>   yD.1648 = retmeD (yD.1648) [return slot optimization];
>
> and thus don't record y as dse_whole_aggregate_clobbered_p.

We don't record whole-aggregate sets in that bitmap anyway.

Richard.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-20 18:40         ` Alexandre Oliva
@ 2007-04-30 10:09           ` Bernd Schmidt
  2007-04-30 19:04             ` Alexandre Oliva
  2007-05-01 15:38             ` Alexandre Oliva
  0 siblings, 2 replies; 104+ messages in thread
From: Bernd Schmidt @ 2007-04-30 10:09 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 2791 bytes --]

Alexandre Oliva wrote:
> On Apr 20, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> 
>> Bootstrap completed, and no check-gcc regressions on i686-linux.  Can
>> you send me a testcase that fails when this change is made?
> 
> Native configuration is x86_64-pc-linux-gnu
> FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -g 

Okay, I've set up an x86-64 install and debugged this a little further.
 I've attached a reduced testcase for this.

What's happening is that in esra, the following happens to function retmeA:
retmeA (x)                              retmeA (x)
{                                       {
                                  >       <unnamed-unsigned:11> x$k;
                                  >       unsigned char x$B0F5;
  struct A D.2000;                        struct A D.2000;

<bb 2>:                                 <bb 2>:
  D.2000 = x;                     |       x$k_1 = x.k;
                                  >       x$B0F5_2 = BIT_FIELD_REF <x,
                                  >       D.2000.k = x$k_1;
                                  >       BIT_FIELD_REF <D.2000, 5, 0>
  return D.2000;                          return D.2000;

}                                       }

It is then inlined into fn1A, where esra runs again, and when scanning
the function, we run into the new BIT_FIELD_REFs.  When it tries to deal
with the following stmt:

   BIT_FIELD_REF <D.4935, 5, 0> = BIT_FIELD_REF <x$B0F5_11, 5, 0>;

it just calls scalarize_use with D.4935 and is_output==true, causing us
to insert two additional stmts:

+  D.4935.k = SR.282_16;
   BIT_FIELD_REF <D.4935, 5, 0> = BIT_FIELD_REF <x$B0F5_11, 5, 0>;
+  SR.282_17 = D.4935.k;

Both of these are unnecessary (and inefficient), as the BIT_FIELD_REF
does not touch "k".  With my patch to restore is_output behaviour, we
only eliminate the first, but not the second.

IMO the real problem is that SRA doesn't know how to deal with these
BIT_FIELD_REFs properly - possibly something similar to how we deal with
ARRAY_RANGE_REF could be used?  I'll keep investigating this, but for
the moment there are two more points I'd like to make:

 * 20040709-2.c (which is very bitfield intensive) at -O3 has a pretty
large code size regression at least on x86-64 when compared to before
your SRA patch.
 * I've seen no serious attempt to investigate or fix the regressions
Andrew Pinski has reported.

As a consequence of all these problems, I'd like to see this patch
reverted for now until it we know how to do it properly.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

[-- Attachment #2: 20040709-2-A.c --]
[-- Type: text/plain, Size: 1477 bytes --]

extern void abort (void);
extern void exit (int);

unsigned int
myrnd (void)
{
  static unsigned int s = 1388815473;
  s *= 1103515245;
  s += 12345;
  return (s / 65536) % 2048;
}

struct __attribute__ ((packed)) A
{
  unsigned short i:1, l:1, j:3, k:11;
};
struct A sA;
struct A
retmeA (struct A x)
{
  return x;
}
unsigned int
fn1A (unsigned int x)
{
  struct A y = sA;
  y.k += x;
  y = retmeA (y);
  return y.k;
}
unsigned int
fn2A (unsigned int x)
{
  struct A y = sA;
  y.k += x;
  y.k %= 15;
  return y.k;
}
unsigned int
retitA (void)
{
  return sA.k;
}
unsigned int
fn3A (unsigned int x)
{
  sA.k += x;
  return retitA ();
}

void
testA (void)
{
  int i;
  unsigned int mask, v, a, r;
  struct A x;
  char *p = (char *) &sA;
  for (i = 0; i < sizeof (sA); ++i)
    *p++ = myrnd ();
  if (__builtin_classify_type (sA.l) == 8)
    sA.l = 5.25;
  sA.k = -1;
  mask = sA.k;
  v = myrnd ();
  a = myrnd ();
  sA.k = v;
  x = sA;
  r = fn1A (a);
  if (x.i != sA.i || x.j != sA.j || x.k != sA.k || x.l != sA.l
      || ((v + a) & mask) != r)
    abort ();
  v = myrnd ();
  a = myrnd ();
  sA.k = v;
  x = sA;
  r = fn2A (a);
  if (x.i != sA.i || x.j != sA.j || x.k != sA.k || x.l != sA.l
      || ((((v + a) & mask) % 15) & mask) != r)
    abort ();
  v = myrnd ();
  a = myrnd ();
  sA.k = v;
  x = sA;
  r = fn3A (a);
  if (x.i != sA.i || x.j != sA.j || sA.k != r || x.l != sA.l
      || ((v + a) & mask) != r)
    abort ();
}

int
main (void)
{
  testA ();
  exit (0);
}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-30 10:09           ` Bernd Schmidt
@ 2007-04-30 19:04             ` Alexandre Oliva
  2007-04-30 19:08               ` Alexandre Oliva
  2007-05-01 15:38             ` Alexandre Oliva
  1 sibling, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-04-30 19:04 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

On Apr 30, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> As a consequence of all these problems, I'd like to see this patch
> reverted for now until it we know how to do it properly.

Sounds like a reasonable trade-off.  I'm sorry that I haven't had time
to look into this throughout the past two weeks.  I was finally able
to start looking into this yesterday, but I haven't got anywhere so
far.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-30 19:04             ` Alexandre Oliva
@ 2007-04-30 19:08               ` Alexandre Oliva
  0 siblings, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-04-30 19:08 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 523 bytes --]

On Apr 30, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Apr 30, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>> As a consequence of all these problems, I'd like to see this patch
>> reverted for now until it we know how to do it properly.

> Sounds like a reasonable trade-off.  I'm sorry that I haven't had time
> to look into this throughout the past two weeks.  I was finally able
> to start looking into this yesterday, but I haven't got anywhere so
> far.

Here's the reversal patch I'm checking in.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-revert.patch --]
[-- Type: text/x-patch, Size: 27508 bytes --]

Index: gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	Temporarily revert:
	2007-04-06  Andreas Tobler  <a.tobler@schweiz.org>
	* tree-sra.c (sra_build_elt_assignment): Initialize min/maxshift.
	2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
	* tree-sra.c (try_instantiate_multiple_fields): Needlessly
	initialize align to silence bogus warning.
	2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.  Remove
	all_no_warning.
	(struct sra_walk_fns): Remove use_all parameter from use.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(sra_walk_expr): Don't maintain or pass down use_all_p.
	(scan_use): Remove use_all parameter.
	(scalarize_use): Likewise.  Re-expand assignment to
	BIT_FIELD_REF of gimple_reg.  De-scalarize before input or
	output, and re-scalarize after output.  Don't mark anything
	for no warning.
	(scalarize_ldst): Adjust.
	(scalarize_walk_gimple_modify_statement): Likewise.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Don't warn for any element whose parent
	is used as a whole.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.
	(instantiate_missing_elemnts): Use them.
	(mark_no_warning): Removed.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(REPLDUP, sra_build_elt_assignment): New.
	(generate_copy_inout): Use them.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_int): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(scalarize_use, scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c	(revision 124300)
+++ gcc/tree-sra.c	(working copy)
@@ -147,10 +147,6 @@
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
-
-  /* 1 if the element is a field that is part of a block, 2 if the field
-     is the block itself, 0 if it's neither.  */
-  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -465,12 +461,6 @@
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
-    case BIT_FIELD_REF:
-      /* Don't take operand 0 into account, that's our parent.  */
-      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
-      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
-      break;
-
     default:
       gcc_unreachable ();
     }
@@ -489,14 +479,12 @@
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything except bitfield blocks back up the
-     chain.  Given that chain lengths are rarely very long, this
-     should be acceptable.  If we truly identify this as a performance
-     problem, it should work to hash the pointer value
-     "e->parent".  */
+  /* Take into account everything back up the chain.  Given that chain
+     lengths are rarely very long, this should be acceptable.  If we
+     truly identify this as a performance problem, it should work to
+     hash the pointer value "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    if (!p->in_bitfld_block)
-      h = (h * 65521) ^ sra_hash_tree (p->element);
+    h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -509,17 +497,8 @@
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
-  const struct sra_elt *ap = a->parent;
-  const struct sra_elt *bp = b->parent;
 
-  if (ap)
-    while (ap->in_bitfld_block)
-      ap = ap->parent;
-  if (bp)
-    while (bp->in_bitfld_block)
-      bp = bp->parent;
-
-  if (ap != bp)
+  if (a->parent != b->parent)
     return false;
 
   ae = a->element;
@@ -554,11 +533,6 @@
 	return false;
       return fields_compatible_p (ae, be);
 
-    case BIT_FIELD_REF:
-      return
-	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
-	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
-
     default:
       gcc_unreachable ();
     }
@@ -697,9 +671,10 @@
   /* Invoked when ELT is required as a unit.  Note that ELT might refer to
      a leaf node, in which case this is a simple scalar reference.  *EXPR_P
      points to the location of the expression.  IS_OUTPUT is true if this
-     is a left-hand-side reference.  */
+     is a left-hand-side reference.  USE_ALL is true if we saw something we
+     couldn't quite identify and had to force the use of the entire object.  */
   void (*use) (struct sra_elt *elt, tree *expr_p,
-	       block_stmt_iterator *bsi, bool is_output);
+	       block_stmt_iterator *bsi, bool is_output, bool use_all);
 
   /* Invoked when we have a copy between two scalarizable references.  */
   void (*copy) (struct sra_elt *lhs_elt, struct sra_elt *rhs_elt,
@@ -753,6 +728,7 @@
   tree expr = *expr_p;
   tree inner = expr;
   bool disable_scalarization = false;
+  bool use_all_p = false;
 
   /* We're looking to collect a reference expression between EXPR and INNER,
      such that INNER is a scalarizable decl and all other nodes through EXPR
@@ -773,7 +749,7 @@
 	    if (disable_scalarization)
 	      elt->cannot_scalarize = true;
 	    else
-	      fns->use (elt, expr_p, bsi, is_output);
+	      fns->use (elt, expr_p, bsi, is_output, use_all_p);
 	  }
 	return;
 
@@ -860,6 +836,7 @@
       use_all:
         expr_p = &TREE_OPERAND (inner, 0);
 	inner = expr = *expr_p;
+	use_all_p = true;
 	break;
 
       default:
@@ -907,14 +884,11 @@
   sra_walk_tree_list (ASM_OUTPUTS (expr), bsi, true, fns);
 }
 
-static void sra_replace (block_stmt_iterator *bsi, tree list);
-static tree sra_build_elt_assignment (struct sra_elt *elt, tree src);
-
 /* Walk a GIMPLE_MODIFY_STMT and categorize the assignment appropriately.  */
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-			     const struct sra_walk_fns *fns)
+		      const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -937,7 +911,7 @@
       if (!rhs_elt->is_scalar && !TREE_SIDE_EFFECTS (lhs))
 	fns->ldst (rhs_elt, lhs, bsi, false);
       else
-	fns->use (rhs_elt, &GIMPLE_STMT_OPERAND (expr, 1), bsi, false);
+	fns->use (rhs_elt, &GIMPLE_STMT_OPERAND (expr, 1), bsi, false, false);
     }
 
   /* If it isn't scalarizable, there may be scalarizable variables within, so
@@ -984,9 +958,7 @@
       /* Otherwise we're being used in some context that requires the
 	 aggregate to be seen as a whole.  Invoke USE.  */
       else
-	{
-	  fns->use (lhs_elt, &GIMPLE_STMT_OPERAND (expr, 0), bsi, true);
-	}
+	fns->use (lhs_elt, &GIMPLE_STMT_OPERAND (expr, 0), bsi, true, false);
     }
 
   /* Similarly to above, LHS_ELT being null only means that the LHS as a
@@ -1097,7 +1069,7 @@
 static void
 scan_use (struct sra_elt *elt, tree *expr_p ATTRIBUTE_UNUSED,
 	  block_stmt_iterator *bsi ATTRIBUTE_UNUSED,
-	  bool is_output ATTRIBUTE_UNUSED)
+	  bool is_output ATTRIBUTE_UNUSED, bool use_all ATTRIBUTE_UNUSED)
 {
   elt->n_uses += 1;
 }
@@ -1205,15 +1177,6 @@
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
-  else if (TREE_CODE (t) == BIT_FIELD_REF)
-    {
-      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
-	       tree_low_cst (TREE_OPERAND (t, 2), 1));
-      obstack_grow (&sra_obstack, buffer, strlen (buffer));
-      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
-	       tree_low_cst (TREE_OPERAND (t, 1), 1));
-      obstack_grow (&sra_obstack, buffer, strlen (buffer));
-    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1246,12 +1209,9 @@
 {
   struct sra_elt *base_elt;
   tree var, base;
-  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    if (!nowarn)
-      nowarn = base_elt->parent->n_uses
-	|| TREE_NO_WARNING (base_elt->parent->element);
+    continue;
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1280,7 +1240,9 @@
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = nowarn;
+      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
+      if (elt->element && TREE_NO_WARNING (elt->element))
+	TREE_NO_WARNING (var) = 1;
     }
   else
     {
@@ -1375,7 +1337,7 @@
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static struct sra_elt *
+static void
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1386,268 +1348,8 @@
     }
   else
     instantiate_missing_elements (sub);
-  return sub;
 }
 
-/* Obtain the canonical type for field F of ELEMENT.  */
-
-static tree
-canon_type_for_field (tree f, tree element)
-{
-  tree field_type = TREE_TYPE (f);
-
-  /* canonicalize_component_ref() unwidens some bit-field types (not
-     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
-     may introduce type mismatches.  */
-  if (INTEGRAL_TYPE_P (field_type)
-      && DECL_MODE (f) != TYPE_MODE (field_type))
-    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-						   field_type,
-						   element,
-						   f, NULL_TREE),
-					   NULL_TREE));
-
-  return field_type;
-}
-
-/* Look for adjacent fields of ELT starting at F that we'd like to
-   scalarize as a single variable.  Return the last field of the
-   group.  */
-
-static tree
-try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
-{
-  unsigned HOST_WIDE_INT align, oalign, word, bit, size, alchk;
-  enum machine_mode mode;
-  tree first = f, prev;
-  tree type, var;
-  struct sra_elt *block;
-
-  if (!is_sra_scalar_type (TREE_TYPE (f))
-      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
-      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
-      || !host_integerp (DECL_SIZE (f), 1)
-      || lookup_element (elt, f, NULL, NO_INSERT))
-    return f;
-
-  /* Taking the alignment of elt->element is not enough, since it
-     might be just an array index or some such.  We shouldn't need to
-     initialize align here, but our optimizers don't always realize
-     that, if we leave the loop without initializing align, we'll fail
-     the assertion right after the loop.  */
-  align = (unsigned HOST_WIDE_INT)-1;
-  for (block = elt; block; block = block->parent)
-    if (DECL_P (block->element))
-      {
-	align = DECL_ALIGN (block->element);
-	break;
-      }
-  gcc_assert (block);
-
-  oalign = DECL_OFFSET_ALIGN (f);
-  word = tree_low_cst (DECL_FIELD_OFFSET (f), 1);
-  bit = tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
-  size = tree_low_cst (DECL_SIZE (f), 1);
-
-  if (align > oalign)
-    align = oalign;
-
-  alchk = align - 1;
-  alchk = ~alchk;
-
-  if ((bit & alchk) != ((bit + size - 1) & alchk))
-    return f;
-
-  /* Find adjacent fields in the same alignment word.  */
-
-  for (prev = f, f = TREE_CHAIN (f);
-       f && TREE_CODE (f) == FIELD_DECL
-	 && is_sra_scalar_type (TREE_TYPE (f))
-	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
-	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
-	 && host_integerp (DECL_SIZE (f), 1)
-	 && (HOST_WIDE_INT)word == tree_low_cst (DECL_FIELD_OFFSET (f), 1)
-	 && !lookup_element (elt, f, NULL, NO_INSERT);
-       prev = f, f = TREE_CHAIN (f))
-    {
-      unsigned HOST_WIDE_INT nbit, nsize;
-
-      nbit = tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
-      nsize = tree_low_cst (DECL_SIZE (f), 1);
-
-      if (bit + size == nbit)
-	{
-	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
-	    break;
-	  size += nsize;
-	}
-      else if (nbit + nsize == bit)
-	{
-	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
-	    break;
-	  bit = nbit;
-	  size += nsize;
-	}
-      else
-	break;
-    }
-
-  f = prev;
-
-  if (f == first)
-    return f;
-
-  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
-
-  /* Try to widen the bit range so as to cover padding bits as well.  */
-
-  if ((bit & ~alchk) || size != align)
-    {
-      unsigned HOST_WIDE_INT mbit = bit & alchk;
-      unsigned HOST_WIDE_INT msize = align;
-
-      for (f = TYPE_FIELDS (elt->type);
-	   f; f = TREE_CHAIN (f))
-	{
-	  unsigned HOST_WIDE_INT fword, fbit, fsize;
-
-	  /* Skip the fields from first to prev.  */
-	  if (f == first)
-	    {
-	      f = prev;
-	      continue;
-	    }
-
-	  if (!(TREE_CODE (f) == FIELD_DECL
-		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
-		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
-	    continue;
-
-	  fword = tree_low_cst (DECL_FIELD_OFFSET (f), 1);
-	  /* If we're past the selected word, we're fine.  */
-	  if (word < fword)
-	    continue;
-
-	  fbit = tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
-
-	  if (host_integerp (DECL_SIZE (f), 1))
-	    fsize = tree_low_cst (DECL_SIZE (f), 1);
-	  else
-	    /* Assume a variable-sized field takes up all space till
-	       the end of the word.  ??? Endianness issues?  */
-	    fsize = align - fbit;
-
-	  if (fword < word)
-	    {
-	      /* A large field might start at a previous word and
-		 extend into the selected word.  Exclude those
-		 bits.  ??? Endianness issues? */
-	      HOST_WIDE_INT diff = fbit + fsize
-		- (HOST_WIDE_INT)((word - fword) * BITS_PER_UNIT + mbit);
-
-	      if (diff <= 0)
-		continue;
-
-	      mbit += diff;
-	      msize -= diff;
-	    }
-	  else
-	    {
-	      gcc_assert (fword == word);
-
-	      /* Non-overlapping, great.  */
-	      if (fbit + fsize <= mbit
-		  || mbit + msize <= fbit)
-		continue;
-
-	      if (fbit <= mbit)
-		{
-		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
-		  mbit += diff;
-		  msize -= diff;
-		}
-	      else if (fbit > mbit)
-		msize -= (mbit + msize - fbit);
-	      else
-		gcc_unreachable ();
-	    }
-	}
-
-      bit = mbit;
-      size = msize;
-    }
-
-  /* Now we know the bit range we're interested in.  Find the smallest
-     machine mode we can use to access it.  */
-
-  for (mode = smallest_mode_for_size (size, MODE_INT);
-       ;
-       mode = GET_MODE_WIDER_MODE (mode))
-    {
-      gcc_assert (mode != VOIDmode);
-
-      alchk = GET_MODE_PRECISION (mode) - 1;
-      alchk = ~alchk;
-
-      if ((bit & alchk) == ((bit + size - 1) & alchk))
-	break;
-    }
-
-  gcc_assert (~alchk < align);
-
-  /* Create the field group as a single variable.  */
-
-  type = lang_hooks.types.type_for_mode (mode, 1);
-  gcc_assert (type);
-  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
-		bitsize_int (size),
-		bitsize_int (word * BITS_PER_UNIT + bit));
-  BIT_FIELD_REF_UNSIGNED (var) = 1;
-
-  block = instantiate_missing_elements_1 (elt, var, type);
-  gcc_assert (block && block->is_scalar);
-
-  var = block->replacement;
-
-  if (((word * BITS_PER_UNIT + bit) & ~alchk)
-      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
-    {
-      block->replacement = build3 (BIT_FIELD_REF,
-				   TREE_TYPE (block->element), var,
-				   bitsize_int (size),
-				   bitsize_int ((word * BITS_PER_UNIT
-						 + bit) & ~alchk));
-      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
-      TREE_NO_WARNING (block->replacement) = 1;
-    }
-
-  block->in_bitfld_block = 2;
-
-  /* Add the member fields to the group, such that they access
-     portions of the group variable.  */
-
-  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
-    {
-      tree field_type = canon_type_for_field (f, elt->element);
-      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
-
-      gcc_assert (fld && fld->is_scalar && !fld->replacement);
-
-      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
-				 DECL_SIZE (f),
-				 bitsize_int
-				 ((word * BITS_PER_UNIT
-				   + (TREE_INT_CST_LOW
-				      (DECL_FIELD_BIT_OFFSET (f))))
-				  & ~alchk));
-      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
-      TREE_NO_WARNING (block->replacement) = 1;
-      fld->in_bitfld_block = 1;
-    }
-
-  return prev;
-}
-
 static void
 instantiate_missing_elements (struct sra_elt *elt)
 {
@@ -1661,17 +1363,21 @@
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree last = try_instantiate_multiple_fields (elt, f);
+	      tree field_type = TREE_TYPE (f);
 
-	      if (last != f)
-		{
-		  f = last;
-		  continue;
-		}
+	      /* canonicalize_component_ref() unwidens some bit-field
+		 types (not marked as DECL_BIT_FIELD in C++), so we
+		 must do the same, lest we may introduce type
+		 mismatches.  */
+	      if (INTEGRAL_TYPE_P (field_type)
+		  && DECL_MODE (f) != TYPE_MODE (field_type))
+		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+							       field_type,
+							       elt->element,
+							       f, NULL_TREE),
+						       NULL_TREE));
 
-	      instantiate_missing_elements_1 (elt, f,
-					      canon_type_for_field
-					      (f, elt->element));
+	      instantiate_missing_elements_1 (elt, f, field_type);
 	    }
 	break;
       }
@@ -1983,16 +1689,6 @@
       {
 	tree field = elt->element;
 
-	/* We can't test elt->in_bitfld_blk here because, when this is
-	   called from instantiate_element, we haven't set this field
-	   yet.  */
-	if (TREE_CODE (field) == BIT_FIELD_REF)
-	  {
-	    tree ret = copy_node (field);
-	    TREE_OPERAND (ret, 0) = base;
-	    return ret;
-	  }
-
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -2045,126 +1741,6 @@
   return build_gimple_modify_stmt (dst, src);
 }
 
-/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
-   takes care of assignments, but we must create copies for uses.  */
-#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : copy_node (t))
-
-static tree
-sra_build_elt_assignment (struct sra_elt *elt, tree src)
-{
-  tree dst = elt->replacement;
-  tree var, type, tmp, tmp2, tmp3;
-  tree list, stmt;
-  tree cst, cst2, mask;
-  tree minshift = NULL, maxshift = NULL;
-
-  if (TREE_CODE (dst) != BIT_FIELD_REF
-      || !elt->in_bitfld_block)
-    return sra_build_assignment (REPLDUP (dst), src);
-
-  var = TREE_OPERAND (dst, 0);
-
-  /* Try to widen the assignment to the entire variable.
-     We need the source to be a BIT_FIELD_REF as well, such that, for
-     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
-     if sp >= dp, we can turn it into
-     d = BIT_FIELD_REF<s,sp+sz,sp-dp>.  */
-  if (elt->in_bitfld_block == 2
-      && TREE_CODE (src) == BIT_FIELD_REF
-      && !tree_int_cst_lt (TREE_OPERAND (src, 2), TREE_OPERAND (dst, 2)))
-    {
-      src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var),
-			 TREE_OPERAND (src, 0),
-			 size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
-				     TREE_OPERAND (dst, 2)),
-			 size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
-				     TREE_OPERAND (dst, 2)));
-      BIT_FIELD_REF_UNSIGNED (src) = 1;
-
-      return sra_build_assignment (var, src);
-    }
-
-  if (!is_gimple_reg (var))
-    return sra_build_assignment (REPLDUP (dst), src);
-
-  list = alloc_stmt_list ();
-
-  cst = TREE_OPERAND (dst, 2);
-  if (WORDS_BIG_ENDIAN)
-    {
-      cst = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
-      maxshift = cst;
-    }
-  else
-    minshift = cst;
-
-  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
-		     TREE_OPERAND (dst, 2));
-  if (WORDS_BIG_ENDIAN)
-    {
-      cst2 = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
-      minshift = cst2;
-    }
-  else
-    maxshift = cst2;
-
-  type = TREE_TYPE (var);
-
-  mask = build_int_cst_wide (type, 1, 0);
-  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, 1);
-  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, 1);
-  mask = int_const_binop (MINUS_EXPR, cst, cst2, 1);
-  mask = fold_build1 (BIT_NOT_EXPR, type, mask);
-
-  if (!WORDS_BIG_ENDIAN)
-    cst2 = TREE_OPERAND (dst, 2);
-
-  tmp = make_rename_temp (type, "SR");
-  stmt = build_gimple_modify_stmt (tmp,
-				   fold_build2 (BIT_AND_EXPR, type,
-						var, mask));
-  append_to_statement_list (stmt, &list);
-
-  if (is_gimple_reg (src))
-    tmp2 = src;
-  else
-    {
-      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
-      stmt = sra_build_assignment (tmp2, src);
-      append_to_statement_list (stmt, &list);
-    }
-
-  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2))
-      || TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
-    {
-      tmp3 = make_rename_temp (type, "SR");
-      tmp2 = fold_build3 (BIT_FIELD_REF, type, tmp2, TREE_OPERAND (dst, 1),
-			  bitsize_int (0));
-      if (TREE_CODE (tmp2) == BIT_FIELD_REF)
-	BIT_FIELD_REF_UNSIGNED (tmp2) = 1;
-      stmt = sra_build_assignment (tmp3, tmp2);
-      append_to_statement_list (stmt, &list);
-      tmp2 = tmp3;
-    }
-
-  if (!integer_zerop (minshift))
-    {
-      tmp3 = make_rename_temp (type, "SR");
-      stmt = build_gimple_modify_stmt (tmp3,
-				       fold_build2 (LSHIFT_EXPR, type,
-						    tmp2, minshift));
-      append_to_statement_list (stmt, &list);
-      tmp2 = tmp3;
-    }
-
-  stmt = build_gimple_modify_stmt (var,
-				   fold_build2 (BIT_IOR_EXPR, type,
-						tmp, tmp2));
-  append_to_statement_list (stmt, &list);
-
-  return list;
-}
-
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -2195,9 +1771,9 @@
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_elt_assignment (elt, expr);
+	t = sra_build_assignment (elt->replacement, expr);
       else
-	t = sra_build_assignment (expr, REPLDUP (elt->replacement));
+	t = sra_build_assignment (expr, elt->replacement);
       append_to_statement_list (t, list_p);
     }
   else
@@ -2222,19 +1798,6 @@
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
-      if (!sc && dc->in_bitfld_block == 2)
-	{
-	  struct sra_elt *dcs;
-
-	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
-	    {
-	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
-	      gcc_assert (sc);
-	      generate_element_copy (dcs, sc, list_p);
-	    }
-
-	  continue;
-	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -2245,7 +1808,7 @@
 
       gcc_assert (src->replacement);
 
-      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
+      t = sra_build_assignment (dst->replacement, src->replacement);
       append_to_statement_list (t, list_p);
     }
 }
@@ -2266,9 +1829,8 @@
       return;
     }
 
-  if (!elt->in_bitfld_block)
-    FOR_EACH_ACTUAL_CHILD (c, elt)
-      generate_element_zero (c, list_p);
+  FOR_EACH_ACTUAL_CHILD (c, elt)
+    generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -2277,7 +1839,7 @@
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_elt_assignment (elt, t);
+      t = sra_build_assignment (elt->replacement, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -2286,10 +1848,10 @@
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
+generate_one_element_init (tree var, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_elt_assignment (elt, init);
+  tree stmt = sra_build_assignment (var, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -2318,7 +1880,7 @@
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt, init, list_p);
+	  generate_one_element_init (elt->replacement, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2477,7 +2039,7 @@
 
 static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
-	       bool is_output)
+	       bool is_output, bool use_all)
 {
   tree list = NULL, stmt = bsi_stmt (*bsi);
 
@@ -2486,27 +2048,8 @@
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	{
-	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
-	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
-	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
-	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
-	    {
-	      tree newstmt = sra_build_elt_assignment
-		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
-	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
-		{
-		  tree list = alloc_stmt_list ();
-		  append_to_statement_list (newstmt, &list);
-		  newstmt = list;
-		}
-	      sra_replace (bsi, newstmt);
-	      return;
-	    }
-
-	  mark_all_v_defs (stmt);
-	}
-      *expr_p = REPLDUP (elt->replacement);
+	mark_all_v_defs (stmt);
+      *expr_p = elt->replacement;
       update_stmt (stmt);
     }
   else
@@ -2524,24 +2067,18 @@
 	 This optimization would be most effective if sra_walk_function
 	 processed the blocks in dominator order.  */
 
-      generate_copy_inout (elt, false, generate_element_ref (elt), &list);
-      if (list)
+      generate_copy_inout (elt, is_output, generate_element_ref (elt), &list);
+      if (list == NULL)
+	return;
+      mark_all_v_defs (list);
+      if (is_output)
+	sra_insert_after (bsi, list);
+      else
 	{
-	  mark_all_v_defs (list);
 	  sra_insert_before (bsi, list);
-	  mark_no_warning (elt);
+	  if (use_all)
+	    mark_no_warning (elt);
 	}
-
-      if (is_output)
-	{
-	  list = NULL;
-	  generate_copy_inout (elt, true, generate_element_ref (elt), &list);
-	  if (list)
-	    {
-	      mark_all_v_defs (list);
-	      sra_insert_after (bsi, list);
-	    }
-	}
     }
 }
 
@@ -2564,7 +2101,7 @@
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
+      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2706,7 +2243,7 @@
     {
       /* Since ELT is not fully instantiated, we have to leave the
 	 block copy in place.  Treat this as a USE.  */
-      scalarize_use (elt, NULL, bsi, is_output);
+      scalarize_use (elt, NULL, bsi, is_output, false);
     }
   else
     {
@@ -2718,8 +2255,8 @@
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      gcc_assert (list);
       mark_all_v_defs (list);
+      gcc_assert (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
@@ -2815,10 +2352,6 @@
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
-      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
-	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
-		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
-		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-04-30 10:09           ` Bernd Schmidt
  2007-04-30 19:04             ` Alexandre Oliva
@ 2007-05-01 15:38             ` Alexandre Oliva
  2007-05-01 15:45               ` Eric Botcazou
                                 ` (2 more replies)
  1 sibling, 3 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-01 15:38 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 377 bytes --]

On Apr 30, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> As a consequence of all these problems, I'd like to see this patch
> reverted for now until it we know how to do it properly.

Here's a new patch that I hope will correct all the regressions,
including those on platforms I don't have access to.  Bootstrapped and
regtested on x86_64-linux-gnu.  Ok to install?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 26432 bytes --]

2007-04-30  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(sra_walk_expr): Use a variable as both input and output if it's
	referenced in a BIT_FIELD_REF in a lhs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Don't warn for any element whose parent
	is used as a whole.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Initialize align.
	(instantiate_missing_elemnts): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(REPLDUP): New.
	(sra_build_elt_assignment): New.  Avoid need for needless
	initialization of maxalign and minalign.
	(generate_copy_inout): Use them.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_int): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(scalarize_use): Re-expand assignment to BIT_FIELD_REF of
	gimple_reg.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-04-30 14:36:34.000000000 -0300
+++ gcc/tree-sra.c	2007-04-30 17:38:56.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -461,6 +465,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +489,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +509,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
 
-  if (a->parent != b->parent)
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
+
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +554,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -729,6 +755,7 @@ sra_walk_expr (tree *expr_p, block_stmt_
   tree inner = expr;
   bool disable_scalarization = false;
   bool use_all_p = false;
+  bool use_both_p = false;
 
   /* We're looking to collect a reference expression between EXPR and INNER,
      such that INNER is a scalarizable decl and all other nodes through EXPR
@@ -749,7 +776,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (disable_scalarization)
 	      elt->cannot_scalarize = true;
 	    else
-	      fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      {
+		if (use_both_p)
+		  fns->use (elt, expr_p, bsi, false, use_all_p);
+		fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      }
 	  }
 	return;
 
@@ -816,6 +847,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (elt)
 	      elt->is_vector_lhs = true;
 	  }
+	else if (is_output)
+	  /* If we're writing to a BIT-FIELD, we have to make sure the
+	     inner variable is back in place before we start modifying
+	     it.  */
+	  use_both_p = true;
 	/* A bit field reference (access to *multiple* fields simultaneously)
 	   is not currently scalarized.  Consider this an access to the
 	   complete outer element, to which walk_tree will bring us next.  */
@@ -884,11 +920,14 @@ sra_walk_asm_expr (tree expr, block_stmt
   sra_walk_tree_list (ASM_OUTPUTS (expr), bsi, true, fns);
 }
 
+static void sra_replace (block_stmt_iterator *bsi, tree list);
+static tree sra_build_elt_assignment (struct sra_elt *elt, tree src);
+
 /* Walk a GIMPLE_MODIFY_STMT and categorize the assignment appropriately.  */
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1216,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1257,12 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = base_elt->parent->n_uses
+	|| TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1291,10 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
+
+      if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	DECL_INITIAL (var) = integer_zero_node;
     }
   else
     {
@@ -1337,7 +1389,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1400,266 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  unsigned HOST_WIDE_INT align, oalign, word, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  /* Taking the alignment of elt->element is not enough, since it
+     might be just an array index or some such.  We shouldn't need to
+     initialize align here, but our optimizers don't always realize
+     that, if we leave the loop without initializing align, we'll fail
+     the assertion right after the loop.  */
+  align = (unsigned HOST_WIDE_INT)-1;
+  for (block = elt; block; block = block->parent)
+    if (DECL_P (block->element))
+      {
+	align = DECL_ALIGN (block->element);
+	break;
+      }
+  gcc_assert (block);
+
+  oalign = DECL_OFFSET_ALIGN (f);
+  word = tree_low_cst (DECL_FIELD_OFFSET (f), 1);
+  bit = tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  if (align > oalign)
+    align = oalign;
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && (HOST_WIDE_INT)word == tree_low_cst (DECL_FIELD_OFFSET (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    break;
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    break;
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fword, fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fword = tree_low_cst (DECL_FIELD_OFFSET (f), 1);
+	  /* If we're past the selected word, we're fine.  */
+	  if (word < fword)
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - fbit;
+
+	  if (fword < word)
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize
+		- (HOST_WIDE_INT)((word - fword) * BITS_PER_UNIT + mbit);
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      gcc_assert (fword == word);
+
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (word * BITS_PER_UNIT + bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if (((word * BITS_PER_UNIT + bit) & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int ((word * BITS_PER_UNIT
+						 + bit) & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+      TREE_NO_WARNING (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((word * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      TREE_NO_WARNING (block->replacement) = 1;
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1675,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +1997,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = copy_node (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1741,6 +2059,122 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : copy_node (t))
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, type, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     if sp >= dp, we can turn it into
+     d = BIT_FIELD_REF<s,sp+sz,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF
+      && !tree_int_cst_lt (TREE_OPERAND (src, 2), TREE_OPERAND (dst, 2)))
+    {
+      src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var),
+			 TREE_OPERAND (src, 0),
+			 size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
+				     TREE_OPERAND (dst, 2)),
+			 size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+				     TREE_OPERAND (dst, 2)));
+      BIT_FIELD_REF_UNSIGNED (src) = 1;
+
+      return sra_build_assignment (var, src);
+    }
+
+  if (!is_gimple_reg (var))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = alloc_stmt_list ();
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (WORDS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
+      minshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+
+  mask = build_int_cst_wide (type, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, 1);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, 1);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, 1);
+  mask = fold_build1 (BIT_NOT_EXPR, type, mask);
+
+  if (!WORDS_BIG_ENDIAN)
+    cst2 = TREE_OPERAND (dst, 2);
+
+  tmp = make_rename_temp (type, "SR");
+  stmt = build_gimple_modify_stmt (tmp,
+				   fold_build2 (BIT_AND_EXPR, type,
+						var, mask));
+  append_to_statement_list (stmt, &list);
+
+  if (is_gimple_reg (src))
+    tmp2 = src;
+  else
+    {
+      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2))
+      || TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      tmp2 = fold_build3 (BIT_FIELD_REF, type, tmp2, TREE_OPERAND (dst, 1),
+			  bitsize_int (0));
+      if (TREE_CODE (tmp2) == BIT_FIELD_REF)
+	BIT_FIELD_REF_UNSIGNED (tmp2) = 1;
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, type,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  stmt = build_gimple_modify_stmt (var,
+				   fold_build2 (BIT_IOR_EXPR, type,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  return list;
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1771,9 +2205,9 @@ generate_copy_inout (struct sra_elt *elt
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2232,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2255,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2276,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2287,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2296,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2328,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2041,23 +2489,45 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = alloc_stmt_list ();
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +2571,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,8 +2725,8 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
@@ -2352,6 +2822,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 15:38             ` Alexandre Oliva
@ 2007-05-01 15:45               ` Eric Botcazou
  2007-05-04  0:19                 ` Alexandre Oliva
  2007-05-01 15:53               ` Diego Novillo
  2007-05-01 16:24               ` Roman Zippel
  2 siblings, 1 reply; 104+ messages in thread
From: Eric Botcazou @ 2007-05-01 15:45 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Bernd Schmidt, Daniel Berlin, Diego Novillo, Andrew Pinski

> Here's a new patch that I hope will correct all the regressions,
> including those on platforms I don't have access to.

Including the Ada problems?  They are quite annoying and break our testers.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 15:38             ` Alexandre Oliva
  2007-05-01 15:45               ` Eric Botcazou
@ 2007-05-01 15:53               ` Diego Novillo
  2007-05-01 16:03                 ` Eric Botcazou
  2007-05-01 16:24               ` Roman Zippel
  2 siblings, 1 reply; 104+ messages in thread
From: Diego Novillo @ 2007-05-01 15:53 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Bernd Schmidt, Daniel Berlin, GCC Patches, Andrew Pinski

Alexandre Oliva wrote on 05/01/07 11:38:

> 	PR middle-end/22156
> 	2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
> 	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
> 	(sra_hash_tree): Handle BIT_FIELD_REFs.
> 	(sra_elt_hash): Don't hash bitfld blocks.
> 	(sra_elt_eq): Skip them in parent compares as well.  Handle
> 	BIT_FIELD_REFs.
> 	(sra_walk_expr): Use a variable as both input and output if it's
> 	referenced in a BIT_FIELD_REF in a lhs.
> 	(build_element_name_1): Handle BIT_FIELD_REFs.
> 	(instantiate_element): Don't warn for any element whose parent
> 	is used as a whole.
> 	(instantiate_missing_elements_1): Return the sra_elt.
> 	(canon_type_for_field): New.
> 	(try_instantiate_multiple_fields): New.  Initialize align.
> 	(instantiate_missing_elemnts): Use them.
> 	(generate_one_element_ref): Handle BIT_FIELD_REFs.
> 	(REPLDUP): New.
> 	(sra_build_elt_assignment): New.  Avoid need for needless
> 	initialization of maxalign and minalign.
> 	(generate_copy_inout): Use them.
> 	(generate_element_copy): Likewise.  Handle bitfld differences.
> 	(generate_element_zero): Don't recurse for blocks.  Use
> 	sra_build_elt_assignment.
> 	(generate_one_element_int): Take elt instead of var.  Use
> 	sra_build_elt_assignment.
> 	(generate_element_init_1): Adjust.
> 	(scalarize_use): Re-expand assignment to BIT_FIELD_REF of
> 	gimple_reg.  Use REPLDUP.
> 	(scalarize_copy): Use REPLDUP.
> 	(scalarize_ldst): Move assert before dereference.
> 	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

OK if testing included Ada.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 15:53               ` Diego Novillo
@ 2007-05-01 16:03                 ` Eric Botcazou
  2007-05-01 16:51                   ` Arnaud Charlet
  2007-05-01 17:31                   ` Andreas Schwab
  0 siblings, 2 replies; 104+ messages in thread
From: Eric Botcazou @ 2007-05-01 16:03 UTC (permalink / raw)
  To: Diego Novillo
  Cc: gcc-patches, Alexandre Oliva, Bernd Schmidt, Daniel Berlin,
	Andrew Pinski

> OK if testing included Ada.

Ada bootstrap is broken because of Ian's patch. :-)

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 15:38             ` Alexandre Oliva
  2007-05-01 15:45               ` Eric Botcazou
  2007-05-01 15:53               ` Diego Novillo
@ 2007-05-01 16:24               ` Roman Zippel
  2007-05-04  4:07                 ` Alexandre Oliva
  2 siblings, 1 reply; 104+ messages in thread
From: Roman Zippel @ 2007-05-01 16:24 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Bernd Schmidt, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

Hi,

On Tue, 1 May 2007, Alexandre Oliva wrote:

> On Apr 30, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> 
> > As a consequence of all these problems, I'd like to see this patch
> > reverted for now until it we know how to do it properly.
> 
> Here's a new patch that I hope will correct all the regressions,
> including those on platforms I don't have access to.  Bootstrapped and
> regtested on x86_64-linux-gnu.  Ok to install?

This still breaks this m68k test case:
http://gcc.gnu.org/ml/gcc/2007-04/msg00830.html

BTW the previous patch broke the m68k bootstrap and this doesn't make me 
very confident this one will do better.

A specific comment about the patch, it might be better to keep combining 
multiple fields, if they don't match the alignment exactly (e.g. as in the 
above test case).
Even better would be if the per field copy of the sra pass would be 
completely suppressed if the structure is already likely to be in a 
register anyway.

bye, Roman

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 16:03                 ` Eric Botcazou
@ 2007-05-01 16:51                   ` Arnaud Charlet
  2007-05-01 16:54                     ` Arnaud Charlet
  2007-05-01 17:31                   ` Andreas Schwab
  1 sibling, 1 reply; 104+ messages in thread
From: Arnaud Charlet @ 2007-05-01 16:51 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Diego Novillo, gcc-patches, Alexandre Oliva, Bernd Schmidt,
	Daniel Berlin, Andrew Pinski

> > OK if testing included Ada.
> 
> Ada bootstrap is broken because of Ian's patch. :-)

So I'd suggest testing with Ada and reverting Ian's patch.

We've got two or three Ada breakage in a row and things are getting a bit
out of control, so it would indeed be nice to get Ada back in good
shape as it was a few weeks ago.

Arno

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 16:51                   ` Arnaud Charlet
@ 2007-05-01 16:54                     ` Arnaud Charlet
  0 siblings, 0 replies; 104+ messages in thread
From: Arnaud Charlet @ 2007-05-01 16:54 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Diego Novillo, gcc-patches, Alexandre Oliva, Bernd Schmidt,
	Daniel Berlin, Andrew Pinski

> So I'd suggest testing with Ada and reverting Ian's patch.

I meant: 'revert locally Ian's patch for testing with Ada'

I didn't mean (yet, hoping for a fix rapidly): reverting Ian's patch on the
repository and testing with Ada.

Arno

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 16:03                 ` Eric Botcazou
  2007-05-01 16:51                   ` Arnaud Charlet
@ 2007-05-01 17:31                   ` Andreas Schwab
  1 sibling, 0 replies; 104+ messages in thread
From: Andreas Schwab @ 2007-05-01 17:31 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Diego Novillo, gcc-patches, Alexandre Oliva, Bernd Schmidt,
	Daniel Berlin, Andrew Pinski

Eric Botcazou <ebotcazou@adacore.com> writes:

>> OK if testing included Ada.
>
> Ada bootstrap is broken because of Ian's patch. :-)

There is a patch in PR31739.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 15:45               ` Eric Botcazou
@ 2007-05-04  0:19                 ` Alexandre Oliva
  0 siblings, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-04  0:19 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Bernd Schmidt, Daniel Berlin, Diego Novillo, Andrew Pinski

On May  1, 2007, Eric Botcazou <ebotcazou@adacore.com> wrote:

>> Here's a new patch that I hope will correct all the regressions,
>> including those on platforms I don't have access to.

> Including the Ada problems?

What problems?  You mentioned some unexpected transformations in
another e-mail, but they were by design.  Is there anything else?

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-01 16:24               ` Roman Zippel
@ 2007-05-04  4:07                 ` Alexandre Oliva
  2007-05-04  5:24                   ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-04  4:07 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Bernd Schmidt, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 1712 bytes --]

On May  1, 2007, Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> On Tue, 1 May 2007, Alexandre Oliva wrote:

>> On Apr 30, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>> 
>> > As a consequence of all these problems, I'd like to see this patch
>> > reverted for now until it we know how to do it properly.
>> 
>> Here's a new patch that I hope will correct all the regressions,
>> including those on platforms I don't have access to.  Bootstrapped and
>> regtested on x86_64-linux-gnu.  Ok to install?

> This still breaks this m68k test case:
> http://gcc.gnu.org/ml/gcc/2007-04/msg00830.html

> BTW the previous patch broke the m68k bootstrap and this doesn't make me 
> very confident this one will do better.

Indeed, there was a serious bug in optimizing assignments to
BIT_FIELD_REFs that covered all fields scalarized into a single
variable, that affected only big-endian machines.

> A specific comment about the patch, it might be better to keep combining 
> multiple fields, if they don't match the alignment exactly (e.g. as in the 
> above test case).

In the patch below, alignment for access is inferred not from the
field, but from the enclosing object.  This should get us much better
access patterns for small objects that fit in registers, and use the
declaration alignment for other wider objects (that are less likely to
be scalarized anyway).

After this change, we get the optimal code that you quoted, even
without -malign-int.


Would you please give it a spin on m68k?  Bootstrap on
x86_64-linux-gnu hasn't completed yet, but it's already built Ada in
stage2, so I'm somewhat confident that this might work ;-)

If bootstrap and regression-testing completes, ok to install?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 26349 bytes --]

2007-04-30  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(sra_walk_expr): Use a variable as both input and output if it's
	referenced in a BIT_FIELD_REF in a lhs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Zero-initialize bit-field variables.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest mode
	from base decl.
	(instantiate_missing_elemnts): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(REPLDUP): New.
	(sra_build_elt_assignment): New.  Avoid need for needless
	initialization of maxalign and minalign.  Fix assignment to
	whole bitfield variables for big-endian machines.
	(generate_copy_inout): Use them.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_int): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(scalarize_use): Re-expand assignment to BIT_FIELD_REF of
	gimple_reg.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-03 13:24:48.000000000 -0300
+++ gcc/tree-sra.c	2007-05-04 00:46:22.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -461,6 +465,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +489,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +509,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
 
-  if (a->parent != b->parent)
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
+
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +554,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -729,6 +755,7 @@ sra_walk_expr (tree *expr_p, block_stmt_
   tree inner = expr;
   bool disable_scalarization = false;
   bool use_all_p = false;
+  bool use_both_p = false;
 
   /* We're looking to collect a reference expression between EXPR and INNER,
      such that INNER is a scalarizable decl and all other nodes through EXPR
@@ -749,7 +776,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (disable_scalarization)
 	      elt->cannot_scalarize = true;
 	    else
-	      fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      {
+		if (use_both_p)
+		  fns->use (elt, expr_p, bsi, false, use_all_p);
+		fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      }
 	  }
 	return;
 
@@ -816,6 +847,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (elt)
 	      elt->is_vector_lhs = true;
 	  }
+	else if (is_output)
+	  /* If we're writing to a BIT-FIELD, we have to make sure the
+	     inner variable is back in place before we start modifying
+	     it.  */
+	  use_both_p = true;
 	/* A bit field reference (access to *multiple* fields simultaneously)
 	   is not currently scalarized.  Consider this an access to the
 	   complete outer element, to which walk_tree will bring us next.  */
@@ -884,11 +920,14 @@ sra_walk_asm_expr (tree expr, block_stmt
   sra_walk_tree_list (ASM_OUTPUTS (expr), bsi, true, fns);
 }
 
+static void sra_replace (block_stmt_iterator *bsi, tree list);
+static tree sra_build_elt_assignment (struct sra_elt *elt, tree src);
+
 /* Walk a GIMPLE_MODIFY_STMT and categorize the assignment appropriately.  */
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1216,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1257,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1290,10 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
+
+      if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	DECL_INITIAL (var) = integer_zero_node;
     }
   else
     {
@@ -1337,7 +1388,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1399,263 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  /* Taking the alignment of elt->element is not enough, since it
+     might be just an array index or some such.  We shouldn't need to
+     initialize align here, but our optimizers don't always realize
+     that, if we leave the loop without initializing align, we'll fail
+     the assertion right after the loop.  */
+  align = (unsigned HOST_WIDE_INT)-1;
+  for (block = elt; block; block = block->parent)
+    if (DECL_P (block->element))
+      {
+	unsigned HOST_WIDE_INT dalign = DECL_ALIGN (block->element);
+
+	align = GET_MODE_BITSIZE (DECL_MODE (block->element));
+	if (align < dalign)
+	  align = dalign;
+	break;
+      }
+  gcc_assert (block);
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    break;
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    break;
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+      TREE_NO_WARNING (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      TREE_NO_WARNING (block->replacement) = 1;
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1671,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +1993,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = copy_node (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1741,6 +2055,120 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : copy_node (t))
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, type, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var),
+			 TREE_OPERAND (src, 0),
+			 DECL_SIZE (TREE_OPERAND (dst, 0)),
+			 size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+				     TREE_OPERAND (dst, 2)));
+      BIT_FIELD_REF_UNSIGNED (src) = 1;
+
+      return sra_build_assignment (var, src);
+    }
+
+  if (!is_gimple_reg (var))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = alloc_stmt_list ();
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (WORDS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
+      minshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+
+  mask = build_int_cst_wide (type, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, 1);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, 1);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, 1);
+  mask = fold_build1 (BIT_NOT_EXPR, type, mask);
+
+  if (!WORDS_BIG_ENDIAN)
+    cst2 = TREE_OPERAND (dst, 2);
+
+  tmp = make_rename_temp (type, "SR");
+  stmt = build_gimple_modify_stmt (tmp,
+				   fold_build2 (BIT_AND_EXPR, type,
+						var, mask));
+  append_to_statement_list (stmt, &list);
+
+  if (is_gimple_reg (src))
+    tmp2 = src;
+  else
+    {
+      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2))
+      || TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      tmp2 = fold_build3 (BIT_FIELD_REF, type, tmp2, TREE_OPERAND (dst, 1),
+			  bitsize_int (0));
+      if (TREE_CODE (tmp2) == BIT_FIELD_REF)
+	BIT_FIELD_REF_UNSIGNED (tmp2) = 1;
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, type,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  stmt = build_gimple_modify_stmt (var,
+				   fold_build2 (BIT_IOR_EXPR, type,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  return list;
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1771,9 +2199,9 @@ generate_copy_inout (struct sra_elt *elt
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2226,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2249,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2270,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2281,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2290,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2322,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2041,23 +2483,45 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = alloc_stmt_list ();
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +2565,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,8 +2719,8 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
@@ -2352,6 +2816,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-04  4:07                 ` Alexandre Oliva
@ 2007-05-04  5:24                   ` Alexandre Oliva
  2007-05-05 18:19                     ` Roman Zippel
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-04  5:24 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Bernd Schmidt, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 328 bytes --]

On May  4, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> Would you please give it a spin on m68k?  Bootstrap on
> x86_64-linux-gnu hasn't completed yet, but it's already built Ada in
> stage2, so I'm somewhat confident that this might work ;-)

Err...  Sorry, I failed to refresh the patch.  This one should fare
better.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 25944 bytes --]

2007-04-30  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(sra_walk_expr): Use a variable as both input and output if it's
	referenced in a BIT_FIELD_REF in a lhs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Zero-initialize bit-field variables.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest mode
	from base decl.
	(instantiate_missing_elemnts): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(REPLDUP): New.
	(sra_build_elt_assignment): New.  Avoid need for needless
	initialization of maxalign and minalign.  Fix assignment to
	whole bitfield variables for big-endian machines.
	(generate_copy_inout): Use them.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_int): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(scalarize_use): Re-expand assignment to BIT_FIELD_REF of
	gimple_reg.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-03 13:24:48.000000000 -0300
+++ gcc/tree-sra.c	2007-05-04 02:16:53.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -461,6 +465,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +489,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +509,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
 
-  if (a->parent != b->parent)
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
+
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +554,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -729,6 +755,7 @@ sra_walk_expr (tree *expr_p, block_stmt_
   tree inner = expr;
   bool disable_scalarization = false;
   bool use_all_p = false;
+  bool use_both_p = false;
 
   /* We're looking to collect a reference expression between EXPR and INNER,
      such that INNER is a scalarizable decl and all other nodes through EXPR
@@ -749,7 +776,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (disable_scalarization)
 	      elt->cannot_scalarize = true;
 	    else
-	      fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      {
+		if (use_both_p)
+		  fns->use (elt, expr_p, bsi, false, use_all_p);
+		fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      }
 	  }
 	return;
 
@@ -816,6 +847,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (elt)
 	      elt->is_vector_lhs = true;
 	  }
+	else if (is_output)
+	  /* If we're writing to a BIT-FIELD, we have to make sure the
+	     inner variable is back in place before we start modifying
+	     it.  */
+	  use_both_p = true;
 	/* A bit field reference (access to *multiple* fields simultaneously)
 	   is not currently scalarized.  Consider this an access to the
 	   complete outer element, to which walk_tree will bring us next.  */
@@ -884,11 +920,14 @@ sra_walk_asm_expr (tree expr, block_stmt
   sra_walk_tree_list (ASM_OUTPUTS (expr), bsi, true, fns);
 }
 
+static void sra_replace (block_stmt_iterator *bsi, tree list);
+static tree sra_build_elt_assignment (struct sra_elt *elt, tree src);
+
 /* Walk a GIMPLE_MODIFY_STMT and categorize the assignment appropriately.  */
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1216,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1257,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1290,10 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
+
+      if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	DECL_INITIAL (var) = integer_zero_node;
     }
   else
     {
@@ -1337,7 +1388,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1399,254 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+  while (block->parent)
+    block = block->parent;
+  gcc_assert (DECL_P (block->element));
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+  if (align < alchk)
+    align = alchk;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    break;
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    break;
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+      TREE_NO_WARNING (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      TREE_NO_WARNING (block->replacement) = 1;
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1662,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +1984,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = copy_node (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1741,6 +2046,120 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : copy_node (t))
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, type, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var),
+			 TREE_OPERAND (src, 0),
+			 DECL_SIZE (TREE_OPERAND (dst, 0)),
+			 size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+				     TREE_OPERAND (dst, 2)));
+      BIT_FIELD_REF_UNSIGNED (src) = 1;
+
+      return sra_build_assignment (var, src);
+    }
+
+  if (!is_gimple_reg (var))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = alloc_stmt_list ();
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (WORDS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
+      minshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+
+  mask = build_int_cst_wide (type, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, 1);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, 1);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, 1);
+  mask = fold_build1 (BIT_NOT_EXPR, type, mask);
+
+  if (!WORDS_BIG_ENDIAN)
+    cst2 = TREE_OPERAND (dst, 2);
+
+  tmp = make_rename_temp (type, "SR");
+  stmt = build_gimple_modify_stmt (tmp,
+				   fold_build2 (BIT_AND_EXPR, type,
+						var, mask));
+  append_to_statement_list (stmt, &list);
+
+  if (is_gimple_reg (src))
+    tmp2 = src;
+  else
+    {
+      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2))
+      || TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      tmp2 = fold_build3 (BIT_FIELD_REF, type, tmp2, TREE_OPERAND (dst, 1),
+			  bitsize_int (0));
+      if (TREE_CODE (tmp2) == BIT_FIELD_REF)
+	BIT_FIELD_REF_UNSIGNED (tmp2) = 1;
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, type,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  stmt = build_gimple_modify_stmt (var,
+				   fold_build2 (BIT_IOR_EXPR, type,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  return list;
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1771,9 +2190,9 @@ generate_copy_inout (struct sra_elt *elt
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2217,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2240,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2261,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2272,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2281,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2313,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2041,23 +2474,45 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = alloc_stmt_list ();
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +2556,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,8 +2710,8 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
@@ -2352,6 +2807,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-04  5:24                   ` Alexandre Oliva
@ 2007-05-05 18:19                     ` Roman Zippel
  2007-05-06  5:13                       ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Roman Zippel @ 2007-05-05 18:19 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Bernd Schmidt, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

Hi,

On Fri, 4 May 2007, Alexandre Oliva wrote:

> > Would you please give it a spin on m68k?  Bootstrap on
> > x86_64-linux-gnu hasn't completed yet, but it's already built Ada in
> > stage2, so I'm somewhat confident that this might work ;-)
> 
> Err...  Sorry, I failed to refresh the patch.  This one should fare
> better.

This now doesn't even survive stage2. I've put an example of a miscompiled 
function at www.xs4all.nl/~zippel/tree-cfg.i
It crashes at the memory access after 'sub.l %a0,%a0'
In the initial RTL output I also see some rather complex reg:DI 
operations instead of simple 32bit operations.

bye, Roman

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-05 18:19                     ` Roman Zippel
@ 2007-05-06  5:13                       ` Alexandre Oliva
  2007-05-06 12:13                         ` Bernd Schmidt
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-06  5:13 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Bernd Schmidt, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 880 bytes --]

On May  5, 2007, Roman Zippel <zippel@linux-m68k.org> wrote:

> This now doesn't even survive stage2.

This one should fare better.  It's completed bootstrap and regression
testing on x86_64-linux-gnu (--enable-languages=all,ada, as usual)

Ok to install?

:ADDPATCH sra:

> In the initial RTL output I also see some rather complex reg:DI 
> operations instead of simple 32bit operations.

I don't see any simple way to avoid this without taking some
machine-specific "preferred access mode" into account.

It might make sense to test whether, instead of a DImode variable, a
pair of SImode variables would do (i.e., no bit-fields crossing the
SImode boundary), as long as registers are known to hold SImode but
not DImode.  But is there a generic way to test for such "efficiency
boundaries"?  And, if not, does it make sense to introduce yet another
machine hook to this end?



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 27530 bytes --]

2007-04-30  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	2007-04-05  Alexandre Oliva  <aoliva@redhat.com>
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(sra_walk_expr): Use a variable as both input and output if it's
	referenced in a BIT_FIELD_REF in a lhs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Gimple-zero-initialize all bit-field variables that are not
	part of parameters that are going to be scalarized on entry.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest possible
	access mode from decl or member type.
	(instantiate_missing_elements): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(REPLDUP): New.
	(sra_build_elt_assignment): New.  Avoid need for needless
	initialization of maxalign and minalign.  Fix assignment to
	whole bitfield variables for big-endian machines.
	(generate_copy_inout): Use them.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_int): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(scalarize_use): Re-expand assignment to BIT_FIELD_REF of
	gimple_reg.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-03 13:24:48.000000000 -0300
+++ gcc/tree-sra.c	2007-05-05 18:51:58.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -205,6 +209,9 @@ extern void debug_sra_elt_name (struct s
 
 /* Forward declarations.  */
 static tree generate_element_ref (struct sra_elt *);
+static tree sra_build_assignment (tree dst, tree src);
+static void mark_all_v_defs (tree list);
+
 \f
 /* Return true if DECL is an SRA candidate.  */
 
@@ -461,6 +468,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +492,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +512,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
+
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
 
-  if (a->parent != b->parent)
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +557,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -729,6 +758,7 @@ sra_walk_expr (tree *expr_p, block_stmt_
   tree inner = expr;
   bool disable_scalarization = false;
   bool use_all_p = false;
+  bool use_both_p = false;
 
   /* We're looking to collect a reference expression between EXPR and INNER,
      such that INNER is a scalarizable decl and all other nodes through EXPR
@@ -749,7 +779,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (disable_scalarization)
 	      elt->cannot_scalarize = true;
 	    else
-	      fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      {
+		if (use_both_p)
+		  fns->use (elt, expr_p, bsi, false, use_all_p);
+		fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      }
 	  }
 	return;
 
@@ -816,6 +850,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (elt)
 	      elt->is_vector_lhs = true;
 	  }
+	else if (is_output)
+	  /* If we're writing to a BIT-FIELD, we have to make sure the
+	     inner variable is back in place before we start modifying
+	     it.  */
+	  use_both_p = true;
 	/* A bit field reference (access to *multiple* fields simultaneously)
 	   is not currently scalarized.  Consider this an access to the
 	   complete outer element, to which walk_tree will bring us next.  */
@@ -884,11 +923,14 @@ sra_walk_asm_expr (tree expr, block_stmt
   sra_walk_tree_list (ASM_OUTPUTS (expr), bsi, true, fns);
 }
 
+static void sra_replace (block_stmt_iterator *bsi, tree list);
+static tree sra_build_elt_assignment (struct sra_elt *elt, tree src);
+
 /* Walk a GIMPLE_MODIFY_STMT and categorize the assignment appropriately.  */
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1219,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1260,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,10 +1293,8 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
-    }
+      TREE_NO_WARNING (var) = nowarn;
+     }
   else
     {
       DECL_IGNORED_P (var) = 1;
@@ -1251,6 +1302,19 @@ instantiate_element (struct sra_elt *elt
       TREE_NO_WARNING (var) = 1;
     }
 
+  /* Zero-initialize bit-field scalarization variables, to avoid
+     triggering undefined behavior, unless the base variable is an
+     argument that is going to get scalarized out as the first thing,
+     providing for the initializers we need.  */
+  if (TREE_CODE (elt->element) == BIT_FIELD_REF
+      && (TREE_CODE (base) != PARM_DECL
+	  || !bitmap_bit_p (needs_copy_in, DECL_UID (base))))
+    {
+      tree init = sra_build_assignment (var, integer_zero_node);
+      insert_edge_copies (init, ENTRY_BLOCK_PTR);
+      mark_all_v_defs (init);
+    }
+
   if (dump_file)
     {
       fputs ("  ", dump_file);
@@ -1337,7 +1401,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1412,273 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  int count;
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+
+  /* For complex and array objects, there are going to be integer
+     literals as child elements.  In this case, we can't just take the
+     alignment and mode of the decl, so we instead rely on the element
+     type.
+
+     ??? We could try to infer additional alignment from the full
+     object declaration and the location of the sub-elements we're
+     accessing.  */
+  for (count = 0; !DECL_P (block->element); count++)
+    block = block->parent;
+
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+
+  if (count)
+    {
+      type = TREE_TYPE (block->element);
+      while (count--)
+	type = TREE_TYPE (type);
+
+      align = TYPE_ALIGN (type);
+      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
+    }
+
+  if (align < alchk)
+    align = alchk;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    break;
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    break;
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1694,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +2016,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = copy_node (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1741,6 +2078,120 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : copy_node (t))
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, type, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var),
+			 TREE_OPERAND (src, 0),
+			 DECL_SIZE (TREE_OPERAND (dst, 0)),
+			 size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+				     TREE_OPERAND (dst, 2)));
+      BIT_FIELD_REF_UNSIGNED (src) = 1;
+
+      return sra_build_assignment (var, src);
+    }
+
+  if (!is_gimple_reg (var))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = alloc_stmt_list ();
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (WORDS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
+      minshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+
+  mask = build_int_cst_wide (type, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, 1);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, 1);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, 1);
+  mask = fold_build1 (BIT_NOT_EXPR, type, mask);
+
+  if (!WORDS_BIG_ENDIAN)
+    cst2 = TREE_OPERAND (dst, 2);
+
+  tmp = make_rename_temp (type, "SR");
+  stmt = build_gimple_modify_stmt (tmp,
+				   fold_build2 (BIT_AND_EXPR, type,
+						var, mask));
+  append_to_statement_list (stmt, &list);
+
+  if (is_gimple_reg (src))
+    tmp2 = src;
+  else
+    {
+      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2))
+      || TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      tmp2 = fold_build3 (BIT_FIELD_REF, type, tmp2, TREE_OPERAND (dst, 1),
+			  bitsize_int (0));
+      if (TREE_CODE (tmp2) == BIT_FIELD_REF)
+	BIT_FIELD_REF_UNSIGNED (tmp2) = 1;
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, type,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  stmt = build_gimple_modify_stmt (var,
+				   fold_build2 (BIT_IOR_EXPR, type,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  return list;
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1771,9 +2222,9 @@ generate_copy_inout (struct sra_elt *elt
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2249,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2272,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2293,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2304,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2313,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2345,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2041,23 +2506,45 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = alloc_stmt_list ();
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +2588,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,8 +2742,8 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
@@ -2352,6 +2839,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-06  5:13                       ` Alexandre Oliva
@ 2007-05-06 12:13                         ` Bernd Schmidt
  2007-05-06 14:27                           ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-05-06 12:13 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

Alexandre Oliva wrote:
> It might make sense to test whether, instead of a DImode variable, a
> pair of SImode variables would do (i.e., no bit-fields crossing the
> SImode boundary), as long as registers are known to hold SImode but
> not DImode.  But is there a generic way to test for such "efficiency
> boundaries"?

I'd start with

  GET_MODE_SIZE (foo) <= UNITS_PER_WORD

as a first order approximation.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-06 12:13                         ` Bernd Schmidt
@ 2007-05-06 14:27                           ` Alexandre Oliva
  2007-05-06 15:01                             ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-06 14:27 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Roman Zippel, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

On May  6, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> Alexandre Oliva wrote:
>> It might make sense to test whether, instead of a DImode variable, a
>> pair of SImode variables would do (i.e., no bit-fields crossing the
>> SImode boundary), as long as registers are known to hold SImode but
>> not DImode.  But is there a generic way to test for such "efficiency
>> boundaries"?

> I'd start with

>   GET_MODE_SIZE (foo) <= UNITS_PER_WORD

> as a first order approximation.

This is sort of equivalent.  Ok to install on top of the other patch?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-word.patch --]
[-- Type: text/x-patch, Size: 754 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-sra.c (try_instantiate_multiple_fields): Clip alignment
	at word size.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-06 02:31:43.000000000 -0300
+++ gcc/tree-sra.c	2007-05-06 02:31:44.000000000 -0300
@@ -1486,6 +1486,11 @@ try_instantiate_multiple_fields (struct 
   if (align < alchk)
     align = alchk;
 
+  /* Coalescing wider fields is probably pointless and
+     inefficient.  */
+  if (align > BITS_PER_WORD)
+    align = BITS_PER_WORD;
+
   bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
     + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
   size = tree_low_cst (DECL_SIZE (f), 1);

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-06 14:27                           ` Alexandre Oliva
@ 2007-05-06 15:01                             ` Alexandre Oliva
  2007-05-06 23:44                               ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-06 15:01 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Roman Zippel, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 1394 bytes --]

On May  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> On May  6, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>> Alexandre Oliva wrote:
>>> It might make sense to test whether, instead of a DImode variable, a
>>> pair of SImode variables would do (i.e., no bit-fields crossing the
>>> SImode boundary), as long as registers are known to hold SImode but
>>> not DImode.  But is there a generic way to test for such "efficiency
>>> boundaries"?

> This is sort of equivalent.  Ok to install on top of the other patch?

> 	* tree-sra.c (try_instantiate_multiple_fields): Clip alignment
> 	at word size.

And this is something I'm testing now, that will try to implement the
idea of extending the width to fit fields that cross the alignment
boundary, but not other fields that don't.  It makes a lot of
difference for a testcase such as this on x86-32:

struct a {
  unsigned long long x : 16, y : 15, z : 33;
};
struct a fa(struct a v) {
  return v;
}

struct b {
  unsigned long long x : 16, y : 16, z : 32;
};
struct b fb(struct b v) {
  return v;
}

Without this patch, the code for fa sucks; with it, it's almost
identical to that of fb, except for ordering of operations.

Ok to install on top of the other two, if it survives bootstrapping
and regtesting on x86-32, where it might have more visible effects
than on x86-64, that doesn't regularly use wider-than-word types?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-word-grow.patch --]
[-- Type: text/x-patch, Size: 1413 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-sra.c (try_instantiate_multiple_fields): Grow alignment
	to fit wider fields that start within the same word.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-06 11:52:33.000000000 -0300
+++ gcc/tree-sra.c	2007-05-06 11:52:38.000000000 -0300
@@ -1521,13 +1521,38 @@ try_instantiate_multiple_fields (struct 
       if (bit + size == nbit)
 	{
 	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
-	    break;
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BIT_SIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BIT_SIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
 	  size += nsize;
 	}
       else if (nbit + nsize == bit)
 	{
 	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
-	    break;
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BIT_SIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BIT_SIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
 	  bit = nbit;
 	  size += nsize;
 	}

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-06 15:01                             ` Alexandre Oliva
@ 2007-05-06 23:44                               ` Alexandre Oliva
  2007-05-06 23:51                                 ` Diego Novillo
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-06 23:44 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Roman Zippel, Daniel Berlin, GCC Patches, Diego Novillo, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 490 bytes --]

On May  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> struct a { unsigned long long x : 16, y : 15, z : 33; };
> struct a fa(struct a v) { return v; }
> struct b { unsigned long long x : 16, y : 16, z : 32; };
> struct b fb(struct b v) { return v; }

Err...  This one (quilt-refreshed) should even compile ;-)

It has actually passed bootstrap and regression testing on
a i686-linux-gnu native built on a x86_64-linux-gnu OS.  Ok to install
along with the other previous patches?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-word-grow.patch --]
[-- Type: text/x-patch, Size: 1409 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-sra.c (try_instantiate_multiple_fields): Grow alignment
	to fit wider fields that start within the same word.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-06 11:53:04.000000000 -0300
+++ gcc/tree-sra.c	2007-05-06 11:53:49.000000000 -0300
@@ -1521,13 +1521,38 @@ try_instantiate_multiple_fields (struct 
       if (bit + size == nbit)
 	{
 	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
-	    break;
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
 	  size += nsize;
 	}
       else if (nbit + nsize == bit)
 	{
 	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
-	    break;
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
 	  bit = nbit;
 	  size += nsize;
 	}

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-06 23:44                               ` Alexandre Oliva
@ 2007-05-06 23:51                                 ` Diego Novillo
  2007-05-07  0:14                                   ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Diego Novillo @ 2007-05-06 23:51 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Bernd Schmidt, Roman Zippel, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski

Alexandre Oliva wrote on 05/06/07 19:44:

> It has actually passed bootstrap and regression testing on
> a i686-linux-gnu native built on a x86_64-linux-gnu OS.  Ok to install
> along with the other previous patches?

Please send a complete patch with a final description.  At this point, I
am totally lost and have no idea what to review.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-06 23:51                                 ` Diego Novillo
@ 2007-05-07  0:14                                   ` Alexandre Oliva
  2007-05-07  2:21                                     ` Andrew Pinski
                                                       ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-07  0:14 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Bernd Schmidt, Roman Zippel, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 1847 bytes --]

On May  6, 2007, Diego Novillo <dnovillo@acm.org> wrote:

> Alexandre Oliva wrote on 05/06/07 19:44:
>> It has actually passed bootstrap and regression testing on
>> a i686-linux-gnu native built on a x86_64-linux-gnu OS.  Ok to install
>> along with the other previous patches?

> Please send a complete patch with a final description.  At this point, I
> am totally lost and have no idea what to review.

Ok, here's a consolidated version.

:ADDPATCH sra:

This patch reintroduces some of the patch that I reverted on
2007-04-30, meant to scalarize into a single variable multiple small
fields that are not accessed directly in the source code, with the
following major changes since the original patch, installed on
2007-04-05:

- only assignments to BIT_FIELD_REFs are regarded as both input and
  output now; the use_all machinery is restored

- variables that are created and potentially accessed as bit-fields
  are zero-initialized, as long as they're not otherwise initialized
  from parameters, to avoid undefined behavior in bit-field
  modifications; restore no-warning marker logic.

- rework the code that decides which fields to put in a block such
  that it tries to avoid creating variables that are wider than a
  word, using as guide the alignment and the mode of the base variable
  or of the type of the innermost field thereof that contains the
  fields being analyzed, instead of relying on
  DECL_OFFSET_ALIGN-determined words.

- fix bug in optimizing away BIT_FIELD_REFs in assignments to the
  entire useful range of scalarization variables on big-endian targets
  that caused the bits to be stored in the wrong locations.

- rework the code to avoid the need for artificial initialization
  becuase of optimizer stupidity


This combination of patches was bootstrapped and regtested on
i686-linux-gnu.  Ok to install?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 28336 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(sra_walk_expr): Use a variable as both input and output if it's
	referenced in a BIT_FIELD_REF in a lhs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Gimple-zero-initialize all bit-field variables that are not
	part of parameters that are going to be scalarized on entry.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest possible
	access mode from decl or member type, but clip it at word
	size, and only widen it if a field crosses an alignment
	boundary.
	(instantiate_missing_elements): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(REPLDUP): New.
	(sra_build_elt_assignment): New.  Avoid need for needless
	initialization of maxalign and minalign.  Fix assignment to
	whole bitfield variables for big-endian machines.
	(generate_copy_inout): Use them.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_int): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(scalarize_use): Re-expand assignment to BIT_FIELD_REF of
	gimple_reg.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-06 15:15:10.000000000 -0300
+++ gcc/tree-sra.c	2007-05-06 21:04:34.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -205,6 +209,9 @@ extern void debug_sra_elt_name (struct s
 
 /* Forward declarations.  */
 static tree generate_element_ref (struct sra_elt *);
+static tree sra_build_assignment (tree dst, tree src);
+static void mark_all_v_defs (tree list);
+
 \f
 /* Return true if DECL is an SRA candidate.  */
 
@@ -461,6 +468,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +492,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +512,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
+
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
 
-  if (a->parent != b->parent)
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +557,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -729,6 +758,7 @@ sra_walk_expr (tree *expr_p, block_stmt_
   tree inner = expr;
   bool disable_scalarization = false;
   bool use_all_p = false;
+  bool use_both_p = false;
 
   /* We're looking to collect a reference expression between EXPR and INNER,
      such that INNER is a scalarizable decl and all other nodes through EXPR
@@ -749,7 +779,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (disable_scalarization)
 	      elt->cannot_scalarize = true;
 	    else
-	      fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      {
+		if (use_both_p)
+		  fns->use (elt, expr_p, bsi, false, use_all_p);
+		fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      }
 	  }
 	return;
 
@@ -816,6 +850,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (elt)
 	      elt->is_vector_lhs = true;
 	  }
+	else if (is_output)
+	  /* If we're writing to a BIT-FIELD, we have to make sure the
+	     inner variable is back in place before we start modifying
+	     it.  */
+	  use_both_p = true;
 	/* A bit field reference (access to *multiple* fields simultaneously)
 	   is not currently scalarized.  Consider this an access to the
 	   complete outer element, to which walk_tree will bring us next.  */
@@ -884,11 +923,14 @@ sra_walk_asm_expr (tree expr, block_stmt
   sra_walk_tree_list (ASM_OUTPUTS (expr), bsi, true, fns);
 }
 
+static void sra_replace (block_stmt_iterator *bsi, tree list);
+static tree sra_build_elt_assignment (struct sra_elt *elt, tree src);
+
 /* Walk a GIMPLE_MODIFY_STMT and categorize the assignment appropriately.  */
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1219,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1260,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1293,7 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
     }
   else
     {
@@ -1251,6 +1302,19 @@ instantiate_element (struct sra_elt *elt
       TREE_NO_WARNING (var) = 1;
     }
 
+  /* Zero-initialize bit-field scalarization variables, to avoid
+     triggering undefined behavior, unless the base variable is an
+     argument that is going to get scalarized out as the first thing,
+     providing for the initializers we need.  */
+  if (TREE_CODE (elt->element) == BIT_FIELD_REF
+      && (TREE_CODE (base) != PARM_DECL
+	  || !bitmap_bit_p (needs_copy_in, DECL_UID (base))))
+    {
+      tree init = sra_build_assignment (var, integer_zero_node);
+      insert_edge_copies (init, ENTRY_BLOCK_PTR);
+      mark_all_v_defs (init);
+    }
+
   if (dump_file)
     {
       fputs ("  ", dump_file);
@@ -1337,7 +1401,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1412,303 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  int count;
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+
+  /* For complex and array objects, there are going to be integer
+     literals as child elements.  In this case, we can't just take the
+     alignment and mode of the decl, so we instead rely on the element
+     type.
+
+     ??? We could try to infer additional alignment from the full
+     object declaration and the location of the sub-elements we're
+     accessing.  */
+  for (count = 0; !DECL_P (block->element); count++)
+    block = block->parent;
+
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+
+  if (count)
+    {
+      type = TREE_TYPE (block->element);
+      while (count--)
+	type = TREE_TYPE (type);
+
+      align = TYPE_ALIGN (type);
+      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
+    }
+
+  if (align < alchk)
+    align = alchk;
+
+  /* Coalescing wider fields is probably pointless and
+     inefficient.  */
+  if (align > BITS_PER_WORD)
+    align = BITS_PER_WORD;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1724,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +2046,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = copy_node (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1741,6 +2108,120 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : copy_node (t))
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, type, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var),
+			 TREE_OPERAND (src, 0),
+			 DECL_SIZE (TREE_OPERAND (dst, 0)),
+			 size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+				     TREE_OPERAND (dst, 2)));
+      BIT_FIELD_REF_UNSIGNED (src) = 1;
+
+      return sra_build_assignment (var, src);
+    }
+
+  if (!is_gimple_reg (var))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = alloc_stmt_list ();
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (WORDS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
+      minshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+
+  mask = build_int_cst_wide (type, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, 1);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, 1);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, 1);
+  mask = fold_build1 (BIT_NOT_EXPR, type, mask);
+
+  if (!WORDS_BIG_ENDIAN)
+    cst2 = TREE_OPERAND (dst, 2);
+
+  tmp = make_rename_temp (type, "SR");
+  stmt = build_gimple_modify_stmt (tmp,
+				   fold_build2 (BIT_AND_EXPR, type,
+						var, mask));
+  append_to_statement_list (stmt, &list);
+
+  if (is_gimple_reg (src))
+    tmp2 = src;
+  else
+    {
+      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2))
+      || TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      tmp2 = fold_build3 (BIT_FIELD_REF, type, tmp2, TREE_OPERAND (dst, 1),
+			  bitsize_int (0));
+      if (TREE_CODE (tmp2) == BIT_FIELD_REF)
+	BIT_FIELD_REF_UNSIGNED (tmp2) = 1;
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (type, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, type,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  stmt = build_gimple_modify_stmt (var,
+				   fold_build2 (BIT_IOR_EXPR, type,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  return list;
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1771,9 +2252,9 @@ generate_copy_inout (struct sra_elt *elt
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2279,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2302,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2323,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2334,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2343,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2375,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2041,23 +2536,45 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = alloc_stmt_list ();
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +2618,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,8 +2772,8 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
@@ -2352,6 +2869,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-07  0:14                                   ` Alexandre Oliva
@ 2007-05-07  2:21                                     ` Andrew Pinski
  2007-05-09 12:25                                     ` Bernd Schmidt
  2007-05-09 18:32                                     ` Roman Zippel
  2 siblings, 0 replies; 104+ messages in thread
From: Andrew Pinski @ 2007-05-07  2:21 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Diego Novillo, Bernd Schmidt, Roman Zippel, Daniel Berlin,
	GCC Patches, Diego Novillo, Eric Botcazou

On 5/6/07, Alexandre Oliva <aoliva@redhat.com> wrote:
> Ok, here's a consolidated version.

And the spu-elf regressions are still there.

This makes no sense
  D.2296_13 = (<unnamed-signed:52>) D.2295_12;
  SR.18_36 = D.2296_13;
  SR.33_38 = BIT_FIELD_REF <SR.18_36, 52, 0>;

SR.33 is long long unsigned int and SR.18 is unnamed-signed:52.

Just have a cast instead of BIT_FIELD_REF.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-07  0:14                                   ` Alexandre Oliva
  2007-05-07  2:21                                     ` Andrew Pinski
@ 2007-05-09 12:25                                     ` Bernd Schmidt
  2007-05-10  7:47                                       ` Alexandre Oliva
  2007-05-09 18:32                                     ` Roman Zippel
  2 siblings, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-05-09 12:25 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Diego Novillo, Roman Zippel, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

Alexandre Oliva wrote:

> This patch reintroduces some of the patch that I reverted on
> 2007-04-30, meant to scalarize into a single variable multiple small
> fields that are not accessed directly in the source code, with the
> following major changes since the original patch, installed on
> 2007-04-05:

I've tested this a bit, and on i386 it seems to have a beneficial effect
on the set of sources I tried.  On the Blackfin it does nothing at all.

> - only assignments to BIT_FIELD_REFs are regarded as both input and
>   output now; the use_all machinery is restored

Sounds like there's room for improvement.  Do you think it would be hard
to solve this accurately?

Is the cost logic in decide_block_copy still correct after this patch?
It seems like it could be unnecessarily pessimistic.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-07  0:14                                   ` Alexandre Oliva
  2007-05-07  2:21                                     ` Andrew Pinski
  2007-05-09 12:25                                     ` Bernd Schmidt
@ 2007-05-09 18:32                                     ` Roman Zippel
  2007-05-10  7:49                                       ` Alexandre Oliva
  2 siblings, 1 reply; 104+ messages in thread
From: Roman Zippel @ 2007-05-09 18:32 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Diego Novillo, Bernd Schmidt, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

Hi,

On Sun, 6 May 2007, Alexandre Oliva wrote:

> This patch reintroduces some of the patch that I reverted on
> 2007-04-30, meant to scalarize into a single variable multiple small
> fields that are not accessed directly in the source code, with the
> following major changes since the original patch, installed on
> 2007-04-05:

I've tested a bit earlier version and it bootstrapped again. :)
The only regression left now is that gcc.c-torture/execute/bf64-1.c fails 
with -O3. main should be empty but isn't and the very first test fails.
The initial RTL shows this as the first instructions:

(insn 5 4 6 ../gcc/gcc/testsuite/gcc.c-torture/execute/bf64-1.c:28 (set (reg:HI 32)
        (const_int 291 [0x123])) -1 (nil)
    (nil))

(insn 6 5 7 ../gcc/gcc/testsuite/gcc.c-torture/execute/bf64-1.c:28 (set (subreg:SI (reg:DI 33) 4)
        (zero_extract:SI (subreg:SI (reg:HI 32) 0)
            (const_int 12 [0xc])
            (const_int 16 [0x10]))) -1 (nil)
    (nil))

The offset is wrong here and so the test later fails then it extracts the 
value again.

One other thing I noticed is that now patterns like "(zero_extract:SI 
(reg:SI) 32 0)" are generated which could simply be replaced with 
(reg:SI). Apparently back ends never had to deal with this so far, so gcc 
generates now e.g.:

	bfins %d1,%d0{#0:#32}

It's easy enough to change in the back end, but maybe it's better to fix 
it somewhere else?

bye, Roman

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-09 12:25                                     ` Bernd Schmidt
@ 2007-05-10  7:47                                       ` Alexandre Oliva
  2007-05-10 11:00                                         ` Bernd Schmidt
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-10  7:47 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Diego Novillo, Roman Zippel, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

On May  9, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

>> - only assignments to BIT_FIELD_REFs are regarded as both input and
>> output now; the use_all machinery is restored

> Sounds like there's room for improvement.  Do you think it would be hard
> to solve this accurately?

I'm not sure what kind of improvement you have in mind.  Are you
talking about not doing the in/out at all and breaking up the
BIT_FIELD_REF such that it affects the scalarized variables?  Or about
doing the in/out only on the fields covered by the BIT_FIELD_REF?  Or
something else?

> Is the cost logic in decide_block_copy still correct after this patch?
> It seems like it could be unnecessarily pessimistic.

That's possible.  I haven't paid too much attention to that yet.  I'm
still busy trying to get it to generate correct code for all targets
;-/

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-09 18:32                                     ` Roman Zippel
@ 2007-05-10  7:49                                       ` Alexandre Oliva
  2007-05-15 17:39                                         ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-10  7:49 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Diego Novillo, Bernd Schmidt, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

On May  9, 2007, Roman Zippel <zippel@linux-m68k.org> wrote:

> (insn 5 4 6 ../gcc/gcc/testsuite/gcc.c-torture/execute/bf64-1.c:28 (set (reg:HI 32)
>         (const_int 291 [0x123])) -1 (nil)
>     (nil))

> (insn 6 5 7 ../gcc/gcc/testsuite/gcc.c-torture/execute/bf64-1.c:28 (set (subreg:SI (reg:DI 33) 4)
>         (zero_extract:SI (subreg:SI (reg:HI 32) 0)
>             (const_int 12 [0xc])
>             (const_int 16 [0x10]))) -1 (nil)
>     (nil))

> The offset is wrong here and so the test later fails then it extracts the 
> value again.

Thanks, I'll look into that.

> One other thing I noticed is that now patterns like "(zero_extract:SI 
> (reg:SI) 32 0)" are generated which could simply be replaced with 
> (reg:SI).

Ugh.  I didn't bother optimizing these away because I was sure
something else would take care of them.  Now that I don't it won't,
I'll avoid emitting it in the first place, thanks.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-10  7:47                                       ` Alexandre Oliva
@ 2007-05-10 11:00                                         ` Bernd Schmidt
  2007-05-22  7:09                                           ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-05-10 11:00 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Diego Novillo, Roman Zippel, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

Alexandre Oliva wrote:
> On May  9, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> 
>>> - only assignments to BIT_FIELD_REFs are regarded as both input and
>>> output now; the use_all machinery is restored
> 
>> Sounds like there's room for improvement.  Do you think it would be hard
>> to solve this accurately?
> 
> I'm not sure what kind of improvement you have in mind.  Are you
> talking about not doing the in/out at all and breaking up the
> BIT_FIELD_REF such that it affects the scalarized variables?  Or about
> doing the in/out only on the fields covered by the BIT_FIELD_REF?  Or
> something else?

Either would be an improvement, but I was thinking about the second
option, which would have improved the code in a testcase I sent earlier
in the thread.


Bernd

-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-10  7:49                                       ` Alexandre Oliva
@ 2007-05-15 17:39                                         ` Alexandre Oliva
  2007-05-17 21:17                                           ` Andrew Pinski
  2007-05-28 10:49                                           ` Bernd Schmidt
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-15 17:39 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Diego Novillo, Bernd Schmidt, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]

On May 10, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> On May  9, 2007, Roman Zippel <zippel@linux-m68k.org> wrote:

>> The offset is wrong here and so the test later fails then it extracts the 
>> value again.

> Thanks, I'll look into that.

IIRC this was caused by the lack of explicit type conversions.

>> One other thing I noticed is that now patterns like "(zero_extract:SI 
>> (reg:SI) 32 0)" are generated which could simply be replaced with 
>> (reg:SI).

> Ugh.  I didn't bother optimizing these away because I was sure
> something else would take care of them.  Now that I don't it won't,
> I'll avoid emitting it in the first place, thanks.

I've fixed this in this new version of the patch.  I've also arranged
for us to almost never issue BIT_FIELD_REFs where simpler bitwise
operations would do.  GCC can do a much better optimization job then.

I still haven't addressed Bernd's suggestion (if you don't mind, I'd
rather do that in a separate patch, as an additional improvement; this
one is quickly growing out of hand), but I intend to do so next.

Roman, Andrew, would you please confirm that this patch fixes the
regressions on m68k and spu that you reported?

Thanks,


This was bootstrapped and regression-tested on x86_64-linux-gnu.  Ok
to install?

:ADDPATCH sra:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 34936 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(sra_walk_expr): Use a variable as both input and output if it's
	referenced in a BIT_FIELD_REF in a lhs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Gimple-zero-initialize all bit-field variables that are not
	part of parameters that are going to be scalarized on entry.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest possible
	access mode from decl or member type, but clip it at word
	size, and only widen it if a field crosses an alignment
	boundary.
	(instantiate_missing_elements): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(sra_build_assignment): Turn BIT_FIELD_REFs of integer-typed
	gimple regs into integer operations.
	(REPLDUP): New.
	(sra_build_elt_assignment): New.  Avoid need for needless
	initialization of maxalign and minalign.  Fix assignment to
	whole bitfield variables for big-endian machines, optimizing
	away full-width BIT_FIELD_REFs.  Turn BIT_FIELD_REFs inputs into
	bit and view-convert operations.
	(generate_copy_inout): Use them.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_int): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(scalarize_use): Re-expand assignment to BIT_FIELD_REF of
	gimple_reg.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-14 14:41:06.000000000 -0300
+++ gcc/tree-sra.c	2007-05-14 16:05:54.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -205,6 +209,9 @@ extern void debug_sra_elt_name (struct s
 
 /* Forward declarations.  */
 static tree generate_element_ref (struct sra_elt *);
+static tree sra_build_assignment (tree dst, tree src);
+static void mark_all_v_defs (tree list);
+
 \f
 /* Return true if DECL is an SRA candidate.  */
 
@@ -461,6 +468,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +492,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +512,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
 
-  if (a->parent != b->parent)
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
+
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +557,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -729,6 +758,7 @@ sra_walk_expr (tree *expr_p, block_stmt_
   tree inner = expr;
   bool disable_scalarization = false;
   bool use_all_p = false;
+  bool use_both_p = false;
 
   /* We're looking to collect a reference expression between EXPR and INNER,
      such that INNER is a scalarizable decl and all other nodes through EXPR
@@ -749,7 +779,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (disable_scalarization)
 	      elt->cannot_scalarize = true;
 	    else
-	      fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      {
+		if (use_both_p)
+		  fns->use (elt, expr_p, bsi, false, use_all_p);
+		fns->use (elt, expr_p, bsi, is_output, use_all_p);
+	      }
 	  }
 	return;
 
@@ -816,6 +850,11 @@ sra_walk_expr (tree *expr_p, block_stmt_
 	    if (elt)
 	      elt->is_vector_lhs = true;
 	  }
+	else if (is_output)
+	  /* If we're writing to a BIT-FIELD, we have to make sure the
+	     inner variable is back in place before we start modifying
+	     it.  */
+	  use_both_p = true;
 	/* A bit field reference (access to *multiple* fields simultaneously)
 	   is not currently scalarized.  Consider this an access to the
 	   complete outer element, to which walk_tree will bring us next.  */
@@ -884,11 +923,14 @@ sra_walk_asm_expr (tree expr, block_stmt
   sra_walk_tree_list (ASM_OUTPUTS (expr), bsi, true, fns);
 }
 
+static void sra_replace (block_stmt_iterator *bsi, tree list);
+static tree sra_build_elt_assignment (struct sra_elt *elt, tree src);
+
 /* Walk a GIMPLE_MODIFY_STMT and categorize the assignment appropriately.  */
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1219,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1260,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1293,7 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
     }
   else
     {
@@ -1251,6 +1302,19 @@ instantiate_element (struct sra_elt *elt
       TREE_NO_WARNING (var) = 1;
     }
 
+  /* Zero-initialize bit-field scalarization variables, to avoid
+     triggering undefined behavior, unless the base variable is an
+     argument that is going to get scalarized out as the first thing,
+     providing for the initializers we need.  */
+  if (TREE_CODE (elt->element) == BIT_FIELD_REF
+      && (TREE_CODE (base) != PARM_DECL
+	  || !bitmap_bit_p (needs_copy_in, DECL_UID (base))))
+    {
+      tree init = sra_build_assignment (var, integer_zero_node);
+      insert_edge_copies (init, ENTRY_BLOCK_PTR);
+      mark_all_v_defs (init);
+    }
+
   if (dump_file)
     {
       fputs ("  ", dump_file);
@@ -1337,7 +1401,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1412,303 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  int count;
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+
+  /* For complex and array objects, there are going to be integer
+     literals as child elements.  In this case, we can't just take the
+     alignment and mode of the decl, so we instead rely on the element
+     type.
+
+     ??? We could try to infer additional alignment from the full
+     object declaration and the location of the sub-elements we're
+     accessing.  */
+  for (count = 0; !DECL_P (block->element); count++)
+    block = block->parent;
+
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+
+  if (count)
+    {
+      type = TREE_TYPE (block->element);
+      while (count--)
+	type = TREE_TYPE (type);
+
+      align = TYPE_ALIGN (type);
+      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
+    }
+
+  if (align < alchk)
+    align = alchk;
+
+  /* Coalescing wider fields is probably pointless and
+     inefficient.  */
+  if (align > BITS_PER_WORD)
+    align = BITS_PER_WORD;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1724,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +2046,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = copy_node (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1730,6 +2097,157 @@ generate_element_ref (struct sra_elt *el
 static tree
 sra_build_assignment (tree dst, tree src)
 {
+  /* Turning BIT_FIELD_REFs of integer-typed gimple regs into bit
+     operations enables other passes to do a much better job at
+     optimizing the code.  */
+  if (TREE_CODE (src) == BIT_FIELD_REF
+      && DECL_P (TREE_OPERAND (src, 0))
+      && TREE_CODE (TREE_TYPE (TREE_OPERAND (src, 0))) == INTEGER_TYPE
+      && is_gimple_reg (TREE_OPERAND (src, 0)))
+    {
+      tree cst, cst2, mask, minshift, maxshift;
+      tree tmp, var, utype, stype;
+      tree list, stmt;
+      bool unsignedp = BIT_FIELD_REF_UNSIGNED (src);
+
+      var = TREE_OPERAND (src, 0);
+      cst = TREE_OPERAND (src, 2);
+      cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
+			 TREE_OPERAND (src, 2));
+
+      if (WORDS_BIG_ENDIAN)
+	{
+	  maxshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
+	  minshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
+	}
+      else
+	{
+	  maxshift = cst2;
+	  minshift = cst;
+	}
+
+      stype = TREE_TYPE (var);
+      if (!TYPE_UNSIGNED (stype))
+	stype = unsigned_type_for (stype);
+
+      utype = TREE_TYPE (dst);
+      if (TREE_CODE (utype) != INTEGER_TYPE)
+	utype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (utype)), 1);
+      else if (!TYPE_UNSIGNED (utype))
+	utype = unsigned_type_for (utype);
+
+      list = alloc_stmt_list ();
+
+      cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
+      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+	{
+	  unsignedp = true;
+	  mask = NULL_TREE;
+	}
+      else
+	{
+	  mask = build_int_cst_wide (utype, 1, 0);
+	  cst = int_const_binop (LSHIFT_EXPR, mask, cst2, 1);
+	  mask = int_const_binop (MINUS_EXPR, cst, mask, 1);
+	}
+
+      tmp = make_rename_temp (utype, "SR");
+      if (TREE_TYPE (var) != stype)
+	{
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_convert (stype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!integer_zerop (minshift))
+	{
+	  tmp = make_rename_temp (stype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (RSHIFT_EXPR, stype,
+							var, minshift));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (!mask && unsignedp
+	      && (TYPE_MAIN_VARIANT (utype)
+		  == TYPE_MAIN_VARIANT (TREE_TYPE (dst))))
+	    tmp = dst;
+	  else
+	    tmp = make_rename_temp (utype, "SR");
+
+	  stmt = build_gimple_modify_stmt (tmp, fold_convert (utype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (mask)
+	{
+	  if (!unsignedp
+	      || (TYPE_MAIN_VARIANT (TREE_TYPE (dst))
+		  != TYPE_MAIN_VARIANT (utype)))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_AND_EXPR, utype,
+							var, mask));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!unsignedp)
+	{
+	  tree signbit = int_const_binop (LSHIFT_EXPR,
+					  build_int_cst_wide (utype, 1, 0),
+					  size_binop (MINUS_EXPR, cst2,
+						      bitsize_int (1)),
+					  1);
+
+	  tmp = make_rename_temp (utype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_XOR_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (dst)) != TYPE_MAIN_VARIANT (utype))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (MINUS_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (var != dst)
+	{
+	  if (TREE_CODE (TREE_TYPE (dst)) == INTEGER_TYPE)
+	    var = fold_convert (TREE_TYPE (dst), var);
+	  else
+	    var = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (dst), var);
+
+	  stmt = build_gimple_modify_stmt (dst, var);
+	  append_to_statement_list (stmt, &list);
+	}
+
+      return list;
+    }
+
   /* It was hoped that we could perform some type sanity checking
      here, but since front-ends can emit accesses of fields in types
      different from their nominal types and copy structures containing
@@ -1741,6 +2259,190 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : copy_node (t))
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, type, utype, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      cst = DECL_SIZE (var);
+      cst2 = size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+			 TREE_OPERAND (dst, 2));
+
+      src = TREE_OPERAND (src, 0);
+
+      /* Avoid full-width bit-fields.  */
+      if (integer_zerop (cst2)
+	  && tree_int_cst_equal (cst, TYPE_SIZE (TREE_TYPE (src))))
+	{
+	  if (TREE_CODE (TREE_TYPE (src)) == INTEGER_TYPE
+	      && !TYPE_UNSIGNED (TREE_TYPE (src)))
+	    src = fold_convert (unsigned_type_for (TREE_TYPE (src)), src);
+
+	  /* If a single conversion won't do, we'll need a statement
+	     list.  */
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (var))
+	      != TYPE_MAIN_VARIANT (TREE_TYPE (src)))
+	    {
+	      list = alloc_stmt_list ();
+
+	      if (TREE_CODE (TREE_TYPE (src)) != INTEGER_TYPE
+		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+		src = fold_build1 (VIEW_CONVERT_EXPR,
+				   lang_hooks.types.type_for_size
+				   (TREE_INT_CST_LOW
+				    (TYPE_SIZE (TREE_TYPE (src))),
+				    1), src);
+
+	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
+	      stmt = build_gimple_modify_stmt (tmp, src);
+	      append_to_statement_list (stmt, &list);
+
+	      stmt = sra_build_assignment (var,
+					   fold_convert (TREE_TYPE (var),
+							 tmp));
+	      append_to_statement_list (stmt, &list);
+
+	      return list;
+	    }
+
+	  src = fold_convert (TREE_TYPE (var), src);
+	}
+      else
+	{
+	  src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var), src, cst, cst2);
+	  BIT_FIELD_REF_UNSIGNED (src) = 1;
+	}
+
+      return sra_build_assignment (var, src);
+    }
+
+  if (!is_gimple_reg (var))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = alloc_stmt_list ();
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (WORDS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst);
+      minshift = size_binop (MINUS_EXPR, DECL_SIZE (var), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+  if (TYPE_UNSIGNED (type))
+    utype = type;
+  else
+    utype = unsigned_type_for (type);
+
+  mask = build_int_cst_wide (type, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, 1);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, 1);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, 1);
+  mask = fold_build1 (BIT_NOT_EXPR, type, mask);
+
+  tmp = make_rename_temp (utype, "SR");
+  stmt = build_gimple_modify_stmt (tmp,
+				   fold_build2 (BIT_AND_EXPR, utype,
+						var, mask));
+  append_to_statement_list (stmt, &list);
+
+  if (is_gimple_reg (src) && TREE_CODE (TREE_TYPE (src)) == INTEGER_TYPE)
+    tmp2 = src;
+  else if (TREE_CODE (TREE_TYPE (src)) == INTEGER_TYPE)
+    {
+      tmp2 = make_rename_temp (unsigned_type_for (TREE_TYPE (src)), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    {
+      tmp2 = make_rename_temp
+	(lang_hooks.types.type_for_size
+	 (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (src))),
+	  1), "SR");
+      stmt = sra_build_assignment (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (tmp2), src));
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2)))
+    {
+      tree ut = unsigned_type_for (TREE_TYPE (tmp2));
+      tmp3 = make_rename_temp (ut, "SR");
+      tmp2 = fold_convert (ut, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      tmp2 = fold_convert (utype, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, utype,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (utype != type)
+    tmp3 = make_rename_temp (utype, "SR");
+  else
+    tmp3 = var;
+  stmt = build_gimple_modify_stmt (tmp3,
+				   fold_build2 (BIT_IOR_EXPR, utype,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  if (tmp3 != var)
+    {
+      stmt = build_gimple_modify_stmt (var,
+				       fold_convert (type, tmp3));
+      append_to_statement_list (stmt, &list);
+    }
+
+  return list;
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1771,9 +2473,9 @@ generate_copy_inout (struct sra_elt *elt
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2500,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2523,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2544,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2555,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2564,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2596,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2041,23 +2757,45 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = alloc_stmt_list ();
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +2839,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,8 +2993,8 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
@@ -2352,6 +3090,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 250 bytes --]



-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-15 17:39                                         ` Alexandre Oliva
@ 2007-05-17 21:17                                           ` Andrew Pinski
  2007-05-28 10:49                                           ` Bernd Schmidt
  1 sibling, 0 replies; 104+ messages in thread
From: Andrew Pinski @ 2007-05-17 21:17 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Diego Novillo, Bernd Schmidt, Daniel Berlin,
	GCC Patches, Diego Novillo, Eric Botcazou

On 5/15/07, Alexandre Oliva <aoliva@redhat.com> wrote:
> Roman, Andrew, would you please confirm that this patch fixes the
> regressions on m68k and spu that you reported?

I can confirm this patch fixes the spu-elf regression.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-10 11:00                                         ` Bernd Schmidt
@ 2007-05-22  7:09                                           ` Alexandre Oliva
  2007-05-22 17:34                                             ` Roman Zippel
  2007-05-22 21:23                                             ` Alexandre Oliva
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-22  7:09 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Diego Novillo, Roman Zippel, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

On May 10, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> Alexandre Oliva wrote:
>> On May  9, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>> 
>>>> - only assignments to BIT_FIELD_REFs are regarded as both input and
>>>> output now; the use_all machinery is restored
>> 
>>> Sounds like there's room for improvement.  Do you think it would be hard
>>> to solve this accurately?
>> 
>> I'm not sure what kind of improvement you have in mind.  Are you
>> talking about not doing the in/out at all and breaking up the
>> BIT_FIELD_REF such that it affects the scalarized variables?  Or about
>> doing the in/out only on the fields covered by the BIT_FIELD_REF?  Or
>> something else?

> Either would be an improvement, but I was thinking about the second
> option, which would have improved the code in a testcase I sent earlier
> in the thread.

As it turns out, this case no longer triggers the in/out bits.  As in,
the code in my last patch still requests an in/out, but GCC seems to
not scalarizes variables with use_all accesses, for some reason I
haven't quite understood, so in the end the in/out code has become
completely useless.

All other cases of use_all relied on this property, and there was
something about the use of BIT_FIELD_REFs in earlier versions of this
patch that broke this property.  After all the simplification I made,
the property is restored, and I could simply remove the in/out code,
obviating your optimization request.

I'm testing a revised patch now, and I'll post it as soon as I have
and verify regression testing results, which could be anywhere from 8
to 24 hours ahead.


Now, does anyone with more understanding of SRA actually understand
how this property emerges?  I've been looking at this code for weeks
but this behavior still seems like magic to me.  TIA,

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-22  7:09                                           ` Alexandre Oliva
@ 2007-05-22 17:34                                             ` Roman Zippel
  2007-05-22 20:49                                               ` Alexandre Oliva
  2007-05-22 21:23                                             ` Alexandre Oliva
  1 sibling, 1 reply; 104+ messages in thread
From: Roman Zippel @ 2007-05-22 17:34 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Bernd Schmidt, Diego Novillo, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

Hi,

On Tue, 22 May 2007, Alexandre Oliva wrote:

> I'm testing a revised patch now, and I'll post it as soon as I have
> and verify regression testing results, which could be anywhere from 8
> to 24 hours ahead.

BTW the patch works fine on m68k.
The only small problem is that I still see full register as bitfield 
accesses. As it's not exactly a regression (the output is already better 
than before), it would be IMO ok to fix this in a seperate patch, but it 
would be nice to get this fixed. :)
Thanks.

bye, Roman

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-22 17:34                                             ` Roman Zippel
@ 2007-05-22 20:49                                               ` Alexandre Oliva
  2007-05-23 13:07                                                 ` Roman Zippel
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-22 20:49 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Bernd Schmidt, Diego Novillo, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

On May 22, 2007, Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> On Tue, 22 May 2007, Alexandre Oliva wrote:

>> I'm testing a revised patch now, and I'll post it as soon as I have
>> and verify regression testing results, which could be anywhere from 8
>> to 24 hours ahead.

> BTW the patch works fine on m68k.

Excellent, thanks!

> The only small problem is that I still see full register as bitfield 
> accesses.

Ugh.  I thought I'd cought all of these.  You got a self-contained
testcase?

Thanks,

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-22  7:09                                           ` Alexandre Oliva
  2007-05-22 17:34                                             ` Roman Zippel
@ 2007-05-22 21:23                                             ` Alexandre Oliva
  1 sibling, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-22 21:23 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Diego Novillo, Roman Zippel, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

On May 22, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> As it turns out, this case no longer triggers the in/out bits.  As in,
> the code in my last patch still requests an in/out, but GCC seems to
> not scalarizes variables with use_all accesses, for some reason I
> haven't quite understood, so in the end the in/out code has become
> completely useless.

I lied.  As it turned out, removing it broke a couple of testcases.
So the previous patch is the good one.  Andrew and Roman both confirm
it.

I'll look into what difference the change I made last night made and
start from that in a separate further-improvement patch, just like
Roman's request.

I don't think it makes sense to hold up this big improvement patch any
further because some additional improments can (and will) still be
made, do you?


I guess I'll also introduce code I wrote to try to catch cases in
which the magic property failed to hold.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-22 20:49                                               ` Alexandre Oliva
@ 2007-05-23 13:07                                                 ` Roman Zippel
  0 siblings, 0 replies; 104+ messages in thread
From: Roman Zippel @ 2007-05-23 13:07 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Bernd Schmidt, Diego Novillo, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

Hi,

On Tue, 22 May 2007, Alexandre Oliva wrote:

> > The only small problem is that I still see full register as bitfield 
> > accesses.
> 
> Ugh.  I thought I'd cought all of these.  You got a self-contained
> testcase?

My initial example:

struct K { unsigned int k : 6, l : 1, j : 10, i : 15; };
struct K retmeK (struct K x)
{ 
  return x;
}

bye, Roman

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-15 17:39                                         ` Alexandre Oliva
  2007-05-17 21:17                                           ` Andrew Pinski
@ 2007-05-28 10:49                                           ` Bernd Schmidt
  2007-05-31 20:57                                             ` Alexandre Oliva
  1 sibling, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-05-28 10:49 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Diego Novillo, Daniel Berlin, GCC Patches,
	Diego Novillo, Andrew Pinski, Eric Botcazou

Alexandre Oliva wrote:
> This was bootstrapped and regression-tested on x86_64-linux-gnu.  Ok
> to install?

This appears to have addressed all the regressions, so it can go back
in.  Thanks!

> :ADDPATCH sra:

:REVIEWMAIL:


Bernd

-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-28 10:49                                           ` Bernd Schmidt
@ 2007-05-31 20:57                                             ` Alexandre Oliva
  2007-06-02 17:41                                               ` Bernd Schmidt
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-05-31 20:57 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Roman Zippel, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 1814 bytes --]

On May 28, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> Alexandre Oliva wrote:
>> This was bootstrapped and regression-tested on x86_64-linux-gnu.  Ok
>> to install?

> This appears to have addressed all the regressions, so it can go back
> in.  Thanks!

Thanks.  I hadn't noticed this approval before today.

It took me a while, but I've managed to address Roman's request to
remove full-width BIT_FIELD_REFs (it required some fiddling with EH
annotations), to verify the absence of regressions in Andrew Pinski's
testcase, and to handle your request for improving the copying of
fields in and out of structs accessed with BIT_FIELD_REFs.

I've not only arranged for GCC to sync only the fields covered by the
BIT_FIELD_REF, but also for it to try to explode the BIT_FIELD_REF
access to accesses to the scalarized fields, should the containing
struct be fully scalarized.  This will then enable the original struct
to be thrown away even when it's accessed by BIT_FIELD_REFs.

While working on these, I realized I'd mistakenly used
WORDS_BIG_ENDIAN rather than BITS_BIG_ENDIAN for shift counts, that
I'd used 1 rather than true in some boolean arguments that are
generally passed as true, and a few other cosmetical changes that
would render a patch on top of the one you reviewed and approved quite
large and hard to follow.

So, instead of applying the previous patch and posting the
differences, I figured it would be cleaner to post the whole thing
again.  I hope you don't mind.  If you prefer, it wouldn't be too hard
to split the cosmetical changes on top of the previous patch from the
behavioral changes made along the way, but I'd rather not take the
risk of messing things up by doing this.

The patch below was bootstrapped and regtested on x86_64-linux-gnu.
Ok to install?

:ADDPATCH sra:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 44321 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Gimple-zero-initialize all bit-field variables that are not
	part of parameters that are going to be scalarized on entry.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest possible
	access mode from decl or member type, but clip it at word
	size, and only widen it if a field crosses an alignment
	boundary.
	(instantiate_missing_elements): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(scalar_bitfield_p): New.
	(sra_build_assignment): Optimize assignments from scalarizable
	BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
	counts.
	(REPLDUP): New.
	(sra_build_bf_assignment): New.  Optimize assignments to
	scalarizable BIT_FIELD_REFs.
	(sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
	assignments to full variables.
	(generate_copy_inout): Use the new macros and functions.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_init): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(bitfield_overlaps_p): New.
	(sra_explode_bitfield_assignment): New.
	(sra_sync_for_bitfield_assignment): New.
	(scalarize_use): Re-expand assignment to scalarized
	BIT_FIELD_REFs of gimple_regs.  Explode or sync needed members
	for BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.  Adjust EH
	handling.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-05-31 04:06:15.000000000 -0300
+++ gcc/tree-sra.c	2007-05-31 05:03:49.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -205,6 +209,9 @@ extern void debug_sra_elt_name (struct s
 
 /* Forward declarations.  */
 static tree generate_element_ref (struct sra_elt *);
+static tree sra_build_assignment (tree dst, tree src);
+static void mark_all_v_defs (tree list);
+
 \f
 /* Return true if DECL is an SRA candidate.  */
 
@@ -461,6 +468,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +492,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +512,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
+
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
 
-  if (a->parent != b->parent)
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +557,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -888,7 +917,7 @@ sra_walk_asm_expr (tree expr, block_stmt
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1206,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1247,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1280,7 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
     }
   else
     {
@@ -1251,6 +1289,20 @@ instantiate_element (struct sra_elt *elt
       TREE_NO_WARNING (var) = 1;
     }
 
+  /* Zero-initialize bit-field scalarization variables, to avoid
+     triggering undefined behavior, unless the base variable is an
+     argument that is going to get scalarized out as the first thing,
+     providing for the initializers we need.  */
+  if (TREE_CODE (elt->element) == BIT_FIELD_REF
+      && (TREE_CODE (base) != PARM_DECL
+	  || !bitmap_bit_p (needs_copy_in, DECL_UID (base))))
+    {
+      tree init = sra_build_assignment (var, fold_convert (TREE_TYPE (var),
+							   integer_zero_node));
+      insert_edge_copies (init, ENTRY_BLOCK_PTR);
+      mark_all_v_defs (init);
+    }
+
   if (dump_file)
     {
       fputs ("  ", dump_file);
@@ -1337,7 +1389,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1400,303 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  int count;
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+
+  /* For complex and array objects, there are going to be integer
+     literals as child elements.  In this case, we can't just take the
+     alignment and mode of the decl, so we instead rely on the element
+     type.
+
+     ??? We could try to infer additional alignment from the full
+     object declaration and the location of the sub-elements we're
+     accessing.  */
+  for (count = 0; !DECL_P (block->element); count++)
+    block = block->parent;
+
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+
+  if (count)
+    {
+      type = TREE_TYPE (block->element);
+      while (count--)
+	type = TREE_TYPE (type);
+
+      align = TYPE_ALIGN (type);
+      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
+    }
+
+  if (align < alchk)
+    align = alchk;
+
+  /* Coalescing wider fields is probably pointless and
+     inefficient.  */
+  if (align > BITS_PER_WORD)
+    align = BITS_PER_WORD;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1712,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +2034,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = unshare_expr (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1725,11 +2080,180 @@ generate_element_ref (struct sra_elt *el
     return elt->element;
 }
 
+/* Return true if BF is a bit-field that we can handle like a scalar.  */
+
+static bool
+scalar_bitfield_p (tree bf)
+{
+  return (TREE_CODE (bf) == BIT_FIELD_REF
+	  && (is_gimple_reg (TREE_OPERAND (bf, 0))
+	      || (TYPE_MODE (TREE_TYPE (TREE_OPERAND (bf, 0))) != BLKmode
+		  && (!TREE_SIDE_EFFECTS (TREE_OPERAND (bf, 0))
+		      || (GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE
+						       (TREE_OPERAND (bf, 0))))
+			  <= BITS_PER_WORD)))));
+}
+
 /* Create an assignment statement from SRC to DST.  */
 
 static tree
 sra_build_assignment (tree dst, tree src)
 {
+  /* Turning BIT_FIELD_REFs into bit operations enables other passes
+     to do a much better job at optimizing the code.  */
+  if (scalar_bitfield_p (src))
+    {
+      tree cst, cst2, mask, minshift, maxshift;
+      tree tmp, var, utype, stype;
+      tree list, stmt;
+      bool unsignedp = BIT_FIELD_REF_UNSIGNED (src);
+
+      var = TREE_OPERAND (src, 0);
+      cst = TREE_OPERAND (src, 2);
+      cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
+			 TREE_OPERAND (src, 2));
+
+      if (BITS_BIG_ENDIAN)
+	{
+	  maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+	  minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+	}
+      else
+	{
+	  maxshift = cst2;
+	  minshift = cst;
+	}
+
+      stype = TREE_TYPE (var);
+      if (!INTEGRAL_TYPE_P (stype))
+	stype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (stype)), 1);
+      else if (!TYPE_UNSIGNED (stype))
+	stype = unsigned_type_for (stype);
+
+      utype = TREE_TYPE (dst);
+      if (!INTEGRAL_TYPE_P (utype))
+	utype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (utype)), 1);
+      else if (!TYPE_UNSIGNED (utype))
+	utype = unsigned_type_for (utype);
+
+      list = NULL;
+
+      cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
+      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+	{
+	  unsignedp = true;
+	  mask = NULL_TREE;
+	}
+      else
+	{
+	  mask = build_int_cst_wide (utype, 1, 0);
+	  cst = int_const_binop (LSHIFT_EXPR, mask, cst2, true);
+	  mask = int_const_binop (MINUS_EXPR, cst, mask, true);
+	}
+
+      tmp = make_rename_temp (utype, "SR");
+      if (TYPE_MAIN_VARIANT (TREE_TYPE (var)) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_convert (stype, var));
+	  else
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_build1 (VIEW_CONVERT_EXPR,
+							  stype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!integer_zerop (minshift))
+	{
+	  tmp = make_rename_temp (stype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (RSHIFT_EXPR, stype,
+							var, minshift));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (!mask && unsignedp
+	      && (TYPE_MAIN_VARIANT (utype)
+		  == TYPE_MAIN_VARIANT (TREE_TYPE (dst))))
+	    tmp = dst;
+	  else
+	    tmp = make_rename_temp (utype, "SR");
+
+	  stmt = build_gimple_modify_stmt (tmp, fold_convert (utype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (mask)
+	{
+	  if (!unsignedp
+	      || (TYPE_MAIN_VARIANT (TREE_TYPE (dst))
+		  != TYPE_MAIN_VARIANT (utype)))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_AND_EXPR, utype,
+							var, mask));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!unsignedp)
+	{
+	  tree signbit = int_const_binop (LSHIFT_EXPR,
+					  build_int_cst_wide (utype, 1, 0),
+					  size_binop (MINUS_EXPR, cst2,
+						      bitsize_int (1)),
+					  true);
+
+	  tmp = make_rename_temp (utype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_XOR_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (dst)) != TYPE_MAIN_VARIANT (utype))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (MINUS_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (var != dst)
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (dst)))
+	    var = fold_convert (TREE_TYPE (dst), var);
+	  else
+	    var = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (dst), var);
+
+	  stmt = build_gimple_modify_stmt (dst, var);
+	  append_to_statement_list (stmt, &list);
+	}
+
+      return list;
+    }
+
   /* It was hoped that we could perform some type sanity checking
      here, but since front-ends can emit accesses of fields in types
      different from their nominal types and copy structures containing
@@ -1741,6 +2265,242 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : unshare_expr (t))
+
+/* Emit an assignment from SRC to DST, but if DST is a scalarizable
+   BIT_FIELD_REF, turn it into bit operations.  */
+
+static tree
+sra_build_bf_assignment (tree dst, tree src)
+{
+  tree var, type, utype, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF)
+    return sra_build_assignment (dst, src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  if (!scalar_bitfield_p (dst))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = NULL;
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (BITS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+      minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+  if (!INTEGRAL_TYPE_P (type))
+    type = lang_hooks.types.type_for_size
+      (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (var))), 1);
+  if (TYPE_UNSIGNED (type))
+    utype = type;
+  else
+    utype = unsigned_type_for (type);
+
+  mask = build_int_cst_wide (utype, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, true);
+  mask = fold_build1 (BIT_NOT_EXPR, utype, mask);
+
+  if (TYPE_MAIN_VARIANT (type) != TYPE_MAIN_VARIANT (TREE_TYPE (var))
+      && !integer_zerop (mask))
+    {
+      tmp = var;
+      if (!is_gimple_variable (tmp))
+	tmp = unshare_expr (var);
+
+      tmp2 = make_rename_temp (type, "SR");
+
+      if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	stmt = build_gimple_modify_stmt (tmp2, fold_convert (type, tmp));
+      else
+	stmt = build_gimple_modify_stmt (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+							    type, tmp));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp2 = var;
+
+  if (!integer_zerop (mask))
+    {
+      tmp = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp,
+				       fold_build2 (BIT_AND_EXPR, utype,
+						    tmp2, mask));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp = mask;
+
+  if (is_gimple_reg (src) && INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    tmp2 = src;
+  else if (INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    {
+      tmp2 = make_rename_temp (unsigned_type_for (TREE_TYPE (src)), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    {
+      tmp2 = make_rename_temp
+	(lang_hooks.types.type_for_size
+	 (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (src))),
+	  1), "SR");
+      stmt = sra_build_assignment (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (tmp2), src));
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2)))
+    {
+      tree ut = unsigned_type_for (TREE_TYPE (tmp2));
+      tmp3 = make_rename_temp (ut, "SR");
+      tmp2 = fold_convert (ut, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      tmp2 = fold_convert (utype, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, utype,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (utype != TREE_TYPE (var))
+    tmp3 = make_rename_temp (utype, "SR");
+  else
+    tmp3 = var;
+  stmt = build_gimple_modify_stmt (tmp3,
+				   fold_build2 (BIT_IOR_EXPR, utype,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  if (tmp3 != var)
+    {
+      if (TREE_TYPE (var) == type)
+	stmt = build_gimple_modify_stmt (var,
+					 fold_convert (type, tmp3));
+      else
+	stmt = build_gimple_modify_stmt (var,
+					 fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (var), tmp3));
+      append_to_statement_list (stmt, &list);
+    }
+
+  return list;
+}
+
+/* Expand an assignment of SRC to the scalarized representation of
+   ELT.  If it is a field group, try to widen the assignment to cover
+   the full variable.  */
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, tmp, cst, cst2, list, stmt;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      cst = TYPE_SIZE (TREE_TYPE (var));
+      cst2 = size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+			 TREE_OPERAND (dst, 2));
+
+      src = TREE_OPERAND (src, 0);
+
+      /* Avoid full-width bit-fields.  */
+      if (integer_zerop (cst2)
+	  && tree_int_cst_equal (cst, TYPE_SIZE (TREE_TYPE (src))))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (src))
+	      && !TYPE_UNSIGNED (TREE_TYPE (src)))
+	    src = fold_convert (unsigned_type_for (TREE_TYPE (src)), src);
+
+	  /* If a single conversion won't do, we'll need a statement
+	     list.  */
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (var))
+	      != TYPE_MAIN_VARIANT (TREE_TYPE (src)))
+	    {
+	      list = NULL;
+
+	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src))
+		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+		src = fold_build1 (VIEW_CONVERT_EXPR,
+				   lang_hooks.types.type_for_size
+				   (TREE_INT_CST_LOW
+				    (TYPE_SIZE (TREE_TYPE (src))),
+				    1), src);
+
+	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
+	      stmt = build_gimple_modify_stmt (tmp, src);
+	      append_to_statement_list (stmt, &list);
+
+	      stmt = sra_build_assignment (var,
+					   fold_convert (TREE_TYPE (var),
+							 tmp));
+	      append_to_statement_list (stmt, &list);
+
+	      return list;
+	    }
+
+	  src = fold_convert (TREE_TYPE (var), src);
+	}
+      else
+	{
+	  src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var), src, cst, cst2);
+	  BIT_FIELD_REF_UNSIGNED (src) = 1;
+	}
+
+      return sra_build_assignment (var, src);
+    }
+
+  return sra_build_bf_assignment (dst, src);
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1764,16 +2524,16 @@ generate_copy_inout (struct sra_elt *elt
       i = c->replacement;
 
       t = build2 (COMPLEX_EXPR, elt->type, r, i);
-      t = sra_build_assignment (expr, t);
+      t = sra_build_bf_assignment (expr, t);
       SSA_NAME_DEF_STMT (expr) = t;
       append_to_statement_list (t, list_p);
     }
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_bf_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2558,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2581,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2602,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2613,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2622,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2654,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2033,6 +2807,174 @@ sra_replace (block_stmt_iterator *bsi, t
     bsi_prev (bsi);
 }
 
+/* Return true if a BIT_FIELD_REF<(FLD->parent), BLEN, BPOS>
+   expression (refereced as BF below) accesses any of the bits in FLD,
+   false if it doesn't.  If DATA is non-null, the bit length of FLD
+   will be stored in DATA[0] and the initial bit position of FLD will
+   be stored in DATA[1], the number of bits of FLD that overlap with
+   BF will be stored in DATA[2], and the initial bit of FLD that
+   overlaps with BF will be stored in DATA[3].  */
+
+static bool
+bitfield_overlaps_p (tree blen, tree bpos, struct sra_elt *fld, tree *data)
+{
+  tree flen, fpos;
+  bool ret;
+
+  if (TREE_CODE (fld->element) == FIELD_DECL)
+    {
+      flen = DECL_SIZE (fld->element);
+      fpos = DECL_FIELD_OFFSET (fld->element);
+      fpos = int_const_binop (MULT_EXPR, fpos,
+			      bitsize_int (BITS_PER_UNIT), true);
+      fpos = int_const_binop (PLUS_EXPR, fpos,
+			      DECL_FIELD_BIT_OFFSET (fld->element), true);
+    }
+  else if (TREE_CODE (fld->element) == BIT_FIELD_REF)
+    {
+      flen = TREE_OPERAND (fld->element, 1);
+      fpos = TREE_OPERAND (fld->element, 2);
+    }
+  else
+    gcc_unreachable ();
+
+  gcc_assert (host_integerp (blen, 1)
+	      && host_integerp (bpos, 1)
+	      && host_integerp (flen, 1)
+	      && host_integerp (fpos, 1));
+
+  ret = ((!tree_int_cst_lt (fpos, bpos)
+	  && tree_int_cst_lt (int_const_binop (MINUS_EXPR, fpos, bpos, true),
+			      blen))
+	 || (!tree_int_cst_lt (bpos, fpos)
+	     && tree_int_cst_lt (int_const_binop (MINUS_EXPR, bpos, fpos, true),
+				 flen)));
+
+  if (!ret)
+    return ret;
+
+  if (data)
+    {
+      tree bend, fend;
+
+      data[0] = flen;
+      data[1] = fpos;
+
+      fend = int_const_binop (PLUS_EXPR, fpos, flen, true);
+      bend = int_const_binop (PLUS_EXPR, bpos, blen, true);
+
+      if (tree_int_cst_lt (bend, fend))
+	data[2] = int_const_binop (MINUS_EXPR, bend, fpos, true);
+      else
+	data[2] = NULL;
+
+      if (tree_int_cst_lt (fpos, bpos))
+	{
+	  data[3] = int_const_binop (MINUS_EXPR, bpos, fpos, true);
+	  data[2] = int_const_binop (MINUS_EXPR, data[2] ? data[2] : data[0],
+				     data[3], true);
+	}
+      else
+	data[3] = NULL;
+    }
+
+  return ret;
+}
+
+static void
+sra_explode_bitfield_assignment (tree var, tree vpos, bool to_var,
+				 tree *listp, tree blen, tree bpos,
+				 struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  tree flp[4];
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    {
+      if (!bitfield_overlaps_p (blen, bpos, fld, flp))
+	continue;
+
+      if (fld->replacement)
+	{
+	  tree infld, invar, st;
+	  tree flen = flp[2] ? flp[2] : flp[0];
+	  tree fpos = flp[3] ? flp[3] : bitsize_int (0);
+
+	  infld = fld->replacement;
+
+	  if (TREE_CODE (infld) == BIT_FIELD_REF)
+	    {
+	      fpos = int_const_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2),
+				      true);
+	      infld = TREE_OPERAND (infld, 0);
+	    }
+
+	  infld = fold_build3 (BIT_FIELD_REF,
+			       lang_hooks.types.type_for_size
+			       (TREE_INT_CST_LOW (flen), 1),
+			       infld, flen, fpos);
+	  BIT_FIELD_REF_UNSIGNED (infld) = 1;
+
+	  invar = int_const_binop (MINUS_EXPR, flp[1], bpos, true);
+	  if (flp[3])
+	    invar = int_const_binop (PLUS_EXPR, invar, flp[3], true);
+	  invar = int_const_binop (PLUS_EXPR, invar, vpos, true);
+
+	  invar = fold_build3 (BIT_FIELD_REF, TREE_TYPE (infld),
+			       var, flen, invar);
+	  BIT_FIELD_REF_UNSIGNED (invar) = 1;
+
+	  if (to_var)
+	    st = sra_build_bf_assignment (invar, infld);
+	  else
+	    st = sra_build_bf_assignment (infld, invar);
+
+	  append_to_statement_list (st, listp);
+	}
+      else
+	{
+	  tree sub = int_const_binop (MINUS_EXPR, flp[1], bpos, true);
+	  sub = int_const_binop (PLUS_EXPR, vpos, sub, true);
+	  if (flp[3])
+	    sub = int_const_binop (PLUS_EXPR, sub, flp[3], true);
+
+	  sra_explode_bitfield_assignment (var, sub, to_var, listp,
+					   flp[2] ? flp[2] : flp[0],
+					   flp[3] ? flp[3] : bitsize_int (0),
+					   fld);
+	}
+    }
+}
+
+static void
+sra_sync_for_bitfield_assignment (tree *listbeforep, tree *listafterp,
+				  tree blen, tree bpos,
+				  struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  tree flp[4];
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    if (bitfield_overlaps_p (blen, bpos, fld, flp))
+      {
+	if (fld->replacement || (!flp[2] && !flp[3]))
+	  {
+	    generate_copy_inout (fld, false, generate_element_ref (fld),
+				 listbeforep);
+	    mark_no_warning (fld);
+	    if (listafterp)
+	      generate_copy_inout (fld, true, generate_element_ref (fld),
+				   listafterp);
+	  }
+	else
+	  sra_sync_for_bitfield_assignment (listbeforep, listafterp,
+					    flp[2] ? flp[2] : flp[0],
+					    flp[3] ? flp[3]
+					    : bitsize_int (0),
+					    fld);
+      }
+}
+
 /* Scalarize a USE.  To recap, this is either a simple reference to ELT,
    if elt is scalar, or some occurrence of ELT that requires a complete
    aggregate.  IS_OUTPUT is true if ELT is being modified.  */
@@ -2041,23 +2983,144 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
+  tree bfexpr;
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = NULL;
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
+  else if (use_all && is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 0)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (bfexpr, 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree listbefore = NULL, listafter = NULL;
+      tree blen = TREE_OPERAND (bfexpr, 1);
+      tree bpos = TREE_OPERAND (bfexpr, 2);
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var = make_rename_temp (type, "SR"), tmp, st;
+
+	  GIMPLE_STMT_OPERAND (stmt, 0) = var;
+	  update = true;
+
+	  if (!TYPE_UNSIGNED (type))
+	    {
+	      type = unsigned_type_for (type);
+	      tmp = make_rename_temp (type, "SR");
+	      st = build_gimple_modify_stmt (tmp,
+					     fold_convert (type, var));
+	      append_to_statement_list (st, &listafter);
+	      var = tmp;
+	    }
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), false, &listafter, blen, bpos, elt);
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&listbefore, &listafter, blen, bpos, elt);
+
+      if (listbefore)
+	{
+	  mark_all_v_defs (listbefore);
+	  sra_insert_before (bsi, listbefore);
+	}
+      if (listafter)
+	{
+	  mark_all_v_defs (listafter);
+	  sra_insert_after (bsi, listafter);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
+  else if (use_all && !is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 1)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (GIMPLE_STMT_OPERAND (stmt, 1), 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree list = NULL;
+      tree blen = TREE_OPERAND (bfexpr, 1);
+      tree bpos = TREE_OPERAND (bfexpr, 2);
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var;
+
+	  if (!TYPE_UNSIGNED (type))
+	    type = unsigned_type_for (type);
+
+	  var = make_rename_temp (type, "SR");
+
+	  append_to_statement_list (build_gimple_modify_stmt
+				    (var, build_int_cst_wide (type, 0, 0)),
+				    &list);
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), true, &list, blen, bpos, elt);
+
+	  GIMPLE_STMT_OPERAND (stmt, 1) = var;
+	  update = true;
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&list, NULL, blen, bpos, elt);
+
+      if (list)
+	{
+	  mark_all_v_defs (list);
+	  sra_insert_before (bsi, list);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +3164,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,20 +3318,31 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
 	{
 	  tree_stmt_iterator tsi;
-	  tree first;
+	  tree first, blist = NULL;
+	  bool thr = (bsi->bb->flags & EDGE_COMPLEX) != 0;
 
 	  /* Extract the first statement from LIST.  */
 	  tsi = tsi_start (list);
-	  first = tsi_stmt (tsi);
+	  for (first = tsi_stmt (tsi);
+	       thr && !tsi_end_p (tsi) && !tree_could_throw_p (first);
+	       first = tsi_stmt (tsi))
+	    {
+	      tsi_delink (&tsi);
+	      append_to_statement_list (first, &blist);
+	    }
+
 	  tsi_delink (&tsi);
 
+	  if (blist)
+	    sra_insert_before (bsi, blist);
+
 	  /* Replace the old statement with this new representative.  */
 	  bsi_replace (bsi, first, true);
 
@@ -2352,6 +3426,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-05-31 20:57                                             ` Alexandre Oliva
@ 2007-06-02 17:41                                               ` Bernd Schmidt
  2007-06-06  3:18                                                 ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-06-02 17:41 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

Alexandre Oliva wrote:

> It took me a while, but I've managed to address Roman's request to
> remove full-width BIT_FIELD_REFs (it required some fiddling with EH
> annotations),

Details?  Comments in the code would be best.

> to verify the absence of regressions in Andrew Pinski's
> testcase, and to handle your request for improving the copying of
> fields in and out of structs accessed with BIT_FIELD_REFs.
> 
> I've not only arranged for GCC to sync only the fields covered by the
> BIT_FIELD_REF, but also for it to try to explode the BIT_FIELD_REF
> access to accesses to the scalarized fields, should the containing
> struct be fully scalarized.  This will then enable the original struct
> to be thrown away even when it's accessed by BIT_FIELD_REFs.

Nice.

> So, instead of applying the previous patch and posting the
> differences, I figured it would be cleaner to post the whole thing
> again.  I hope you don't mind.  If you prefer, it wouldn't be too hard
> to split the cosmetical changes on top of the previous patch from the
> behavioral changes made along the way, but I'd rather not take the
> risk of messing things up by doing this.

It's up to you how you want to proceed.  If you want, you can install
the previous patch with bugfixes - I haven't had enough time yet to
fully understand the new code.  A few preliminary comments:

> +/* Return true if a BIT_FIELD_REF<(FLD->parent), BLEN, BPOS>
> +   expression (refereced as BF below) accesses any of the bits in FLD,
> +   false if it doesn't.  If DATA is non-null, the bit length of FLD
> +   will be stored in DATA[0] and the initial bit position of FLD will
> +   be stored in DATA[1], the number of bits of FLD that overlap with
> +   BF will be stored in DATA[2], and the initial bit of FLD that
> +   overlaps with BF will be stored in DATA[3].  */
> +
> +static bool
> +bitfield_overlaps_p (tree blen, tree bpos, struct sra_elt *fld, tree *data)

I'd rather have four separate, named variables.  This is too opaque for
the reader.

> +static void
> +sra_explode_bitfield_assignment (tree var, tree vpos, bool to_var,
> +				 tree *listp, tree blen, tree bpos,
> +				 struct sra_elt *elt)

> +static void
> +sra_sync_for_bitfield_assignment (tree *listbeforep, tree *listafterp,
> +				  tree blen, tree bpos,
> +				  struct sra_elt *elt)

No comments for these functions.

>        /* Preserve EH semantics.  */
>        if (stmt_ends_bb_p (stmt))
>  	{
>  	  tree_stmt_iterator tsi;
> -	  tree first;
> +	  tree first, blist = NULL;
> +	  bool thr = (bsi->bb->flags & EDGE_COMPLEX) != 0;
>  
>  	  /* Extract the first statement from LIST.  */
>  	  tsi = tsi_start (list);
> -	  first = tsi_stmt (tsi);
> +	  for (first = tsi_stmt (tsi);
> +	       thr && !tsi_end_p (tsi) && !tree_could_throw_p (first);
> +	       first = tsi_stmt (tsi))
> +	    {
> +	      tsi_delink (&tsi);
> +	      append_to_statement_list (first, &blist);
> +	    }
> +
>  	  tsi_delink (&tsi);
>  
> +	  if (blist)
> +	    sra_insert_before (bsi, blist);
> +

This could use more comments as to what's going on.


Bernd
-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-06-02 17:41                                               ` Bernd Schmidt
@ 2007-06-06  3:18                                                 ` Alexandre Oliva
  2007-06-25 18:48                                                   ` Alexandre Oliva
  2007-06-26 23:01                                                   ` Bernd Schmidt
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-06-06  3:18 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Roman Zippel, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 2082 bytes --]

On Jun  2, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> Alexandre Oliva wrote:
>> It took me a while, but I've managed to address Roman's request to
>> remove full-width BIT_FIELD_REFs (it required some fiddling with EH
>> annotations),

> Details?  Comments in the code would be best.

As for comments, see below.  It's just that we can't assume any longer
that the first instruction in a replacement sequence is the one
containing the memory access that may trap.

>> +/* Return true if a BIT_FIELD_REF<(FLD->parent), BLEN, BPOS>
>> +   expression (refereced as BF below) accesses any of the bits in FLD,
>> +   false if it doesn't.  If DATA is non-null, the bit length of FLD
>> +   will be stored in DATA[0] and the initial bit position of FLD will
>> +   be stored in DATA[1], the number of bits of FLD that overlap with
>> +   BF will be stored in DATA[2], and the initial bit of FLD that
>> +   overlaps with BF will be stored in DATA[3].  */
>> +
>> +static bool
>> +bitfield_overlaps_p (tree blen, tree bpos, struct sra_elt *fld, tree *data)

> I'd rather have four separate, named variables.  This is too opaque for
> the reader.

How about a single struct, as in the patch below?

>> +static void
>> +sra_explode_bitfield_assignment (tree var, tree vpos, bool to_var,
>> +				 tree *listp, tree blen, tree bpos,
>> +				 struct sra_elt *elt)

>> +static void
>> +sra_sync_for_bitfield_assignment (tree *listbeforep, tree *listafterp,
>> +				  tree blen, tree bpos,
>> +				  struct sra_elt *elt)

> No comments for these functions.

Ugh, sorry.  I even remember having started a pass over the patch
adding missing comments for new functions, but I must have got off
track.  Thanks for catching these.

>> /* Preserve EH semantics.  */

> This could use more comments as to what's going on.

Is what I added to this revised patch enough?


Bootstrapped and regtested on x86_64-linux-gnu.  Ok?

Even if it's approved, I probably won't check it in before Sunday,
since I'm going to be on the road until then, with good internet
access but little time for it :-(


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 46542 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Gimple-zero-initialize all bit-field variables that are not
	part of parameters that are going to be scalarized on entry.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest possible
	access mode from decl or member type, but clip it at word
	size, and only widen it if a field crosses an alignment
	boundary.
	(instantiate_missing_elements): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(scalar_bitfield_p): New.
	(sra_build_assignment): Optimize assignments from scalarizable
	BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
	counts.
	(REPLDUP): New.
	(sra_build_bf_assignment): New.  Optimize assignments to
	scalarizable BIT_FIELD_REFs.
	(sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
	assignments to full variables.
	(generate_copy_inout): Use the new macros and functions.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_init): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(bitfield_overlap_info): New struct.
	(bitfield_overlaps_p): New.
	(sra_explode_bitfield_assignment): New.
	(sra_sync_for_bitfield_assignment): New.
	(scalarize_use): Re-expand assignment to scalarized
	BIT_FIELD_REFs of gimple_regs.  Explode or sync needed members
	for BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.  Adjust EH
	handling.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-06-01 04:19:10.000000000 -0300
+++ gcc/tree-sra.c	2007-06-04 20:26:15.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -205,6 +209,9 @@ extern void debug_sra_elt_name (struct s
 
 /* Forward declarations.  */
 static tree generate_element_ref (struct sra_elt *);
+static tree sra_build_assignment (tree dst, tree src);
+static void mark_all_v_defs (tree list);
+
 \f
 /* Return true if DECL is an SRA candidate.  */
 
@@ -461,6 +468,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +492,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +512,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
+
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
 
-  if (a->parent != b->parent)
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +557,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -888,7 +917,7 @@ sra_walk_asm_expr (tree expr, block_stmt
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1206,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1247,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1280,7 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
     }
   else
     {
@@ -1251,6 +1289,20 @@ instantiate_element (struct sra_elt *elt
       TREE_NO_WARNING (var) = 1;
     }
 
+  /* Zero-initialize bit-field scalarization variables, to avoid
+     triggering undefined behavior, unless the base variable is an
+     argument that is going to get scalarized out as the first thing,
+     providing for the initializers we need.  */
+  if (TREE_CODE (elt->element) == BIT_FIELD_REF
+      && (TREE_CODE (base) != PARM_DECL
+	  || !bitmap_bit_p (needs_copy_in, DECL_UID (base))))
+    {
+      tree init = sra_build_assignment (var, fold_convert (TREE_TYPE (var),
+							   integer_zero_node));
+      insert_edge_copies (init, ENTRY_BLOCK_PTR);
+      mark_all_v_defs (init);
+    }
+
   if (dump_file)
     {
       fputs ("  ", dump_file);
@@ -1337,7 +1389,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1400,303 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  int count;
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+
+  /* For complex and array objects, there are going to be integer
+     literals as child elements.  In this case, we can't just take the
+     alignment and mode of the decl, so we instead rely on the element
+     type.
+
+     ??? We could try to infer additional alignment from the full
+     object declaration and the location of the sub-elements we're
+     accessing.  */
+  for (count = 0; !DECL_P (block->element); count++)
+    block = block->parent;
+
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+
+  if (count)
+    {
+      type = TREE_TYPE (block->element);
+      while (count--)
+	type = TREE_TYPE (type);
+
+      align = TYPE_ALIGN (type);
+      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
+    }
+
+  if (align < alchk)
+    align = alchk;
+
+  /* Coalescing wider fields is probably pointless and
+     inefficient.  */
+  if (align > BITS_PER_WORD)
+    align = BITS_PER_WORD;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1712,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +2034,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = unshare_expr (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1725,11 +2080,180 @@ generate_element_ref (struct sra_elt *el
     return elt->element;
 }
 
+/* Return true if BF is a bit-field that we can handle like a scalar.  */
+
+static bool
+scalar_bitfield_p (tree bf)
+{
+  return (TREE_CODE (bf) == BIT_FIELD_REF
+	  && (is_gimple_reg (TREE_OPERAND (bf, 0))
+	      || (TYPE_MODE (TREE_TYPE (TREE_OPERAND (bf, 0))) != BLKmode
+		  && (!TREE_SIDE_EFFECTS (TREE_OPERAND (bf, 0))
+		      || (GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE
+						       (TREE_OPERAND (bf, 0))))
+			  <= BITS_PER_WORD)))));
+}
+
 /* Create an assignment statement from SRC to DST.  */
 
 static tree
 sra_build_assignment (tree dst, tree src)
 {
+  /* Turning BIT_FIELD_REFs into bit operations enables other passes
+     to do a much better job at optimizing the code.  */
+  if (scalar_bitfield_p (src))
+    {
+      tree cst, cst2, mask, minshift, maxshift;
+      tree tmp, var, utype, stype;
+      tree list, stmt;
+      bool unsignedp = BIT_FIELD_REF_UNSIGNED (src);
+
+      var = TREE_OPERAND (src, 0);
+      cst = TREE_OPERAND (src, 2);
+      cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
+			 TREE_OPERAND (src, 2));
+
+      if (BITS_BIG_ENDIAN)
+	{
+	  maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+	  minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+	}
+      else
+	{
+	  maxshift = cst2;
+	  minshift = cst;
+	}
+
+      stype = TREE_TYPE (var);
+      if (!INTEGRAL_TYPE_P (stype))
+	stype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (stype)), 1);
+      else if (!TYPE_UNSIGNED (stype))
+	stype = unsigned_type_for (stype);
+
+      utype = TREE_TYPE (dst);
+      if (!INTEGRAL_TYPE_P (utype))
+	utype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (utype)), 1);
+      else if (!TYPE_UNSIGNED (utype))
+	utype = unsigned_type_for (utype);
+
+      list = NULL;
+
+      cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
+      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+	{
+	  unsignedp = true;
+	  mask = NULL_TREE;
+	}
+      else
+	{
+	  mask = build_int_cst_wide (utype, 1, 0);
+	  cst = int_const_binop (LSHIFT_EXPR, mask, cst2, true);
+	  mask = int_const_binop (MINUS_EXPR, cst, mask, true);
+	}
+
+      tmp = make_rename_temp (utype, "SR");
+      if (TYPE_MAIN_VARIANT (TREE_TYPE (var)) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_convert (stype, var));
+	  else
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_build1 (VIEW_CONVERT_EXPR,
+							  stype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!integer_zerop (minshift))
+	{
+	  tmp = make_rename_temp (stype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (RSHIFT_EXPR, stype,
+							var, minshift));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (!mask && unsignedp
+	      && (TYPE_MAIN_VARIANT (utype)
+		  == TYPE_MAIN_VARIANT (TREE_TYPE (dst))))
+	    tmp = dst;
+	  else
+	    tmp = make_rename_temp (utype, "SR");
+
+	  stmt = build_gimple_modify_stmt (tmp, fold_convert (utype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (mask)
+	{
+	  if (!unsignedp
+	      || (TYPE_MAIN_VARIANT (TREE_TYPE (dst))
+		  != TYPE_MAIN_VARIANT (utype)))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_AND_EXPR, utype,
+							var, mask));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!unsignedp)
+	{
+	  tree signbit = int_const_binop (LSHIFT_EXPR,
+					  build_int_cst_wide (utype, 1, 0),
+					  size_binop (MINUS_EXPR, cst2,
+						      bitsize_int (1)),
+					  true);
+
+	  tmp = make_rename_temp (utype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_XOR_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (dst)) != TYPE_MAIN_VARIANT (utype))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (MINUS_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (var != dst)
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (dst)))
+	    var = fold_convert (TREE_TYPE (dst), var);
+	  else
+	    var = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (dst), var);
+
+	  stmt = build_gimple_modify_stmt (dst, var);
+	  append_to_statement_list (stmt, &list);
+	}
+
+      return list;
+    }
+
   /* It was hoped that we could perform some type sanity checking
      here, but since front-ends can emit accesses of fields in types
      different from their nominal types and copy structures containing
@@ -1741,6 +2265,242 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : unshare_expr (t))
+
+/* Emit an assignment from SRC to DST, but if DST is a scalarizable
+   BIT_FIELD_REF, turn it into bit operations.  */
+
+static tree
+sra_build_bf_assignment (tree dst, tree src)
+{
+  tree var, type, utype, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF)
+    return sra_build_assignment (dst, src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  if (!scalar_bitfield_p (dst))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = NULL;
+
+  cst = TREE_OPERAND (dst, 2);
+  cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (dst, 1),
+		     TREE_OPERAND (dst, 2));
+
+  if (BITS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+      minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+  if (!INTEGRAL_TYPE_P (type))
+    type = lang_hooks.types.type_for_size
+      (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (var))), 1);
+  if (TYPE_UNSIGNED (type))
+    utype = type;
+  else
+    utype = unsigned_type_for (type);
+
+  mask = build_int_cst_wide (utype, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, true);
+  mask = fold_build1 (BIT_NOT_EXPR, utype, mask);
+
+  if (TYPE_MAIN_VARIANT (type) != TYPE_MAIN_VARIANT (TREE_TYPE (var))
+      && !integer_zerop (mask))
+    {
+      tmp = var;
+      if (!is_gimple_variable (tmp))
+	tmp = unshare_expr (var);
+
+      tmp2 = make_rename_temp (type, "SR");
+
+      if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	stmt = build_gimple_modify_stmt (tmp2, fold_convert (type, tmp));
+      else
+	stmt = build_gimple_modify_stmt (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+							    type, tmp));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp2 = var;
+
+  if (!integer_zerop (mask))
+    {
+      tmp = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp,
+				       fold_build2 (BIT_AND_EXPR, utype,
+						    tmp2, mask));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp = mask;
+
+  if (is_gimple_reg (src) && INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    tmp2 = src;
+  else if (INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    {
+      tmp2 = make_rename_temp (unsigned_type_for (TREE_TYPE (src)), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    {
+      tmp2 = make_rename_temp
+	(lang_hooks.types.type_for_size
+	 (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (src))),
+	  1), "SR");
+      stmt = sra_build_assignment (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (tmp2), src));
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2)))
+    {
+      tree ut = unsigned_type_for (TREE_TYPE (tmp2));
+      tmp3 = make_rename_temp (ut, "SR");
+      tmp2 = fold_convert (ut, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (type))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      tmp2 = fold_convert (utype, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, utype,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (utype != TREE_TYPE (var))
+    tmp3 = make_rename_temp (utype, "SR");
+  else
+    tmp3 = var;
+  stmt = build_gimple_modify_stmt (tmp3,
+				   fold_build2 (BIT_IOR_EXPR, utype,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  if (tmp3 != var)
+    {
+      if (TREE_TYPE (var) == type)
+	stmt = build_gimple_modify_stmt (var,
+					 fold_convert (type, tmp3));
+      else
+	stmt = build_gimple_modify_stmt (var,
+					 fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (var), tmp3));
+      append_to_statement_list (stmt, &list);
+    }
+
+  return list;
+}
+
+/* Expand an assignment of SRC to the scalarized representation of
+   ELT.  If it is a field group, try to widen the assignment to cover
+   the full variable.  */
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, tmp, cst, cst2, list, stmt;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      cst = TYPE_SIZE (TREE_TYPE (var));
+      cst2 = size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+			 TREE_OPERAND (dst, 2));
+
+      src = TREE_OPERAND (src, 0);
+
+      /* Avoid full-width bit-fields.  */
+      if (integer_zerop (cst2)
+	  && tree_int_cst_equal (cst, TYPE_SIZE (TREE_TYPE (src))))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (src))
+	      && !TYPE_UNSIGNED (TREE_TYPE (src)))
+	    src = fold_convert (unsigned_type_for (TREE_TYPE (src)), src);
+
+	  /* If a single conversion won't do, we'll need a statement
+	     list.  */
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (var))
+	      != TYPE_MAIN_VARIANT (TREE_TYPE (src)))
+	    {
+	      list = NULL;
+
+	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src))
+		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+		src = fold_build1 (VIEW_CONVERT_EXPR,
+				   lang_hooks.types.type_for_size
+				   (TREE_INT_CST_LOW
+				    (TYPE_SIZE (TREE_TYPE (src))),
+				    1), src);
+
+	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
+	      stmt = build_gimple_modify_stmt (tmp, src);
+	      append_to_statement_list (stmt, &list);
+
+	      stmt = sra_build_assignment (var,
+					   fold_convert (TREE_TYPE (var),
+							 tmp));
+	      append_to_statement_list (stmt, &list);
+
+	      return list;
+	    }
+
+	  src = fold_convert (TREE_TYPE (var), src);
+	}
+      else
+	{
+	  src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var), src, cst, cst2);
+	  BIT_FIELD_REF_UNSIGNED (src) = 1;
+	}
+
+      return sra_build_assignment (var, src);
+    }
+
+  return sra_build_bf_assignment (dst, src);
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1764,16 +2524,16 @@ generate_copy_inout (struct sra_elt *elt
       i = c->replacement;
 
       t = build2 (COMPLEX_EXPR, elt->type, r, i);
-      t = sra_build_assignment (expr, t);
+      t = sra_build_bf_assignment (expr, t);
       SSA_NAME_DEF_STMT (expr) = t;
       append_to_statement_list (t, list_p);
     }
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_bf_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2558,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2581,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2602,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2613,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2622,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2654,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2033,6 +2807,214 @@ sra_replace (block_stmt_iterator *bsi, t
     bsi_prev (bsi);
 }
 
+/* Data structure that bitfield_overlaps_p fills in with information
+   about the element passed in and how much of it overlaps with the
+   bit-range passed it to.  */
+
+struct bitfield_overlap_info
+{
+  /* The bit-length of an element.  */
+  tree field_len;
+
+  /* The bit-position of the element in its parent.  */
+  tree field_pos;
+
+  /* The number of bits of the element that overlap with the incoming
+     bit range.  */
+  tree overlap_len;
+
+  /* The first bit of the element that overlaps with the incoming bit
+     range.  */
+  tree overlap_pos;
+};
+
+/* Return true if a BIT_FIELD_REF<(FLD->parent), BLEN, BPOS>
+   expression (refereced as BF below) accesses any of the bits in FLD,
+   false if it doesn't.  If DATA is non-null, its field_len and
+   field_pos are filled in such that BIT_FIELD_REF<(FLD->parent),
+   field_len, field_pos> (referenced as BFLD below) represents the
+   entire field FLD->element, and BIT_FIELD_REF<BFLD, overlap_len,
+   overlap_pos> represents the portion of the entire field that
+   overlaps with BF.  */
+
+static bool
+bitfield_overlaps_p (tree blen, tree bpos, struct sra_elt *fld,
+		     struct bitfield_overlap_info *data)
+{
+  tree flen, fpos;
+  bool ret;
+
+  if (TREE_CODE (fld->element) == FIELD_DECL)
+    {
+      flen = DECL_SIZE (fld->element);
+      fpos = DECL_FIELD_OFFSET (fld->element);
+      fpos = int_const_binop (MULT_EXPR, fpos,
+			      bitsize_int (BITS_PER_UNIT), true);
+      fpos = int_const_binop (PLUS_EXPR, fpos,
+			      DECL_FIELD_BIT_OFFSET (fld->element), true);
+    }
+  else if (TREE_CODE (fld->element) == BIT_FIELD_REF)
+    {
+      flen = TREE_OPERAND (fld->element, 1);
+      fpos = TREE_OPERAND (fld->element, 2);
+    }
+  else
+    gcc_unreachable ();
+
+  gcc_assert (host_integerp (blen, 1)
+	      && host_integerp (bpos, 1)
+	      && host_integerp (flen, 1)
+	      && host_integerp (fpos, 1));
+
+  ret = ((!tree_int_cst_lt (fpos, bpos)
+	  && tree_int_cst_lt (int_const_binop (MINUS_EXPR, fpos, bpos, true),
+			      blen))
+	 || (!tree_int_cst_lt (bpos, fpos)
+	     && tree_int_cst_lt (int_const_binop (MINUS_EXPR, bpos, fpos, true),
+				 flen)));
+
+  if (!ret)
+    return ret;
+
+  if (data)
+    {
+      tree bend, fend;
+
+      data->field_len = flen;
+      data->field_pos = fpos;
+
+      fend = int_const_binop (PLUS_EXPR, fpos, flen, true);
+      bend = int_const_binop (PLUS_EXPR, bpos, blen, true);
+
+      if (tree_int_cst_lt (bend, fend))
+	data->overlap_len = int_const_binop (MINUS_EXPR, bend, fpos, true);
+      else
+	data->overlap_len = NULL;
+
+      if (tree_int_cst_lt (fpos, bpos))
+	{
+	  data->overlap_pos = int_const_binop (MINUS_EXPR, bpos, fpos, true);
+	  data->overlap_len = int_const_binop (MINUS_EXPR,
+					       data->overlap_len
+					       ? data->overlap_len
+					       : data->field_len,
+					       data->overlap_pos, true);
+	}
+      else
+	data->overlap_pos = NULL;
+    }
+
+  return ret;
+}
+
+/* Add to LISTP a sequence of statements that copies BLEN bits between
+   VAR and the scalarized elements of ELT, starting a bit VPOS of VAR
+   and at bit BPOS of ELT.  The direction of the copy is given by
+   TO_VAR.  */
+
+static void
+sra_explode_bitfield_assignment (tree var, tree vpos, bool to_var,
+				 tree *listp, tree blen, tree bpos,
+				 struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  struct bitfield_overlap_info flp;
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    {
+      tree flen, fpos;
+
+      if (!bitfield_overlaps_p (blen, bpos, fld, &flp))
+	continue;
+
+      flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
+      fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
+
+      if (fld->replacement)
+	{
+	  tree infld, invar, st;
+
+	  infld = fld->replacement;
+
+	  if (TREE_CODE (infld) == BIT_FIELD_REF)
+	    {
+	      fpos = int_const_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2),
+				      true);
+	      infld = TREE_OPERAND (infld, 0);
+	    }
+
+	  infld = fold_build3 (BIT_FIELD_REF,
+			       lang_hooks.types.type_for_size
+			       (TREE_INT_CST_LOW (flen), 1),
+			       infld, flen, fpos);
+	  BIT_FIELD_REF_UNSIGNED (infld) = 1;
+
+	  invar = int_const_binop (MINUS_EXPR, flp.field_pos, bpos, true);
+	  if (flp.overlap_pos)
+	    invar = int_const_binop (PLUS_EXPR, invar, flp.overlap_pos, true);
+	  invar = int_const_binop (PLUS_EXPR, invar, vpos, true);
+
+	  invar = fold_build3 (BIT_FIELD_REF, TREE_TYPE (infld),
+			       var, flen, invar);
+	  BIT_FIELD_REF_UNSIGNED (invar) = 1;
+
+	  if (to_var)
+	    st = sra_build_bf_assignment (invar, infld);
+	  else
+	    st = sra_build_bf_assignment (infld, invar);
+
+	  append_to_statement_list (st, listp);
+	}
+      else
+	{
+	  tree sub = int_const_binop (MINUS_EXPR, flp.field_pos, bpos, true);
+	  sub = int_const_binop (PLUS_EXPR, vpos, sub, true);
+	  if (flp.overlap_pos)
+	    sub = int_const_binop (PLUS_EXPR, sub, flp.overlap_pos, true);
+
+	  sra_explode_bitfield_assignment (var, sub, to_var, listp,
+					   flen, fpos, fld);
+	}
+    }
+}
+
+/* Add to LISTBEFOREP statements that copy scalarized members of ELT
+   that overlap with BIT_FIELD_REF<(ELT->element), BLEN, BPOS> back
+   into the full variable, and to LISTAFTERP, if non-NULL, statements
+   that copy the (presumably modified) overlapping portions of the
+   full variable back to the scalarized variables.  */
+
+static void
+sra_sync_for_bitfield_assignment (tree *listbeforep, tree *listafterp,
+				  tree blen, tree bpos,
+				  struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  struct bitfield_overlap_info flp;
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    if (bitfield_overlaps_p (blen, bpos, fld, &flp))
+      {
+	if (fld->replacement || (!flp.overlap_len && !flp.overlap_pos))
+	  {
+	    generate_copy_inout (fld, false, generate_element_ref (fld),
+				 listbeforep);
+	    mark_no_warning (fld);
+	    if (listafterp)
+	      generate_copy_inout (fld, true, generate_element_ref (fld),
+				   listafterp);
+	  }
+	else
+	  {
+	    tree flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
+	    tree fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
+
+	    sra_sync_for_bitfield_assignment (listbeforep, listafterp,
+					      flen, fpos, fld);
+	  }
+      }
+}
+
 /* Scalarize a USE.  To recap, this is either a simple reference to ELT,
    if elt is scalar, or some occurrence of ELT that requires a complete
    aggregate.  IS_OUTPUT is true if ELT is being modified.  */
@@ -2041,23 +3023,144 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
+  tree bfexpr;
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = NULL;
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
+  else if (use_all && is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 0)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (bfexpr, 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree listbefore = NULL, listafter = NULL;
+      tree blen = TREE_OPERAND (bfexpr, 1);
+      tree bpos = TREE_OPERAND (bfexpr, 2);
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var = make_rename_temp (type, "SR"), tmp, st;
+
+	  GIMPLE_STMT_OPERAND (stmt, 0) = var;
+	  update = true;
+
+	  if (!TYPE_UNSIGNED (type))
+	    {
+	      type = unsigned_type_for (type);
+	      tmp = make_rename_temp (type, "SR");
+	      st = build_gimple_modify_stmt (tmp,
+					     fold_convert (type, var));
+	      append_to_statement_list (st, &listafter);
+	      var = tmp;
+	    }
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), false, &listafter, blen, bpos, elt);
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&listbefore, &listafter, blen, bpos, elt);
+
+      if (listbefore)
+	{
+	  mark_all_v_defs (listbefore);
+	  sra_insert_before (bsi, listbefore);
+	}
+      if (listafter)
+	{
+	  mark_all_v_defs (listafter);
+	  sra_insert_after (bsi, listafter);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
+  else if (use_all && !is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 1)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (GIMPLE_STMT_OPERAND (stmt, 1), 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree list = NULL;
+      tree blen = TREE_OPERAND (bfexpr, 1);
+      tree bpos = TREE_OPERAND (bfexpr, 2);
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var;
+
+	  if (!TYPE_UNSIGNED (type))
+	    type = unsigned_type_for (type);
+
+	  var = make_rename_temp (type, "SR");
+
+	  append_to_statement_list (build_gimple_modify_stmt
+				    (var, build_int_cst_wide (type, 0, 0)),
+				    &list);
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), true, &list, blen, bpos, elt);
+
+	  GIMPLE_STMT_OPERAND (stmt, 1) = var;
+	  update = true;
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&list, NULL, blen, bpos, elt);
+
+      if (list)
+	{
+	  mark_all_v_defs (list);
+	  sra_insert_before (bsi, list);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +3204,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,20 +3358,41 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
 	{
 	  tree_stmt_iterator tsi;
-	  tree first;
+	  tree first, blist = NULL;
+	  bool thr = (bsi->bb->flags & EDGE_COMPLEX) != 0;
 
-	  /* Extract the first statement from LIST.  */
+	  /* If the last statement of this BB created an EH edge
+	     before scalarization, we have to locate the first
+	     statement that can throw in the new statement list and
+	     use that as the last statement of this BB, such that EH
+	     semantics is preserved.  All statements up to this one
+	     are added to the same BB.  All other statements in the
+	     list will be added to normal outgoing edges of the same
+	     BB.  If they access any memory, it's the same memory, so
+	     we can assume they won't throw.  */
 	  tsi = tsi_start (list);
-	  first = tsi_stmt (tsi);
+	  for (first = tsi_stmt (tsi);
+	       thr && !tsi_end_p (tsi) && !tree_could_throw_p (first);
+	       first = tsi_stmt (tsi))
+	    {
+	      tsi_delink (&tsi);
+	      append_to_statement_list (first, &blist);
+	    }
+
+	  /* Extract the first remaining statement from LIST, this is
+	     the EH statement if there is one.  */
 	  tsi_delink (&tsi);
 
+	  if (blist)
+	    sra_insert_before (bsi, blist);
+
 	  /* Replace the old statement with this new representative.  */
 	  bsi_replace (bsi, first, true);
 
@@ -2352,6 +3476,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-06-06  3:18                                                 ` Alexandre Oliva
@ 2007-06-25 18:48                                                   ` Alexandre Oliva
  2007-06-26 23:01                                                   ` Bernd Schmidt
  1 sibling, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-06-25 18:48 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Roman Zippel, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

On Jun  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> Bootstrapped and regtested on x86_64-linux-gnu.  Ok?

> Even if it's approved, I probably won't check it in before Sunday,
> since I'm going to be on the road until then, with good internet
> access but little time for it :-(

> for gcc/ChangeLog
> from  Alexandre Oliva  <aoliva@redhat.com>

> 	PR middle-end/22156
> 	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
> 	(sra_hash_tree): Handle BIT_FIELD_REFs.
> 	(sra_elt_hash): Don't hash bitfld blocks.
> 	(sra_elt_eq): Skip them in parent compares as well.  Handle
> 	BIT_FIELD_REFs.
> 	(build_element_name_1): Handle BIT_FIELD_REFs.
> 	(instantiate_element): Propagate nowarn from parents.
> 	Gimple-zero-initialize all bit-field variables that are not
> 	part of parameters that are going to be scalarized on entry.
> 	(instantiate_missing_elements_1): Return the sra_elt.
> 	(canon_type_for_field): New.
> 	(try_instantiate_multiple_fields): New.  Infer widest possible
> 	access mode from decl or member type, but clip it at word
> 	size, and only widen it if a field crosses an alignment
> 	boundary.
> 	(instantiate_missing_elements): Use them.
> 	(generate_one_element_ref): Handle BIT_FIELD_REFs.
> 	(scalar_bitfield_p): New.
> 	(sra_build_assignment): Optimize assignments from scalarizable
> 	BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
> 	counts.
> 	(REPLDUP): New.
> 	(sra_build_bf_assignment): New.  Optimize assignments to
> 	scalarizable BIT_FIELD_REFs.
> 	(sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
> 	assignments to full variables.
> 	(generate_copy_inout): Use the new macros and functions.
> 	(generate_element_copy): Likewise.  Handle bitfld differences.
> 	(generate_element_zero): Don't recurse for blocks.  Use
> 	sra_build_elt_assignment.
> 	(generate_one_element_init): Take elt instead of var.  Use
> 	sra_build_elt_assignment.
> 	(generate_element_init_1): Adjust.
> 	(bitfield_overlap_info): New struct.
> 	(bitfield_overlaps_p): New.
> 	(sra_explode_bitfield_assignment): New.
> 	(sra_sync_for_bitfield_assignment): New.
> 	(scalarize_use): Re-expand assignment to scalarized
> 	BIT_FIELD_REFs of gimple_regs.  Explode or sync needed members
> 	for BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
> 	(scalarize_copy): Use REPLDUP.
> 	(scalarize_ldst): Move assert before dereference.  Adjust EH
> 	handling.
> 	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

:ADDPATCH tree-sra:

Ping?

http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00300.html

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-06-06  3:18                                                 ` Alexandre Oliva
  2007-06-25 18:48                                                   ` Alexandre Oliva
@ 2007-06-26 23:01                                                   ` Bernd Schmidt
  2007-06-28  4:50                                                     ` Alexandre Oliva
  1 sibling, 1 reply; 104+ messages in thread
From: Bernd Schmidt @ 2007-06-26 23:01 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

Alexandre Oliva wrote:
> Bootstrapped and regtested on x86_64-linux-gnu.  Ok?

This fails as below.  At this point I think I'd prefer if we could apply 
the earlier stable patch first and then do enhancements on top of it.


Bernd

  Native configuration is i686-pc-linux-gnu

                 === gcc tests ===
@@ -31,6 +31,25 @@
  Running 
/local/src/egcs/trunk/gcc/testsuite/gcc.c-torture/compile/compile.exp ...
  Running 
/local/src/egcs/trunk/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp 
...
  Running 
/local/src/egcs/trunk/gcc/testsuite/gcc.c-torture/execute/execute.exp ...
+FAIL: gcc.c-torture/execute/20031211-1.c compilation,  -O1  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/20031211-1.c compilation,  -O2  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/20031211-1.c compilation,  -O3 
-fomit-frame-pointer  (internal compiler error)
+FAIL: gcc.c-torture/execute/20031211-1.c compilation,  -O3 -g 
(internal compiler error)
+FAIL: gcc.c-torture/execute/20031211-1.c compilation,  -Os  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/20040709-1.c compilation,  -O3 
-fomit-frame-pointer  (internal compiler error)
+FAIL: gcc.c-torture/execute/20040709-1.c compilation,  -O3 
-fomit-frame-pointer -funroll-loops  (internal compiler error)
+FAIL: gcc.c-torture/execute/20040709-1.c compilation,  -O3 
-fomit-frame-pointer -funroll-all-loops -finline-functions  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/20040709-1.c compilation,  -O3 -g 
(internal compiler error)
+FAIL: gcc.c-torture/execute/20040709-2.c compilation,  -O1  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/20040709-2.c compilation,  -O2  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/20040709-2.c compilation,  -O3 
-fomit-frame-pointer  (internal compiler error)
+FAIL: gcc.c-torture/execute/20040709-2.c compilation,  -O3 
-fomit-frame-pointer -funroll-loops  (internal compiler error)
+FAIL: gcc.c-torture/execute/20040709-2.c compilation,  -O3 
-fomit-frame-pointer -funroll-all-loops -finline-functions  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/20040709-2.c compilation,  -O3 -g 
(internal compiler error)
+FAIL: gcc.c-torture/execute/20040709-2.c compilation,  -Os  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/bf64-1.c compilation,  -O3 
-fomit-frame-pointer  (internal compiler error)
+FAIL: gcc.c-torture/execute/bf64-1.c compilation,  -O3 -g  (internal 
compiler error)
+FAIL: gcc.c-torture/execute/bf64-1.c compilation,  -Os  (internal 
compiler error)

-- 
This footer brought to you by insane German lawmakers.
Analog Devices GmbH      Wilhelm-Wagenfeld-Str. 6      80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-06-26 23:01                                                   ` Bernd Schmidt
@ 2007-06-28  4:50                                                     ` Alexandre Oliva
  2007-07-03  0:52                                                       ` Roman Zippel
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-06-28  4:50 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Roman Zippel, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 1980 bytes --]

On Jun 26, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:

> This fails as below.

Thanks.  I've fixed these and then some, then re-tested on both
i686-linux-gnu and x86_64-linux-gnu, no regressions.

> At this point I think I'd prefer if we could apply the earlier
> stable patch first and then do enhancements on top of it.

After some of the problems I caught in this last round, I'm no longer
comfortable installing the earlier "stable" patch.  There were some
type compatibility and sign extension problems that hadn't been
exposed by testing but that were there, and I fear they might break
stuff.

Please consider this revised patch instead.  The changes from the last
patch are:

- convert bit-field operands to bitsizetype before doing size_binop
  arithmetic with them.  This is what caused the ICEs you've found on
  x86.  Some other parts of the compiler create BIT_FIELD_REFs with
  non-sizetype operands.

- use size_binop all over instead of int_const_binop, for stronger
  type safety, except in the few places where we're building integral
  constants rather than sizes

- in sra_build_assignment, use the correct type for the temp variable
  that should hold the result of converting var to the same-sized
  unsigned integral type (this was wrong in the "stable" patch)

- in sra_build_bf_assignment, use the correct type to decide whether
  the incoming variable needs conversion to an unsigned integral type,
  and convert it to the unsigned type instead of doing bit arithmetic
  between operands with different signedness

- in sra_build_bf_assignment, filter-out sign-extension bits from
  the input operand being replaced in a wider variable

- in sra_build_bf_assignment, make sure we convert the input field to
  the correct unsigned type instead of relying on implicit type
  conversion by the left shift.

I went through the type conversions, expressions and assignments in
the patch one more time, and I couldn't find any other mismatches.

Ok?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 46961 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.
	Gimple-zero-initialize all bit-field variables that are not
	part of parameters that are going to be scalarized on entry.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest possible
	access mode from decl or member type, but clip it at word
	size, and only widen it if a field crosses an alignment
	boundary.
	(instantiate_missing_elements): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(scalar_bitfield_p): New.
	(sra_build_assignment): Optimize assignments from scalarizable
	BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
	counts.
	(REPLDUP): New.
	(sra_build_bf_assignment): New.  Optimize assignments to
	scalarizable BIT_FIELD_REFs.
	(sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
	assignments to full variables.
	(generate_copy_inout): Use the new macros and functions.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_init): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(bitfield_overlap_info): New struct.
	(bitfield_overlaps_p): New.
	(sra_explode_bitfield_assignment): New.
	(sra_sync_for_bitfield_assignment): New.
	(scalarize_use): Re-expand assignment to scalarized
	BIT_FIELD_REFs of gimple_regs.  Explode or sync needed members
	for BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.  Adjust EH
	handling.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-06-26 23:11:13.000000000 -0300
+++ gcc/tree-sra.c	2007-06-27 16:49:19.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -205,6 +209,9 @@ extern void debug_sra_elt_name (struct s
 
 /* Forward declarations.  */
 static tree generate_element_ref (struct sra_elt *);
+static tree sra_build_assignment (tree dst, tree src);
+static void mark_all_v_defs (tree list);
+
 \f
 /* Return true if DECL is an SRA candidate.  */
 
@@ -461,6 +468,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +492,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +512,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
+
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
 
-  if (a->parent != b->parent)
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +557,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -888,7 +917,7 @@ sra_walk_asm_expr (tree expr, block_stmt
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1206,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,9 +1247,11 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
@@ -1240,9 +1280,7 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
     }
   else
     {
@@ -1251,6 +1289,20 @@ instantiate_element (struct sra_elt *elt
       TREE_NO_WARNING (var) = 1;
     }
 
+  /* Zero-initialize bit-field scalarization variables, to avoid
+     triggering undefined behavior, unless the base variable is an
+     argument that is going to get scalarized out as the first thing,
+     providing for the initializers we need.  */
+  if (TREE_CODE (elt->element) == BIT_FIELD_REF
+      && (TREE_CODE (base) != PARM_DECL
+	  || !bitmap_bit_p (needs_copy_in, DECL_UID (base))))
+    {
+      tree init = sra_build_assignment (var, fold_convert (TREE_TYPE (var),
+							   integer_zero_node));
+      insert_edge_copies (init, ENTRY_BLOCK_PTR);
+      mark_all_v_defs (init);
+    }
+
   if (dump_file)
     {
       fputs ("  ", dump_file);
@@ -1337,7 +1389,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1400,303 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  int count;
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+
+  /* For complex and array objects, there are going to be integer
+     literals as child elements.  In this case, we can't just take the
+     alignment and mode of the decl, so we instead rely on the element
+     type.
+
+     ??? We could try to infer additional alignment from the full
+     object declaration and the location of the sub-elements we're
+     accessing.  */
+  for (count = 0; !DECL_P (block->element); count++)
+    block = block->parent;
+
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+
+  if (count)
+    {
+      type = TREE_TYPE (block->element);
+      while (count--)
+	type = TREE_TYPE (type);
+
+      align = TYPE_ALIGN (type);
+      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
+    }
+
+  if (align < alchk)
+    align = alchk;
+
+  /* Coalescing wider fields is probably pointless and
+     inefficient.  */
+  if (align > BITS_PER_WORD)
+    align = BITS_PER_WORD;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1712,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +2034,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = unshare_expr (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1725,11 +2080,180 @@ generate_element_ref (struct sra_elt *el
     return elt->element;
 }
 
+/* Return true if BF is a bit-field that we can handle like a scalar.  */
+
+static bool
+scalar_bitfield_p (tree bf)
+{
+  return (TREE_CODE (bf) == BIT_FIELD_REF
+	  && (is_gimple_reg (TREE_OPERAND (bf, 0))
+	      || (TYPE_MODE (TREE_TYPE (TREE_OPERAND (bf, 0))) != BLKmode
+		  && (!TREE_SIDE_EFFECTS (TREE_OPERAND (bf, 0))
+		      || (GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE
+						       (TREE_OPERAND (bf, 0))))
+			  <= BITS_PER_WORD)))));
+}
+
 /* Create an assignment statement from SRC to DST.  */
 
 static tree
 sra_build_assignment (tree dst, tree src)
 {
+  /* Turning BIT_FIELD_REFs into bit operations enables other passes
+     to do a much better job at optimizing the code.  */
+  if (scalar_bitfield_p (src))
+    {
+      tree cst, cst2, mask, minshift, maxshift;
+      tree tmp, var, utype, stype;
+      tree list, stmt;
+      bool unsignedp = BIT_FIELD_REF_UNSIGNED (src);
+
+      var = TREE_OPERAND (src, 0);
+      cst = TREE_OPERAND (src, 2);
+      cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
+			 TREE_OPERAND (src, 2));
+
+      if (BITS_BIG_ENDIAN)
+	{
+	  maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+	  minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+	}
+      else
+	{
+	  maxshift = cst2;
+	  minshift = cst;
+	}
+
+      stype = TREE_TYPE (var);
+      if (!INTEGRAL_TYPE_P (stype))
+	stype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (stype)), 1);
+      else if (!TYPE_UNSIGNED (stype))
+	stype = unsigned_type_for (stype);
+
+      utype = TREE_TYPE (dst);
+      if (!INTEGRAL_TYPE_P (utype))
+	utype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (utype)), 1);
+      else if (!TYPE_UNSIGNED (utype))
+	utype = unsigned_type_for (utype);
+
+      list = NULL;
+
+      cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
+      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+	{
+	  unsignedp = true;
+	  mask = NULL_TREE;
+	}
+      else
+	{
+	  mask = build_int_cst_wide (utype, 1, 0);
+	  cst = int_const_binop (LSHIFT_EXPR, mask, cst2, true);
+	  mask = int_const_binop (MINUS_EXPR, cst, mask, true);
+	}
+
+      tmp = make_rename_temp (stype, "SR");
+      if (TYPE_MAIN_VARIANT (TREE_TYPE (var)) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_convert (stype, var));
+	  else
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_build1 (VIEW_CONVERT_EXPR,
+							  stype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!integer_zerop (minshift))
+	{
+	  tmp = make_rename_temp (stype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (RSHIFT_EXPR, stype,
+							var, minshift));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (!mask && unsignedp
+	      && (TYPE_MAIN_VARIANT (utype)
+		  == TYPE_MAIN_VARIANT (TREE_TYPE (dst))))
+	    tmp = dst;
+	  else
+	    tmp = make_rename_temp (utype, "SR");
+
+	  stmt = build_gimple_modify_stmt (tmp, fold_convert (utype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (mask)
+	{
+	  if (!unsignedp
+	      || (TYPE_MAIN_VARIANT (TREE_TYPE (dst))
+		  != TYPE_MAIN_VARIANT (utype)))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_AND_EXPR, utype,
+							var, mask));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!unsignedp)
+	{
+	  tree signbit = int_const_binop (LSHIFT_EXPR,
+					  build_int_cst_wide (utype, 1, 0),
+					  size_binop (MINUS_EXPR, cst2,
+						      bitsize_int (1)),
+					  true);
+
+	  tmp = make_rename_temp (utype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_XOR_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (dst)) != TYPE_MAIN_VARIANT (utype))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (MINUS_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (var != dst)
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (dst)))
+	    var = fold_convert (TREE_TYPE (dst), var);
+	  else
+	    var = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (dst), var);
+
+	  stmt = build_gimple_modify_stmt (dst, var);
+	  append_to_statement_list (stmt, &list);
+	}
+
+      return list;
+    }
+
   /* It was hoped that we could perform some type sanity checking
      here, but since front-ends can emit accesses of fields in types
      different from their nominal types and copy structures containing
@@ -1741,6 +2265,256 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : unshare_expr (t))
+
+/* Emit an assignment from SRC to DST, but if DST is a scalarizable
+   BIT_FIELD_REF, turn it into bit operations.  */
+
+static tree
+sra_build_bf_assignment (tree dst, tree src)
+{
+  tree var, type, utype, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF)
+    return sra_build_assignment (dst, src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  if (!scalar_bitfield_p (dst))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = NULL;
+
+  cst = fold_convert (bitsizetype, TREE_OPERAND (dst, 2));
+  cst2 = size_binop (PLUS_EXPR,
+		     fold_convert (bitsizetype, TREE_OPERAND (dst, 1)),
+		     cst);
+
+  if (BITS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+      minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+  if (!INTEGRAL_TYPE_P (type))
+    type = lang_hooks.types.type_for_size
+      (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (var))), 1);
+  if (TYPE_UNSIGNED (type))
+    utype = type;
+  else
+    utype = unsigned_type_for (type);
+
+  mask = build_int_cst_wide (utype, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, true);
+  mask = fold_build1 (BIT_NOT_EXPR, utype, mask);
+
+  if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (TREE_TYPE (var))
+      && !integer_zerop (mask))
+    {
+      tmp = var;
+      if (!is_gimple_variable (tmp))
+	tmp = unshare_expr (var);
+
+      tmp2 = make_rename_temp (utype, "SR");
+
+      if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	stmt = build_gimple_modify_stmt (tmp2, fold_convert (utype, tmp));
+      else
+	stmt = build_gimple_modify_stmt (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+							    utype, tmp));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp2 = var;
+
+  if (!integer_zerop (mask))
+    {
+      tmp = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp,
+				       fold_build2 (BIT_AND_EXPR, utype,
+						    tmp2, mask));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp = mask;
+
+  if (is_gimple_reg (src) && INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    tmp2 = src;
+  else if (INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    {
+      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    {
+      tmp2 = make_rename_temp
+	(lang_hooks.types.type_for_size
+	 (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (src))),
+	  1), "SR");
+      stmt = sra_build_assignment (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (tmp2), src));
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2)))
+    {
+      tree ut = unsigned_type_for (TREE_TYPE (tmp2));
+      tmp3 = make_rename_temp (ut, "SR");
+      tmp2 = fold_convert (ut, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+
+      tmp2 = fold_build1 (BIT_NOT_EXPR, utype, mask);
+      tmp2 = int_const_binop (RSHIFT_EXPR, tmp2, minshift, true);
+      tmp2 = fold_convert (ut, tmp2);
+      tmp2 = fold_build2 (BIT_AND_EXPR, utype, tmp3, tmp2);
+
+      if (tmp3 != tmp2)
+	{
+	  tmp3 = make_rename_temp (ut, "SR");
+	  stmt = sra_build_assignment (tmp3, tmp2);
+	  append_to_statement_list (stmt, &list);
+	}
+
+      tmp2 = tmp3;
+    }
+
+  if (TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (utype))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      tmp2 = fold_convert (utype, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, utype,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (utype != TREE_TYPE (var))
+    tmp3 = make_rename_temp (utype, "SR");
+  else
+    tmp3 = var;
+  stmt = build_gimple_modify_stmt (tmp3,
+				   fold_build2 (BIT_IOR_EXPR, utype,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  if (tmp3 != var)
+    {
+      if (TREE_TYPE (var) == type)
+	stmt = build_gimple_modify_stmt (var,
+					 fold_convert (type, tmp3));
+      else
+	stmt = build_gimple_modify_stmt (var,
+					 fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (var), tmp3));
+      append_to_statement_list (stmt, &list);
+    }
+
+  return list;
+}
+
+/* Expand an assignment of SRC to the scalarized representation of
+   ELT.  If it is a field group, try to widen the assignment to cover
+   the full variable.  */
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, tmp, cst, cst2, list, stmt;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      cst = TYPE_SIZE (TREE_TYPE (var));
+      cst2 = size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+			 TREE_OPERAND (dst, 2));
+
+      src = TREE_OPERAND (src, 0);
+
+      /* Avoid full-width bit-fields.  */
+      if (integer_zerop (cst2)
+	  && tree_int_cst_equal (cst, TYPE_SIZE (TREE_TYPE (src))))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (src))
+	      && !TYPE_UNSIGNED (TREE_TYPE (src)))
+	    src = fold_convert (unsigned_type_for (TREE_TYPE (src)), src);
+
+	  /* If a single conversion won't do, we'll need a statement
+	     list.  */
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (var))
+	      != TYPE_MAIN_VARIANT (TREE_TYPE (src)))
+	    {
+	      list = NULL;
+
+	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src))
+		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+		src = fold_build1 (VIEW_CONVERT_EXPR,
+				   lang_hooks.types.type_for_size
+				   (TREE_INT_CST_LOW
+				    (TYPE_SIZE (TREE_TYPE (src))),
+				    1), src);
+
+	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
+	      stmt = build_gimple_modify_stmt (tmp, src);
+	      append_to_statement_list (stmt, &list);
+
+	      stmt = sra_build_assignment (var,
+					   fold_convert (TREE_TYPE (var),
+							 tmp));
+	      append_to_statement_list (stmt, &list);
+
+	      return list;
+	    }
+
+	  src = fold_convert (TREE_TYPE (var), src);
+	}
+      else
+	{
+	  src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var), src, cst, cst2);
+	  BIT_FIELD_REF_UNSIGNED (src) = 1;
+	}
+
+      return sra_build_assignment (var, src);
+    }
+
+  return sra_build_bf_assignment (dst, src);
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1764,16 +2538,16 @@ generate_copy_inout (struct sra_elt *elt
       i = c->replacement;
 
       t = build2 (COMPLEX_EXPR, elt->type, r, i);
-      t = sra_build_assignment (expr, t);
+      t = sra_build_bf_assignment (expr, t);
       SSA_NAME_DEF_STMT (expr) = t;
       append_to_statement_list (t, list_p);
     }
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_bf_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2572,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2595,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2616,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2627,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2636,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2668,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2033,6 +2821,211 @@ sra_replace (block_stmt_iterator *bsi, t
     bsi_prev (bsi);
 }
 
+/* Data structure that bitfield_overlaps_p fills in with information
+   about the element passed in and how much of it overlaps with the
+   bit-range passed it to.  */
+
+struct bitfield_overlap_info
+{
+  /* The bit-length of an element.  */
+  tree field_len;
+
+  /* The bit-position of the element in its parent.  */
+  tree field_pos;
+
+  /* The number of bits of the element that overlap with the incoming
+     bit range.  */
+  tree overlap_len;
+
+  /* The first bit of the element that overlaps with the incoming bit
+     range.  */
+  tree overlap_pos;
+};
+
+/* Return true if a BIT_FIELD_REF<(FLD->parent), BLEN, BPOS>
+   expression (refereced as BF below) accesses any of the bits in FLD,
+   false if it doesn't.  If DATA is non-null, its field_len and
+   field_pos are filled in such that BIT_FIELD_REF<(FLD->parent),
+   field_len, field_pos> (referenced as BFLD below) represents the
+   entire field FLD->element, and BIT_FIELD_REF<BFLD, overlap_len,
+   overlap_pos> represents the portion of the entire field that
+   overlaps with BF.  */
+
+static bool
+bitfield_overlaps_p (tree blen, tree bpos, struct sra_elt *fld,
+		     struct bitfield_overlap_info *data)
+{
+  tree flen, fpos;
+  bool ret;
+
+  if (TREE_CODE (fld->element) == FIELD_DECL)
+    {
+      flen = fold_convert (bitsizetype, DECL_SIZE (fld->element));
+      fpos = fold_convert (bitsizetype, DECL_FIELD_OFFSET (fld->element));
+      fpos = size_binop (MULT_EXPR, fpos, bitsize_int (BITS_PER_UNIT));
+      fpos = size_binop (PLUS_EXPR, fpos, DECL_FIELD_BIT_OFFSET (fld->element));
+    }
+  else if (TREE_CODE (fld->element) == BIT_FIELD_REF)
+    {
+      flen = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 1));
+      fpos = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 2));
+    }
+  else
+    gcc_unreachable ();
+
+  gcc_assert (host_integerp (blen, 1)
+	      && host_integerp (bpos, 1)
+	      && host_integerp (flen, 1)
+	      && host_integerp (fpos, 1));
+
+  ret = ((!tree_int_cst_lt (fpos, bpos)
+	  && tree_int_cst_lt (size_binop (MINUS_EXPR, fpos, bpos),
+			      blen))
+	 || (!tree_int_cst_lt (bpos, fpos)
+	     && tree_int_cst_lt (size_binop (MINUS_EXPR, bpos, fpos),
+				 flen)));
+
+  if (!ret)
+    return ret;
+
+  if (data)
+    {
+      tree bend, fend;
+
+      data->field_len = flen;
+      data->field_pos = fpos;
+
+      fend = size_binop (PLUS_EXPR, fpos, flen);
+      bend = size_binop (PLUS_EXPR, bpos, blen);
+
+      if (tree_int_cst_lt (bend, fend))
+	data->overlap_len = size_binop (MINUS_EXPR, bend, fpos);
+      else
+	data->overlap_len = NULL;
+
+      if (tree_int_cst_lt (fpos, bpos))
+	{
+	  data->overlap_pos = size_binop (MINUS_EXPR, bpos, fpos);
+	  data->overlap_len = size_binop (MINUS_EXPR,
+					  data->overlap_len
+					  ? data->overlap_len
+					  : data->field_len,
+					  data->overlap_pos);
+	}
+      else
+	data->overlap_pos = NULL;
+    }
+
+  return ret;
+}
+
+/* Add to LISTP a sequence of statements that copies BLEN bits between
+   VAR and the scalarized elements of ELT, starting a bit VPOS of VAR
+   and at bit BPOS of ELT.  The direction of the copy is given by
+   TO_VAR.  */
+
+static void
+sra_explode_bitfield_assignment (tree var, tree vpos, bool to_var,
+				 tree *listp, tree blen, tree bpos,
+				 struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  struct bitfield_overlap_info flp;
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    {
+      tree flen, fpos;
+
+      if (!bitfield_overlaps_p (blen, bpos, fld, &flp))
+	continue;
+
+      flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
+      fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
+
+      if (fld->replacement)
+	{
+	  tree infld, invar, st;
+
+	  infld = fld->replacement;
+
+	  if (TREE_CODE (infld) == BIT_FIELD_REF)
+	    {
+	      fpos = size_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2));
+	      infld = TREE_OPERAND (infld, 0);
+	    }
+
+	  infld = fold_build3 (BIT_FIELD_REF,
+			       lang_hooks.types.type_for_size
+			       (TREE_INT_CST_LOW (flen), 1),
+			       infld, flen, fpos);
+	  BIT_FIELD_REF_UNSIGNED (infld) = 1;
+
+	  invar = size_binop (MINUS_EXPR, flp.field_pos, bpos);
+	  if (flp.overlap_pos)
+	    invar = size_binop (PLUS_EXPR, invar, flp.overlap_pos);
+	  invar = size_binop (PLUS_EXPR, invar, vpos);
+
+	  invar = fold_build3 (BIT_FIELD_REF, TREE_TYPE (infld),
+			       var, flen, invar);
+	  BIT_FIELD_REF_UNSIGNED (invar) = 1;
+
+	  if (to_var)
+	    st = sra_build_bf_assignment (invar, infld);
+	  else
+	    st = sra_build_bf_assignment (infld, invar);
+
+	  append_to_statement_list (st, listp);
+	}
+      else
+	{
+	  tree sub = size_binop (MINUS_EXPR, flp.field_pos, bpos);
+	  sub = size_binop (PLUS_EXPR, vpos, sub);
+	  if (flp.overlap_pos)
+	    sub = size_binop (PLUS_EXPR, sub, flp.overlap_pos);
+
+	  sra_explode_bitfield_assignment (var, sub, to_var, listp,
+					   flen, fpos, fld);
+	}
+    }
+}
+
+/* Add to LISTBEFOREP statements that copy scalarized members of ELT
+   that overlap with BIT_FIELD_REF<(ELT->element), BLEN, BPOS> back
+   into the full variable, and to LISTAFTERP, if non-NULL, statements
+   that copy the (presumably modified) overlapping portions of the
+   full variable back to the scalarized variables.  */
+
+static void
+sra_sync_for_bitfield_assignment (tree *listbeforep, tree *listafterp,
+				  tree blen, tree bpos,
+				  struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  struct bitfield_overlap_info flp;
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    if (bitfield_overlaps_p (blen, bpos, fld, &flp))
+      {
+	if (fld->replacement || (!flp.overlap_len && !flp.overlap_pos))
+	  {
+	    generate_copy_inout (fld, false, generate_element_ref (fld),
+				 listbeforep);
+	    mark_no_warning (fld);
+	    if (listafterp)
+	      generate_copy_inout (fld, true, generate_element_ref (fld),
+				   listafterp);
+	  }
+	else
+	  {
+	    tree flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
+	    tree fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
+
+	    sra_sync_for_bitfield_assignment (listbeforep, listafterp,
+					      flen, fpos, fld);
+	  }
+      }
+}
+
 /* Scalarize a USE.  To recap, this is either a simple reference to ELT,
    if elt is scalar, or some occurrence of ELT that requires a complete
    aggregate.  IS_OUTPUT is true if ELT is being modified.  */
@@ -2041,23 +3034,144 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
+  tree bfexpr;
 
   if (elt->replacement)
     {
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	{
+	  if (TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	      && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	      && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	      && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	    {
+	      tree newstmt = sra_build_elt_assignment
+		(elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	      if (TREE_CODE (newstmt) != STATEMENT_LIST)
+		{
+		  tree list = NULL;
+		  append_to_statement_list (newstmt, &list);
+		  newstmt = list;
+		}
+	      sra_replace (bsi, newstmt);
+	      return;
+	    }
+
+	  mark_all_v_defs (stmt);
+	}
+      *expr_p = REPLDUP (elt->replacement);
       update_stmt (stmt);
     }
+  else if (use_all && is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 0)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (bfexpr, 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree listbefore = NULL, listafter = NULL;
+      tree blen = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 1));
+      tree bpos = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 2));
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var = make_rename_temp (type, "SR"), tmp, st;
+
+	  GIMPLE_STMT_OPERAND (stmt, 0) = var;
+	  update = true;
+
+	  if (!TYPE_UNSIGNED (type))
+	    {
+	      type = unsigned_type_for (type);
+	      tmp = make_rename_temp (type, "SR");
+	      st = build_gimple_modify_stmt (tmp,
+					     fold_convert (type, var));
+	      append_to_statement_list (st, &listafter);
+	      var = tmp;
+	    }
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), false, &listafter, blen, bpos, elt);
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&listbefore, &listafter, blen, bpos, elt);
+
+      if (listbefore)
+	{
+	  mark_all_v_defs (listbefore);
+	  sra_insert_before (bsi, listbefore);
+	}
+      if (listafter)
+	{
+	  mark_all_v_defs (listafter);
+	  sra_insert_after (bsi, listafter);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
+  else if (use_all && !is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 1)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (GIMPLE_STMT_OPERAND (stmt, 1), 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree list = NULL;
+      tree blen = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 1));
+      tree bpos = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 2));
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var;
+
+	  if (!TYPE_UNSIGNED (type))
+	    type = unsigned_type_for (type);
+
+	  var = make_rename_temp (type, "SR");
+
+	  append_to_statement_list (build_gimple_modify_stmt
+				    (var, build_int_cst_wide (type, 0, 0)),
+				    &list);
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), true, &list, blen, bpos, elt);
+
+	  GIMPLE_STMT_OPERAND (stmt, 1) = var;
+	  update = true;
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&list, NULL, blen, bpos, elt);
+
+      if (list)
+	{
+	  mark_all_v_defs (list);
+	  sra_insert_before (bsi, list);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +3215,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,20 +3369,41 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
 	{
 	  tree_stmt_iterator tsi;
-	  tree first;
+	  tree first, blist = NULL;
+	  bool thr = (bsi->bb->flags & EDGE_COMPLEX) != 0;
 
-	  /* Extract the first statement from LIST.  */
+	  /* If the last statement of this BB created an EH edge
+	     before scalarization, we have to locate the first
+	     statement that can throw in the new statement list and
+	     use that as the last statement of this BB, such that EH
+	     semantics is preserved.  All statements up to this one
+	     are added to the same BB.  All other statements in the
+	     list will be added to normal outgoing edges of the same
+	     BB.  If they access any memory, it's the same memory, so
+	     we can assume they won't throw.  */
 	  tsi = tsi_start (list);
-	  first = tsi_stmt (tsi);
+	  for (first = tsi_stmt (tsi);
+	       thr && !tsi_end_p (tsi) && !tree_could_throw_p (first);
+	       first = tsi_stmt (tsi))
+	    {
+	      tsi_delink (&tsi);
+	      append_to_statement_list (first, &blist);
+	    }
+
+	  /* Extract the first remaining statement from LIST, this is
+	     the EH statement if there is one.  */
 	  tsi_delink (&tsi);
 
+	  if (blist)
+	    sra_insert_before (bsi, blist);
+
 	  /* Replace the old statement with this new representative.  */
 	  bsi_replace (bsi, first, true);
 
@@ -2352,6 +3487,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-06-28  4:50                                                     ` Alexandre Oliva
@ 2007-07-03  0:52                                                       ` Roman Zippel
  2007-07-06  9:21                                                         ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Roman Zippel @ 2007-07-03  0:52 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Bernd Schmidt, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

Hi,

On Thu, 28 Jun 2007, Alexandre Oliva wrote:

> On Jun 26, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> 
> > This fails as below.
> 
> Thanks.  I've fixed these and then some, then re-tested on both
> i686-linux-gnu and x86_64-linux-gnu, no regressions.

I added this patch to my test run and it bootstraps on m68k, but there are 
a few test failures, e.g. gcc.c-torture/execute/20031211-1.c is a simple one.

bye, Roman

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-07-03  0:52                                                       ` Roman Zippel
@ 2007-07-06  9:21                                                         ` Alexandre Oliva
  2007-08-24  7:28                                                           ` SRA bit-field optimization (was: Re: Reload bug & SRA oddness) Alexandre Oliva
  2007-09-29 17:52                                                           ` Reload bug & SRA oddness Diego Novillo
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-07-06  9:21 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Bernd Schmidt, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

On Jul  2, 2007, Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> On Thu, 28 Jun 2007, Alexandre Oliva wrote:

>> On Jun 26, 2007, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
>> 
>> > This fails as below.
>> 
>> Thanks.  I've fixed these and then some, then re-tested on both
>> i686-linux-gnu and x86_64-linux-gnu, no regressions.

> I added this patch to my test run and it bootstraps on m68k, but there are 
> a few test failures, e.g. gcc.c-torture/execute/20031211-1.c is a simple one.

Thanks for the report, and for having tested the patch below and
confirming it fixed the problem.  The patch lost sight of the actual
width of a bit-field and used its underlying type's width instead for
bit-field operations, so, in order to extract the single-bit bit-field
from the scalarized variable, it did a whole lot of right-shifting on
big-endian boxes.  Oops.

I realized the lack of truncation could also be a problem for other
scalarized bit-field variables, so I arranged for accesses to these
variables to be handled as bit-fields as well.  I'm not entirely happy
about this, but I don't see that it's safe to do otherwise.

Here's the patch, bootstrapped and regression-tested on
x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
ppc64-linux-gnu underway.

Ok to install?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-take2.patch --]
[-- Type: text/x-patch, Size: 48682 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
	(sra_hash_tree): Handle BIT_FIELD_REFs.
	(sra_elt_hash): Don't hash bitfld blocks.
	(sra_elt_eq): Skip them in parent compares as well.  Handle
	BIT_FIELD_REFs.
	(build_element_name_1): Handle BIT_FIELD_REFs.
	(instantiate_element): Propagate nowarn from parents.  Create
	BIT_FIELD_REF for variables that are widened by scalarization.
	Gimple-zero-initialize all bit-field variables that are not
	part of parameters that are going to be scalarized on entry.
	(instantiate_missing_elements_1): Return the sra_elt.
	(canon_type_for_field): New.
	(try_instantiate_multiple_fields): New.  Infer widest possible
	access mode from decl or member type, but clip it at word
	size, and only widen it if a field crosses an alignment
	boundary.
	(instantiate_missing_elements): Use them.
	(generate_one_element_ref): Handle BIT_FIELD_REFs.
	(scalar_bitfield_p): New.
	(sra_build_assignment): Optimize assignments from scalarizable
	BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
	counts.
	(REPLDUP): New.
	(sra_build_bf_assignment): New.  Optimize assignments to
	scalarizable BIT_FIELD_REFs.
	(sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
	assignments to full variables.
	(generate_copy_inout): Use the new macros and functions.
	(generate_element_copy): Likewise.  Handle bitfld differences.
	(generate_element_zero): Don't recurse for blocks.  Use
	sra_build_elt_assignment.
	(generate_one_element_init): Take elt instead of var.  Use
	sra_build_elt_assignment.
	(generate_element_init_1): Adjust.
	(bitfield_overlap_info): New struct.
	(bitfield_overlaps_p): New.
	(sra_explode_bitfield_assignment): New.  Adjust widened
	variables to account for endianness.
	(sra_sync_for_bitfield_assignment): New.
	(scalarize_use): Re-expand assignment to/from scalarized
	BIT_FIELD_REFs.  Explode or sync needed members for
	BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
	(scalarize_copy): Use REPLDUP.
	(scalarize_ldst): Move assert before dereference.  Adjust EH
	handling.
	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-07-03 20:14:50.000000000 -0300
+++ gcc/tree-sra.c	2007-07-04 23:55:34.000000000 -0300
@@ -147,6 +147,10 @@ struct sra_elt
 
   /* True if there is BIT_FIELD_REF on the lhs with a vector. */
   bool is_vector_lhs;
+
+  /* 1 if the element is a field that is part of a block, 2 if the field
+     is the block itself, 0 if it's neither.  */
+  char in_bitfld_block;
 };
 
 #define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
@@ -205,6 +209,9 @@ extern void debug_sra_elt_name (struct s
 
 /* Forward declarations.  */
 static tree generate_element_ref (struct sra_elt *);
+static tree sra_build_assignment (tree dst, tree src);
+static void mark_all_v_defs (tree list);
+
 \f
 /* Return true if DECL is an SRA candidate.  */
 
@@ -461,6 +468,12 @@ sra_hash_tree (tree t)
       h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
       break;
 
+    case BIT_FIELD_REF:
+      /* Don't take operand 0 into account, that's our parent.  */
+      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
+      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -479,12 +492,14 @@ sra_elt_hash (const void *x)
 
   h = sra_hash_tree (e->element);
 
-  /* Take into account everything back up the chain.  Given that chain
-     lengths are rarely very long, this should be acceptable.  If we
-     truly identify this as a performance problem, it should work to
-     hash the pointer value "e->parent".  */
+  /* Take into account everything except bitfield blocks back up the
+     chain.  Given that chain lengths are rarely very long, this
+     should be acceptable.  If we truly identify this as a performance
+     problem, it should work to hash the pointer value
+     "e->parent".  */
   for (p = e->parent; p ; p = p->parent)
-    h = (h * 65521) ^ sra_hash_tree (p->element);
+    if (!p->in_bitfld_block)
+      h = (h * 65521) ^ sra_hash_tree (p->element);
 
   return h;
 }
@@ -497,8 +512,17 @@ sra_elt_eq (const void *x, const void *y
   const struct sra_elt *a = x;
   const struct sra_elt *b = y;
   tree ae, be;
+  const struct sra_elt *ap = a->parent;
+  const struct sra_elt *bp = b->parent;
+
+  if (ap)
+    while (ap->in_bitfld_block)
+      ap = ap->parent;
+  if (bp)
+    while (bp->in_bitfld_block)
+      bp = bp->parent;
 
-  if (a->parent != b->parent)
+  if (ap != bp)
     return false;
 
   ae = a->element;
@@ -533,6 +557,11 @@ sra_elt_eq (const void *x, const void *y
 	return false;
       return fields_compatible_p (ae, be);
 
+    case BIT_FIELD_REF:
+      return
+	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
+	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
+
     default:
       gcc_unreachable ();
     }
@@ -888,7 +917,7 @@ sra_walk_asm_expr (tree expr, block_stmt
 
 static void
 sra_walk_gimple_modify_stmt (tree expr, block_stmt_iterator *bsi,
-		      const struct sra_walk_fns *fns)
+			     const struct sra_walk_fns *fns)
 {
   struct sra_elt *lhs_elt, *rhs_elt;
   tree lhs, rhs;
@@ -1177,6 +1206,15 @@ build_element_name_1 (struct sra_elt *el
       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
       obstack_grow (&sra_obstack, buffer, strlen (buffer));
     }
+  else if (TREE_CODE (t) == BIT_FIELD_REF)
+    {
+      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 2), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
+	       tree_low_cst (TREE_OPERAND (t, 1), 1));
+      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+    }
   else
     {
       tree name = DECL_NAME (t);
@@ -1209,13 +1247,34 @@ instantiate_element (struct sra_elt *elt
 {
   struct sra_elt *base_elt;
   tree var, base;
+  bool nowarn = TREE_NO_WARNING (elt->element);
 
   for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    continue;
+    if (!nowarn)
+      nowarn = TREE_NO_WARNING (base_elt->parent->element);
   base = base_elt->element;
 
   elt->replacement = var = make_rename_temp (elt->type, "SR");
 
+  if (DECL_P (elt->element)
+      && !tree_int_cst_equal (DECL_SIZE (var), DECL_SIZE (elt->element)))
+    {
+      DECL_SIZE (var) = DECL_SIZE (elt->element);
+      DECL_SIZE_UNIT (var) = DECL_SIZE_UNIT (elt->element);
+
+      elt->in_bitfld_block = 1;
+      elt->replacement = build3 (BIT_FIELD_REF, elt->type, var,
+				 DECL_SIZE (var),
+				 BITS_BIG_ENDIAN
+				 ? size_binop (MINUS_EXPR,
+					       TYPE_SIZE (elt->type),
+					       DECL_SIZE (var))
+				 : bitsize_int (0));
+      if (!INTEGRAL_TYPE_P (elt->type)
+	  || TYPE_UNSIGNED (elt->type))
+	BIT_FIELD_REF_UNSIGNED (elt->replacement) = 1;
+    }
+
   /* For vectors, if used on the left hand side with BIT_FIELD_REF,
      they are not a gimple register.  */
   if (TREE_CODE (TREE_TYPE (var)) == VECTOR_TYPE && elt->is_vector_lhs)
@@ -1240,9 +1299,7 @@ instantiate_element (struct sra_elt *elt
       DECL_DEBUG_EXPR_IS_FROM (var) = 1;
       
       DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = TREE_NO_WARNING (base);
-      if (elt->element && TREE_NO_WARNING (elt->element))
-	TREE_NO_WARNING (var) = 1;
+      TREE_NO_WARNING (var) = nowarn;
     }
   else
     {
@@ -1251,6 +1308,18 @@ instantiate_element (struct sra_elt *elt
       TREE_NO_WARNING (var) = 1;
     }
 
+  /* Zero-initialize bit-field scalarization variables, to avoid
+     triggering undefined behavior.  */
+  if (TREE_CODE (elt->element) == BIT_FIELD_REF
+      || (var != elt->replacement
+	  && TREE_CODE (elt->replacement) == BIT_FIELD_REF))
+    {
+      tree init = sra_build_assignment (var, fold_convert (TREE_TYPE (var),
+							   integer_zero_node));
+      insert_edge_copies (init, ENTRY_BLOCK_PTR);
+      mark_all_v_defs (init);
+    }
+
   if (dump_file)
     {
       fputs ("  ", dump_file);
@@ -1337,7 +1406,7 @@ sum_instantiated_sizes (struct sra_elt *
 
 static void instantiate_missing_elements (struct sra_elt *elt);
 
-static void
+static struct sra_elt *
 instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
 {
   struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
@@ -1348,6 +1417,303 @@ instantiate_missing_elements_1 (struct s
     }
   else
     instantiate_missing_elements (sub);
+  return sub;
+}
+
+/* Obtain the canonical type for field F of ELEMENT.  */
+
+static tree
+canon_type_for_field (tree f, tree element)
+{
+  tree field_type = TREE_TYPE (f);
+
+  /* canonicalize_component_ref() unwidens some bit-field types (not
+     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
+     may introduce type mismatches.  */
+  if (INTEGRAL_TYPE_P (field_type)
+      && DECL_MODE (f) != TYPE_MODE (field_type))
+    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
+						   field_type,
+						   element,
+						   f, NULL_TREE),
+					   NULL_TREE));
+
+  return field_type;
+}
+
+/* Look for adjacent fields of ELT starting at F that we'd like to
+   scalarize as a single variable.  Return the last field of the
+   group.  */
+
+static tree
+try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+{
+  int count;
+  unsigned HOST_WIDE_INT align, bit, size, alchk;
+  enum machine_mode mode;
+  tree first = f, prev;
+  tree type, var;
+  struct sra_elt *block;
+
+  if (!is_sra_scalar_type (TREE_TYPE (f))
+      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
+      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+      || !host_integerp (DECL_SIZE (f), 1)
+      || lookup_element (elt, f, NULL, NO_INSERT))
+    return f;
+
+  block = elt;
+
+  /* For complex and array objects, there are going to be integer
+     literals as child elements.  In this case, we can't just take the
+     alignment and mode of the decl, so we instead rely on the element
+     type.
+
+     ??? We could try to infer additional alignment from the full
+     object declaration and the location of the sub-elements we're
+     accessing.  */
+  for (count = 0; !DECL_P (block->element); count++)
+    block = block->parent;
+
+  align = DECL_ALIGN (block->element);
+  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
+
+  if (count)
+    {
+      type = TREE_TYPE (block->element);
+      while (count--)
+	type = TREE_TYPE (type);
+
+      align = TYPE_ALIGN (type);
+      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
+    }
+
+  if (align < alchk)
+    align = alchk;
+
+  /* Coalescing wider fields is probably pointless and
+     inefficient.  */
+  if (align > BITS_PER_WORD)
+    align = BITS_PER_WORD;
+
+  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+  size = tree_low_cst (DECL_SIZE (f), 1);
+
+  alchk = align - 1;
+  alchk = ~alchk;
+
+  if ((bit & alchk) != ((bit + size - 1) & alchk))
+    return f;
+
+  /* Find adjacent fields in the same alignment word.  */
+
+  for (prev = f, f = TREE_CHAIN (f);
+       f && TREE_CODE (f) == FIELD_DECL
+	 && is_sra_scalar_type (TREE_TYPE (f))
+	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
+	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
+	 && host_integerp (DECL_SIZE (f), 1)
+	 && !lookup_element (elt, f, NULL, NO_INSERT);
+       prev = f, f = TREE_CHAIN (f))
+    {
+      unsigned HOST_WIDE_INT nbit, nsize;
+
+      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+      nsize = tree_low_cst (DECL_SIZE (f), 1);
+
+      if (bit + size == nbit)
+	{
+	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+	    {
+	      /* If we're at an alignment boundary, don't bother
+		 growing alignment such that we can include this next
+		 field.  */
+	      if ((nbit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
+		break;
+	    }
+	  size += nsize;
+	}
+      else if (nbit + nsize == bit)
+	{
+	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
+	    {
+	      if ((bit & alchk)
+		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
+		break;
+
+	      align = GET_MODE_BITSIZE (DECL_MODE (f));
+	      alchk = align - 1;
+	      alchk = ~alchk;
+
+	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
+		break;
+	    }
+	  bit = nbit;
+	  size += nsize;
+	}
+      else
+	break;
+    }
+
+  f = prev;
+
+  if (f == first)
+    return f;
+
+  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+
+  /* Try to widen the bit range so as to cover padding bits as well.  */
+
+  if ((bit & ~alchk) || size != align)
+    {
+      unsigned HOST_WIDE_INT mbit = bit & alchk;
+      unsigned HOST_WIDE_INT msize = align;
+
+      for (f = TYPE_FIELDS (elt->type);
+	   f; f = TREE_CHAIN (f))
+	{
+	  unsigned HOST_WIDE_INT fbit, fsize;
+
+	  /* Skip the fields from first to prev.  */
+	  if (f == first)
+	    {
+	      f = prev;
+	      continue;
+	    }
+
+	  if (!(TREE_CODE (f) == FIELD_DECL
+		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
+		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
+	    continue;
+
+	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
+	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+
+	  /* If we're past the selected word, we're fine.  */
+	  if ((bit & alchk) < (fbit & alchk))
+	    continue;
+
+	  if (host_integerp (DECL_SIZE (f), 1))
+	    fsize = tree_low_cst (DECL_SIZE (f), 1);
+	  else
+	    /* Assume a variable-sized field takes up all space till
+	       the end of the word.  ??? Endianness issues?  */
+	    fsize = align - (fbit & alchk);
+
+	  if ((fbit & alchk) < (bit & alchk))
+	    {
+	      /* A large field might start at a previous word and
+		 extend into the selected word.  Exclude those
+		 bits.  ??? Endianness issues? */
+	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+
+	      if (diff <= 0)
+		continue;
+
+	      mbit += diff;
+	      msize -= diff;
+	    }
+	  else
+	    {
+	      /* Non-overlapping, great.  */
+	      if (fbit + fsize <= mbit
+		  || mbit + msize <= fbit)
+		continue;
+
+	      if (fbit <= mbit)
+		{
+		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
+		  mbit += diff;
+		  msize -= diff;
+		}
+	      else if (fbit > mbit)
+		msize -= (mbit + msize - fbit);
+	      else
+		gcc_unreachable ();
+	    }
+	}
+
+      bit = mbit;
+      size = msize;
+    }
+
+  /* Now we know the bit range we're interested in.  Find the smallest
+     machine mode we can use to access it.  */
+
+  for (mode = smallest_mode_for_size (size, MODE_INT);
+       ;
+       mode = GET_MODE_WIDER_MODE (mode))
+    {
+      gcc_assert (mode != VOIDmode);
+
+      alchk = GET_MODE_PRECISION (mode) - 1;
+      alchk = ~alchk;
+
+      if ((bit & alchk) == ((bit + size - 1) & alchk))
+	break;
+    }
+
+  gcc_assert (~alchk < align);
+
+  /* Create the field group as a single variable.  */
+
+  type = lang_hooks.types.type_for_mode (mode, 1);
+  gcc_assert (type);
+  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
+		bitsize_int (size),
+		bitsize_int (bit));
+  BIT_FIELD_REF_UNSIGNED (var) = 1;
+
+  block = instantiate_missing_elements_1 (elt, var, type);
+  gcc_assert (block && block->is_scalar);
+
+  var = block->replacement;
+
+  if ((bit & ~alchk)
+      || (HOST_WIDE_INT)size != tree_low_cst (DECL_SIZE (var), 1))
+    {
+      block->replacement = build3 (BIT_FIELD_REF,
+				   TREE_TYPE (block->element), var,
+				   bitsize_int (size),
+				   bitsize_int (bit & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (block->replacement) = 1;
+    }
+
+  block->in_bitfld_block = 2;
+
+  /* Add the member fields to the group, such that they access
+     portions of the group variable.  */
+
+  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+    {
+      tree field_type = canon_type_for_field (f, elt->element);
+      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
+
+      gcc_assert (fld && fld->is_scalar && !fld->replacement);
+
+      fld->replacement = build3 (BIT_FIELD_REF, field_type, var,
+				 DECL_SIZE (f),
+				 bitsize_int
+				 ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
+				   * BITS_PER_UNIT
+				   + (TREE_INT_CST_LOW
+				      (DECL_FIELD_BIT_OFFSET (f))))
+				  & ~alchk));
+      BIT_FIELD_REF_UNSIGNED (fld->replacement) = TYPE_UNSIGNED (field_type);
+      fld->in_bitfld_block = 1;
+    }
+
+  return prev;
 }
 
 static void
@@ -1363,21 +1729,17 @@ instantiate_missing_elements (struct sra
 	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
 	  if (TREE_CODE (f) == FIELD_DECL)
 	    {
-	      tree field_type = TREE_TYPE (f);
+	      tree last = try_instantiate_multiple_fields (elt, f);
 
-	      /* canonicalize_component_ref() unwidens some bit-field
-		 types (not marked as DECL_BIT_FIELD in C++), so we
-		 must do the same, lest we may introduce type
-		 mismatches.  */
-	      if (INTEGRAL_TYPE_P (field_type)
-		  && DECL_MODE (f) != TYPE_MODE (field_type))
-		field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-							       field_type,
-							       elt->element,
-							       f, NULL_TREE),
-						       NULL_TREE));
+	      if (last != f)
+		{
+		  f = last;
+		  continue;
+		}
 
-	      instantiate_missing_elements_1 (elt, f, field_type);
+	      instantiate_missing_elements_1 (elt, f,
+					      canon_type_for_field
+					      (f, elt->element));
 	    }
 	break;
       }
@@ -1689,6 +2051,16 @@ generate_one_element_ref (struct sra_elt
       {
 	tree field = elt->element;
 
+	/* We can't test elt->in_bitfld_blk here because, when this is
+	   called from instantiate_element, we haven't set this field
+	   yet.  */
+	if (TREE_CODE (field) == BIT_FIELD_REF)
+	  {
+	    tree ret = unshare_expr (field);
+	    TREE_OPERAND (ret, 0) = base;
+	    return ret;
+	  }
+
 	/* Watch out for compatible records with differing field lists.  */
 	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
 	  field = find_compatible_field (TREE_TYPE (base), field);
@@ -1725,11 +2097,180 @@ generate_element_ref (struct sra_elt *el
     return elt->element;
 }
 
+/* Return true if BF is a bit-field that we can handle like a scalar.  */
+
+static bool
+scalar_bitfield_p (tree bf)
+{
+  return (TREE_CODE (bf) == BIT_FIELD_REF
+	  && (is_gimple_reg (TREE_OPERAND (bf, 0))
+	      || (TYPE_MODE (TREE_TYPE (TREE_OPERAND (bf, 0))) != BLKmode
+		  && (!TREE_SIDE_EFFECTS (TREE_OPERAND (bf, 0))
+		      || (GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE
+						       (TREE_OPERAND (bf, 0))))
+			  <= BITS_PER_WORD)))));
+}
+
 /* Create an assignment statement from SRC to DST.  */
 
 static tree
 sra_build_assignment (tree dst, tree src)
 {
+  /* Turning BIT_FIELD_REFs into bit operations enables other passes
+     to do a much better job at optimizing the code.  */
+  if (scalar_bitfield_p (src))
+    {
+      tree cst, cst2, mask, minshift, maxshift;
+      tree tmp, var, utype, stype;
+      tree list, stmt;
+      bool unsignedp = BIT_FIELD_REF_UNSIGNED (src);
+
+      var = TREE_OPERAND (src, 0);
+      cst = TREE_OPERAND (src, 2);
+      cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
+			 TREE_OPERAND (src, 2));
+
+      if (BITS_BIG_ENDIAN)
+	{
+	  maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+	  minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+	}
+      else
+	{
+	  maxshift = cst2;
+	  minshift = cst;
+	}
+
+      stype = TREE_TYPE (var);
+      if (!INTEGRAL_TYPE_P (stype))
+	stype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (stype)), 1);
+      else if (!TYPE_UNSIGNED (stype))
+	stype = unsigned_type_for (stype);
+
+      utype = TREE_TYPE (dst);
+      if (!INTEGRAL_TYPE_P (utype))
+	utype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
+						(TYPE_SIZE (utype)), 1);
+      else if (!TYPE_UNSIGNED (utype))
+	utype = unsigned_type_for (utype);
+
+      list = NULL;
+
+      cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
+      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+	{
+	  unsignedp = true;
+	  mask = NULL_TREE;
+	}
+      else
+	{
+	  mask = build_int_cst_wide (utype, 1, 0);
+	  cst = int_const_binop (LSHIFT_EXPR, mask, cst2, true);
+	  mask = int_const_binop (MINUS_EXPR, cst, mask, true);
+	}
+
+      tmp = make_rename_temp (stype, "SR");
+      if (TYPE_MAIN_VARIANT (TREE_TYPE (var)) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_convert (stype, var));
+	  else
+	    stmt = build_gimple_modify_stmt (tmp,
+					     fold_build1 (VIEW_CONVERT_EXPR,
+							  stype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!integer_zerop (minshift))
+	{
+	  tmp = make_rename_temp (stype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (RSHIFT_EXPR, stype,
+							var, minshift));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (stype))
+	{
+	  if (!mask && unsignedp
+	      && (TYPE_MAIN_VARIANT (utype)
+		  == TYPE_MAIN_VARIANT (TREE_TYPE (dst))))
+	    tmp = dst;
+	  else
+	    tmp = make_rename_temp (utype, "SR");
+
+	  stmt = build_gimple_modify_stmt (tmp, fold_convert (utype, var));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (mask)
+	{
+	  if (!unsignedp
+	      || (TYPE_MAIN_VARIANT (TREE_TYPE (dst))
+		  != TYPE_MAIN_VARIANT (utype)))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_AND_EXPR, utype,
+							var, mask));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (!unsignedp)
+	{
+	  tree signbit = int_const_binop (LSHIFT_EXPR,
+					  build_int_cst_wide (utype, 1, 0),
+					  size_binop (MINUS_EXPR, cst2,
+						      bitsize_int (1)),
+					  true);
+
+	  tmp = make_rename_temp (utype, "SR");
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (BIT_XOR_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (dst)) != TYPE_MAIN_VARIANT (utype))
+	    tmp = make_rename_temp (utype, "SR");
+	  else
+	    tmp = dst;
+
+	  stmt = build_gimple_modify_stmt (tmp,
+					   fold_build2 (MINUS_EXPR, utype,
+							var, signbit));
+	  append_to_statement_list (stmt, &list);
+
+	  var = tmp;
+	}
+
+      if (var != dst)
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (dst)))
+	    var = fold_convert (TREE_TYPE (dst), var);
+	  else
+	    var = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (dst), var);
+
+	  stmt = build_gimple_modify_stmt (dst, var);
+	  append_to_statement_list (stmt, &list);
+	}
+
+      return list;
+    }
+
   /* It was hoped that we could perform some type sanity checking
      here, but since front-ends can emit accesses of fields in types
      different from their nominal types and copy structures containing
@@ -1741,6 +2282,256 @@ sra_build_assignment (tree dst, tree src
   return build_gimple_modify_stmt (dst, src);
 }
 
+/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
+   takes care of assignments, but we must create copies for uses.  */
+#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : unshare_expr (t))
+
+/* Emit an assignment from SRC to DST, but if DST is a scalarizable
+   BIT_FIELD_REF, turn it into bit operations.  */
+
+static tree
+sra_build_bf_assignment (tree dst, tree src)
+{
+  tree var, type, utype, tmp, tmp2, tmp3;
+  tree list, stmt;
+  tree cst, cst2, mask;
+  tree minshift, maxshift;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF)
+    return sra_build_assignment (dst, src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  if (!scalar_bitfield_p (dst))
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  list = NULL;
+
+  cst = fold_convert (bitsizetype, TREE_OPERAND (dst, 2));
+  cst2 = size_binop (PLUS_EXPR,
+		     fold_convert (bitsizetype, TREE_OPERAND (dst, 1)),
+		     cst);
+
+  if (BITS_BIG_ENDIAN)
+    {
+      maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
+      minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+    }
+  else
+    {
+      maxshift = cst2;
+      minshift = cst;
+    }
+
+  type = TREE_TYPE (var);
+  if (!INTEGRAL_TYPE_P (type))
+    type = lang_hooks.types.type_for_size
+      (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (var))), 1);
+  if (TYPE_UNSIGNED (type))
+    utype = type;
+  else
+    utype = unsigned_type_for (type);
+
+  mask = build_int_cst_wide (utype, 1, 0);
+  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
+  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
+  mask = int_const_binop (MINUS_EXPR, cst, cst2, true);
+  mask = fold_build1 (BIT_NOT_EXPR, utype, mask);
+
+  if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (TREE_TYPE (var))
+      && !integer_zerop (mask))
+    {
+      tmp = var;
+      if (!is_gimple_variable (tmp))
+	tmp = unshare_expr (var);
+
+      tmp2 = make_rename_temp (utype, "SR");
+
+      if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
+	stmt = build_gimple_modify_stmt (tmp2, fold_convert (utype, tmp));
+      else
+	stmt = build_gimple_modify_stmt (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+							    utype, tmp));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp2 = var;
+
+  if (!integer_zerop (mask))
+    {
+      tmp = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp,
+				       fold_build2 (BIT_AND_EXPR, utype,
+						    tmp2, mask));
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    tmp = mask;
+
+  if (is_gimple_reg (src) && INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    tmp2 = src;
+  else if (INTEGRAL_TYPE_P (TREE_TYPE (src)))
+    {
+      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
+      stmt = sra_build_assignment (tmp2, src);
+      append_to_statement_list (stmt, &list);
+    }
+  else
+    {
+      tmp2 = make_rename_temp
+	(lang_hooks.types.type_for_size
+	 (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (src))),
+	  1), "SR");
+      stmt = sra_build_assignment (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (tmp2), src));
+      append_to_statement_list (stmt, &list);
+    }
+
+  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2)))
+    {
+      tree ut = unsigned_type_for (TREE_TYPE (tmp2));
+      tmp3 = make_rename_temp (ut, "SR");
+      tmp2 = fold_convert (ut, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+
+      tmp2 = fold_build1 (BIT_NOT_EXPR, utype, mask);
+      tmp2 = int_const_binop (RSHIFT_EXPR, tmp2, minshift, true);
+      tmp2 = fold_convert (ut, tmp2);
+      tmp2 = fold_build2 (BIT_AND_EXPR, utype, tmp3, tmp2);
+
+      if (tmp3 != tmp2)
+	{
+	  tmp3 = make_rename_temp (ut, "SR");
+	  stmt = sra_build_assignment (tmp3, tmp2);
+	  append_to_statement_list (stmt, &list);
+	}
+
+      tmp2 = tmp3;
+    }
+
+  if (TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (utype))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      tmp2 = fold_convert (utype, tmp2);
+      stmt = sra_build_assignment (tmp3, tmp2);
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (!integer_zerop (minshift))
+    {
+      tmp3 = make_rename_temp (utype, "SR");
+      stmt = build_gimple_modify_stmt (tmp3,
+				       fold_build2 (LSHIFT_EXPR, utype,
+						    tmp2, minshift));
+      append_to_statement_list (stmt, &list);
+      tmp2 = tmp3;
+    }
+
+  if (utype != TREE_TYPE (var))
+    tmp3 = make_rename_temp (utype, "SR");
+  else
+    tmp3 = var;
+  stmt = build_gimple_modify_stmt (tmp3,
+				   fold_build2 (BIT_IOR_EXPR, utype,
+						tmp, tmp2));
+  append_to_statement_list (stmt, &list);
+
+  if (tmp3 != var)
+    {
+      if (TREE_TYPE (var) == type)
+	stmt = build_gimple_modify_stmt (var,
+					 fold_convert (type, tmp3));
+      else
+	stmt = build_gimple_modify_stmt (var,
+					 fold_build1 (VIEW_CONVERT_EXPR,
+						      TREE_TYPE (var), tmp3));
+      append_to_statement_list (stmt, &list);
+    }
+
+  return list;
+}
+
+/* Expand an assignment of SRC to the scalarized representation of
+   ELT.  If it is a field group, try to widen the assignment to cover
+   the full variable.  */
+
+static tree
+sra_build_elt_assignment (struct sra_elt *elt, tree src)
+{
+  tree dst = elt->replacement;
+  tree var, tmp, cst, cst2, list, stmt;
+
+  if (TREE_CODE (dst) != BIT_FIELD_REF
+      || !elt->in_bitfld_block)
+    return sra_build_assignment (REPLDUP (dst), src);
+
+  var = TREE_OPERAND (dst, 0);
+
+  /* Try to widen the assignment to the entire variable.
+     We need the source to be a BIT_FIELD_REF as well, such that, for
+     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
+     by design, conditions are met such that we can turn it into
+     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
+  if (elt->in_bitfld_block == 2
+      && TREE_CODE (src) == BIT_FIELD_REF)
+    {
+      cst = TYPE_SIZE (TREE_TYPE (var));
+      cst2 = size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
+			 TREE_OPERAND (dst, 2));
+
+      src = TREE_OPERAND (src, 0);
+
+      /* Avoid full-width bit-fields.  */
+      if (integer_zerop (cst2)
+	  && tree_int_cst_equal (cst, TYPE_SIZE (TREE_TYPE (src))))
+	{
+	  if (INTEGRAL_TYPE_P (TREE_TYPE (src))
+	      && !TYPE_UNSIGNED (TREE_TYPE (src)))
+	    src = fold_convert (unsigned_type_for (TREE_TYPE (src)), src);
+
+	  /* If a single conversion won't do, we'll need a statement
+	     list.  */
+	  if (TYPE_MAIN_VARIANT (TREE_TYPE (var))
+	      != TYPE_MAIN_VARIANT (TREE_TYPE (src)))
+	    {
+	      list = NULL;
+
+	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src))
+		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+		src = fold_build1 (VIEW_CONVERT_EXPR,
+				   lang_hooks.types.type_for_size
+				   (TREE_INT_CST_LOW
+				    (TYPE_SIZE (TREE_TYPE (src))),
+				    1), src);
+
+	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
+	      stmt = build_gimple_modify_stmt (tmp, src);
+	      append_to_statement_list (stmt, &list);
+
+	      stmt = sra_build_assignment (var,
+					   fold_convert (TREE_TYPE (var),
+							 tmp));
+	      append_to_statement_list (stmt, &list);
+
+	      return list;
+	    }
+
+	  src = fold_convert (TREE_TYPE (var), src);
+	}
+      else
+	{
+	  src = fold_build3 (BIT_FIELD_REF, TREE_TYPE (var), src, cst, cst2);
+	  BIT_FIELD_REF_UNSIGNED (src) = 1;
+	}
+
+      return sra_build_assignment (var, src);
+    }
+
+  return sra_build_bf_assignment (dst, src);
+}
+
 /* Generate a set of assignment statements in *LIST_P to copy all
    instantiated elements under ELT to or from the equivalent structure
    rooted at EXPR.  COPY_OUT controls the direction of the copy, with
@@ -1764,16 +2555,16 @@ generate_copy_inout (struct sra_elt *elt
       i = c->replacement;
 
       t = build2 (COMPLEX_EXPR, elt->type, r, i);
-      t = sra_build_assignment (expr, t);
+      t = sra_build_bf_assignment (expr, t);
       SSA_NAME_DEF_STMT (expr) = t;
       append_to_statement_list (t, list_p);
     }
   else if (elt->replacement)
     {
       if (copy_out)
-	t = sra_build_assignment (elt->replacement, expr);
+	t = sra_build_elt_assignment (elt, expr);
       else
-	t = sra_build_assignment (expr, elt->replacement);
+	t = sra_build_bf_assignment (expr, REPLDUP (elt->replacement));
       append_to_statement_list (t, list_p);
     }
   else
@@ -1798,6 +2589,19 @@ generate_element_copy (struct sra_elt *d
   FOR_EACH_ACTUAL_CHILD (dc, dst)
     {
       sc = lookup_element (src, dc->element, NULL, NO_INSERT);
+      if (!sc && dc->in_bitfld_block == 2)
+	{
+	  struct sra_elt *dcs;
+
+	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
+	    {
+	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
+	      gcc_assert (sc);
+	      generate_element_copy (dcs, sc, list_p);
+	    }
+
+	  continue;
+	}
       gcc_assert (sc);
       generate_element_copy (dc, sc, list_p);
     }
@@ -1808,7 +2612,7 @@ generate_element_copy (struct sra_elt *d
 
       gcc_assert (src->replacement);
 
-      t = sra_build_assignment (dst->replacement, src->replacement);
+      t = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
       append_to_statement_list (t, list_p);
     }
 }
@@ -1829,8 +2633,9 @@ generate_element_zero (struct sra_elt *e
       return;
     }
 
-  FOR_EACH_ACTUAL_CHILD (c, elt)
-    generate_element_zero (c, list_p);
+  if (!elt->in_bitfld_block)
+    FOR_EACH_ACTUAL_CHILD (c, elt)
+      generate_element_zero (c, list_p);
 
   if (elt->replacement)
     {
@@ -1839,7 +2644,7 @@ generate_element_zero (struct sra_elt *e
       gcc_assert (elt->is_scalar);
       t = fold_convert (elt->type, integer_zero_node);
 
-      t = sra_build_assignment (elt->replacement, t);
+      t = sra_build_elt_assignment (elt, t);
       append_to_statement_list (t, list_p);
     }
 }
@@ -1848,10 +2653,10 @@ generate_element_zero (struct sra_elt *e
    Add the result to *LIST_P.  */
 
 static void
-generate_one_element_init (tree var, tree init, tree *list_p)
+generate_one_element_init (struct sra_elt *elt, tree init, tree *list_p)
 {
   /* The replacement can be almost arbitrarily complex.  Gimplify.  */
-  tree stmt = sra_build_assignment (var, init);
+  tree stmt = sra_build_elt_assignment (elt, init);
   gimplify_and_add (stmt, list_p);
 }
 
@@ -1880,7 +2685,7 @@ generate_element_init_1 (struct sra_elt 
     {
       if (elt->replacement)
 	{
-	  generate_one_element_init (elt->replacement, init, list_p);
+	  generate_one_element_init (elt, init, list_p);
 	  elt->visited = true;
 	}
       return result;
@@ -2033,6 +2838,220 @@ sra_replace (block_stmt_iterator *bsi, t
     bsi_prev (bsi);
 }
 
+/* Data structure that bitfield_overlaps_p fills in with information
+   about the element passed in and how much of it overlaps with the
+   bit-range passed it to.  */
+
+struct bitfield_overlap_info
+{
+  /* The bit-length of an element.  */
+  tree field_len;
+
+  /* The bit-position of the element in its parent.  */
+  tree field_pos;
+
+  /* The number of bits of the element that overlap with the incoming
+     bit range.  */
+  tree overlap_len;
+
+  /* The first bit of the element that overlaps with the incoming bit
+     range.  */
+  tree overlap_pos;
+};
+
+/* Return true if a BIT_FIELD_REF<(FLD->parent), BLEN, BPOS>
+   expression (refereced as BF below) accesses any of the bits in FLD,
+   false if it doesn't.  If DATA is non-null, its field_len and
+   field_pos are filled in such that BIT_FIELD_REF<(FLD->parent),
+   field_len, field_pos> (referenced as BFLD below) represents the
+   entire field FLD->element, and BIT_FIELD_REF<BFLD, overlap_len,
+   overlap_pos> represents the portion of the entire field that
+   overlaps with BF.  */
+
+static bool
+bitfield_overlaps_p (tree blen, tree bpos, struct sra_elt *fld,
+		     struct bitfield_overlap_info *data)
+{
+  tree flen, fpos;
+  bool ret;
+
+  if (TREE_CODE (fld->element) == FIELD_DECL)
+    {
+      flen = fold_convert (bitsizetype, DECL_SIZE (fld->element));
+      fpos = fold_convert (bitsizetype, DECL_FIELD_OFFSET (fld->element));
+      fpos = size_binop (MULT_EXPR, fpos, bitsize_int (BITS_PER_UNIT));
+      fpos = size_binop (PLUS_EXPR, fpos, DECL_FIELD_BIT_OFFSET (fld->element));
+    }
+  else if (TREE_CODE (fld->element) == BIT_FIELD_REF)
+    {
+      flen = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 1));
+      fpos = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 2));
+    }
+  else
+    gcc_unreachable ();
+
+  gcc_assert (host_integerp (blen, 1)
+	      && host_integerp (bpos, 1)
+	      && host_integerp (flen, 1)
+	      && host_integerp (fpos, 1));
+
+  ret = ((!tree_int_cst_lt (fpos, bpos)
+	  && tree_int_cst_lt (size_binop (MINUS_EXPR, fpos, bpos),
+			      blen))
+	 || (!tree_int_cst_lt (bpos, fpos)
+	     && tree_int_cst_lt (size_binop (MINUS_EXPR, bpos, fpos),
+				 flen)));
+
+  if (!ret)
+    return ret;
+
+  if (data)
+    {
+      tree bend, fend;
+
+      data->field_len = flen;
+      data->field_pos = fpos;
+
+      fend = size_binop (PLUS_EXPR, fpos, flen);
+      bend = size_binop (PLUS_EXPR, bpos, blen);
+
+      if (tree_int_cst_lt (bend, fend))
+	data->overlap_len = size_binop (MINUS_EXPR, bend, fpos);
+      else
+	data->overlap_len = NULL;
+
+      if (tree_int_cst_lt (fpos, bpos))
+	{
+	  data->overlap_pos = size_binop (MINUS_EXPR, bpos, fpos);
+	  data->overlap_len = size_binop (MINUS_EXPR,
+					  data->overlap_len
+					  ? data->overlap_len
+					  : data->field_len,
+					  data->overlap_pos);
+	}
+      else
+	data->overlap_pos = NULL;
+    }
+
+  return ret;
+}
+
+/* Add to LISTP a sequence of statements that copies BLEN bits between
+   VAR and the scalarized elements of ELT, starting a bit VPOS of VAR
+   and at bit BPOS of ELT.  The direction of the copy is given by
+   TO_VAR.  */
+
+static void
+sra_explode_bitfield_assignment (tree var, tree vpos, bool to_var,
+				 tree *listp, tree blen, tree bpos,
+				 struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  struct bitfield_overlap_info flp;
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    {
+      tree flen, fpos;
+
+      if (!bitfield_overlaps_p (blen, bpos, fld, &flp))
+	continue;
+
+      flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
+      fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
+
+      if (fld->replacement)
+	{
+	  tree infld, invar, st;
+
+	  infld = fld->replacement;
+
+	  if (TREE_CODE (infld) == BIT_FIELD_REF)
+	    {
+	      fpos = size_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2));
+	      infld = TREE_OPERAND (infld, 0);
+	    }
+	  else if (BITS_BIG_ENDIAN && DECL_P (fld->element)
+		   && !tree_int_cst_equal (TYPE_SIZE (TREE_TYPE (infld)),
+					   DECL_SIZE (fld->element)))
+	    {
+	      fpos = size_binop (PLUS_EXPR, fpos,
+				 TYPE_SIZE (TREE_TYPE (infld)));
+	      fpos = size_binop (MINUS_EXPR, fpos,
+				 DECL_SIZE (fld->element));
+	    }
+
+	  infld = fold_build3 (BIT_FIELD_REF,
+			       lang_hooks.types.type_for_size
+			       (TREE_INT_CST_LOW (flen), 1),
+			       infld, flen, fpos);
+	  BIT_FIELD_REF_UNSIGNED (infld) = 1;
+
+	  invar = size_binop (MINUS_EXPR, flp.field_pos, bpos);
+	  if (flp.overlap_pos)
+	    invar = size_binop (PLUS_EXPR, invar, flp.overlap_pos);
+	  invar = size_binop (PLUS_EXPR, invar, vpos);
+
+	  invar = fold_build3 (BIT_FIELD_REF, TREE_TYPE (infld),
+			       var, flen, invar);
+	  BIT_FIELD_REF_UNSIGNED (invar) = 1;
+
+	  if (to_var)
+	    st = sra_build_bf_assignment (invar, infld);
+	  else
+	    st = sra_build_bf_assignment (infld, invar);
+
+	  append_to_statement_list (st, listp);
+	}
+      else
+	{
+	  tree sub = size_binop (MINUS_EXPR, flp.field_pos, bpos);
+	  sub = size_binop (PLUS_EXPR, vpos, sub);
+	  if (flp.overlap_pos)
+	    sub = size_binop (PLUS_EXPR, sub, flp.overlap_pos);
+
+	  sra_explode_bitfield_assignment (var, sub, to_var, listp,
+					   flen, fpos, fld);
+	}
+    }
+}
+
+/* Add to LISTBEFOREP statements that copy scalarized members of ELT
+   that overlap with BIT_FIELD_REF<(ELT->element), BLEN, BPOS> back
+   into the full variable, and to LISTAFTERP, if non-NULL, statements
+   that copy the (presumably modified) overlapping portions of the
+   full variable back to the scalarized variables.  */
+
+static void
+sra_sync_for_bitfield_assignment (tree *listbeforep, tree *listafterp,
+				  tree blen, tree bpos,
+				  struct sra_elt *elt)
+{
+  struct sra_elt *fld;
+  struct bitfield_overlap_info flp;
+
+  FOR_EACH_ACTUAL_CHILD (fld, elt)
+    if (bitfield_overlaps_p (blen, bpos, fld, &flp))
+      {
+	if (fld->replacement || (!flp.overlap_len && !flp.overlap_pos))
+	  {
+	    generate_copy_inout (fld, false, generate_element_ref (fld),
+				 listbeforep);
+	    mark_no_warning (fld);
+	    if (listafterp)
+	      generate_copy_inout (fld, true, generate_element_ref (fld),
+				   listafterp);
+	  }
+	else
+	  {
+	    tree flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
+	    tree fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
+
+	    sra_sync_for_bitfield_assignment (listbeforep, listafterp,
+					      flen, fpos, fld);
+	  }
+      }
+}
+
 /* Scalarize a USE.  To recap, this is either a simple reference to ELT,
    if elt is scalar, or some occurrence of ELT that requires a complete
    aggregate.  IS_OUTPUT is true if ELT is being modified.  */
@@ -2041,23 +3060,162 @@ static void
 scalarize_use (struct sra_elt *elt, tree *expr_p, block_stmt_iterator *bsi,
 	       bool is_output, bool use_all)
 {
-  tree list = NULL, stmt = bsi_stmt (*bsi);
+  tree stmt = bsi_stmt (*bsi);
+  tree bfexpr;
 
   if (elt->replacement)
     {
+      tree replacement = elt->replacement;
+
       /* If we have a replacement, then updating the reference is as
 	 simple as modifying the existing statement in place.  */
+      if (is_output
+	  && TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	  && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
+	  && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	  && &GIMPLE_STMT_OPERAND (stmt, 0) == expr_p)
+	{
+	  tree newstmt = sra_build_elt_assignment
+	    (elt, GIMPLE_STMT_OPERAND (stmt, 1));
+	  if (TREE_CODE (newstmt) != STATEMENT_LIST)
+	    {
+	      tree list = NULL;
+	      append_to_statement_list (newstmt, &list);
+	      newstmt = list;
+	    }
+	  sra_replace (bsi, newstmt);
+	  return;
+	}
+      else if (!is_output
+	       && TREE_CODE (elt->replacement) == BIT_FIELD_REF
+	       && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	       && &GIMPLE_STMT_OPERAND (stmt, 1) == expr_p)
+	{
+	  tree tmp = make_rename_temp
+	    (TREE_TYPE (GIMPLE_STMT_OPERAND (stmt, 0)), "SR");
+	  tree newstmt = sra_build_assignment (tmp, REPLDUP (elt->replacement));
+
+	  if (TREE_CODE (newstmt) != STATEMENT_LIST)
+	    {
+	      tree list = NULL;
+	      append_to_statement_list (newstmt, &list);
+	      newstmt = list;
+	    }
+	  sra_insert_before (bsi, newstmt);
+	  replacement = tmp;
+	}
       if (is_output)
-	mark_all_v_defs (stmt);
-      *expr_p = elt->replacement;
+	  mark_all_v_defs (stmt);
+      *expr_p = REPLDUP (replacement);
       update_stmt (stmt);
     }
+  else if (use_all && is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 0)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (bfexpr, 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree listbefore = NULL, listafter = NULL;
+      tree blen = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 1));
+      tree bpos = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 2));
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var = make_rename_temp (type, "SR"), tmp, st;
+
+	  GIMPLE_STMT_OPERAND (stmt, 0) = var;
+	  update = true;
+
+	  if (!TYPE_UNSIGNED (type))
+	    {
+	      type = unsigned_type_for (type);
+	      tmp = make_rename_temp (type, "SR");
+	      st = build_gimple_modify_stmt (tmp,
+					     fold_convert (type, var));
+	      append_to_statement_list (st, &listafter);
+	      var = tmp;
+	    }
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), false, &listafter, blen, bpos, elt);
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&listbefore, &listafter, blen, bpos, elt);
+
+      if (listbefore)
+	{
+	  mark_all_v_defs (listbefore);
+	  sra_insert_before (bsi, listbefore);
+	}
+      if (listafter)
+	{
+	  mark_all_v_defs (listafter);
+	  sra_insert_after (bsi, listafter);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
+  else if (use_all && !is_output
+	   && TREE_CODE (stmt) == GIMPLE_MODIFY_STMT
+	   && TREE_CODE (bfexpr
+			 = GIMPLE_STMT_OPERAND (stmt, 1)) == BIT_FIELD_REF
+	   && &TREE_OPERAND (GIMPLE_STMT_OPERAND (stmt, 1), 0) == expr_p
+	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
+	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+    {
+      tree list = NULL;
+      tree blen = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 1));
+      tree bpos = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 2));
+      bool update = false;
+
+      if (!elt->use_block_copy)
+	{
+	  tree type = TREE_TYPE (bfexpr);
+	  tree var;
+
+	  if (!TYPE_UNSIGNED (type))
+	    type = unsigned_type_for (type);
+
+	  var = make_rename_temp (type, "SR");
+
+	  append_to_statement_list (build_gimple_modify_stmt
+				    (var, build_int_cst_wide (type, 0, 0)),
+				    &list);
+
+	  sra_explode_bitfield_assignment
+	    (var, bitsize_int (0), true, &list, blen, bpos, elt);
+
+	  GIMPLE_STMT_OPERAND (stmt, 1) = var;
+	  update = true;
+	}
+      else
+	sra_sync_for_bitfield_assignment
+	  (&list, NULL, blen, bpos, elt);
+
+      if (list)
+	{
+	  mark_all_v_defs (list);
+	  sra_insert_before (bsi, list);
+	}
+
+      if (update)
+	update_stmt (stmt);
+    }
   else
     {
-      /* Otherwise we need some copies.  If ELT is being read, then we want
-	 to store all (modified) sub-elements back into the structure before
-	 the reference takes place.  If ELT is being written, then we want to
-	 load the changed values back into our shadow variables.  */
+      tree list = NULL;
+
+      /* Otherwise we need some copies.  If ELT is being read, then we
+	 want to store all (modified) sub-elements back into the
+	 structure before the reference takes place.  If ELT is being
+	 written, then we want to load the changed values back into
+	 our shadow variables.  */
       /* ??? We don't check modified for reads, we just always write all of
 	 the values.  We should be able to record the SSA number of the VOP
 	 for which the values were last read.  If that number matches the
@@ -2101,7 +3259,7 @@ scalarize_copy (struct sra_elt *lhs_elt,
       gcc_assert (TREE_CODE (stmt) == GIMPLE_MODIFY_STMT);
 
       GIMPLE_STMT_OPERAND (stmt, 0) = lhs_elt->replacement;
-      GIMPLE_STMT_OPERAND (stmt, 1) = rhs_elt->replacement;
+      GIMPLE_STMT_OPERAND (stmt, 1) = REPLDUP (rhs_elt->replacement);
       update_stmt (stmt);
     }
   else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
@@ -2255,20 +3413,41 @@ scalarize_ldst (struct sra_elt *elt, tre
 
       mark_all_v_defs (stmt);
       generate_copy_inout (elt, is_output, other, &list);
-      mark_all_v_defs (list);
       gcc_assert (list);
+      mark_all_v_defs (list);
 
       /* Preserve EH semantics.  */
       if (stmt_ends_bb_p (stmt))
 	{
 	  tree_stmt_iterator tsi;
-	  tree first;
+	  tree first, blist = NULL;
+	  bool thr = (bsi->bb->flags & EDGE_COMPLEX) != 0;
 
-	  /* Extract the first statement from LIST.  */
+	  /* If the last statement of this BB created an EH edge
+	     before scalarization, we have to locate the first
+	     statement that can throw in the new statement list and
+	     use that as the last statement of this BB, such that EH
+	     semantics is preserved.  All statements up to this one
+	     are added to the same BB.  All other statements in the
+	     list will be added to normal outgoing edges of the same
+	     BB.  If they access any memory, it's the same memory, so
+	     we can assume they won't throw.  */
 	  tsi = tsi_start (list);
-	  first = tsi_stmt (tsi);
+	  for (first = tsi_stmt (tsi);
+	       thr && !tsi_end_p (tsi) && !tree_could_throw_p (first);
+	       first = tsi_stmt (tsi))
+	    {
+	      tsi_delink (&tsi);
+	      append_to_statement_list (first, &blist);
+	    }
+
+	  /* Extract the first remaining statement from LIST, this is
+	     the EH statement if there is one.  */
 	  tsi_delink (&tsi);
 
+	  if (blist)
+	    sra_insert_before (bsi, blist);
+
 	  /* Replace the old statement with this new representative.  */
 	  bsi_replace (bsi, first, true);
 
@@ -2352,6 +3531,10 @@ dump_sra_elt_name (FILE *f, struct sra_e
 	    fputc ('.', f);
 	  print_generic_expr (f, elt->element, dump_flags);
 	}
+      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
+	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
+		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
+		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
       else if (TREE_CODE (elt->element) == RANGE_EXPR)
 	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
 		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* SRA bit-field optimization (was: Re: Reload bug & SRA oddness)
  2007-07-06  9:21                                                         ` Alexandre Oliva
@ 2007-08-24  7:28                                                           ` Alexandre Oliva
  2007-09-28  9:17                                                             ` SRA bit-field optimization Alexandre Oliva
  2007-09-29 17:52                                                           ` Reload bug & SRA oddness Diego Novillo
  1 sibling, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-08-24  7:28 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Bernd Schmidt, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

On Jul  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> Here's the patch, bootstrapped and regression-tested on
> x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
> tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
> ppc64-linux-gnu underway.

> Ok to install?

Ping?

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00556.html

> for gcc/ChangeLog
> from  Alexandre Oliva  <aoliva@redhat.com>

> 	PR middle-end/22156
> 	* tree-sra.c (struct sra_elt): Add in_bitfld_block.
> 	(sra_hash_tree): Handle BIT_FIELD_REFs.
> 	(sra_elt_hash): Don't hash bitfld blocks.
> 	(sra_elt_eq): Skip them in parent compares as well.  Handle
> 	BIT_FIELD_REFs.
> 	(build_element_name_1): Handle BIT_FIELD_REFs.
> 	(instantiate_element): Propagate nowarn from parents.  Create
> 	BIT_FIELD_REF for variables that are widened by scalarization.
> 	Gimple-zero-initialize all bit-field variables that are not
> 	part of parameters that are going to be scalarized on entry.
> 	(instantiate_missing_elements_1): Return the sra_elt.
> 	(canon_type_for_field): New.
> 	(try_instantiate_multiple_fields): New.  Infer widest possible
> 	access mode from decl or member type, but clip it at word
> 	size, and only widen it if a field crosses an alignment
> 	boundary.
> 	(instantiate_missing_elements): Use them.
> 	(generate_one_element_ref): Handle BIT_FIELD_REFs.
> 	(scalar_bitfield_p): New.
> 	(sra_build_assignment): Optimize assignments from scalarizable
> 	BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
> 	counts.
> 	(REPLDUP): New.
> 	(sra_build_bf_assignment): New.  Optimize assignments to
> 	scalarizable BIT_FIELD_REFs.
> 	(sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
> 	assignments to full variables.
> 	(generate_copy_inout): Use the new macros and functions.
> 	(generate_element_copy): Likewise.  Handle bitfld differences.
> 	(generate_element_zero): Don't recurse for blocks.  Use
> 	sra_build_elt_assignment.
> 	(generate_one_element_init): Take elt instead of var.  Use
> 	sra_build_elt_assignment.
> 	(generate_element_init_1): Adjust.
> 	(bitfield_overlap_info): New struct.
> 	(bitfield_overlaps_p): New.
> 	(sra_explode_bitfield_assignment): New.  Adjust widened
> 	variables to account for endianness.
> 	(sra_sync_for_bitfield_assignment): New.
> 	(scalarize_use): Re-expand assignment to/from scalarized
> 	BIT_FIELD_REFs.  Explode or sync needed members for
> 	BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
> 	(scalarize_copy): Use REPLDUP.
> 	(scalarize_ldst): Move assert before dereference.  Adjust EH
> 	handling.
> 	(dump_sra_elt_name): Handle BIT_FIELD_REFs.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-08-24  7:28                                                           ` SRA bit-field optimization (was: Re: Reload bug & SRA oddness) Alexandre Oliva
@ 2007-09-28  9:17                                                             ` Alexandre Oliva
  2007-10-02 17:15                                                               ` Richard Guenther
  2007-10-03  7:45                                                               ` Eric Botcazou
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-09-28  9:17 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Bernd Schmidt, Diego Novillo, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

On Aug 24, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Jul  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
>> Here's the patch, bootstrapped and regression-tested on
>> x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
>> tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
>> ppc64-linux-gnu underway.

>> Ok to install?

> Ping?

Ping^2?

Tests on ppc* passed back then, BTW.

> http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00556.html

>> for gcc/ChangeLog
>> from  Alexandre Oliva  <aoliva@redhat.com>

>> PR middle-end/22156
>> * tree-sra.c (struct sra_elt): Add in_bitfld_block.
>> (sra_hash_tree): Handle BIT_FIELD_REFs.
>> (sra_elt_hash): Don't hash bitfld blocks.
>> (sra_elt_eq): Skip them in parent compares as well.  Handle
>> BIT_FIELD_REFs.
>> (build_element_name_1): Handle BIT_FIELD_REFs.
>> (instantiate_element): Propagate nowarn from parents.  Create
>> BIT_FIELD_REF for variables that are widened by scalarization.
>> Gimple-zero-initialize all bit-field variables that are not
>> part of parameters that are going to be scalarized on entry.
>> (instantiate_missing_elements_1): Return the sra_elt.
>> (canon_type_for_field): New.
>> (try_instantiate_multiple_fields): New.  Infer widest possible
>> access mode from decl or member type, but clip it at word
>> size, and only widen it if a field crosses an alignment
>> boundary.
>> (instantiate_missing_elements): Use them.
>> (generate_one_element_ref): Handle BIT_FIELD_REFs.
>> (scalar_bitfield_p): New.
>> (sra_build_assignment): Optimize assignments from scalarizable
>> BIT_FIELD_REFs.  Use BITS_BIG_ENDIAN to determine shift
>> counts.
>> (REPLDUP): New.
>> (sra_build_bf_assignment): New.  Optimize assignments to
>> scalarizable BIT_FIELD_REFs.
>> (sra_build_elt_assignment): New.  Optimize BIT_FIELD_REF
>> assignments to full variables.
>> (generate_copy_inout): Use the new macros and functions.
>> (generate_element_copy): Likewise.  Handle bitfld differences.
>> (generate_element_zero): Don't recurse for blocks.  Use
>> sra_build_elt_assignment.
>> (generate_one_element_init): Take elt instead of var.  Use
>> sra_build_elt_assignment.
>> (generate_element_init_1): Adjust.
>> (bitfield_overlap_info): New struct.
>> (bitfield_overlaps_p): New.
>> (sra_explode_bitfield_assignment): New.  Adjust widened
>> variables to account for endianness.
>> (sra_sync_for_bitfield_assignment): New.
>> (scalarize_use): Re-expand assignment to/from scalarized
>> BIT_FIELD_REFs.  Explode or sync needed members for
>> BIT_FIELD_REFs accesses or assignments.  Use REPLDUP.
>> (scalarize_copy): Use REPLDUP.
>> (scalarize_ldst): Move assert before dereference.  Adjust EH
>> handling.
>> (dump_sra_elt_name): Handle BIT_FIELD_REFs.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Reload bug & SRA oddness
  2007-07-06  9:21                                                         ` Alexandre Oliva
  2007-08-24  7:28                                                           ` SRA bit-field optimization (was: Re: Reload bug & SRA oddness) Alexandre Oliva
@ 2007-09-29 17:52                                                           ` Diego Novillo
  1 sibling, 0 replies; 104+ messages in thread
From: Diego Novillo @ 2007-09-29 17:52 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Bernd Schmidt, Daniel Berlin, GCC Patches,
	Andrew Pinski, Eric Botcazou

On 7/6/07, Alexandre Oliva <aoliva@redhat.com> wrote:

> Here's the patch, bootstrapped and regression-tested on
> x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
> tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
> ppc64-linux-gnu underway.
>
> Ok to install?

Yes.  Thanks.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-09-28  9:17                                                             ` SRA bit-field optimization Alexandre Oliva
@ 2007-10-02 17:15                                                               ` Richard Guenther
  2007-10-03  8:38                                                                 ` Richard Sandiford
                                                                                   ` (2 more replies)
  2007-10-03  7:45                                                               ` Eric Botcazou
  1 sibling, 3 replies; 104+ messages in thread
From: Richard Guenther @ 2007-10-02 17:15 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Bernd Schmidt, Diego Novillo, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou

On 9/28/07, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Aug 24, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
>
> > On Jul  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
> >> Here's the patch, bootstrapped and regression-tested on
> >> x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
> >> tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
> >> ppc64-linux-gnu underway.
>
> >> Ok to install?
>
> > Ping?
>
> Ping^2?
>
> Tests on ppc* passed back then, BTW.

It seems this was committed now, and causes ncurses build to fail on
x86_64:

../ncurses/./base/lib_addch.c: In function 'render_char':
../ncurses/./base/lib_addch.c:541: internal compiler error: in
bitfield_overlaps_p, at tree-sra.c:2901
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugs.opensuse.org/> for instructions.
make[1]: *** [../obj_s/lib_addch.o] Error 1

I'll file a bugreport once I get hands on preprocessed source.

Richard.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-09-28  9:17                                                             ` SRA bit-field optimization Alexandre Oliva
  2007-10-02 17:15                                                               ` Richard Guenther
@ 2007-10-03  7:45                                                               ` Eric Botcazou
  2007-10-03 21:36                                                                 ` Eric Botcazou
  2007-10-08 20:28                                                                 ` Alexandre Oliva
  1 sibling, 2 replies; 104+ messages in thread
From: Eric Botcazou @ 2007-10-03  7:45 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 2868 bytes --]

> Ping^2?
>
> Tests on ppc* passed back then, BTW.

It apparently still misbehaves in some cases for Ada and has introduced an 
ACATS regression at -O2 on x86-64 (c37213b):

<bb 4>:
  # D.857_26 = PHI <D.857_11(3), D.857_27(7)>
  D.860_25 = (<unnamed-signed:64>) D.857_26;
  _init = VIEW_CONVERT_EXPR<struct p__cons[D.859:D.861]>(*x.1_3)[D.860_25]{lb: 
D.859_24 sz: 12};
  SR.57_139 = VIEW_CONVERT_EXPR<UNSIGNED_64>(_init.c1);
  _init.c1 = VIEW_CONVERT_EXPR<struct p__cons__T2b>(SR.57_139);
  D.976 = VIEW_CONVERT_EXPR<struct p__rec>(4294967297);
  SR.45_155 = VIEW_CONVERT_EXPR<UNSIGNED_64>(D.976);
  D.978.c1 = VIEW_CONVERT_EXPR<struct p__cons__T2b>(SR.45_155);
  D.978.d3 = 1;
  D.960 = D.978;
  VIEW_CONVERT_EXPR<struct p__cons[D.859:D.861]>(*x.1_3)[D.860_25]{lb: 
D.859_24 sz: 12} = D.960;

<bb 5>:
  if (D.858_23 == D.857_26)
    goto <bb 6>;
  else
    goto <bb 7>;

is now turned into

<bb 4>:
  # SR.128_80 = PHI <SR.128_164(3), SR.128_197(7)>
  # D.857_26 = PHI <D.857_11(3), D.857_27(7)>
  D.860_25 = (<unnamed-signed:64>) D.857_26;
  _init$d3_69 = VIEW_CONVERT_EXPR<struct p__cons[D.859:D.861]>(*x.1_3)
[D.860_25]{lb: D.859_24 sz: 12}.d3;
  SR.131_67 = VIEW_CONVERT_EXPR<UNSIGNED_64>(VIEW_CONVERT_EXPR<struct 
p__cons[D.859:D.861]>(*x.1_3)[D.860_25]{lb: D.859_24 sz: 12}.c1);
  _init$c1$B0F64_65 = SR.131_67;
  SR.132_63 = _init$c1$B0F64_65;
  _init.c1 = VIEW_CONVERT_EXPR<struct p__cons__T2b>(SR.132_63);
  SR.57_139 = VIEW_CONVERT_EXPR<UNSIGNED_64>(_init.c1);
  SR.133_6 = SR.57_139;
  _init$c1$B0F64_154 = SR.133_6;
  D.976 = VIEW_CONVERT_EXPR<struct p__rec>(4294967297);
  SR.45_155 = VIEW_CONVERT_EXPR<UNSIGNED_64>(D.976);
  SR.134_142 = SR.45_155;
  SR.130_124 = SR.134_142;
  SR.129_148 = 1;
  SR.135_147 = SR.128_80 & 4294967295;
  SR.138_83 = SR.130_124 >> 32;
  SR.136_141 = (p__sm___XDLU_1__10) SR.138_83;
  SR.139_140 = (UNSIGNED_64) SR.136_141;
  SR.140_138 = SR.139_140 << 32;
  SR.128_92 = SR.135_147 | SR.140_138;
  SR.141_201 = SR.128_92 & 0x0ffffffff00000000;
  SR.142_200 = (p__sm___XDLU_1__10) SR.130_124;
  SR.144_199 = (UNSIGNED_64) SR.142_200;
  SR.128_137 = SR.141_201 | SR.144_199;
  SR.128_197 = SR.130_124;
  SR.127_136 = SR.129_148;
  SR.145_135 = SR.128_197;

<bb 5>:
  VIEW_CONVERT_EXPR<struct p__cons[D.859:D.861]>(*x.1_3)[D.860_25]{lb: 
D.859_24 sz: 12}.c1 = VIEW_CONVERT_EXPR<struct p__cons__T2b>(SR.145_135);
  VIEW_CONVERT_EXPR<struct p__cons[D.859:D.861]>(*x.1_3)[D.860_25]{lb: 
D.859_24 sz: 12}.d3 = SR.127_136;
  if (D.858_23 == D.857_26)
    goto <bb 6>;
  else
    goto <bb 7>;


The correctness problem is that the last statement of BB 4 is not 
tree_could_throw_p anymore, yielding:

p.adb: In function 'P':
p.adb:1: error: statement marked for throw, but doesn't
SR.145_135 = SR.128_197;

The optimization problem is the junk now present at the end of BB 4.

Testcase attached, compile at -O2.

-- 
Eric Botcazou

[-- Attachment #2: p.adb --]
[-- Type: text/x-adasrc, Size: 582 bytes --]

PROCEDURE P IS

   SUBTYPE SM IS INTEGER RANGE 1..10;

   TYPE REC (D1, D2 : SM) IS RECORD NULL; END RECORD;

   F1_CONS : INTEGER := 2;

   FUNCTION F1 RETURN INTEGER IS
   BEGIN
      F1_CONS := F1_CONS - 1;
      RETURN F1_CONS;
   END F1;

   TYPE CONS (D3 : INTEGER := 1) IS
      RECORD
         C1 : REC(D3, F1);
      END RECORD;

   Y : CONS;

   TYPE ARR IS ARRAY (1..5) OF CONS;

BEGIN

   DECLARE
      X : ARR;
   BEGIN
      IF X /= (1..5 => (1, (1, 1))) THEN
         raise Program_Error;
      END IF;
   END;
   EXCEPTION
      WHEN CONSTRAINT_ERROR => NULL;

END;

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-02 17:15                                                               ` Richard Guenther
@ 2007-10-03  8:38                                                                 ` Richard Sandiford
  2007-10-03 16:50                                                                   ` Alexandre Oliva
  2007-10-03 14:37                                                                 ` Daniel Berlin
  2007-10-05 15:03                                                                 ` Richard Guenther
  2 siblings, 1 reply; 104+ messages in thread
From: Richard Sandiford @ 2007-10-03  8:38 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Alexandre Oliva, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, GCC Patches, Andrew Pinski, Eric Botcazou

"Richard Guenther" <richard.guenther@gmail.com> writes:
> On 9/28/07, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Aug 24, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
>>
>> > On Jul  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
>> >> Here's the patch, bootstrapped and regression-tested on
>> >> x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
>> >> tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
>> >> ppc64-linux-gnu underway.
>>
>> >> Ok to install?
>>
>> > Ping?
>>
>> Ping^2?
>>
>> Tests on ppc* passed back then, BTW.
>
> It seems this was committed now, and causes ncurses build to fail on
> x86_64:
>
> ../ncurses/./base/lib_addch.c: In function 'render_char':
> ../ncurses/./base/lib_addch.c:541: internal compiler error: in
> bitfield_overlaps_p, at tree-sra.c:2901
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://bugs.opensuse.org/> for instructions.
> make[1]: *** [../obj_s/lib_addch.o] Error 1
>
> I'll file a bugreport once I get hands on preprocessed source.

For the record, it also causes many failures on mips-linux-gnu,
such as the following execute.exp ones:

FAIL: gcc.c-torture/execute/20000113-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/20000113-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/20000113-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20000113-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20000113-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20031211-2.c execution,  -O1 
FAIL: gcc.c-torture/execute/20031211-2.c execution,  -O2 
FAIL: gcc.c-torture/execute/20031211-2.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20031211-2.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20031211-2.c execution,  -Os 
FAIL: gcc.c-torture/execute/20040307-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/20040307-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/20040307-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20040307-1.c execution,  -O3 -fomit-frame-pointer -funroll-loops 
FAIL: gcc.c-torture/execute/20040307-1.c execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/20040307-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20040307-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O3 -fomit-frame-pointer -funroll-loops 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O1 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O2 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -fomit-frame-pointer -funroll-loops 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -Os 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O1 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O2 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -Os 
FAIL: gcc.c-torture/execute/bf64-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/bf64-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/bitfld-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/bitfld-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/bitfld-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/bitfld-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/bitfld-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/compndlit-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/compndlit-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/compndlit-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/compndlit-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/compndlit-1.c execution,  -Os 

FWIW, the gcc.c-torture/execute/20000113-1.c failures can easily be
reproduced with a cross-compiler, because after the patch, foobar is
optimised to a call to abort.

Richard

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-02 17:15                                                               ` Richard Guenther
  2007-10-03  8:38                                                                 ` Richard Sandiford
@ 2007-10-03 14:37                                                                 ` Daniel Berlin
  2007-10-03 14:44                                                                   ` Diego Novillo
  2007-10-05 15:03                                                                 ` Richard Guenther
  2 siblings, 1 reply; 104+ messages in thread
From: Daniel Berlin @ 2007-10-03 14:37 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Richard Guenther, Roman Zippel, Bernd Schmidt, Diego Novillo,
	GCC Patches, Andrew Pinski, Eric Botcazou

On 10/2/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> On 9/28/07, Alexandre Oliva <aoliva@redhat.com> wrote:
> > On Aug 24, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
> >
> > > On Jul  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
> > >> Here's the patch, bootstrapped and regression-tested on
> > >> x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
> > >> tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
> > >> ppc64-linux-gnu underway.
> >
> > >> Ok to install?
> >
> > > Ping?
> >
> > Ping^2?
> >
> > Tests on ppc* passed back then, BTW.
>
> It seems this was committed now, and causes ncurses build to fail on
> x86_64:
>
Was it actually approved anywhere?
If not, why was it committed?

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-03 14:37                                                                 ` Daniel Berlin
@ 2007-10-03 14:44                                                                   ` Diego Novillo
  0 siblings, 0 replies; 104+ messages in thread
From: Diego Novillo @ 2007-10-03 14:44 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Alexandre Oliva, Richard Guenther, Roman Zippel, Bernd Schmidt,
	GCC Patches, Andrew Pinski, Eric Botcazou

On 10/3/07, Daniel Berlin <dberlin@dberlin.org> wrote:

> Was it actually approved anywhere?
> If not, why was it committed?
>

[ Sending from my work account.  gcc.gnu.org is rejecting mail from my
personal account. ]

Yes, I approved it earlier this week.  Alex, if it's causing
widespread breakage again, please back it out.  Perhaps this needs
more work, or the patch needs to be redesigned.  It is quite
intrusive, after all.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-03  8:38                                                                 ` Richard Sandiford
@ 2007-10-03 16:50                                                                   ` Alexandre Oliva
  2007-10-03 18:23                                                                     ` Richard Sandiford
  2007-10-04 19:56                                                                     ` Richard Sandiford
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-03 16:50 UTC (permalink / raw)
  To: rsandifo, Richard Guenther
  Cc: Roman Zippel, Bernd Schmidt, Diego Novillo, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 457 bytes --]

On Oct  3, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:

> For the record, it also causes many failures on mips-linux-gnu,
> such as the following execute.exp ones:

> FWIW, the gcc.c-torture/execute/20000113-1.c failures can easily be
> reproduced with a cross-compiler, because after the patch, foobar is
> optimised to a call to abort.

This patch appears to fix at least the first one.  I haven't given it
a full round of testing yet, FWIW.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-fallout.patch --]
[-- Type: text/x-patch, Size: 2665 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-sra.c (sra_build_assignment): Set up mask depending on
	precision, not type.
	(sra_build_elt_assignment): Don't view-convert from signed to
	unsigned.
	(sra_explode_bitfield_assignment): Use bit-field type if
	possible.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-10-01 17:38:16.000000000 -0300
+++ gcc/tree-sra.c	2007-10-03 13:36:08.000000000 -0300
@@ -2168,7 +2168,7 @@ sra_build_assignment (tree dst, tree src
       list = NULL;
 
       cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
-      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+      if (TREE_INT_CST_LOW (cst2) == TYPE_PRECISION (utype))
 	{
 	  unsignedp = true;
 	  mask = NULL_TREE;
@@ -2508,13 +2508,13 @@ sra_build_elt_assignment (struct sra_elt
 	    {
 	      list = NULL;
 
-	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src))
-		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src)))
 		src = fold_build1 (VIEW_CONVERT_EXPR,
 				   lang_hooks.types.type_for_size
 				   (TREE_INT_CST_LOW
 				    (TYPE_SIZE (TREE_TYPE (src))),
 				    1), src);
+	      gcc_assert (TYPE_UNSIGNED (TREE_TYPE (src)));
 
 	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
 	      stmt = build_gimple_modify_stmt (tmp, src);
@@ -2971,10 +2971,14 @@ sra_explode_bitfield_assignment (tree va
 
       if (fld->replacement)
 	{
-	  tree infld, invar, st;
+	  tree infld, invar, st, type;
 
 	  infld = fld->replacement;
 
+	  type = TREE_TYPE (infld);
+	  if (TYPE_PRECISION (type) != TREE_INT_CST_LOW (flen))
+	    type = lang_hooks.types.type_for_size (TREE_INT_CST_LOW (flen), 1);
+
 	  if (TREE_CODE (infld) == BIT_FIELD_REF)
 	    {
 	      fpos = size_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2));
@@ -2990,10 +2994,7 @@ sra_explode_bitfield_assignment (tree va
 				 DECL_SIZE (fld->element));
 	    }
 
-	  infld = fold_build3 (BIT_FIELD_REF,
-			       lang_hooks.types.type_for_size
-			       (TREE_INT_CST_LOW (flen), 1),
-			       infld, flen, fpos);
+	  infld = fold_build3 (BIT_FIELD_REF, type, infld, flen, fpos);
 	  BIT_FIELD_REF_UNSIGNED (infld) = 1;
 
 	  invar = size_binop (MINUS_EXPR, flp.field_pos, bpos);
@@ -3001,8 +3002,7 @@ sra_explode_bitfield_assignment (tree va
 	    invar = size_binop (PLUS_EXPR, invar, flp.overlap_pos);
 	  invar = size_binop (PLUS_EXPR, invar, vpos);
 
-	  invar = fold_build3 (BIT_FIELD_REF, TREE_TYPE (infld),
-			       var, flen, invar);
+	  invar = fold_build3 (BIT_FIELD_REF, type, var, flen, invar);
 	  BIT_FIELD_REF_UNSIGNED (invar) = 1;
 
 	  if (to_var)

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-03 16:50                                                                   ` Alexandre Oliva
@ 2007-10-03 18:23                                                                     ` Richard Sandiford
  2007-10-04 19:56                                                                     ` Richard Sandiford
  1 sibling, 0 replies; 104+ messages in thread
From: Richard Sandiford @ 2007-10-03 18:23 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Richard Guenther, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, GCC Patches, Andrew Pinski, Eric Botcazou

Alexandre Oliva <aoliva@redhat.com> writes:
> On Oct  3, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:
>> For the record, it also causes many failures on mips-linux-gnu,
>> such as the following execute.exp ones:
>
>> FWIW, the gcc.c-torture/execute/20000113-1.c failures can easily be
>> reproduced with a cross-compiler, because after the patch, foobar is
>> optimised to a call to abort.
>
> This patch appears to fix at least the first one.  I haven't given it
> a full round of testing yet, FWIW.

Thanks for the quick fix.  I'll give it an overnight spin on
mipsisa32-elf (which also shows the problem) and let you know
how it goes.

Richard

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-03  7:45                                                               ` Eric Botcazou
@ 2007-10-03 21:36                                                                 ` Eric Botcazou
  2007-10-08 20:28                                                                 ` Alexandre Oliva
  1 sibling, 0 replies; 104+ messages in thread
From: Eric Botcazou @ 2007-10-03 21:36 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

> It apparently still misbehaves in some cases for Ada 

It miscompiles the Ada compiler on PA/Linux (PR ada/33634).

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-03 16:50                                                                   ` Alexandre Oliva
  2007-10-03 18:23                                                                     ` Richard Sandiford
@ 2007-10-04 19:56                                                                     ` Richard Sandiford
  2007-10-05 17:43                                                                       ` Alexandre Oliva
  1 sibling, 1 reply; 104+ messages in thread
From: Richard Sandiford @ 2007-10-04 19:56 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Richard Guenther, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, GCC Patches, Andrew Pinski, Eric Botcazou

Alexandre Oliva <aoliva@redhat.com> writes:
> On Oct  3, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:
>> For the record, it also causes many failures on mips-linux-gnu,
>> such as the following execute.exp ones:
>
>> FWIW, the gcc.c-torture/execute/20000113-1.c failures can easily be
>> reproduced with a cross-compiler, because after the patch, foobar is
>> optimised to a call to abort.
>
> This patch appears to fix at least the first one.  I haven't given it
> a full round of testing yet, FWIW.

Thanks, it stops the testcase being optimised to abort, but it does
still abort.  The following line of final_cleanup dump looks suspicious:

  b.x3 = (<unnamed-unsigned:3>) ((unsigned char) ((D.1242 - (int) D.1234) * D.1242) + (unsigned char) D.1238) | D.1238 & 7;

Why the "|"?  (The bit on the left hand side of the "|" looks fine.)

FWIW, if you need an executable testcase, the same problem occurs on
mipsisa32-elf.  I'm happy to test any revised patches though.

Richard

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-08 20:28                                                                 ` Alexandre Oliva
@ 2007-10-05  6:24                                                                   ` Eric Botcazou
  2007-10-05 16:03                                                                     ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Eric Botcazou @ 2007-10-05  6:24 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

> Ugh, there was a thinko in the test for whether the original BB/stmt
> could throw.  Thanks for the testcase, this patch fixes it.

Partially, I presume it doesn't do anything for the pessimization.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-02 17:15                                                               ` Richard Guenther
  2007-10-03  8:38                                                                 ` Richard Sandiford
  2007-10-03 14:37                                                                 ` Daniel Berlin
@ 2007-10-05 15:03                                                                 ` Richard Guenther
  2007-10-05 16:20                                                                   ` Alexandre Oliva
  2 siblings, 1 reply; 104+ messages in thread
From: Richard Guenther @ 2007-10-05 15:03 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Bernd Schmidt, Diego Novillo, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou

On 10/2/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> On 9/28/07, Alexandre Oliva <aoliva@redhat.com> wrote:
> > On Aug 24, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
> >
> > > On Jul  6, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:
> > >> Here's the patch, bootstrapped and regression-tested on
> > >> x86_64-linux-gnu and i686-pc-linux-gnu (with an oldish, pre-sccvn
> > >> tree, to be able to test ada as well).  Tests on ppc-linux-gnu and
> > >> ppc64-linux-gnu underway.
> >
> > >> Ok to install?
> >
> > > Ping?
> >
> > Ping^2?
> >
> > Tests on ppc* passed back then, BTW.
>
> It seems this was committed now, and causes ncurses build to fail on
> x86_64:
>
> ../ncurses/./base/lib_addch.c: In function 'render_char':
> ../ncurses/./base/lib_addch.c:541: internal compiler error: in
> bitfield_overlaps_p, at tree-sra.c:2901
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://bugs.opensuse.org/> for instructions.
> make[1]: *** [../obj_s/lib_addch.o] Error 1
>
> I'll file a bugreport once I get hands on preprocessed source.

Btw, this is PR33655.

Richard.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05  6:24                                                                   ` Eric Botcazou
@ 2007-10-05 16:03                                                                     ` Alexandre Oliva
  2007-10-07  9:01                                                                       ` Eric Botcazou
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-05 16:03 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

On Oct  5, 2007, Eric Botcazou <ebotcazou@adacore.com> wrote:

>> Ugh, there was a thinko in the test for whether the original BB/stmt
>> could throw.  Thanks for the testcase, this patch fixes it.

> Partially, I presume it doesn't do anything for the pessimization.

What pessimization?

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 15:03                                                                 ` Richard Guenther
@ 2007-10-05 16:20                                                                   ` Alexandre Oliva
  2007-10-05 16:24                                                                     ` Richard Guenther
                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-05 16:20 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Roman Zippel, Bernd Schmidt, Diego Novillo, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

On Oct  5, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote:

> On 10/2/07, Richard Guenther <richard.guenther@gmail.com> wrote:
>> It seems this was committed now, and causes ncurses build to fail on
>> x86_64:
> Btw, this is PR33655.

Thanks.  Here's a patch that fixes the problem, that I've just started
bootstrapping and regtesting on x86_64-linux-gnu.  Ok to install if it
passes?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-fallout-array.patch --]
[-- Type: text/x-patch, Size: 1510 bytes --]

for gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR tree-optimization/33655
	* tree-sra.c (bitfield_overlaps_p): Handle array and complex
	elements.

for gcc/testsuite/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR tree-optimization/33655
	* gcc.dg/torture/pr33655.c: New.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-10-05 12:53:06.000000000 -0300
+++ gcc/tree-sra.c	2007-10-05 12:54:55.000000000 -0300
@@ -2897,6 +2897,11 @@ bitfield_overlaps_p (tree blen, tree bpo
       flen = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 1));
       fpos = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 2));
     }
+  else if (TREE_CODE (fld->element) == INTEGER_CST)
+    {
+      flen = fold_convert (bitsizetype, TYPE_SIZE (fld->type));
+      fpos = size_binop (MULT_EXPR, flen, fld->element);
+    }
   else
     gcc_unreachable ();
 
Index: gcc/testsuite/gcc.dg/torture/pr33655.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/testsuite/gcc.dg/torture/pr33655.c	2007-10-05 12:54:55.000000000 -0300
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+typedef struct {
+    unsigned long attr;
+    int chars[2];
+} cchar_t;
+typedef struct _win_st {
+    cchar_t _bkgrnd;
+} WINDOW;
+void render_char(WINDOW *win, cchar_t ch)
+{
+    if ((ch).chars[0] == L' '
+        && (ch).chars[1] == L'\0')
+        win->_bkgrnd = ch;
+}

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 16:20                                                                   ` Alexandre Oliva
@ 2007-10-05 16:24                                                                     ` Richard Guenther
  2007-10-05 16:26                                                                     ` Diego Novillo
  2007-10-09  4:55                                                                     ` Alexandre Oliva
  2 siblings, 0 replies; 104+ messages in thread
From: Richard Guenther @ 2007-10-05 16:24 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Roman Zippel, Bernd Schmidt, Diego Novillo, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou

On 10/5/07, Alexandre Oliva <aoliva@redhat.com> wrote:
> > On 10/2/07, Richard Guenther <richard.guenther@gmail.com> wrote:
> >> It seems this was committed now, and causes ncurses build to fail on
> >> x86_64:
> > Btw, this is PR33655.
>
> Thanks.  Here's a patch that fixes the problem, that I've just started
> bootstrapping and regtesting on x86_64-linux-gnu.  Ok to install if it
> passes?

Yes, thanks.

Richard.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 16:20                                                                   ` Alexandre Oliva
  2007-10-05 16:24                                                                     ` Richard Guenther
@ 2007-10-05 16:26                                                                     ` Diego Novillo
  2007-10-05 20:08                                                                       ` Alexandre Oliva
  2007-10-09  4:55                                                                     ` Alexandre Oliva
  2 siblings, 1 reply; 104+ messages in thread
From: Diego Novillo @ 2007-10-05 16:26 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Richard Guenther, Roman Zippel, Bernd Schmidt, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou

On 10/5/07, Alexandre Oliva <aoliva@redhat.com> wrote:

> Ok to install if it passes?

Yes, with a ChangeLog entry and the test from the PR.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-04 19:56                                                                     ` Richard Sandiford
@ 2007-10-05 17:43                                                                       ` Alexandre Oliva
  2007-10-06  8:02                                                                         ` Richard Sandiford
  2007-10-06 16:01                                                                         ` David Daney
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-05 17:43 UTC (permalink / raw)
  To: rsandifo
  Cc: Richard Guenther, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, GCC Patches, Andrew Pinski, Eric Botcazou,
	John David Anglin

[-- Attachment #1: Type: text/plain, Size: 2191 bytes --]

On Oct  4, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:

> Alexandre Oliva <aoliva@redhat.com> writes:
>> On Oct  3, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:
>>> For the record, it also causes many failures on mips-linux-gnu,
>>> such as the following execute.exp ones:
>> 
>>> FWIW, the gcc.c-torture/execute/20000113-1.c failures can easily be
>>> reproduced with a cross-compiler, because after the patch, foobar is
>>> optimised to a call to abort.
>> 
>> This patch appears to fix at least the first one.  I haven't given it
>> a full round of testing yet, FWIW.

> Thanks, it stops the testcase being optimised to abort, but it does
> still abort.  The following line of final_cleanup dump looks suspicious:

>   b.x3 = (<unnamed-unsigned:3>) ((unsigned char) ((D.1242 - (int) D.1234) * D.1242) + (unsigned char) D.1238) | D.1238 & 7;

> Why the "|"?

Another case of overflow and sign-extension when computing bit masks.
I haven't tried to figured out why it doesn't hit on other
architectures.

Anyhow, this patch, that supersedes the previous, fixes it, and with
it the testcase appears to yield the correct result.

> FWIW, if you need an executable testcase, the same problem occurs on
> mipsisa32-elf.  I'm happy to test any revised patches though.

I'm bootstrapping on x86_64-linux-gnu right now.  I don't think there
will be time for me to start a cross toolchain build after that,
before I leave on another short trip during the weekend, so I'd
appreciate if you could give it a spin.

If this isn't enough, on Monday, or perhaps even Sunday evening, I'll
get myself a cross mipsisa32-elf toolchain and test environment and
look into all the remaining failures you reported.

Hopefully this patch will fix the hppa-linux-gnu Ada compiler
miscompilation too.  Dave, Eric, would you please give it a try and
let me know?  I don't have access to this platform, so it would be
difficult for me to diagnose the compiler miscompilation :-(

Perhaps running a bootstrap-disabled build, then letting me know about
regressions it displays?  Then I can try to duplicate them locally and
hopefully they will lead to a fix for the bootstrap problem.

Thanks,



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-fallout-mips.patch --]
[-- Type: text/x-patch, Size: 3416 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-sra.c (sra_build_assignment): Set up mask depending on
	precision, not type.
	(sra_build_bf_assignment): Don't overflow computing bit masks.
	(sra_build_elt_assignment): Don't view-convert from signed to
	unsigned.
	(sra_explode_bitfield_assignment): Use bit-field type if
	possible.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-10-05 13:58:18.000000000 -0300
+++ gcc/tree-sra.c	2007-10-05 14:22:08.000000000 -0300
@@ -2168,7 +2168,7 @@ sra_build_assignment (tree dst, tree src
       list = NULL;
 
       cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
-      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+      if (TREE_INT_CST_LOW (cst2) == TYPE_PRECISION (utype))
 	{
 	  unsignedp = true;
 	  mask = NULL_TREE;
@@ -2343,8 +2343,14 @@ sra_build_bf_assignment (tree dst, tree 
     utype = unsigned_type_for (type);
 
   mask = build_int_cst_wide (utype, 1, 0);
-  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
-  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
+  if (TREE_INT_CST_LOW (maxshift) == TYPE_PRECISION (utype))
+    cst = build_int_cst_wide (utype, 0, 0);
+  else
+    cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
+  if (integer_zerop (minshift))
+    cst2 = mask;
+  else
+    cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
   mask = int_const_binop (MINUS_EXPR, cst, cst2, true);
   mask = fold_build1 (BIT_NOT_EXPR, utype, mask);
 
@@ -2508,13 +2514,13 @@ sra_build_elt_assignment (struct sra_elt
 	    {
 	      list = NULL;
 
-	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src))
-		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src)))
 		src = fold_build1 (VIEW_CONVERT_EXPR,
 				   lang_hooks.types.type_for_size
 				   (TREE_INT_CST_LOW
 				    (TYPE_SIZE (TREE_TYPE (src))),
 				    1), src);
+	      gcc_assert (TYPE_UNSIGNED (TREE_TYPE (src)));
 
 	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
 	      stmt = build_gimple_modify_stmt (tmp, src);
@@ -2971,10 +2977,14 @@ sra_explode_bitfield_assignment (tree va
 
       if (fld->replacement)
 	{
-	  tree infld, invar, st;
+	  tree infld, invar, st, type;
 
 	  infld = fld->replacement;
 
+	  type = TREE_TYPE (infld);
+	  if (TYPE_PRECISION (type) != TREE_INT_CST_LOW (flen))
+	    type = lang_hooks.types.type_for_size (TREE_INT_CST_LOW (flen), 1);
+
 	  if (TREE_CODE (infld) == BIT_FIELD_REF)
 	    {
 	      fpos = size_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2));
@@ -2990,10 +3000,7 @@ sra_explode_bitfield_assignment (tree va
 				 DECL_SIZE (fld->element));
 	    }
 
-	  infld = fold_build3 (BIT_FIELD_REF,
-			       lang_hooks.types.type_for_size
-			       (TREE_INT_CST_LOW (flen), 1),
-			       infld, flen, fpos);
+	  infld = fold_build3 (BIT_FIELD_REF, type, infld, flen, fpos);
 	  BIT_FIELD_REF_UNSIGNED (infld) = 1;
 
 	  invar = size_binop (MINUS_EXPR, flp.field_pos, bpos);
@@ -3001,8 +3008,7 @@ sra_explode_bitfield_assignment (tree va
 	    invar = size_binop (PLUS_EXPR, invar, flp.overlap_pos);
 	  invar = size_binop (PLUS_EXPR, invar, vpos);
 
-	  invar = fold_build3 (BIT_FIELD_REF, TREE_TYPE (infld),
-			       var, flen, invar);
+	  invar = fold_build3 (BIT_FIELD_REF, type, var, flen, invar);
 	  BIT_FIELD_REF_UNSIGNED (invar) = 1;
 
 	  if (to_var)

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 16:26                                                                     ` Diego Novillo
@ 2007-10-05 20:08                                                                       ` Alexandre Oliva
  0 siblings, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-05 20:08 UTC (permalink / raw)
  To: Diego Novillo
  Cc: Richard Guenther, Roman Zippel, Bernd Schmidt, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou

On Oct  5, 2007, "Diego Novillo" <dnovillo@google.com> wrote:

> On 10/5/07, Alexandre Oliva <aoliva@redhat.com> wrote:
>> Ok to install if it passes?

> Yes, with a ChangeLog entry and the test from the PR.

Err, I'm confused.  AFAICT the patch I posted had both.  Did you find
something else was missing?

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 17:43                                                                       ` Alexandre Oliva
@ 2007-10-06  8:02                                                                         ` Richard Sandiford
  2007-10-06 21:06                                                                           ` John David Anglin
  2007-10-06 16:01                                                                         ` David Daney
  1 sibling, 1 reply; 104+ messages in thread
From: Richard Sandiford @ 2007-10-06  8:02 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Richard Guenther, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, GCC Patches, Andrew Pinski, Eric Botcazou,
	John David Anglin

Alexandre Oliva <aoliva@redhat.com> writes:
> Anyhow, this patch, that supersedes the previous, fixes it, and with
> it the testcase appears to yield the correct result.

Thanks, the results after this patch are much better.  As you say,
20000113-1.c is well and truly fixed.  The only remaining C regressions are:

FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20031211-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20040709-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O1 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O2 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -fomit-frame-pointer -funroll-loops 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20040709-2.c execution,  -Os 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O1 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O2 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/920908-2.c execution,  -Os 
FAIL: gcc.c-torture/execute/950628-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -fomit-frame-pointer  (internal compiler error)
FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -g  (internal compiler error)
FAIL: gcc.c-torture/execute/bf64-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/bf64-1.c execution,  -O3 -g 

20031211-1.c is a pleasingly simple testcase, and as with the original
20000113-1.c failure, it's optimised to abort() by final_cleanup.

Don't worry about the 991118-1.c failure.  It's an RTL ICE, and probably
a config/mips bug.  I'll have a look at it.

Richard

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 17:43                                                                       ` Alexandre Oliva
  2007-10-06  8:02                                                                         ` Richard Sandiford
@ 2007-10-06 16:01                                                                         ` David Daney
  1 sibling, 0 replies; 104+ messages in thread
From: David Daney @ 2007-10-06 16:01 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: rsandifo, Richard Guenther, Roman Zippel, Bernd Schmidt,
	Diego Novillo, Daniel Berlin, GCC Patches, Andrew Pinski,
	Eric Botcazou, John David Anglin

Alexandre Oliva wrote:
> I'm bootstrapping on x86_64-linux-gnu right now.  I don't think there
> will be time for me to start a cross toolchain build after that,
> before I leave on another short trip during the weekend, so I'd
> appreciate if you could give it a spin.
>
> If this isn't enough, on Monday, or perhaps even Sunday evening, I'll
> get myself a cross mipsisa32-elf toolchain and test environment and
> look into all the remaining failures you reported.
>
> Hopefully this patch will fix the hppa-linux-gnu Ada compiler
> miscompilation too.  Dave, Eric, would you please give it a try and
> let me know?  I don't have access to this platform, so it would be
> difficult for me to diagnose the compiler miscompilation :-(
>
> Perhaps running a bootstrap-disabled build, then letting me know about
> regressions it displays?  Then I can try to duplicate them locally and
> hopefully they will lead to a fix for the bootstrap problem.
>
> Thanks,
>
>   

I bootstrapped with --enable-languages=c on mipsel-linux with

r129029 with your two patches 
(gcc-sra-bit-field-ref-fallout-array.patch, 
gcc-sra-bit-field-ref-fallout-mips.patch).  This fixed all the C 
regressions.

David Daney
 


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-06  8:02                                                                         ` Richard Sandiford
@ 2007-10-06 21:06                                                                           ` John David Anglin
  2007-10-06 22:12                                                                             ` Richard Sandiford
  0 siblings, 1 reply; 104+ messages in thread
From: John David Anglin @ 2007-10-06 21:06 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: aoliva, richard.guenther, zippel, bernds_cb1, dnovillo, dberlin,
	gcc-patches, pinskia, ebotcazou

> Alexandre Oliva <aoliva@redhat.com> writes:
> > Anyhow, this patch, that supersedes the previous, fixes it, and with
> > it the testcase appears to yield the correct result.
> 
> Thanks, the results after this patch are much better.  As you say,
> 20000113-1.c is well and truly fixed.  The only remaining C regressions are:

> FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -fomit-frame-pointer  (internal compiler error)
> FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -g  (internal compiler error)

> Don't worry about the 991118-1.c failure.  It's an RTL ICE, and probably
> a config/mips bug.  I'll have a look at it.

I'm also still seeing the failure of 991118-1.c.  These are the remaining
new fails on hppa-unknown-linux-gnu:

FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -fomit-frame-pointer  (
internal compiler error)
FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -g  (internal compiler
error)
FAIL: gcc.c-torture/execute/longlong.c execution,  -O3 -fomit-frame-pointer
FAIL: gcc.c-torture/execute/longlong.c execution,  -O3 -g

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-06 21:06                                                                           ` John David Anglin
@ 2007-10-06 22:12                                                                             ` Richard Sandiford
  2007-10-07 23:44                                                                               ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Richard Sandiford @ 2007-10-06 22:12 UTC (permalink / raw)
  To: John David Anglin
  Cc: aoliva, richard.guenther, zippel, bernds_cb1, dnovillo, dberlin,
	gcc-patches, pinskia, ebotcazou

"John David Anglin" <dave@hiauly1.hia.nrc.ca> writes:
>> Alexandre Oliva <aoliva@redhat.com> writes:
>> > Anyhow, this patch, that supersedes the previous, fixes it, and with
>> > it the testcase appears to yield the correct result.
>> 
>> Thanks, the results after this patch are much better.  As you say,
>> 20000113-1.c is well and truly fixed.  The only remaining C regressions are:
>
>> FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -fomit-frame-pointer  (internal compiler error)
>> FAIL: gcc.c-torture/execute/991118-1.c compilation,  -O3 -g  (internal compiler error)
>
>> Don't worry about the 991118-1.c failure.  It's an RTL ICE, and probably
>> a config/mips bug.  I'll have a look at it.
>
> I'm also still seeing the failure of 991118-1.c.

Ah, thanks.  I did have a quick look at it, and it seemed to be caused
by a missing conversion at the tree level.  Expand was being given
a BIT_AND_EXPR with a DImode-typed result and HImode-typed operand.
I didn't try to track down where it was coming from.  So it does seem
to be a tree-level bug after all.

Richard

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 16:03                                                                     ` Alexandre Oliva
@ 2007-10-07  9:01                                                                       ` Eric Botcazou
  2007-10-07 23:58                                                                         ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Eric Botcazou @ 2007-10-07  9:01 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

> What pessimization?

The complex calculation now introduced by SRA:

  SR.135_147 = SR.128_80 & 4294967295;
  SR.138_83 = SR.130_124 >> 32;
  SR.136_141 = (p__sm___XDLU_1__10) SR.138_83;
  SR.139_140 = (UNSIGNED_64) SR.136_141;
  SR.140_138 = SR.139_140 << 32;
  SR.128_92 = SR.135_147 | SR.140_138;
  SR.141_201 = SR.128_92 & 0x0ffffffff00000000;
  SR.142_200 = (p__sm___XDLU_1__10) SR.130_124;
  SR.144_199 = (UNSIGNED_64) SR.142_200;
  SR.128_137 = SR.141_201 | SR.144_199;

It turns out that SR.128_137 is dead so the whole stuff is eliminated by the 
first subsequent DCE pass, but I'm worried that it might not always be so.

I confirm that the verification failure of c37213b is fixed by the patch you 
posted in your previous message, thanks!

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-06 22:12                                                                             ` Richard Sandiford
@ 2007-10-07 23:44                                                                               ` Alexandre Oliva
  2007-10-08 21:14                                                                                 ` John David Anglin
  2007-10-09  4:45                                                                                 ` Alexandre Oliva
  0 siblings, 2 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-07 23:44 UTC (permalink / raw)
  To: John David Anglin
  Cc: richard.guenther, zippel, bernds_cb1, dnovillo, dberlin,
	gcc-patches, pinskia, ebotcazou, rsandifo

[-- Attachment #1: Type: text/plain, Size: 1313 bytes --]

On Oct  6, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:

> Ah, thanks.  I did have a quick look at it, and it seemed to be caused
> by a missing conversion at the tree level.  Expand was being given
> a BIT_AND_EXPR with a DImode-typed result and HImode-typed operand.
> I didn't try to track down where it was coming from.  So it does seem
> to be a tree-level bug after all.

Yep, fixed in the revised patch below.

It also fixes the endianness problem that was hitting machines that
had BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN, that caused the 20031211-1.c
and other regressions.  I remember having noticed at some point in the
development of the bit-field SRA patch that I was using the wrong
endianness macro and changing it, but BITS_BIG_ENDIAN wasn't what I
was looking for.  Doh.

I haven't given this patch much testing yet, but the two new changes
are obviously correct and do make progress.  I'm starting a test run
on a just-built mipsisa32-elf uberbaum tree, after having finally
figured out why dejagnu wasn't finding newlib (my uberbaum source tree
contained links to gcc and srcware sub-dirs, but site.exp set srcdir
with the pathname of the gcc source tree, rather than that of the
uberbaum tree, doesn't this bite anyone else?)

Ok to install if it passes bootstrap on x86_64-linux-gnu?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-fallout-mips.patch --]
[-- Type: text/x-patch, Size: 4893 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-sra.c (instantiate_element): Use BYTES_BIG_ENDIAN for
	bit-field layout.
	(sra_build_assignment): Likewise.  Set up mask depending on
	precision, not type.
	(sra_build_bf_assignment): Use BYTES_BIG_ENDIAN.  Don't overflow
	computing bit masks.  Fix type mismatch when applying negated
	mask.
	(sra_build_elt_assignment): Don't view-convert from signed to
	unsigned.
	(sra_explode_bitfield_assignment): Use bit-field type if
	possible.  Use BYTES_BIG_ENDIAN.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-10-06 08:46:59.000000000 -0300
+++ gcc/tree-sra.c	2007-10-07 20:20:11.000000000 -0300
@@ -1275,7 +1275,7 @@ instantiate_element (struct sra_elt *elt
       elt->in_bitfld_block = 1;
       elt->replacement = build3 (BIT_FIELD_REF, elt->type, var,
 				 DECL_SIZE (var),
-				 BITS_BIG_ENDIAN
+				 BYTES_BIG_ENDIAN
 				 ? size_binop (MINUS_EXPR,
 					       TYPE_SIZE (elt->type),
 					       DECL_SIZE (var))
@@ -2140,7 +2140,7 @@ sra_build_assignment (tree dst, tree src
       cst2 = size_binop (PLUS_EXPR, TREE_OPERAND (src, 1),
 			 TREE_OPERAND (src, 2));
 
-      if (BITS_BIG_ENDIAN)
+      if (BYTES_BIG_ENDIAN)
 	{
 	  maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
 	  minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
@@ -2168,7 +2168,7 @@ sra_build_assignment (tree dst, tree src
       list = NULL;
 
       cst2 = size_binop (MINUS_EXPR, maxshift, minshift);
-      if (tree_int_cst_equal (cst2, TYPE_SIZE (utype)))
+      if (TREE_INT_CST_LOW (cst2) == TYPE_PRECISION (utype))
 	{
 	  unsignedp = true;
 	  mask = NULL_TREE;
@@ -2322,7 +2322,7 @@ sra_build_bf_assignment (tree dst, tree 
 		     fold_convert (bitsizetype, TREE_OPERAND (dst, 1)),
 		     cst);
 
-  if (BITS_BIG_ENDIAN)
+  if (BYTES_BIG_ENDIAN)
     {
       maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
       minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
@@ -2343,8 +2343,14 @@ sra_build_bf_assignment (tree dst, tree 
     utype = unsigned_type_for (type);
 
   mask = build_int_cst_wide (utype, 1, 0);
-  cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
-  cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
+  if (TREE_INT_CST_LOW (maxshift) == TYPE_PRECISION (utype))
+    cst = build_int_cst_wide (utype, 0, 0);
+  else
+    cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
+  if (integer_zerop (minshift))
+    cst2 = mask;
+  else
+    cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
   mask = int_const_binop (MINUS_EXPR, cst, cst2, true);
   mask = fold_build1 (BIT_NOT_EXPR, utype, mask);
 
@@ -2508,13 +2514,13 @@ sra_build_elt_assignment (struct sra_elt
 	    {
 	      list = NULL;
 
-	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src))
-		  || !TYPE_UNSIGNED (TREE_TYPE (src)))
+	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src)))
 		src = fold_build1 (VIEW_CONVERT_EXPR,
 				   lang_hooks.types.type_for_size
 				   (TREE_INT_CST_LOW
 				    (TYPE_SIZE (TREE_TYPE (src))),
 				    1), src);
+	      gcc_assert (TYPE_UNSIGNED (TREE_TYPE (src)));
 
 	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
 	      stmt = build_gimple_modify_stmt (tmp, src);
@@ -2976,16 +2982,20 @@ sra_explode_bitfield_assignment (tree va
 
       if (fld->replacement)
 	{
-	  tree infld, invar, st;
+	  tree infld, invar, st, type;
 
 	  infld = fld->replacement;
 
+	  type = TREE_TYPE (infld);
+	  if (TYPE_PRECISION (type) != TREE_INT_CST_LOW (flen))
+	    type = lang_hooks.types.type_for_size (TREE_INT_CST_LOW (flen), 1);
+
 	  if (TREE_CODE (infld) == BIT_FIELD_REF)
 	    {
 	      fpos = size_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2));
 	      infld = TREE_OPERAND (infld, 0);
 	    }
-	  else if (BITS_BIG_ENDIAN && DECL_P (fld->element)
+	  else if (BYTES_BIG_ENDIAN && DECL_P (fld->element)
 		   && !tree_int_cst_equal (TYPE_SIZE (TREE_TYPE (infld)),
 					   DECL_SIZE (fld->element)))
 	    {
@@ -2995,10 +3005,7 @@ sra_explode_bitfield_assignment (tree va
 				 DECL_SIZE (fld->element));
 	    }
 
-	  infld = fold_build3 (BIT_FIELD_REF,
-			       lang_hooks.types.type_for_size
-			       (TREE_INT_CST_LOW (flen), 1),
-			       infld, flen, fpos);
+	  infld = fold_build3 (BIT_FIELD_REF, type, infld, flen, fpos);
 	  BIT_FIELD_REF_UNSIGNED (infld) = 1;
 
 	  invar = size_binop (MINUS_EXPR, flp.field_pos, bpos);
@@ -3006,8 +3013,7 @@ sra_explode_bitfield_assignment (tree va
 	    invar = size_binop (PLUS_EXPR, invar, flp.overlap_pos);
 	  invar = size_binop (PLUS_EXPR, invar, vpos);
 
-	  invar = fold_build3 (BIT_FIELD_REF, TREE_TYPE (infld),
-			       var, flen, invar);
+	  invar = fold_build3 (BIT_FIELD_REF, type, var, flen, invar);
 	  BIT_FIELD_REF_UNSIGNED (invar) = 1;
 
 	  if (to_var)

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-07  9:01                                                                       ` Eric Botcazou
@ 2007-10-07 23:58                                                                         ` Alexandre Oliva
  2007-10-08  5:13                                                                           ` Eric Botcazou
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-07 23:58 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

On Oct  7, 2007, Eric Botcazou <ebotcazou@adacore.com> wrote:

>> What pessimization?
> The complex calculation now introduced by SRA:

It's not really introduced, it's just made explicit earlier, in the
tree level, such that we can better optimize it.  I'll give you that
this isn't the simplest way to model operations on 32-bit words in
64-bit variables, but in the end we'd get the same code either way, at
least for variables that are not pushed to memory.

> I confirm that the verification failure of c37213b is fixed by the patch you 
> posted in your previous message, thanks!

Thanks,

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-07 23:58                                                                         ` Alexandre Oliva
@ 2007-10-08  5:13                                                                           ` Eric Botcazou
  2007-10-08 20:29                                                                             ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Eric Botcazou @ 2007-10-08  5:13 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

> It's not really introduced, it's just made explicit earlier, in the
> tree level, such that we can better optimize it.  I'll give you that
> this isn't the simplest way to model operations on 32-bit words in
> 64-bit variables, but in the end we'd get the same code either way, at
> least for variables that are not pushed to memory.

OK, thanks for the clarification.

Could you install the aforementioned patchlet?  FWIW it has successfully 
passed a complete bootstrap + Ada regression testing on x86-64.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-03  7:45                                                               ` Eric Botcazou
  2007-10-03 21:36                                                                 ` Eric Botcazou
@ 2007-10-08 20:28                                                                 ` Alexandre Oliva
  2007-10-05  6:24                                                                   ` Eric Botcazou
  1 sibling, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-08 20:28 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

[-- Attachment #1: Type: text/plain, Size: 576 bytes --]

On Oct  3, 2007, Eric Botcazou <ebotcazou@adacore.com> wrote:

> It apparently still misbehaves in some cases for Ada and has introduced an 
> ACATS regression at -O2 on x86-64 (c37213b):

> The correctness problem is that the last statement of BB 4 is not 
> tree_could_throw_p anymore, yielding:

Ugh, there was a thinko in the test for whether the original BB/stmt
could throw.  Thanks for the testcase, this patch fixes it.  I haven't
bootstrapped it yet, I'm on the road with a little computing
horsepower and limited network connection.  I should be back home
tonight.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-fallout-ada.patch --]
[-- Type: text/x-patch, Size: 699 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-sra.c (scalarize_lsdt): Fix thinko in testing whether
	the original stmt can throw.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-10-03 13:52:42.000000000 -0300
+++ gcc/tree-sra.c	2007-10-04 15:20:11.000000000 -0300
@@ -3431,7 +3431,7 @@ scalarize_ldst (struct sra_elt *elt, tre
 	{
 	  tree_stmt_iterator tsi;
 	  tree first, blist = NULL;
-	  bool thr = (bsi->bb->flags & EDGE_COMPLEX) != 0;
+	  bool thr = tree_could_throw_p (stmt);
 
 	  /* If the last statement of this BB created an EH edge
 	     before scalarization, we have to locate the first

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-08  5:13                                                                           ` Eric Botcazou
@ 2007-10-08 20:29                                                                             ` Alexandre Oliva
  2007-10-08 21:00                                                                               ` Eric Botcazou
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-08 20:29 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

On Oct  8, 2007, Eric Botcazou <ebotcazou@adacore.com> wrote:

>> It's not really introduced, it's just made explicit earlier, in the
>> tree level, such that we can better optimize it.  I'll give you that
>> this isn't the simplest way to model operations on 32-bit words in
>> 64-bit variables, but in the end we'd get the same code either way, at
>> least for variables that are not pushed to memory.

> OK, thanks for the clarification.

> Could you install the aforementioned patchlet?  FWIW it has successfully 
> passed a complete bootstrap + Ada regression testing on x86-64.

I'm not sure which of these two patches you're talking about, and I'm
not entitled to install any of them without approval, unless you think
they fall in the obviously-correct rule (which the first one might, I
guess, but AFAICT it somehow failed to make to the list)

http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00440.html
http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00395.html

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-08 20:29                                                                             ` Alexandre Oliva
@ 2007-10-08 21:00                                                                               ` Eric Botcazou
  2007-10-08 23:56                                                                                 ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: Eric Botcazou @ 2007-10-08 21:00 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

> I'm not sure which of these two patches you're talking about, and I'm
> not entitled to install any of them without approval, unless you think
> they fall in the obviously-correct rule (which the first one might, I
> guess, but AFAICT it somehow failed to make to the list)

Yes, it's the first one and yes, I'd think that it falls into the 
obviously-correct-for-the-sake-of-consistency rule.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-07 23:44                                                                               ` Alexandre Oliva
@ 2007-10-08 21:14                                                                                 ` John David Anglin
  2007-10-08 23:51                                                                                   ` Alexandre Oliva
  2007-10-09  4:45                                                                                 ` Alexandre Oliva
  1 sibling, 1 reply; 104+ messages in thread
From: John David Anglin @ 2007-10-08 21:14 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: richard.guenther, zippel, bernds_cb1, dnovillo, dberlin,
	gcc-patches, pinskia, ebotcazou, rsandifo

On Sun, 07 Oct 2007, Alexandre Oliva wrote:

> On Oct  6, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:
> 
> > Ah, thanks.  I did have a quick look at it, and it seemed to be caused
> > by a missing conversion at the tree level.  Expand was being given
> > a BIT_AND_EXPR with a DImode-typed result and HImode-typed operand.
> > I didn't try to track down where it was coming from.  So it does seem
> > to be a tree-level bug after all.
> 
> Yep, fixed in the revised patch below.

The 991118-1.c and longlong.c fails are still present on hppa-unknown-linux-gnu
with the revised patch.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-08 21:14                                                                                 ` John David Anglin
@ 2007-10-08 23:51                                                                                   ` Alexandre Oliva
  2007-10-09  0:31                                                                                     ` John David Anglin
  0 siblings, 1 reply; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-08 23:51 UTC (permalink / raw)
  To: John David Anglin
  Cc: richard.guenther, zippel, bernds_cb1, dnovillo, dberlin,
	gcc-patches, pinskia, ebotcazou, rsandifo

On Oct  8, 2007, John David Anglin <dave@hiauly1.hia.nrc.ca> wrote:

> On Sun, 07 Oct 2007, Alexandre Oliva wrote:
>> On Oct  6, 2007, Richard Sandiford <rsandifo@nildram.co.uk> wrote:
>> 
>> > Ah, thanks.  I did have a quick look at it, and it seemed to be caused
>> > by a missing conversion at the tree level.  Expand was being given
>> > a BIT_AND_EXPR with a DImode-typed result and HImode-typed operand.
>> > I didn't try to track down where it was coming from.  So it does seem
>> > to be a tree-level bug after all.
>> 
>> Yep, fixed in the revised patch below.

> The 991118-1.c and longlong.c fails are still present on
> hppa-unknown-linux-gnu with the revised patch.

Since there isn't AFAIK a simulator for hppa-elf and the failure isn't
present on mipsisa32-elf, I'm going to need some help tracking this
down.  I've stared at the final tree code, the final RTL and the
generated assembly for 991118-1.c on hppa-linux-gnu for a couple of
hours already, and I haven't been able to locate any error :-(

Could you please let me know which comparison fails, what we're
comparing with and what the computed value is, such that I can stand a
better chance of fixing the problem?


Thanks,

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-08 21:00                                                                               ` Eric Botcazou
@ 2007-10-08 23:56                                                                                 ` Alexandre Oliva
  0 siblings, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-08 23:56 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Roman Zippel, Bernd Schmidt, Diego Novillo,
	Daniel Berlin, Andrew Pinski

On Oct  8, 2007, Eric Botcazou <ebotcazou@adacore.com> wrote:

>> I'm not sure which of these two patches you're talking about, and I'm
>> not entitled to install any of them without approval, unless you think
>> they fall in the obviously-correct rule (which the first one might, I
>> guess, but AFAICT it somehow failed to make to the list)

> Yes, it's the first one and yes, I'd think that it falls into the 
> obviously-correct-for-the-sake-of-consistency rule.

Ok, then, it's in.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-08 23:51                                                                                   ` Alexandre Oliva
@ 2007-10-09  0:31                                                                                     ` John David Anglin
  2007-10-09  4:41                                                                                       ` Alexandre Oliva
  0 siblings, 1 reply; 104+ messages in thread
From: John David Anglin @ 2007-10-09  0:31 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: dave.anglin, richard.guenther, zippel, bernds_cb1, dnovillo,
	dberlin, gcc-patches, pinskia, ebotcazou, rsandifo

> Could you please let me know which comparison fails, what we're
> comparing with and what the computed value is, such that I can stand a
> better chance of fixing the problem?

/home/dave/gnu/gcc-4.3/gcc/gcc/testsuite/gcc.c-torture/execute/991118-1.c: In function 'main':
/home/dave/gnu/gcc-4.3/gcc/gcc/testsuite/gcc.c-torture/execute/991118-1.c:64: internal compiler error: in simplify_subreg, at simplify-rtx.c:4921

Looks like:
  tmp = sub (tmp);

Breakpoint 2, simplify_subreg (outermode=SImode, op=0x402e6fd0,
    innermode=DImode, byte=0) at ../../gcc/gcc/simplify-rtx.c:4920
4920      gcc_assert (GET_MODE (op) == innermode
(gdb) p debug_rtx (op)
(subreg:HI (reg:SI 291) 2)
(gdb) bt
#0  simplify_subreg (outermode=SImode, op=0x402e6fd0, innermode=DImode, byte=0)
    at ../../gcc/gcc/simplify-rtx.c:4920
#1  0x00305414 in simplify_gen_subreg (outermode=SImode, op=0x402e6fd0,
    innermode=DImode, byte=0) at ../../gcc/gcc/simplify-rtx.c:5218
#2  0x00167ff4 in operand_subword_force (op=0x7, offset=1076785104,
    mode=DImode) at ../../gcc/gcc/emit-rtl.c:1438
#3  0x0027a1cc in expand_binop (mode=DImode, binoptab=0x75db40,
    op0=0x402e6fd0, op1=0x401563d8, target=0x402e80f0, unsignedp=1,
    methods=OPTAB_LIB_WIDEN) at ../../gcc/gcc/optabs.c:1721
#4  0x00189718 in expand_expr_real_1 (exp=0x402cbe38, target=0x0,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:9377
#5  0x001997e0 in expand_expr_real (exp=0x402cbe38, target=0x0,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:7090
#6  0x001999b0 in expand_expr (exp=0x7, target=0x402e6fd0, mode=DImode,
    modifier=EXPAND_NORMAL) at ../../gcc/gcc/expr.h:514
#7  0x0018de08 in expand_expr_real_1 (exp=0x402df460, target=0x0,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:8109
#8  0x001997e0 in expand_expr_real (exp=0x402df460, target=0x0,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:7090
#9  0x001999b0 in expand_expr (exp=0x7, target=0x402e6fd0, mode=DImode,
    modifier=EXPAND_NORMAL) at ../../gcc/gcc/expr.h:514
#10 0x0018e530 in expand_expr_real_1 (exp=0x402cbe60, target=0x0,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:8914
#11 0x001997e0 in expand_expr_real (exp=0x402cbe60, target=0x0,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:7090
#12 0x00199c7c in expand_operands (exp0=0x402cbd20, exp1=0x402cbe60,
    target=0x0, op0=0xfb472288, op1=0xfb47228c, modifier=EXPAND_NORMAL)
    at ../../gcc/gcc/expr.h:514
#13 0x001896d4 in expand_expr_real_1 (exp=0x402cbe88, target=0x0,
    tmode=DImode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:9370
#14 0x001997e0 in expand_expr_real (exp=0x402cbe88, target=0x0, tmode=DImode,
    modifier=EXPAND_NORMAL, alt_rtl=0x0) at ../../gcc/gcc/expr.c:7090
#15 0x001999b0 in expand_expr (exp=0x7, target=0x402e6fd0, mode=DImode,
    modifier=EXPAND_NORMAL) at ../../gcc/gcc/expr.h:514
#16 0x0018a1fc in expand_expr_real_1 (exp=0x402d18c0, target=0x402e3080,
    tmode=DImode, modifier=EXPAND_NORMAL, alt_rtl=0xfb471fc8)
    at ../../gcc/gcc/expr.c:8159
#17 0x001997e0 in expand_expr_real (exp=0x402d18c0, target=0x402e3080,
    tmode=DImode, modifier=EXPAND_NORMAL, alt_rtl=0xfb471fc8)
    at ../../gcc/gcc/expr.c:7090
#18 0x0019ee20 in store_expr (exp=0x7, target=0x402e3080, call_param_p=0,
    nontemporal=<value optimized out>) at ../../gcc/gcc/expr.c:4574
#19 0x001a1590 in expand_assignment (to=0x40152a80, from=0x402d18c0,
    nontemporal=0 '\0') at ../../gcc/gcc/expr.c:4357
#20 0x00187c28 in expand_expr_real_1 (exp=0x402d18e0, target=0x0,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:9130
#21 0x00199950 in expand_expr_real (exp=0x402d18e0, target=0x400d9200,
    tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0)
    at ../../gcc/gcc/expr.c:7084
#22 0x003069d4 in expand_expr_stmt (exp=0x7) at ../../gcc/gcc/stmt.c:1357
#23 0x00549bfc in expand_gimple_basic_block (bb=0x400d0800)
    at ../../gcc/gcc/cfgexpand.c:1606
#24 0x0054adb4 in tree_expand_cfg () at ../../gcc/gcc/cfgexpand.c:1913
#25 0x00288e3c in execute_one_pass (pass=0x6e9388)
    at ../../gcc/gcc/passes.c:1117
#26 0x00289050 in execute_pass_list (pass=0x6e9388)
    at ../../gcc/gcc/passes.c:1170
#27 0x003798b8 in tree_rest_of_compilation (fndecl=0x40148e80)
    at ../../gcc/gcc/tree-optimize.c:404
#28 0x004e0cc4 in cgraph_expand_function (node=0x40148f00)
    at ../../gcc/gcc/cgraphunit.c:1060
#29 0x004e3534 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1123
#30 0x00031b14 in c_write_global_declarations () at ../../gcc/gcc/c-decl.c:8077
#31 0x0031ae3c in toplev_main (argc=<value optimized out>,
    argv=<value optimized out>) at ../../gcc/gcc/toplev.c:1052
#32 0x407db750 in __libc_start_main () from /lib/libc.so.6
#33 0x0001ea58 in _start ()


-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-09  0:31                                                                                     ` John David Anglin
@ 2007-10-09  4:41                                                                                       ` Alexandre Oliva
  0 siblings, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-09  4:41 UTC (permalink / raw)
  To: John David Anglin
  Cc: dave.anglin, richard.guenther, zippel, bernds_cb1, dnovillo,
	dberlin, gcc-patches, pinskia, ebotcazou, rsandifo

[-- Attachment #1: Type: text/plain, Size: 475 bytes --]

On Oct  8, 2007, "John David Anglin" <dave@hiauly1.hia.nrc.ca> wrote:

> Breakpoint 2, simplify_subreg (outermode=SImode, op=0x402e6fd0,
>     innermode=DImode, byte=0) at ../../gcc/gcc/simplify-rtx.c:4920

Ugh.  I had fixed this, but as it turned out, the hunk that fixed it
was included in one patch while the ChangeLog entry was in the other.

The fix for this is already in, in revision 129143.

Here's the patch I actually checked in, with an adjusted ChangeLog
entry.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-fallout-ada.patch --]
[-- Type: text/x-patch, Size: 1169 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (scalarize_lsdt): Fix thinko in testing whether
	the original stmt can throw.
	(sra_build_bf_assignment): Fix type mismatch when applying negated
	mask.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-10-08 20:54:16.000000000 -0300
+++ gcc/tree-sra.c	2007-10-08 20:54:27.000000000 -0300
@@ -2408,7 +2408,7 @@ sra_build_bf_assignment (tree dst, tree 
       tmp2 = fold_build1 (BIT_NOT_EXPR, utype, mask);
       tmp2 = int_const_binop (RSHIFT_EXPR, tmp2, minshift, true);
       tmp2 = fold_convert (ut, tmp2);
-      tmp2 = fold_build2 (BIT_AND_EXPR, utype, tmp3, tmp2);
+      tmp2 = fold_build2 (BIT_AND_EXPR, ut, tmp3, tmp2);
 
       if (tmp3 != tmp2)
 	{
@@ -3436,7 +3436,7 @@ scalarize_ldst (struct sra_elt *elt, tre
 	{
 	  tree_stmt_iterator tsi;
 	  tree first, blist = NULL;
-	  bool thr = (bsi->bb->flags & EDGE_COMPLEX) != 0;
+	  bool thr = tree_could_throw_p (stmt);
 
 	  /* If the last statement of this BB created an EH edge
 	     before scalarization, we have to locate the first

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-07 23:44                                                                               ` Alexandre Oliva
  2007-10-08 21:14                                                                                 ` John David Anglin
@ 2007-10-09  4:45                                                                                 ` Alexandre Oliva
  1 sibling, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-09  4:45 UTC (permalink / raw)
  To: John David Anglin
  Cc: richard.guenther, zippel, bernds_cb1, dnovillo, dberlin,
	gcc-patches, pinskia, ebotcazou, rsandifo

Having gone through testing and revised the patch one more time, I
decided it is obviously correct and I'm checking it in, with the
following ChangeLog entry:

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR middle-end/22156
	* tree-sra.c (instantiate_element): Use BYTES_BIG_ENDIAN for
	bit-field layout.
	(sra_build_assignment): Likewise.  Set up mask depending on
	precision, not type.
	(sra_build_bf_assignment): Use BYTES_BIG_ENDIAN.  Don't overflow
	computing bit masks.
	(sra_build_elt_assignment): Don't view-convert from signed to
	unsigned.
	(sra_explode_bitfield_assignment): Use bit-field type if
	possible.  Use BYTES_BIG_ENDIAN.

-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: SRA bit-field optimization
  2007-10-05 16:20                                                                   ` Alexandre Oliva
  2007-10-05 16:24                                                                     ` Richard Guenther
  2007-10-05 16:26                                                                     ` Diego Novillo
@ 2007-10-09  4:55                                                                     ` Alexandre Oliva
  2 siblings, 0 replies; 104+ messages in thread
From: Alexandre Oliva @ 2007-10-09  4:55 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Roman Zippel, Bernd Schmidt, Diego Novillo, Daniel Berlin,
	GCC Patches, Andrew Pinski, Eric Botcazou, hjl, wilson

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

On Oct  5, 2007, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Oct  5, 2007, "Richard Guenther" <richard.guenther@gmail.com> wrote:

>> On 10/2/07, Richard Guenther <richard.guenther@gmail.com> wrote:
>>> It seems this was committed now, and causes ncurses build to fail on
>>> x86_64:
>> Btw, this is PR33655.

> Thanks.  Here's a patch that fixes the problem, that I've just started
> bootstrapping and regtesting on x86_64-linux-gnu.  Ok to install if it
> passes?

H.J.Lu ran into a problem on ia64-linux-gnu, and Jim Wilson provided
the following patch.  I'm checking it in as it is an obvious
improvement.  Please let me know whether it causes any further
problem.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-sra-bit-field-ref-fallout-array-more.patch --]
[-- Type: text/x-patch, Size: 833 bytes --]

for gcc/ChangeLog
from  James E. Wilson  <wilson@specifix.com>

	PR tree-optimization/33655
	PR middle-end/22156
	* tree-sra.c (bitfield_overlaps_p): When fld->element is INTEGER_CST,
	convert it to bitsizetype before size_binop call.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2007-10-09 01:45:08.000000000 -0300
+++ gcc/tree-sra.c	2007-10-09 01:50:52.000000000 -0300
@@ -2906,7 +2906,8 @@ bitfield_overlaps_p (tree blen, tree bpo
   else if (TREE_CODE (fld->element) == INTEGER_CST)
     {
       flen = fold_convert (bitsizetype, TYPE_SIZE (fld->type));
-      fpos = size_binop (MULT_EXPR, flen, fld->element);
+      fpos = fold_convert (bitsizetype, fld->element);
+      fpos = size_binop (MULT_EXPR, flen, fpos);
     }
   else
     gcc_unreachable ();

[-- Attachment #3: Type: text/plain, Size: 249 bytes --]


-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

^ permalink raw reply	[flat|nested] 104+ messages in thread

end of thread, other threads:[~2007-10-09  4:55 UTC | newest]

Thread overview: 104+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-20  0:29 Reload bug & SRA oddness Bernd Schmidt
2007-04-20  3:19 ` Andrew Pinski
2007-04-20  3:53 ` Daniel Berlin
2007-04-20  4:30   ` Bernd Schmidt
2007-04-20  4:48     ` Andrew Pinski
2007-04-20  7:32     ` Alexandre Oliva
2007-04-20 12:57       ` Bernd Schmidt
2007-04-20 18:28         ` Alexandre Oliva
2007-04-20 18:40         ` Alexandre Oliva
2007-04-30 10:09           ` Bernd Schmidt
2007-04-30 19:04             ` Alexandre Oliva
2007-04-30 19:08               ` Alexandre Oliva
2007-05-01 15:38             ` Alexandre Oliva
2007-05-01 15:45               ` Eric Botcazou
2007-05-04  0:19                 ` Alexandre Oliva
2007-05-01 15:53               ` Diego Novillo
2007-05-01 16:03                 ` Eric Botcazou
2007-05-01 16:51                   ` Arnaud Charlet
2007-05-01 16:54                     ` Arnaud Charlet
2007-05-01 17:31                   ` Andreas Schwab
2007-05-01 16:24               ` Roman Zippel
2007-05-04  4:07                 ` Alexandre Oliva
2007-05-04  5:24                   ` Alexandre Oliva
2007-05-05 18:19                     ` Roman Zippel
2007-05-06  5:13                       ` Alexandre Oliva
2007-05-06 12:13                         ` Bernd Schmidt
2007-05-06 14:27                           ` Alexandre Oliva
2007-05-06 15:01                             ` Alexandre Oliva
2007-05-06 23:44                               ` Alexandre Oliva
2007-05-06 23:51                                 ` Diego Novillo
2007-05-07  0:14                                   ` Alexandre Oliva
2007-05-07  2:21                                     ` Andrew Pinski
2007-05-09 12:25                                     ` Bernd Schmidt
2007-05-10  7:47                                       ` Alexandre Oliva
2007-05-10 11:00                                         ` Bernd Schmidt
2007-05-22  7:09                                           ` Alexandre Oliva
2007-05-22 17:34                                             ` Roman Zippel
2007-05-22 20:49                                               ` Alexandre Oliva
2007-05-23 13:07                                                 ` Roman Zippel
2007-05-22 21:23                                             ` Alexandre Oliva
2007-05-09 18:32                                     ` Roman Zippel
2007-05-10  7:49                                       ` Alexandre Oliva
2007-05-15 17:39                                         ` Alexandre Oliva
2007-05-17 21:17                                           ` Andrew Pinski
2007-05-28 10:49                                           ` Bernd Schmidt
2007-05-31 20:57                                             ` Alexandre Oliva
2007-06-02 17:41                                               ` Bernd Schmidt
2007-06-06  3:18                                                 ` Alexandre Oliva
2007-06-25 18:48                                                   ` Alexandre Oliva
2007-06-26 23:01                                                   ` Bernd Schmidt
2007-06-28  4:50                                                     ` Alexandre Oliva
2007-07-03  0:52                                                       ` Roman Zippel
2007-07-06  9:21                                                         ` Alexandre Oliva
2007-08-24  7:28                                                           ` SRA bit-field optimization (was: Re: Reload bug & SRA oddness) Alexandre Oliva
2007-09-28  9:17                                                             ` SRA bit-field optimization Alexandre Oliva
2007-10-02 17:15                                                               ` Richard Guenther
2007-10-03  8:38                                                                 ` Richard Sandiford
2007-10-03 16:50                                                                   ` Alexandre Oliva
2007-10-03 18:23                                                                     ` Richard Sandiford
2007-10-04 19:56                                                                     ` Richard Sandiford
2007-10-05 17:43                                                                       ` Alexandre Oliva
2007-10-06  8:02                                                                         ` Richard Sandiford
2007-10-06 21:06                                                                           ` John David Anglin
2007-10-06 22:12                                                                             ` Richard Sandiford
2007-10-07 23:44                                                                               ` Alexandre Oliva
2007-10-08 21:14                                                                                 ` John David Anglin
2007-10-08 23:51                                                                                   ` Alexandre Oliva
2007-10-09  0:31                                                                                     ` John David Anglin
2007-10-09  4:41                                                                                       ` Alexandre Oliva
2007-10-09  4:45                                                                                 ` Alexandre Oliva
2007-10-06 16:01                                                                         ` David Daney
2007-10-03 14:37                                                                 ` Daniel Berlin
2007-10-03 14:44                                                                   ` Diego Novillo
2007-10-05 15:03                                                                 ` Richard Guenther
2007-10-05 16:20                                                                   ` Alexandre Oliva
2007-10-05 16:24                                                                     ` Richard Guenther
2007-10-05 16:26                                                                     ` Diego Novillo
2007-10-05 20:08                                                                       ` Alexandre Oliva
2007-10-09  4:55                                                                     ` Alexandre Oliva
2007-10-03  7:45                                                               ` Eric Botcazou
2007-10-03 21:36                                                                 ` Eric Botcazou
2007-10-08 20:28                                                                 ` Alexandre Oliva
2007-10-05  6:24                                                                   ` Eric Botcazou
2007-10-05 16:03                                                                     ` Alexandre Oliva
2007-10-07  9:01                                                                       ` Eric Botcazou
2007-10-07 23:58                                                                         ` Alexandre Oliva
2007-10-08  5:13                                                                           ` Eric Botcazou
2007-10-08 20:29                                                                             ` Alexandre Oliva
2007-10-08 21:00                                                                               ` Eric Botcazou
2007-10-08 23:56                                                                                 ` Alexandre Oliva
2007-09-29 17:52                                                           ` Reload bug & SRA oddness Diego Novillo
2007-04-20 17:07     ` Daniel Berlin
2007-04-28 20:48   ` Bernd Schmidt
2007-04-28 21:26     ` Richard Guenther
2007-04-28 21:49       ` Daniel Berlin
2007-04-29 10:04         ` Richard Guenther
2007-04-29 10:27           ` Richard Guenther
2007-04-29 10:31             ` Richard Guenther
2007-04-29 11:16               ` Richard Guenther
2007-04-20 14:01 ` Bernd Schmidt
2007-04-20 22:00   ` Eric Botcazou
2007-04-28 16:25     ` Bernd Schmidt
2007-04-28 17:46       ` Eric Botcazou
2007-04-29  6:42   ` Bernd Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).