public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
@ 2010-07-30 16:21 Jie Zhang
  2010-07-30 16:37 ` Richard Guenther
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Jie Zhang @ 2010-07-30 16:21 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 957 bytes --]

PR tree-optimization/45144 shows an issue that SRA causes. I used 
arm-none-eabi target as an example in PR tree-optimization/45144. But 
the same issue can also been seen on x86_64-linux-gnu target using the 
same test case in the PR.

SRA completely scalarizes a small record. But when the record is used 
later as a whole, GCC has to make the record out of the scalar parts. 
When the record contains bit-fields, GCC generates ugly code to assemble 
the scalar parts into a record.

Until the aggregates copy propagation is implemented, I think it would 
better to disable full scalarization for such records. The patch is 
attached. It's bootstrapped on x86_64-linux-gnu and regression tested.

Is it OK for now? We can remove it after aggregates copy propagation is 
implemented.

Will it be better to add bit-field check in type_consists_of_records_p 
instead of using a new function "type_contains_bit_field_p"?


Regards,
-- 
Jie Zhang
CodeSourcery

[-- Attachment #2: gcc-not-full-scalarize-bitfield.diff --]
[-- Type: text/x-patch, Size: 2450 bytes --]


	PR tree-optimization/45144
	* tree-sra.c (type_contains_bit_field_p): New.
	(build_accesses_from_assign): Don't completely scalarize
	a record if it contains bit-field.

	testsuite/
	PR tree-optimization/45144
	* gcc.dg/tree-ssa/pr45144.c: New test.

Index: testsuite/gcc.dg/tree-ssa/pr45144.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void baz (unsigned);
+
+extern unsigned buf[];
+
+struct A
+{
+  unsigned a1:10;
+  unsigned a2:3;
+  unsigned:19;
+};
+
+union TMP
+{
+  struct A a;
+  unsigned int b;
+};
+
+static unsigned
+foo (struct A *p)
+{
+  union TMP t;
+  struct A x;
+  
+  x = *p;
+  t.a = x;
+  return t.b;
+}
+
+void
+bar (unsigned orig, unsigned *new)
+{
+  struct A a;
+  union TMP s;
+
+  s.b = orig;
+  a = s.a;
+  if (a.a1)
+    baz (a.a2);
+  *new = foo (&a);
+}
+
+/* { dg-final { scan-tree-dump "x = a;" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: tree-sra.c
===================================================================
--- tree-sra.c	(revision 162704)
+++ tree-sra.c	(working copy)
@@ -998,6 +998,30 @@ disqualify_ops_if_throwing_stmt (gimple 
   return false;
 }
 
+static bool
+type_contains_bit_field_p (const_tree type)
+{
+  tree fld;
+
+  if (TREE_CODE (type) != RECORD_TYPE)
+    return false;
+
+  for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
+    if (TREE_CODE (fld) == FIELD_DECL)
+      {
+	tree ft = TREE_TYPE (fld);
+
+	if (DECL_BIT_FIELD (fld))
+	  return true;
+
+	if (TREE_CODE (ft) == RECORD_TYPE
+	    && type_contains_bit_field_p (ft))
+	  return true;
+      }
+
+  return false;
+}
+
 /* Scan expressions occuring in STMT, create access structures for all accesses
    to candidates for scalarization and remove those candidates which occur in
    statements or expressions that prevent them from being split apart.  Return
@@ -1025,7 +1049,8 @@ build_accesses_from_assign (gimple stmt)
     {
       racc->grp_assignment_read = 1;
       if (should_scalarize_away_bitmap && !gimple_has_volatile_ops (stmt)
-	  && !is_gimple_reg_type (racc->type))
+	  && !is_gimple_reg_type (racc->type)
+	  && !type_contains_bit_field_p (racc->type))
 	bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base));
     }
 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-30 16:21 [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144) Jie Zhang
@ 2010-07-30 16:37 ` Richard Guenther
  2010-07-30 17:27   ` Mark Mitchell
  2010-07-31 10:01 ` Richard Guenther
  2010-08-02 13:01 ` Martin Jambor
  2 siblings, 1 reply; 24+ messages in thread
From: Richard Guenther @ 2010-07-30 16:37 UTC (permalink / raw)
  To: Jie Zhang; +Cc: GCC Patches

On Fri, Jul 30, 2010 at 6:09 PM, Jie Zhang <jie@codesourcery.com> wrote:
> PR tree-optimization/45144 shows an issue that SRA causes. I used
> arm-none-eabi target as an example in PR tree-optimization/45144. But the
> same issue can also been seen on x86_64-linux-gnu target using the same test
> case in the PR.
>
> SRA completely scalarizes a small record. But when the record is used later
> as a whole, GCC has to make the record out of the scalar parts. When the
> record contains bit-fields, GCC generates ugly code to assemble the scalar
> parts into a record.
>
> Until the aggregates copy propagation is implemented, I think it would
> better to disable full scalarization for such records. The patch is
> attached. It's bootstrapped on x86_64-linux-gnu and regression tested.
>
> Is it OK for now? We can remove it after aggregates copy propagation is
> implemented.
>
> Will it be better to add bit-field check in type_consists_of_records_p
> instead of using a new function "type_contains_bit_field_p"?

Without looking at the patch I have two comments.

First, I heard talks about an aggregate copy propagation pass.
Instead of a new pass I would suggest to rewrite aggregate
variables (partly) into SSA form, extending the DECL_GIMPLE_REG_P
facility.  Thus, for each full definition aggr = ... you get
aggr_2 = ..., partial reads of that SSA name should be fine as well
as aggregate uses.

This works for non-aliased variables only of course and requires
some thinking as for DECL_GIMPLE_REG_P as that applies to
the underlying decl which would prohibit creating from

 a.a = x;
 a = b;

 a.a = x;
 a_2 = b;

which means that we need to replace the full defs with a new
temporary (with DECL_DEBUG_EXPR_FROM set appropriately),
thus

 a.a = x;
 tmp_2 = b; // was a

this rewriting should happen in update_address_taken.

Second comment.

SRA should be the place to lower bitfield accesses to loads of
the underlying scalar and the bit extraction via BIT_FIELD_REF.
Stores should be handled via a read-modify-write cycle,
borrowing BIT_FIELD_EXPR as done on the old mem-ref branch.

Richard.

> Regards,
> --
> Jie Zhang
> CodeSourcery
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-30 16:37 ` Richard Guenther
@ 2010-07-30 17:27   ` Mark Mitchell
  2010-07-30 17:53     ` Jakub Jelinek
  2010-08-02  3:28     ` Jie Zhang
  0 siblings, 2 replies; 24+ messages in thread
From: Mark Mitchell @ 2010-07-30 17:27 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jie Zhang, GCC Patches

Richard Guenther wrote:

>> Until the aggregates copy propagation is implemented, I think it would
>> better to disable full scalarization for such records. The patch is
>> attached. It's bootstrapped on x86_64-linux-gnu and regression tested.

> Without looking at the patch I have two comments.

From reading your comments, I'm not sure if you're saying that you don't
like Jie's idea, or if you think it's an OK idea, but there are some
other things that you would like to see done as well or instead.  It
would be helpful if you could make that clear.

To me, Jie's change seems like a plausible heuristic within the current
infrastructure.  I'm all for building new infrastructure when
possible/necessary, but I don't think we should prevent these kinds of
tweaks to heuristics just because we can think of another way of doing
things.  To me, the question ought to be "does this make the compiler
better for users?"

To answer that question, what I guess I'd like to know is what the
impact is on some benchmark that matters.  Here, I don't think SPEC,
EEMBC, etc. are probably the right places to look; they probably don't
have much of this kind of code.  Perhaps it would be interesting to know
how many SRA opportunities we lose in the Linux kernel because of this
change -- and then spot-check some of them to see whether those are
cases where we really lose by not doing SRA, or really win because we're
not doing the kind of ugly stuff that inspired this change.

Thanks,

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 17:27   ` Mark Mitchell
@ 2010-07-30 17:53     ` Jakub Jelinek
  2010-07-30 18:53       ` Mark Mitchell
                         ` (2 more replies)
  2010-08-02  3:28     ` Jie Zhang
  1 sibling, 3 replies; 24+ messages in thread
From: Jakub Jelinek @ 2010-07-30 17:53 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Richard Guenther, Jie Zhang, GCC Patches

On Fri, Jul 30, 2010 at 10:11:42AM -0700, Mark Mitchell wrote:
> To me, Jie's change seems like a plausible heuristic within the current
> infrastructure.  I'm all for building new infrastructure when
> possible/necessary, but I don't think we should prevent these kinds of
> tweaks to heuristics just because we can think of another way of doing
> things.  To me, the question ought to be "does this make the compiler
> better for users?"

I wouldn't call it tweak to heuristics, it looks to me like an ugly hack.
In many cases the SRA of bitfields is very desirable, especially for apps
that use bitfields heavily like Linux kernel.  I remember Alex spending
quite some time improving SRA to handle bitfields more efficiently,
and this hack just disables it altogether.

What we IMHO need is a pass late in the gimple pipeline which will
optimize adjacent bitfield operations and lower to BIT_FIELD_REF ops
or something similar, because bitfield ops are too hard to be handled
efficiently after the expansion.  Combiner helps sometimes, but for many
bit field ops the 3 insn limit is too limiting.
Such pass could help with many more things than just what has been created
for bitfields by SRA in some cases, consider say:
struct A { unsigned short h; int i : 4, j : 4, k : 4, l : 4; } a, b, c;

void
foo (void)
{
  a.i |= 5; a.j = 3; a.k &= 2; a.l = 8;
  b.i = c.i; b.j = c.j; b.k = c.k; b.l = c.l;
}

which on say i?86/x86_64 can be optimized as roughly:
  ((unsigned short *)&a)[1] = (((unsigned short *)&a)[1] & 0x20a) | 0x8035;
  ((unsigned short *)&b)[1] = ((unsigned short *)&c)[1];
(of course in some alias friendly way).
See e.g. PR37135/PR22121/PR42172.

	Jakub

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 17:53     ` Jakub Jelinek
@ 2010-07-30 18:53       ` Mark Mitchell
  2010-07-30 18:59         ` Richard Kenner
  2010-07-31  9:48       ` Richard Guenther
  2010-08-02  4:01       ` Jie Zhang
  2 siblings, 1 reply; 24+ messages in thread
From: Mark Mitchell @ 2010-07-30 18:53 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Guenther, Jie Zhang, GCC Patches

Jakub Jelinek wrote:
> On Fri, Jul 30, 2010 at 10:11:42AM -0700, Mark Mitchell wrote:
>> To me, Jie's change seems like a plausible heuristic within the current
>> infrastructure.  I'm all for building new infrastructure when
>> possible/necessary, but I don't think we should prevent these kinds of
>> tweaks to heuristics just because we can think of another way of doing
>> things.  To me, the question ought to be "does this make the compiler
>> better for users?"
> 
> I wouldn't call it tweak to heuristics, it looks to me like an ugly hack.

Well, call it what you like.  It's making a guess about when SRA is or
isn't profitable.  It might be a good guess or a bad guess, but it's a
guess, which is why I used the term "heuristic".

> In many cases the SRA of bitfields is very desirable, especially for apps
> that use bitfields heavily like Linux kernel.  I remember Alex spending
> quite some time improving SRA to handle bitfields more efficiently,
> and this hack just disables it altogether.

That's precisely why I wanted some data about that.  Maybe we already
know enough to know that this isn't a good heuristic because we already
know that SRA on bitfields is a big win in lots of cases.  That's fine;
in that case, the patch is not a good thing.

> What we IMHO need is a pass late in the gimple pipeline which will
> optimize adjacent bitfield operations and lower to BIT_FIELD_REF ops
> or something similar, because bitfield ops are too hard to be handled
> efficiently after the expansion.

That's an interesting idea.  Does anyone else have an idea as to the
best plan here?

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 18:53       ` Mark Mitchell
@ 2010-07-30 18:59         ` Richard Kenner
  2010-07-30 19:39           ` Andrew Pinski
  2010-07-31  9:48           ` Richard Guenther
  0 siblings, 2 replies; 24+ messages in thread
From: Richard Kenner @ 2010-07-30 18:59 UTC (permalink / raw)
  To: mark; +Cc: gcc-patches, jakub, jie, richard.guenther

> > What we IMHO need is a pass late in the gimple pipeline which will
> > optimize adjacent bitfield operations and lower to BIT_FIELD_REF ops
> > or something similar, because bitfield ops are too hard to be handled
> > efficiently after the expansion.
> 
> That's an interesting idea.  Does anyone else have an idea as to the
> best plan here?

Note that fold-const.c has done this in some cases for a very long time.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 18:59         ` Richard Kenner
@ 2010-07-30 19:39           ` Andrew Pinski
  2010-07-30 19:48             ` Mark Mitchell
  2010-07-31  9:48           ` Richard Guenther
  1 sibling, 1 reply; 24+ messages in thread
From: Andrew Pinski @ 2010-07-30 19:39 UTC (permalink / raw)
  To: Richard Kenner; +Cc: mark, gcc-patches, jakub, jie, richard.guenther



On Jul 30, 2010, at 11:54 AM, kenner@vlsi1.ultra.nyu.edu (Richard  
Kenner) wrote:

>>> What we IMHO need is a pass late in the gimple pipeline which will
>>> optimize adjacent bitfield operations and lower to BIT_FIELD_REF ops
>>> or something similar, because bitfield ops are too hard to be  
>>> handled
>>> efficiently after the expansion.
>>
>> That's an interesting idea.  Does anyone else have an idea as to the
>> best plan here?
>
> Note that fold-const.c has done this in some cases for a very long  
> time.

Interesting because we have this extact pass here at cavium. I can  
post the patch set that adds it to 4.3.3 if anyone wants to look at  
them. 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 19:39           ` Andrew Pinski
@ 2010-07-30 19:48             ` Mark Mitchell
  2010-08-02  4:10               ` Jie Zhang
  0 siblings, 1 reply; 24+ messages in thread
From: Mark Mitchell @ 2010-07-30 19:48 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Richard Kenner, gcc-patches, jakub, jie, richard.guenther

Andrew Pinski wrote:

> Interesting because we have this extact pass here at cavium. I can post
> the patch set that adds it to 4.3.3 if anyone wants to look at them.

Certainly seems like it would be helpful, assuming Cavium will assign
copyright.

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-30 18:59         ` Richard Kenner
  2010-07-30 19:39           ` Andrew Pinski
@ 2010-07-31  9:48           ` Richard Guenther
  1 sibling, 0 replies; 24+ messages in thread
From: Richard Guenther @ 2010-07-31  9:48 UTC (permalink / raw)
  To: Richard Kenner; +Cc: mark, gcc-patches, jakub, jie

On Fri, Jul 30, 2010 at 8:54 PM, Richard Kenner
<kenner@vlsi1.ultra.nyu.edu> wrote:
>> > What we IMHO need is a pass late in the gimple pipeline which will
>> > optimize adjacent bitfield operations and lower to BIT_FIELD_REF ops
>> > or something similar, because bitfield ops are too hard to be handled
>> > efficiently after the expansion.
>>
>> That's an interesting idea.  Does anyone else have an idea as to the
>> best plan here?
>
> Note that fold-const.c has done this in some cases for a very long time.

Indeed - also in a very ugly way.

Richard.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-30 17:53     ` Jakub Jelinek
  2010-07-30 18:53       ` Mark Mitchell
@ 2010-07-31  9:48       ` Richard Guenther
  2010-08-02 13:25         ` Martin Jambor
  2010-08-02  4:01       ` Jie Zhang
  2 siblings, 1 reply; 24+ messages in thread
From: Richard Guenther @ 2010-07-31  9:48 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Mark Mitchell, Jie Zhang, GCC Patches

On Fri, Jul 30, 2010 at 7:53 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Jul 30, 2010 at 10:11:42AM -0700, Mark Mitchell wrote:
>> To me, Jie's change seems like a plausible heuristic within the current
>> infrastructure.  I'm all for building new infrastructure when
>> possible/necessary, but I don't think we should prevent these kinds of
>> tweaks to heuristics just because we can think of another way of doing
>> things.  To me, the question ought to be "does this make the compiler
>> better for users?"
>
> I wouldn't call it tweak to heuristics, it looks to me like an ugly hack.
> In many cases the SRA of bitfields is very desirable, especially for apps
> that use bitfields heavily like Linux kernel.  I remember Alex spending
> quite some time improving SRA to handle bitfields more efficiently,
> and this hack just disables it altogether.
>
> What we IMHO need is a pass late in the gimple pipeline which will
> optimize adjacent bitfield operations and lower to BIT_FIELD_REF ops
> or something similar, because bitfield ops are too hard to be handled
> efficiently after the expansion.  Combiner helps sometimes, but for many
> bit field ops the 3 insn limit is too limiting.
> Such pass could help with many more things than just what has been created
> for bitfields by SRA in some cases, consider say:
> struct A { unsigned short h; int i : 4, j : 4, k : 4, l : 4; } a, b, c;
>
> void
> foo (void)
> {
>  a.i |= 5; a.j = 3; a.k &= 2; a.l = 8;
>  b.i = c.i; b.j = c.j; b.k = c.k; b.l = c.l;
> }
>
> which on say i?86/x86_64 can be optimized as roughly:
>  ((unsigned short *)&a)[1] = (((unsigned short *)&a)[1] & 0x20a) | 0x8035;
>  ((unsigned short *)&b)[1] = ((unsigned short *)&c)[1];
> (of course in some alias friendly way).
> See e.g. PR37135/PR22121/PR42172.

Note that on the old mem-ref branch I lowered bit-field references very
early, but people complained that we eventually loose the targets
ability to directly perform bitfield stores.

Basically I lowered each bitfield access (with bitfield access I name
those using a COMPONENT_REF with a FIELD_DECL that has
DECL_BIT_FIELD set) to either a load of the underlying type
plus a BIT_FIELD_REF on the loaded scalar, or a load of the underlying
type plus a BIT_FIELD_EXPR (new code, which inserts bits into
a scalar) plus a store of the underlying type.  CSE and DSE handled
the redundant loads and stores very nicely and BIT_FIELD_EXPRs
and BIT_FIELD_EXPRs were easily combined.

Now instead of lowering very early or very late as Jakub suggests we
can drive (and perform) that lowering in SRA.  For example by
simply walking through its list of accesses and combine them.

Richard.

>        Jakub
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-30 16:21 [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144) Jie Zhang
  2010-07-30 16:37 ` Richard Guenther
@ 2010-07-31 10:01 ` Richard Guenther
  2010-08-02  4:29   ` Jie Zhang
  2010-08-02 13:01 ` Martin Jambor
  2 siblings, 1 reply; 24+ messages in thread
From: Richard Guenther @ 2010-07-31 10:01 UTC (permalink / raw)
  To: Jie Zhang; +Cc: GCC Patches, Martin Jambor

On Fri, Jul 30, 2010 at 6:09 PM, Jie Zhang <jie@codesourcery.com> wrote:
> PR tree-optimization/45144 shows an issue that SRA causes. I used
> arm-none-eabi target as an example in PR tree-optimization/45144. But the
> same issue can also been seen on x86_64-linux-gnu target using the same test
> case in the PR.
>
> SRA completely scalarizes a small record. But when the record is used later
> as a whole, GCC has to make the record out of the scalar parts. When the
> record contains bit-fields, GCC generates ugly code to assemble the scalar
> parts into a record.
>
> Until the aggregates copy propagation is implemented, I think it would
> better to disable full scalarization for such records. The patch is
> attached. It's bootstrapped on x86_64-linux-gnu and regression tested.
>
> Is it OK for now? We can remove it after aggregates copy propagation is
> implemented.
>
> Will it be better to add bit-field check in type_consists_of_records_p
> instead of using a new function "type_contains_bit_field_p"?

The patch looks like a hack.  Can you instead make SRA treat the
underlying type of bit-fields as the object for scalarization?
I'm not 100% familiar with the internals, but IIRC SRA builds an
access tree, so for each bitfield load/store the analysis phase should
record an access of the underlying field covering all bits and
a sub-access for the respective member.

Maybe Martin can weight in here.

Richard.

>
> Regards,
> --
> Jie Zhang
> CodeSourcery
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-30 17:27   ` Mark Mitchell
  2010-07-30 17:53     ` Jakub Jelinek
@ 2010-08-02  3:28     ` Jie Zhang
  2010-08-02 16:52       ` Mark Mitchell
  1 sibling, 1 reply; 24+ messages in thread
From: Jie Zhang @ 2010-08-02  3:28 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Richard Guenther, GCC Patches

On 07/31/2010 01:11 AM, Mark Mitchell wrote:
> Richard Guenther wrote:
>
>>> Until the aggregates copy propagation is implemented, I think it would
>>> better to disable full scalarization for such records. The patch is
>>> attached. It's bootstrapped on x86_64-linux-gnu and regression tested.
>
>> Without looking at the patch I have two comments.
>
>  From reading your comments, I'm not sure if you're saying that you don't
> like Jie's idea, or if you think it's an OK idea, but there are some
> other things that you would like to see done as well or instead.  It
> would be helpful if you could make that clear.
>
> To me, Jie's change seems like a plausible heuristic within the current
> infrastructure.  I'm all for building new infrastructure when
> possible/necessary, but I don't think we should prevent these kinds of
> tweaks to heuristics just because we can think of another way of doing
> things.  To me, the question ought to be "does this make the compiler
> better for users?"
>
> To answer that question, what I guess I'd like to know is what the
> impact is on some benchmark that matters.  Here, I don't think SPEC,
> EEMBC, etc. are probably the right places to look; they probably don't
> have much of this kind of code.  Perhaps it would be interesting to know
> how many SRA opportunities we lose in the Linux kernel because of this
> change -- and then spot-check some of them to see whether those are
> cases where we really lose by not doing SRA, or really win because we're
> not doing the kind of ugly stuff that inspired this change.
>
My patch causes no changes on EEMBC for arm-none-eabi target.

My patch prevents several full scalarizations of records with bit-field 
when compiling Linux kernel for x86_64, but none of these causes 
differences in final assemblies. I use 2.6.34.1 and the default config 
for x86_64. I checked -O2 and -Os.

I also checked the effect of my patch on GCC itself. My patch prevents 
one full scalarization of records with bit-field when compiling files 
under gcc/ with -O2. But there is no difference in the final assemblies.


Regards,
-- 
Jie Zhang
CodeSourcery

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 17:53     ` Jakub Jelinek
  2010-07-30 18:53       ` Mark Mitchell
  2010-07-31  9:48       ` Richard Guenther
@ 2010-08-02  4:01       ` Jie Zhang
  2 siblings, 0 replies; 24+ messages in thread
From: Jie Zhang @ 2010-08-02  4:01 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Mark Mitchell, Richard Guenther, GCC Patches

On 07/31/2010 01:53 AM, Jakub Jelinek wrote:
> On Fri, Jul 30, 2010 at 10:11:42AM -0700, Mark Mitchell wrote:
>> To me, Jie's change seems like a plausible heuristic within the current
>> infrastructure.  I'm all for building new infrastructure when
>> possible/necessary, but I don't think we should prevent these kinds of
>> tweaks to heuristics just because we can think of another way of doing
>> things.  To me, the question ought to be "does this make the compiler
>> better for users?"
>
> I wouldn't call it tweak to heuristics, it looks to me like an ugly hack.
> In many cases the SRA of bitfields is very desirable, especially for apps
> that use bitfields heavily like Linux kernel.  I remember Alex spending
> quite some time improving SRA to handle bitfields more efficiently,
> and this hack just disables it altogether.
>
My patch does not disable SRA of bit-fields. It's an adjust of the tweak 
for PR 42585. It's only disable total scalarization of small records 
containing bit-fields. As you said below, it's difficult for RTL 
combiner to combine the scalarized bit-fields again. So we'd better to 
avoid total scalarizing it. If the bit-fields are accessed explicitly in 
the IL, SRA still works as in usual way.

As my reply to Mark's email, my patch does not change the final assembly 
of linux kernel, at least for 2.6.34.1 with the default configuration of 
x86_64.


Regards,
-- 
Jie Zhang
CodeSourcery

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 19:48             ` Mark Mitchell
@ 2010-08-02  4:10               ` Jie Zhang
  0 siblings, 0 replies; 24+ messages in thread
From: Jie Zhang @ 2010-08-02  4:10 UTC (permalink / raw)
  To: Mark Mitchell
  Cc: Andrew Pinski, Richard Kenner, gcc-patches, jakub, richard.guenther

On 07/31/2010 02:59 AM, Mark Mitchell wrote:
> Andrew Pinski wrote:
>
>> Interesting because we have this extact pass here at cavium. I can post
>> the patch set that adds it to 4.3.3 if anyone wants to look at them.
>
> Certainly seems like it would be helpful, assuming Cavium will assign
> copyright.
>
This is very interesting! If Cavium will assign copyright and the patch 
set is posted, I can help to bring it to trunk.


Regards,
-- 
Jie Zhang
CodeSourcery

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-31 10:01 ` Richard Guenther
@ 2010-08-02  4:29   ` Jie Zhang
  0 siblings, 0 replies; 24+ messages in thread
From: Jie Zhang @ 2010-08-02  4:29 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches, Martin Jambor

On 07/31/2010 05:48 PM, Richard Guenther wrote:
> On Fri, Jul 30, 2010 at 6:09 PM, Jie Zhang<jie@codesourcery.com>  wrote:
>> PR tree-optimization/45144 shows an issue that SRA causes. I used
>> arm-none-eabi target as an example in PR tree-optimization/45144. But the
>> same issue can also been seen on x86_64-linux-gnu target using the same test
>> case in the PR.
>>
>> SRA completely scalarizes a small record. But when the record is used later
>> as a whole, GCC has to make the record out of the scalar parts. When the
>> record contains bit-fields, GCC generates ugly code to assemble the scalar
>> parts into a record.
>>
>> Until the aggregates copy propagation is implemented, I think it would
>> better to disable full scalarization for such records. The patch is
>> attached. It's bootstrapped on x86_64-linux-gnu and regression tested.
>>
>> Is it OK for now? We can remove it after aggregates copy propagation is
>> implemented.
>>
>> Will it be better to add bit-field check in type_consists_of_records_p
>> instead of using a new function "type_contains_bit_field_p"?
>
> The patch looks like a hack.  Can you instead make SRA treat the
> underlying type of bit-fields as the object for scalarization?
> I'm not 100% familiar with the internals, but IIRC SRA builds an
> access tree, so for each bitfield load/store the analysis phase should
> record an access of the underlying field covering all bits and
> a sub-access for the respective member.
>
Yes. It's a hack, but it's a hack to the hack for PR 42585. ;-) The fix 
for PR 42585 scalarize small records unconditionally, in hope that later 
passes can fix it up if the scalars are not used eventually. But for 
bit-fields, RTL combiner failed to do so. :-(

> Maybe Martin can weight in here.
>
Martin?


Regards,
-- 
Jie Zhang
CodeSourcery

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-07-30 16:21 [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144) Jie Zhang
  2010-07-30 16:37 ` Richard Guenther
  2010-07-31 10:01 ` Richard Guenther
@ 2010-08-02 13:01 ` Martin Jambor
  2010-08-04 11:53   ` Jie Zhang
  2 siblings, 1 reply; 24+ messages in thread
From: Martin Jambor @ 2010-08-02 13:01 UTC (permalink / raw)
  To: Jie Zhang; +Cc: GCC Patches

Hi,

On Sat, Jul 31, 2010 at 12:09:42AM +0800, Jie Zhang wrote:
> PR tree-optimization/45144 shows an issue that SRA causes. I used
> arm-none-eabi target as an example in PR tree-optimization/45144.
> But the same issue can also been seen on x86_64-linux-gnu target
> using the same test case in the PR.
> 
> SRA completely scalarizes a small record. But when the record is
> used later as a whole, GCC has to make the record out of the scalar
> parts. When the record contains bit-fields, GCC generates ugly code
> to assemble the scalar parts into a record.
> 
> Until the aggregates copy propagation is implemented, I think it
> would better to disable full scalarization for such records. The
> patch is attached. It's bootstrapped on x86_64-linux-gnu and
> regression tested.
> 
> Is it OK for now? We can remove it after aggregates copy propagation
> is implemented.
> 
> Will it be better to add bit-field check in
> type_consists_of_records_p instead of using a new function
> "type_contains_bit_field_p"?
> 

When I was implementing the total scalarization bit of SRA I thought
of disabling it for structures with bit-fields too.  I did not really
examine the effects in any way but I never expected this to result in
nice code at places where we use SRA to do poor-man's copy
propagation.  However, eventually I decided to keep the total
scalarization for these structures because doing so can save stack
space and it would be shame if adding one such field to a structure
would make us use the space again (in fact, total scalarization was
introduced as a fix to unnecessary stack-frame setup bugs like PR
42585).  But given your results with kernel and gcc, I don't object to
disabling it... people will scream if something slows down for them.

On the other hand, if we decide to go this way, we need to do the
check at a different place, going over the whole type whenever looking
at an assignment is not necessary and is wasteful.  The check would be
most appropriate as a part of type_consists_of_records_p where it
would be performed only once for each variable in question.

Thanks,

Martin

> 
> Regards,
> -- 
> Jie Zhang
> CodeSourcery

> 
> 	PR tree-optimization/45144
> 	* tree-sra.c (type_contains_bit_field_p): New.
> 	(build_accesses_from_assign): Don't completely scalarize
> 	a record if it contains bit-field.
> 
> 	testsuite/
> 	PR tree-optimization/45144
> 	* gcc.dg/tree-ssa/pr45144.c: New test.
> 
> Index: testsuite/gcc.dg/tree-ssa/pr45144.c
> ===================================================================
> --- testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
> +++ testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
> @@ -0,0 +1,46 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +void baz (unsigned);
> +
> +extern unsigned buf[];
> +
> +struct A
> +{
> +  unsigned a1:10;
> +  unsigned a2:3;
> +  unsigned:19;
> +};
> +
> +union TMP
> +{
> +  struct A a;
> +  unsigned int b;
> +};
> +
> +static unsigned
> +foo (struct A *p)
> +{
> +  union TMP t;
> +  struct A x;
> +  
> +  x = *p;
> +  t.a = x;
> +  return t.b;
> +}
> +
> +void
> +bar (unsigned orig, unsigned *new)
> +{
> +  struct A a;
> +  union TMP s;
> +
> +  s.b = orig;
> +  a = s.a;
> +  if (a.a1)
> +    baz (a.a2);
> +  *new = foo (&a);
> +}
> +
> +/* { dg-final { scan-tree-dump "x = a;" "optimized"} } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: tree-sra.c
> ===================================================================
> --- tree-sra.c	(revision 162704)
> +++ tree-sra.c	(working copy)
> @@ -998,6 +998,30 @@ disqualify_ops_if_throwing_stmt (gimple 
>    return false;
>  }
>  
> +static bool
> +type_contains_bit_field_p (const_tree type)
> +{
> +  tree fld;
> +
> +  if (TREE_CODE (type) != RECORD_TYPE)
> +    return false;
> +
> +  for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
> +    if (TREE_CODE (fld) == FIELD_DECL)
> +      {
> +	tree ft = TREE_TYPE (fld);
> +
> +	if (DECL_BIT_FIELD (fld))
> +	  return true;
> +
> +	if (TREE_CODE (ft) == RECORD_TYPE
> +	    && type_contains_bit_field_p (ft))
> +	  return true;
> +      }
> +
> +  return false;
> +}
> +
>  /* Scan expressions occuring in STMT, create access structures for all accesses
>     to candidates for scalarization and remove those candidates which occur in
>     statements or expressions that prevent them from being split apart.  Return
> @@ -1025,7 +1049,8 @@ build_accesses_from_assign (gimple stmt)
>      {
>        racc->grp_assignment_read = 1;
>        if (should_scalarize_away_bitmap && !gimple_has_volatile_ops (stmt)
> -	  && !is_gimple_reg_type (racc->type))
> +	  && !is_gimple_reg_type (racc->type)
> +	  && !type_contains_bit_field_p (racc->type))
>  	bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base));
>      }
>  

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-07-31  9:48       ` Richard Guenther
@ 2010-08-02 13:25         ` Martin Jambor
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Jambor @ 2010-08-02 13:25 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jakub Jelinek, Mark Mitchell, Jie Zhang, GCC Patches

Hi,

On Sat, Jul 31, 2010 at 11:44:24AM +0200, Richard Guenther wrote:
> On Fri, Jul 30, 2010 at 7:53 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Fri, Jul 30, 2010 at 10:11:42AM -0700, Mark Mitchell wrote:
> >> To me, Jie's change seems like a plausible heuristic within the current
> >> infrastructure.  I'm all for building new infrastructure when
> >> possible/necessary, but I don't think we should prevent these kinds of
> >> tweaks to heuristics just because we can think of another way of doing
> >> things.  To me, the question ought to be "does this make the compiler
> >> better for users?"
> >
> > I wouldn't call it tweak to heuristics, it looks to me like an ugly hack.
> > In many cases the SRA of bitfields is very desirable, especially for apps
> > that use bitfields heavily like Linux kernel.  I remember Alex spending
> > quite some time improving SRA to handle bitfields more efficiently,
> > and this hack just disables it altogether.
> >
> > What we IMHO need is a pass late in the gimple pipeline which will
> > optimize adjacent bitfield operations and lower to BIT_FIELD_REF ops
> > or something similar, because bitfield ops are too hard to be handled
> > efficiently after the expansion.  Combiner helps sometimes, but for many
> > bit field ops the 3 insn limit is too limiting.
> > Such pass could help with many more things than just what has been created
> > for bitfields by SRA in some cases, consider say:
> > struct A { unsigned short h; int i : 4, j : 4, k : 4, l : 4; } a, b, c;
> >
> > void
> > foo (void)
> > {
> >  a.i |= 5; a.j = 3; a.k &= 2; a.l = 8;
> >  b.i = c.i; b.j = c.j; b.k = c.k; b.l = c.l;
> > }
> >
> > which on say i?86/x86_64 can be optimized as roughly:
> >  ((unsigned short *)&a)[1] = (((unsigned short *)&a)[1] & 0x20a) | 0x8035;
> >  ((unsigned short *)&b)[1] = ((unsigned short *)&c)[1];
> > (of course in some alias friendly way).
> > See e.g. PR37135/PR22121/PR42172.
> 
> Note that on the old mem-ref branch I lowered bit-field references very
> early, but people complained that we eventually loose the targets
> ability to directly perform bitfield stores.
> 
> Basically I lowered each bitfield access (with bitfield access I name
> those using a COMPONENT_REF with a FIELD_DECL that has
> DECL_BIT_FIELD set) to either a load of the underlying type
> plus a BIT_FIELD_REF on the loaded scalar, or a load of the underlying
> type plus a BIT_FIELD_EXPR (new code, which inserts bits into
> a scalar) plus a store of the underlying type.  CSE and DSE handled
> the redundant loads and stores very nicely and BIT_FIELD_EXPRs
> and BIT_FIELD_EXPRs were easily combined.
> 
> Now instead of lowering very early or very late as Jakub suggests we
> can drive (and perform) that lowering in SRA.  For example by
> simply walking through its list of accesses and combine them.
> 

With somehing like BIT_FIELD_EXPR... yes, this is probably the way to
go, also for other reasons like replacing the implementation of
buid_ref_for_offset with building of a MEM_REF.  But as you well know,
this is not particuarly easy.  But I hope I'll get there eventually :-)

Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-08-02  3:28     ` Jie Zhang
@ 2010-08-02 16:52       ` Mark Mitchell
  2010-08-03  9:00         ` Richard Guenther
  0 siblings, 1 reply; 24+ messages in thread
From: Mark Mitchell @ 2010-08-02 16:52 UTC (permalink / raw)
  To: Jie Zhang; +Cc: Richard Guenther, GCC Patches

Jie Zhang wrote:

> My patch prevents several full scalarizations of records with bit-field
> when compiling Linux kernel for x86_64, but none of these causes
> differences in final assemblies. I use 2.6.34.1 and the default config
> for x86_64. I checked -O2 and -Os.

That seems at odds with the statement made previously in this thread
that this optimization was essential for Linux kernel performance.

If Jie's statement is accurate, then, whether or not this is a "hack",
it seems like a win.  I don't see anything wrong with accepting a small,
local improvement that has no user-observable negative impact; we can
always rip it out and replace it with something better when something
better exists.

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-08-02 16:52       ` Mark Mitchell
@ 2010-08-03  9:00         ` Richard Guenther
  2010-08-03  9:59           ` Jie Zhang
  0 siblings, 1 reply; 24+ messages in thread
From: Richard Guenther @ 2010-08-03  9:00 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Jie Zhang, GCC Patches

On Mon, Aug 2, 2010 at 6:52 PM, Mark Mitchell <mark@codesourcery.com> wrote:
> Jie Zhang wrote:
>
>> My patch prevents several full scalarizations of records with bit-field
>> when compiling Linux kernel for x86_64, but none of these causes
>> differences in final assemblies. I use 2.6.34.1 and the default config
>> for x86_64. I checked -O2 and -Os.
>
> That seems at odds with the statement made previously in this thread
> that this optimization was essential for Linux kernel performance.
>
> If Jie's statement is accurate, then, whether or not this is a "hack",
> it seems like a win.  I don't see anything wrong with accepting a small,
> local improvement that has no user-observable negative impact; we can
> always rip it out and replace it with something better when something
> better exists.

OTOH no changes in code generation are also not in favor of this
patch.  Why didn't it improve anything?  Or was that expected?

Can you adjust the patch according to Martins suggestion?

Richard.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-08-03  9:00         ` Richard Guenther
@ 2010-08-03  9:59           ` Jie Zhang
  0 siblings, 0 replies; 24+ messages in thread
From: Jie Zhang @ 2010-08-03  9:59 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Mark Mitchell, GCC Patches

On 08/03/2010 05:00 PM, Richard Guenther wrote:
> On Mon, Aug 2, 2010 at 6:52 PM, Mark Mitchell<mark@codesourcery.com>  wrote:
>> Jie Zhang wrote:
>>
>>> My patch prevents several full scalarizations of records with bit-field
>>> when compiling Linux kernel for x86_64, but none of these causes
>>> differences in final assemblies. I use 2.6.34.1 and the default config
>>> for x86_64. I checked -O2 and -Os.
>>
>> That seems at odds with the statement made previously in this thread
>> that this optimization was essential for Linux kernel performance.
>>
>> If Jie's statement is accurate, then, whether or not this is a "hack",
>> it seems like a win.  I don't see anything wrong with accepting a small,
>> local improvement that has no user-observable negative impact; we can
>> always rip it out and replace it with something better when something
>> better exists.
>
> OTOH no changes in code generation are also not in favor of this
> patch.  Why didn't it improve anything?  Or was that expected?
>
Only two total scalarizations are prevented by my patch when building 
the linux kernel for the default x86_64 config, which compiles 1646 C files.

One is in io_apic.c:ioapic_write_entry (), "struct IO_APIC_route_entry e".

The other is in fs-writeback.c:bdi_sync_writeback (), "struct 
wb_writeback_args args".

In both cases, those structs are used simply as a whole. GCC can already 
optimize away the parts generated by total scalarization. So there is no 
difference when it's disabled.

> Can you adjust the patch according to Martins suggestion?
>
OK.


Thanks,
-- 
Jie Zhang
CodeSourcery

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144)
  2010-08-02 13:01 ` Martin Jambor
@ 2010-08-04 11:53   ` Jie Zhang
  2010-08-04 12:23     ` Richard Guenther
  0 siblings, 1 reply; 24+ messages in thread
From: Jie Zhang @ 2010-08-04 11:53 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2463 bytes --]

On 08/02/2010 09:01 PM, Martin Jambor wrote:
> Hi,
>
> On Sat, Jul 31, 2010 at 12:09:42AM +0800, Jie Zhang wrote:
>> PR tree-optimization/45144 shows an issue that SRA causes. I used
>> arm-none-eabi target as an example in PR tree-optimization/45144.
>> But the same issue can also been seen on x86_64-linux-gnu target
>> using the same test case in the PR.
>>
>> SRA completely scalarizes a small record. But when the record is
>> used later as a whole, GCC has to make the record out of the scalar
>> parts. When the record contains bit-fields, GCC generates ugly code
>> to assemble the scalar parts into a record.
>>
>> Until the aggregates copy propagation is implemented, I think it
>> would better to disable full scalarization for such records. The
>> patch is attached. It's bootstrapped on x86_64-linux-gnu and
>> regression tested.
>>
>> Is it OK for now? We can remove it after aggregates copy propagation
>> is implemented.
>>
>> Will it be better to add bit-field check in
>> type_consists_of_records_p instead of using a new function
>> "type_contains_bit_field_p"?
>>
>
> When I was implementing the total scalarization bit of SRA I thought
> of disabling it for structures with bit-fields too.  I did not really
> examine the effects in any way but I never expected this to result in
> nice code at places where we use SRA to do poor-man's copy
> propagation.  However, eventually I decided to keep the total
> scalarization for these structures because doing so can save stack
> space and it would be shame if adding one such field to a structure
> would make us use the space again (in fact, total scalarization was
> introduced as a fix to unnecessary stack-frame setup bugs like PR
> 42585).  But given your results with kernel and gcc, I don't object to
> disabling it... people will scream if something slows down for them.
>
> On the other hand, if we decide to go this way, we need to do the
> check at a different place, going over the whole type whenever looking
> at an assignment is not necessary and is wasteful.  The check would be
> most appropriate as a part of type_consists_of_records_p where it
> would be performed only once for each variable in question.
>

Thanks for your comment! How about this version? I moved the check into 
type_consists_of_records_p. Bootstrapped and regression tested on 
x86_64. Also checked by building linux kernel and made sure there were 
no regressions.


Regards,
-- 
Jie Zhang
CodeSourcery

[-- Attachment #2: gcc-not-full-scalarize-bitfield-2.diff --]
[-- Type: text/x-patch, Size: 1883 bytes --]


	PR tree-optimization/45144
	* tree-sra.c (type_consists_of_records_p): Return false
	if the record contains bit-field.

	testsuite/
	PR tree-optimization/45144
	* gcc.dg/tree-ssa/pr45144.c: New test.

Index: testsuite/gcc.dg/tree-ssa/pr45144.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/pr45144.c	(revision 0)
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void baz (unsigned);
+
+extern unsigned buf[];
+
+struct A
+{
+  unsigned a1:10;
+  unsigned a2:3;
+  unsigned:19;
+};
+
+union TMP
+{
+  struct A a;
+  unsigned int b;
+};
+
+static unsigned
+foo (struct A *p)
+{
+  union TMP t;
+  struct A x;
+  
+  x = *p;
+  t.a = x;
+  return t.b;
+}
+
+void
+bar (unsigned orig, unsigned *new)
+{
+  struct A a;
+  union TMP s;
+
+  s.b = orig;
+  a = s.a;
+  if (a.a1)
+    baz (a.a2);
+  *new = foo (&a);
+}
+
+/* { dg-final { scan-tree-dump "x = a;" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: tree-sra.c
===================================================================
--- tree-sra.c	(revision 162725)
+++ tree-sra.c	(working copy)
@@ -811,7 +811,7 @@ create_access (tree expr, gimple stmt, b
 /* Return true iff TYPE is a RECORD_TYPE with fields that are either of gimple
    register types or (recursively) records with only these two kinds of fields.
    It also returns false if any of these records has a zero-size field as its
-   last field.  */
+   last field or has a bit-field.  */
 
 static bool
 type_consists_of_records_p (tree type)
@@ -827,6 +827,9 @@ type_consists_of_records_p (tree type)
       {
 	tree ft = TREE_TYPE (fld);
 
+	if (DECL_BIT_FIELD (fld))
+	  return false;
+
 	if (!is_gimple_reg_type (ft)
 	    && !type_consists_of_records_p (ft))
 	  return false;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-08-04 11:53   ` Jie Zhang
@ 2010-08-04 12:23     ` Richard Guenther
  2010-08-04 19:41       ` Martin Jambor
  0 siblings, 1 reply; 24+ messages in thread
From: Richard Guenther @ 2010-08-04 12:23 UTC (permalink / raw)
  To: Jie Zhang; +Cc: GCC Patches, Martin Jambor

On Wed, Aug 4, 2010 at 1:53 PM, Jie Zhang <jie@codesourcery.com> wrote:
> On 08/02/2010 09:01 PM, Martin Jambor wrote:
>>
>> Hi,
>>
>> On Sat, Jul 31, 2010 at 12:09:42AM +0800, Jie Zhang wrote:
>>>
>>> PR tree-optimization/45144 shows an issue that SRA causes. I used
>>> arm-none-eabi target as an example in PR tree-optimization/45144.
>>> But the same issue can also been seen on x86_64-linux-gnu target
>>> using the same test case in the PR.
>>>
>>> SRA completely scalarizes a small record. But when the record is
>>> used later as a whole, GCC has to make the record out of the scalar
>>> parts. When the record contains bit-fields, GCC generates ugly code
>>> to assemble the scalar parts into a record.
>>>
>>> Until the aggregates copy propagation is implemented, I think it
>>> would better to disable full scalarization for such records. The
>>> patch is attached. It's bootstrapped on x86_64-linux-gnu and
>>> regression tested.
>>>
>>> Is it OK for now? We can remove it after aggregates copy propagation
>>> is implemented.
>>>
>>> Will it be better to add bit-field check in
>>> type_consists_of_records_p instead of using a new function
>>> "type_contains_bit_field_p"?
>>>
>>
>> When I was implementing the total scalarization bit of SRA I thought
>> of disabling it for structures with bit-fields too.  I did not really
>> examine the effects in any way but I never expected this to result in
>> nice code at places where we use SRA to do poor-man's copy
>> propagation.  However, eventually I decided to keep the total
>> scalarization for these structures because doing so can save stack
>> space and it would be shame if adding one such field to a structure
>> would make us use the space again (in fact, total scalarization was
>> introduced as a fix to unnecessary stack-frame setup bugs like PR
>> 42585).  But given your results with kernel and gcc, I don't object to
>> disabling it... people will scream if something slows down for them.
>>
>> On the other hand, if we decide to go this way, we need to do the
>> check at a different place, going over the whole type whenever looking
>> at an assignment is not necessary and is wasteful.  The check would be
>> most appropriate as a part of type_consists_of_records_p where it
>> would be performed only once for each variable in question.
>>
>
> Thanks for your comment! How about this version? I moved the check into
> type_consists_of_records_p. Bootstrapped and regression tested on x86_64.
> Also checked by building linux kernel and made sure there were no
> regressions.

This is ok if Martin is ok with it.

Thanks,
Richard.

>
> Regards,
> --
> Jie Zhang
> CodeSourcery
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-08-04 12:23     ` Richard Guenther
@ 2010-08-04 19:41       ` Martin Jambor
  2010-08-05  3:12         ` Jie Zhang
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Jambor @ 2010-08-04 19:41 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jie Zhang, GCC Patches

Hi,

On Wed, Aug 04, 2010 at 02:23:23PM +0200, Richard Guenther wrote:
> On Wed, Aug 4, 2010 at 1:53 PM, Jie Zhang <jie@codesourcery.com> wrote:
> > Thanks for your comment! How about this version? I moved the check into
> > type_consists_of_records_p. Bootstrapped and regression tested on x86_64.
> > Also checked by building linux kernel and made sure there were no
> > regressions.
> 
> This is ok if Martin is ok with it.
> 
> Thanks,
> Richard.


Yes, it's fine, I can't really think of any other quick way of dealing
with this.

Thanks,

Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC] Don't completely scalarize a record if it contains  bit-field (PR tree-optimization/45144)
  2010-08-04 19:41       ` Martin Jambor
@ 2010-08-05  3:12         ` Jie Zhang
  0 siblings, 0 replies; 24+ messages in thread
From: Jie Zhang @ 2010-08-05  3:12 UTC (permalink / raw)
  To: Richard Guenther, GCC Patches

On 08/05/2010 03:40 AM, Martin Jambor wrote:
> Hi,
>
> On Wed, Aug 04, 2010 at 02:23:23PM +0200, Richard Guenther wrote:
>> On Wed, Aug 4, 2010 at 1:53 PM, Jie Zhang<jie@codesourcery.com>  wrote:
>>> Thanks for your comment! How about this version? I moved the check into
>>> type_consists_of_records_p. Bootstrapped and regression tested on x86_64.
>>> Also checked by building linux kernel and made sure there were no
>>> regressions.
>>
>> This is ok if Martin is ok with it.
>>
> Yes, it's fine, I can't really think of any other quick way of dealing
> with this.
>
Thanks Richard and Martin. I have committed it on trunk.

-- 
Jie Zhang
CodeSourcery

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-08-05  3:12 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-30 16:21 [RFC] Don't completely scalarize a record if it contains bit-field (PR tree-optimization/45144) Jie Zhang
2010-07-30 16:37 ` Richard Guenther
2010-07-30 17:27   ` Mark Mitchell
2010-07-30 17:53     ` Jakub Jelinek
2010-07-30 18:53       ` Mark Mitchell
2010-07-30 18:59         ` Richard Kenner
2010-07-30 19:39           ` Andrew Pinski
2010-07-30 19:48             ` Mark Mitchell
2010-08-02  4:10               ` Jie Zhang
2010-07-31  9:48           ` Richard Guenther
2010-07-31  9:48       ` Richard Guenther
2010-08-02 13:25         ` Martin Jambor
2010-08-02  4:01       ` Jie Zhang
2010-08-02  3:28     ` Jie Zhang
2010-08-02 16:52       ` Mark Mitchell
2010-08-03  9:00         ` Richard Guenther
2010-08-03  9:59           ` Jie Zhang
2010-07-31 10:01 ` Richard Guenther
2010-08-02  4:29   ` Jie Zhang
2010-08-02 13:01 ` Martin Jambor
2010-08-04 11:53   ` Jie Zhang
2010-08-04 12:23     ` Richard Guenther
2010-08-04 19:41       ` Martin Jambor
2010-08-05  3:12         ` Jie Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).