From: Richard Biener <rguenther@suse.de>
To: Eric Botcazou <ebotcazou@adacore.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH][GCC7] Remove scaling of COMPONENT_REF/ARRAY_REF ops 2/3
Date: Mon, 02 May 2016 09:42:00 -0000 [thread overview]
Message-ID: <alpine.LSU.2.11.1605021121510.13384@t29.fhfr.qr> (raw)
In-Reply-To: <3564626.L9kKSTKFH1@polaris>
On Mon, 2 May 2016, Eric Botcazou wrote:
> > The following experiment resulted from looking at making
> > array_ref_low_bound and array_ref_element_size non-mutating. Again
> > I wondered why we do this strange scaling by offset/element alignment.
>
> The idea is to expose the alignment factor to the RTL expander:
>
> tree tem
> = get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1,
> &unsignedp, &reversep, &volatilep, true);
>
> [...]
>
> rtx offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode,
> EXPAND_SUM);
>
> [...]
>
> op0 = offset_address (op0, offset_rtx,
> highest_pow2_factor (offset));
>
> With the scaling, offset is something like _69 * 4 so highest_pow2_factor can
> see the factor and passes it down to offset_address:
>
> (gdb) p debug_rtx(op0)
> (mem/c:SI (plus:SI (reg/f:SI 193)
> (reg:SI 194)) [3 *s.16_63 S4 A32])
>
> With your patch in the same situation:
>
> (gdb) p debug_rtx(op0)
> (mem/c:SI (plus:SI (reg/f:SI 139)
> (reg:SI 116 [ _33 ])) [3 *s.16_63 S4 A8])
>
> On strict-alignment targets, this makes a big difference, e.g. SPARC:
>
> ld [%i4+%i5], %i0
>
> vs
>
> ldub [%i5+%i4], %g1
> sll %g1, 24, %g1
> add %i5, %i4, %i5
> ldub [%i5+1], %i0
> sll %i0, 16, %i0
> or %i0, %g1, %i0
> ldub [%i5+2], %g1
> sll %g1, 8, %g1
> or %g1, %i0, %g1
> ldub [%i5+3], %i0
> or %i0, %g1, %i0
>
>
> Now this is mitigated by a couple of things:
>
> 1. the above pessimization only happens on the RHS; on the LHS, the expander
> calls highest_pow2_factor_for_target instead of highest_pow2_factor and the
> former takes into account the type's alignment thanks to the MAX:
>
> /* Similar, except that the alignment requirements of TARGET are
> taken into account. Assume it is at least as aligned as its
> type, unless it is a COMPONENT_REF in which case the layout of
> the structure gives the alignment. */
>
> static unsigned HOST_WIDE_INT
> highest_pow2_factor_for_target (const_tree target, const_tree exp)
> {
> unsigned HOST_WIDE_INT talign = target_align (target) / BITS_PER_UNIT;
> unsigned HOST_WIDE_INT factor = highest_pow2_factor (exp);
>
> return MAX (factor, talign);
> }
>
> 2. highest_pow2_factor can be rescued by the set_nonzero_bits machinery of
> the SSA CCP pass because it calls tree_ctz. The above example was compiled
> with -O -fno-tree-ccp on SPARC; at -O, the code isn't pessimized.
>
> > So - the following patch gets rid of that scaling. For a "simple"
> > C testcase
> >
> > void bar (void *);
> > void foo (int n)
> > {
> > struct S { struct R { int b[n]; } a[2]; int k; } s;
> > s.k = 1;
> > s.a[1].b[7] = 3;
> > bar (&s);
> > }
>
> This only exposes the LHS case, here's a more complete testcase:
>
> void bar (void *);
>
> int foo (int n)
> {
> struct S { struct R { char b[n]; } a[2]; int k; } s;
> s.k = 1;
> s.a[1].b[7] = 3;
> bar (&s);
> return s.k;
> }
Ok. Note that on x86_64 at least SLSR wrecks the component-ref case:
@@ -30,10 +34,14 @@
_9 = _8 + 4;
s.1_11 = __builtin_alloca_with_align (_9, 32);
_12 = _7 >> 2;
- s.1_11->k{off: _12 * 4} = 1;
+ _18 = _12 * 4;
+ _19 = s.1_11 + _18;
+ MEM[(struct S *)_19] = 1;
s.1_11->a[1]{lb: 0 sz: _5}.b[7] = 3;
bar (s.1_11);
- _16 = s.1_11->k{off: _12 * 4};
+ _20 = _12 * 4;
+ _21 = s.1_11 + _20;
+ _16 = MEM[(struct S *)_21];
__builtin_stack_restore (saved_stack.3_3);
return _16;
It seems to me that the issue in the end is that where we compute
alignment from is the pieces gathered by get_inner_reference
instead of computing it alongside of that information in
get_inner_reference, taking advantage of DECL_OFFSET_ALIGN
and the array element type alignment there. This would be
another opportunity to merge get_inner_reference and
get_object_alignment_2 (or have a common worker).
That we expose the exact division in the IL has unfortunate side-effects
like above.
Do I understand you correctly that without using -fno-tree-ccp there
are currently no regressions? What about -O0 then? The code
generated by -O0 on x86_64 currently is quite horrible of course,
so maybe we don't care too much... I think -Og disables CCPs
bit-tracking though.
I see that the constant offset parts are only sometimes folded
into op0 before calling offset_address with the variable offset parts.
Now with get_object_alignment_2 computing alignment and misalignment
one could split out the misaligning offset from bitpos in some way
to have op0 always be aligned.
Sth like
Index: gcc/expr.c
===================================================================
--- gcc/expr.c (revision 235706)
+++ gcc/expr.c (working copy)
@@ -10334,6 +10334,9 @@ expand_expr_real_1 (tree exp, rtx target
offset_rtx = convert_to_mode (address_mode, offset_rtx,
0);
}
+ get_object_alignment_2 (exp, &align, &misalign, false);
+ bitpos -= misalign;
+
/* See the comment in expand_assignment for the rationale. */
if (mode1 != VOIDmode
&& bitpos != 0
@@ -10348,6 +10351,7 @@ expand_expr_real_1 (tree exp, rtx target
op0 = offset_address (op0, offset_rtx,
highest_pow2_factor (offset));
+ set_mem_align (op0, align);
}
/* If OFFSET is making OP0 more aligned than BIGGEST_ALIGNMENT,
which of course still would need adjustments to get_object_alignment_2
and get_innner_reference to recover alignment info by looking at
DECL_OFFSET_ALIGN and friends.
For the moment (as my original motivation was to get rid of the
requirement to call component_ref_field_offset and friends
from GIMPLE parts of the compiler to make them non-tree-mutating
but that has shown to be impossible or rather difficult)
I'm leaving things as-is.
Thanks,
Richard.
next prev parent reply other threads:[~2016-05-02 9:42 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-18 14:30 Richard Biener
2016-02-19 8:33 ` Eric Botcazou
2016-04-19 12:15 ` Richard Biener
2016-04-25 11:08 ` Eric Botcazou
2016-05-02 9:13 ` Eric Botcazou
2016-05-02 9:42 ` Richard Biener [this message]
2016-05-03 8:39 ` Eric Botcazou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1605021121510.13384@t29.fhfr.qr \
--to=rguenther@suse.de \
--cc=ebotcazou@adacore.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).