public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	    Richard Earnshaw <richard.earnshaw@arm.com>,
	    Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>,
	    Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>,
	    Eric Botcazou <ebotcazou@adacore.com>,
	Jeff Law <law@redhat.com>,     Jakub Jelinek <jakub@redhat.com>
Subject: Re: [PATCHv4] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
Date: Thu, 15 Aug 2019 08:58:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.20.1908151032040.32458@zhemvz.fhfr.qr> (raw)
In-Reply-To: <AM6PR10MB256602B22A8F0DA91782D76DE4AD0@AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM>

On Wed, 14 Aug 2019, Bernd Edlinger wrote:

> On 8/14/19 2:00 PM, Richard Biener wrote:
> > On Thu, 8 Aug 2019, Bernd Edlinger wrote:
> > 
> >> On 8/2/19 9:01 PM, Bernd Edlinger wrote:
> >>> On 8/2/19 3:11 PM, Richard Biener wrote:
> >>>> On Tue, 30 Jul 2019, Bernd Edlinger wrote:
> >>>>
> >>>>>
> >>>>> I have no test coverage for the movmisalign optab though, so I
> >>>>> rely on your code review for that part.
> >>>>
> >>>> It looks OK.  I tried to make it trigger on the following on
> >>>> i?86 with -msse2:
> >>>>
> >>>> typedef int v4si __attribute__((vector_size (16)));
> >>>>
> >>>> struct S { v4si v; } __attribute__((packed));
> >>>>
> >>>> v4si foo (struct S s)
> >>>> {
> >>>>   return s.v;
> >>>> }
> >>>>
> >>>
> >>> Hmm, the entry_parm need to be a MEM_P and an unaligned one.
> >>> So the test case could be made to trigger it this way:
> >>>
> >>> typedef int v4si __attribute__((vector_size (16)));
> >>>
> >>> struct S { v4si v; } __attribute__((packed));
> >>>
> >>> int t;
> >>> v4si foo (struct S a, struct S b, struct S c, struct S d,
> >>>           struct S e, struct S f, struct S g, struct S h,
> >>>           int i, int j, int k, int l, int m, int n,
> >>>           int o, struct S s)
> >>> {
> >>>   t = o;
> >>>   return s.v;
> >>> }
> >>>
> >>
> >> Ah, I realized that there are already a couple of very similar
> >> test cases: gcc.target/i386/pr35767-1.c, gcc.target/i386/pr35767-1d.c,
> >> gcc.target/i386/pr35767-1i.c and gcc.target/i386/pr39445.c,
> >> which also manage to execute the movmisalign code with the latest patch
> >> version.  So I thought that it is not necessary to add another one.
> >>
> >>> However the code path is still not reached, since targetm.slow_ualigned_access
> >>> is always FALSE, which is probably a flaw in my patch.
> >>>
> >>> So I think,
> >>>
> >>> +  else if (MEM_P (data->entry_parm)
> >>> +          && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> >>> +             > MEM_ALIGN (data->entry_parm)
> >>> +          && targetm.slow_unaligned_access (promoted_nominal_mode,
> >>> +                                            MEM_ALIGN (data->entry_parm)))
> >>>
> >>> should probably better be
> >>>
> >>> +  else if (MEM_P (data->entry_parm)
> >>> +          && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> >>> +             > MEM_ALIGN (data->entry_parm)
> >>> +        && (((icode = optab_handler (movmisalign_optab, promoted_nominal_mode))
> >>> +             != CODE_FOR_nothing)
> >>> +            || targetm.slow_unaligned_access (promoted_nominal_mode,
> >>> +                                              MEM_ALIGN (data->entry_parm))))
> >>>
> >>> Right?
> >>>
> >>> Then the modified test case would use the movmisalign optab.
> >>> However nothing changes in the end, since the i386 back-end is used to work
> >>> around the middle end not using movmisalign optab when it should do so.
> >>>
> >>
> >> I prefer the second form of the check, as it offers more test coverage,
> >> and is probably more correct than the former.
> >>
> >> Note there are more variations of this misalign check in expr.c,
> >> some are somehow odd, like expansion of MEM_REF and VIEW_CONVERT_EXPR:
> >>
> >>             && mode != BLKmode
> >>             && align < GET_MODE_ALIGNMENT (mode))
> >>           {
> >>             if ((icode = optab_handler (movmisalign_optab, mode))
> >>                 != CODE_FOR_nothing)
> >>               [...]
> >>             else if (targetm.slow_unaligned_access (mode, align))
> >>               temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
> >>                                         0, TYPE_UNSIGNED (TREE_TYPE (exp)),
> >>                                         (modifier == EXPAND_STACK_PARM
> >>                                          ? NULL_RTX : target),
> >>                                         mode, mode, false, alt_rtl);
> >>
> >> I wonder if they are correct this way, why shouldn't we use the movmisalign
> >> optab if it exists, regardless of TARGET_SLOW_UNALIGNED_ACCESSS ?
> > 
> > Doesn't the code do exactly this?  Prefer movmisalign over 
> > extrct_bit_field?
> > 
> 
> Ah, yes.  How could I miss that.
> 
> >>
> >>> I wonder if I should try to add a gcc_checking_assert to the mov<mode> expand
> >>> patterns that the memory is properly aligned ?
> >>>
> >>
> >> Wow, that was a really exciting bug-hunt with those assertions around...
> > 
> > :)
> > 
> >>>> @@ -3292,6 +3306,23 @@ assign_parm_setup_reg (struct assign_parm_data_all
> >>>>
> >>>>        did_conversion = true;
> >>>>      }
> >>>> +  else if (MEM_P (data->entry_parm)
> >>>> +          && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> >>>> +             > MEM_ALIGN (data->entry_parm)
> >>>>
> >>>> we arrive here by-passing
> >>>>
> >>>>   else if (need_conversion)
> >>>>     {
> >>>>       /* We did not have an insn to convert directly, or the sequence
> >>>>          generated appeared unsafe.  We must first copy the parm to a
> >>>>          pseudo reg, and save the conversion until after all
> >>>>          parameters have been moved.  */
> >>>>
> >>>>       int save_tree_used;
> >>>>       rtx tempreg = gen_reg_rtx (GET_MODE (data->entry_parm));
> >>>>
> >>>>       emit_move_insn (tempreg, validated_mem);
> >>>>
> >>>> but this move instruction is invalid in the same way as the case
> >>>> you fix, no?  So wouldn't it be better to do
> >>>>
> >>>
> >>> We could do that, but I supposed that there must be a reason why
> >>> assign_parm_setup_stack gets away with that same:
> >>>
> >>>   if (data->promoted_mode != data->nominal_mode)
> >>>     {
> >>>       /* Conversion is required.  */
> >>>       rtx tempreg = gen_reg_rtx (GET_MODE (data->entry_parm));
> >>>
> >>>       emit_move_insn (tempreg, validize_mem (copy_rtx (data->entry_parm)));
> >>>
> >>>
> >>> So either some back-ends are too permissive with us,
> >>> or there is a reason why promoted_mode != nominal_mode
> >>> does not happen together with unaligned entry_parm.
> >>> In a way that would be a rather unusual ABI.
> >>>
> >>
> >> To find out if that ever happens I added a couple of checking
> >> assertions in the arm mov<mode> expand patterns.
> >>
> >> So far the assertions did (almost) always hold, so it is likely not
> >> necessary to fiddle with all those naive move instructions here.
> >>
> >> So my gut feeling is, leave those places alone until there is a reason
> >> for changing them.
> > 
> > Works for me - I wonder if we should add those asserts to generic
> > code (guarded with flag_checking) though.
> > 
> 
> Well, yes, but I was scared away by the complexity of emit_move_insn_1.
> 
> It could be done, but in the moment I would be happy to have these
> checks of one major strict alignment target, ARM is a good candidate
> since most instructions work even if they are accidentally
> using unaligned arguments.  So middle-end errors do not always
> visible by ordinary tests.  Nevertheless it is a blatant violation of the
> contract between middle-end and back-end, which should be avoided.

Fair enough.

> >> However the assertion in movsi triggered a couple times in the
> >> ada testsuite due to expand_builtin_init_descriptor using a
> >> BLKmode MEM rtx, which is only 8-bit aligned.  So, I set the
> >> ptr_mode alignment there explicitly.
> > 
> > Looks good given we emit ptr_mode moves into it.  Thus OK independently.
> > 
> 
> Thanks, committed as r274487.
> 
> >> Several struct-layout-1.dg testcase tripped over misaligned
> >> complex_cst constants, fixed by varasm.c (align_variable).
> >> This is likely a wrong code bug, because misaligned complex
> >> constants, are expanded to misaligned MEM_REF, but the
> >> expansion cannot handle misaligned constants, only packed
> >> structure fields.
> > 
> > Hmm.  So your patch overrides user-alignment here.  Woudln't it
> > be better to do that more conciously by
> > 
> >   if (! DECL_USER_ALIGN (decl)
> >       || (align < GET_MODE_ALIGNMENT (DECL_MODE (decl))
> >           && targetm.slow_unaligned_access (DECL_MODE (decl), align)))
> > 
> > ?  And why is the movmisalign optab support missing here?
> > 
> 
> Yes, I wanted to replicate what we have in assign_parm_adjust_stack_rtl:
> 
>   /* If we can't trust the parm stack slot to be aligned enough for its
>      ultimate type, don't use that slot after entry.  We'll make another
>      stack slot, if we need one.  */
>   if (stack_parm
>       && ((GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)
>            && targetm.slow_unaligned_access (data->nominal_mode,
>                                              MEM_ALIGN (stack_parm)))
> 
> which also makes a variable more aligned than it is declared.
> But maybe both should also check the movmisalign optab in
> addition to slow_unaligned_access ?

Quite possible.

> > IMHO whatever code later fails to properly use unaligned loads
> > should be fixed instead rather than ignoring user requested alignment.
> > 
> > Can you quote a short testcase that explains what exactly goes wrong?
> > The struct-layout ones are awkward to look at...
> > 
> 
> Sure,
> 
> $ cat test.c
> _Complex float __attribute__((aligned(1))) cf;
> 
> void foo (void)
> {
>   cf = 1.0i;
> }
> 
> $ arm-linux-gnueabihf-gcc -S test.c 
> during RTL pass: expand
> test.c: In function 'foo':
> test.c:5:6: internal compiler error: in gen_movsf, at config/arm/arm.md:7003
>     5 |   cf = 1.0i;
>       |   ~~~^~~~~~
> 0x7ba475 gen_movsf(rtx_def*, rtx_def*)
> 	../../gcc-trunk/gcc/config/arm/arm.md:7003
> 0xa49587 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
> 	../../gcc-trunk/gcc/recog.h:318
> 0xa49587 emit_move_insn_1(rtx_def*, rtx_def*)
> 	../../gcc-trunk/gcc/expr.c:3695
> 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
> 	../../gcc-trunk/gcc/expr.c:3791
> 0xa494f7 emit_move_complex_parts(rtx_def*, rtx_def*)
> 	../../gcc-trunk/gcc/expr.c:3490
> 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
> 	../../gcc-trunk/gcc/expr.c:3791
> 0xa5106f store_expr(tree_node*, rtx_def*, int, bool, bool)
> 	../../gcc-trunk/gcc/expr.c:5855
> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool)
> 	../../gcc-trunk/gcc/expr.c:5441

Huh, so why didn't it trigger

  /* Handle misaligned stores.  */
  mode = TYPE_MODE (TREE_TYPE (to));
  if ((TREE_CODE (to) == MEM_REF
       || TREE_CODE (to) == TARGET_MEM_REF)
      && mode != BLKmode
      && !mem_ref_refers_to_non_mem_p (to)
      && ((align = get_object_alignment (to))
          < GET_MODE_ALIGNMENT (mode))
      && (((icode = optab_handler (movmisalign_optab, mode))
           != CODE_FOR_nothing)
          || targetm.slow_unaligned_access (mode, align)))
    {

?  (_Complex float is 32bit aligned it seems, the DECL_RTL for the
var is (mem/c:SC (symbol_ref:SI ("cf") [flags 0x2] <var_decl 
0x2aaaaaad1240 cf>) [1 cf+0 S8 A8]), SCmode is 32bit aligned.

Ah, 'to' is a plain DECL here so the above handling is incomplete.
IIRC component refs like __real cf = 0.f should be handled fine
again(?).  So, does adding || DECL_P (to) fix the case as well?

> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool)
> 	../../gcc-trunk/gcc/expr.c:4983
> 0x93396f expand_gimple_stmt_1
> 	../../gcc-trunk/gcc/cfgexpand.c:3777
> 0x93396f expand_gimple_stmt
> 	../../gcc-trunk/gcc/cfgexpand.c:3875
> 0x9392e1 expand_gimple_basic_block
> 	../../gcc-trunk/gcc/cfgexpand.c:5915
> 0x93b046 execute
> 	../../gcc-trunk/gcc/cfgexpand.c:6538
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <https://gcc.gnu.org/bugs/> for instructions.
> 
> Without the hunk in varasm.c of course.
> 
> What happens is that expand_expr_real_2 returns a unaligned mem_ref here:
> 
>     case COMPLEX_CST:
>       /* Handle evaluating a complex constant in a CONCAT target.  */
>       if (original_target && GET_CODE (original_target) == CONCAT)
>         {
>           [... this path not taken ...]
>         }
> 
>       /* fall through */
> 
>     case STRING_CST:
>       temp = expand_expr_constant (exp, 1, modifier);
> 
>       /* temp contains a constant address.
>          On RISC machines where a constant address isn't valid,
>          make some insns to get that address into a register.  */
>       if (modifier != EXPAND_CONST_ADDRESS
>           && modifier != EXPAND_INITIALIZER
>           && modifier != EXPAND_SUM
>           && ! memory_address_addr_space_p (mode, XEXP (temp, 0),
>                                             MEM_ADDR_SPACE (temp)))
>         return replace_equiv_address (temp,
>                                       copy_rtx (XEXP (temp, 0)));
>       return temp;
> 
> The result of expand_expr_real(..., EXPAND_NORMAL) ought to be usable
> by emit_move_insn, that is expected just *everywhere* and can't be changed.
> 
> This could probably be fixed in an ugly way in the COMPLEX_CST, handler
> but OTOH, I don't see any reason why this constant has to be misaligned
> when it can be easily aligned, which avoids the need for a misaligned access.

If the COMPLEX_CST happends to end up in unaligned memory then that's
of course a bug (unless the target requests that for all COMPLEX_CSTs).
That is, if the unalignment is triggered because the store is to an
unaligned decl.

But I think the issue is the above one?

> >> Furthermore gcc.dg/Warray-bounds-33.c was fixed by the
> >> change in expr.c (expand_expr_real_1).  Certainly is it invalid
> >> to read memory at a function address, but it should not ICE.
> >> The problem here, is the MEM_REF has no valid MEM_ALIGN, it looks
> >> like A32, so the misaligned code execution is not taken, but it is
> >> set to A8 below, but then we hit an ICE if the result is used:
> > 
> > So the user accessed it as A32.
> > 
> >>         /* Don't set memory attributes if the base expression is
> >>            SSA_NAME that got expanded as a MEM.  In that case, we should
> >>            just honor its original memory attributes.  */
> >>         if (TREE_CODE (tem) != SSA_NAME || !MEM_P (orig_op0))
> >>           set_mem_attributes (op0, exp, 0);
> > 
> > Huh, I don't understand this.  'tem' should never be SSA_NAME.
> 
> tem is the result of get_inner_reference, why can't that be a SSA_NAME ?

We can't subset an SSA_NAME.  I have really no idea what this intended
to do...

> > But set_mem_attributes_minus_bitpos uses get_object_alignment_1
> > and that has special treatment for FUNCTION_DECLs that is not
> > covered by
> > 
> >       /* When EXP is an actual memory reference then we can use
> >          TYPE_ALIGN of a pointer indirection to derive alignment.
> >          Do so only if get_pointer_alignment_1 did not reveal absolute
> >          alignment knowledge and if using that alignment would
> >          improve the situation.  */
> >       unsigned int talign;
> >       if (!addr_p && !known_alignment
> >           && (talign = min_align_of_type (TREE_TYPE (exp)) * 
> > BITS_PER_UNIT)
> >           && talign > align)
> >         align = talign;
> > 
> > which could be moved out of the if-cascade.
> > 
> > That said, setting A8 should eventually result into appropriate
> > unaligned expansion, so it seems odd this triggers the assert...
> > 
> 
> The function pointer is really 32-byte aligned in ARM mode to start
> with...
> 
> The problem is that the code that handles this misaligned access
> is skipped because the mem_rtx has initially no MEM_ATTRS and therefore
> MEM_ALIGN == 32, and therefore the code that handles the unaligned
> access is not taken.  BUT before the mem_rtx is returned it is
> set to MEM_ALIGN = 8 by set_mem_attributes, and we have an assertion,
> because the result from expand_expr_real(..., EXPAND_NORMAL) ought to be
> usable with emit_move_insn.

yes, as said the _access_ determines the address should be aligned
so we shouldn't end up setting MEM_ALIGN to 8 but to 32 according
to the access type/mode.  But we can't trust DECL_ALIGN of
FUNCTION_DECLs but we _can_ trust users writing *(int *)fn
(maybe for actual accesses we _can_ trust DECL_ALIGN, it's just
we may not compute nonzero bits for the actual address because
of function pointer mangling)
(for accessing function code I'd say this would be premature
optimization, but ...)

> >>
> >> Finally gcc.dg/torture/pr48493.c required the change
> >> in assign_parm_setup_stack.  This is just not using the
> >> correct MEM_ALIGN attribute value, while the memory is
> >> actually aligned.
> > 
> > But doesn't
> > 
> >           int align = STACK_SLOT_ALIGNMENT (data->passed_type,
> >                                             GET_MODE (data->entry_parm),
> >                                             TYPE_ALIGN 
> > (data->passed_type));
> > +         if (align < (int)GET_MODE_ALIGNMENT (GET_MODE 
> > (data->entry_parm))
> > +             && targetm.slow_unaligned_access (GET_MODE 
> > (data->entry_parm),
> > +                                               align))
> > +           align = GET_MODE_ALIGNMENT (GET_MODE (data->entry_parm));
> > 
> > hint at that STACK_SLOT_ALIGNMENT is simply bogus for the target?
> > That is, the target says, for natural alignment 64 the stack slot
> > alignment can only be guaranteed 32.  You can't then simply up it
> > but have to use unaligned accesses (or the target/middle-end needs
> > to do dynamic stack alignment).
> > 
> Yes, maybe, but STACK_SLOT_ALIGNMENT is used in a few other places as well,
> and none of them have a problem, probably because they use expand_expr,
> but here we use emit_move_insn:
> 
>       if (MEM_P (src))
>         {
>           [...]
>         }
>       else
>         {
>           if (!REG_P (src))
>             src = force_reg (GET_MODE (src), src);
>           emit_move_insn (dest, src);
>         }
> 
> So I could restrict that to
> 
>           if (!MEM_P (data->entry_parm)
>               && align < (int)GET_MODE_ALIGNMENT (GET_MODE (data->entry_parm))
>               && ((optab_handler (movmisalign_optab,
> 				  GET_MODE (data->entry_parm))
>                    != CODE_FOR_nothing)
>                   || targetm.slow_unaligned_access (GET_MODE (data->entry_parm),
>                                                     align)))
>             align = GET_MODE_ALIGNMENT (GET_MODE (data->entry_parm));
> 
> But OTOH even for arguments arriving in unaligned stack slots where
> emit_block_move could handle it, that would just work against the
> intention of assign_parm_adjust_stack_rtl.
> 
> Of course there are limits how much alignment assign_stack_local
> can handle, and that would result in an assertion in the emit_move_insn.
> But in the end if that happens it is just an impossible target
> configuration.

Still I think you can't simply override STACK_SLOT_ALIGNMENT just because
of the mode of an entry param, can you?  If you can assume a bigger
alignment then STACK_SLOT_ALIGNMENT should return it.

> > 
> >>  Note that set_mem_attributes does not
> >> always preserve the MEM_ALIGN of the ref, since:
> > 
> > set_mem_attributes sets _all_ attributes from an expression or type.
> > 
> 
> Not really:
> 
>   refattrs = MEM_ATTRS (ref);
>   if (refattrs)
>     {
>       /* ??? Can this ever happen?  Calling this routine on a MEM that
>          already carries memory attributes should probably be invalid.  */
>       [...]
>       attrs.align = refattrs->align;
>     }
>   else
>     [...]
> 
>   if (objectp || TREE_CODE (t) == INDIRECT_REF)
>     attrs.align = MAX (attrs.align, TYPE_ALIGN (type));
> 
> >>   /* Default values from pre-existing memory attributes if present.  */
> >>   refattrs = MEM_ATTRS (ref);
> >>   if (refattrs)
> >>     {
> >>       /* ??? Can this ever happen?  Calling this routine on a MEM that
> >>          already carries memory attributes should probably be invalid.  */
> >>       attrs.expr = refattrs->expr;
> >>       attrs.offset_known_p = refattrs->offset_known_p;
> >>       attrs.offset = refattrs->offset;
> >>       attrs.size_known_p = refattrs->size_known_p;
> >>       attrs.size = refattrs->size;
> >>       attrs.align = refattrs->align;
> >>     }
> >>
> >> but if we happen to set_mem_align to _exactly_ the MODE_ALIGNMENT
> >> the MEM_ATTRS are zero, and a smaller alignment may result.
> > 
> > Not sure what you are saying here.  That
> > 
> > set_mem_align (MEM:SI A32, 32)
> > 
> > produces a NULL MEM_ATTRS and thus set_mem_attributes not inheriting
> > the A32 but eventually computing sth lower?  Yeah, that's probably
> > an interesting "hole" here.  I'm quite sure that if we'd do
> > 
> > refattrs = MEM_ATTRS (ref) ? MEM_ATTRS (ref) : mem_mode_attrs[(int) GET_MODE (ref)];
> > 
> > we run into issues exactly on strict-align targets ...
> > 
> 
> Yeah, that's scary...
> 
> >> Well with those checks in place it should now be a lot harder to generate
> >> invalid code on STRICT_ALIGNMENT targets, without running into an ICE.
> >>
> >> Attached is the latest version of my arm alignment patch.
> >>
> >>
> >> Boot-strapped and reg-tested on x64_64-pc-linux-gnu and arm-linux-gnueabihf.
> >> Is it OK for trunk?
> > 
> > @@ -3291,6 +3306,23 @@ assign_parm_setup_reg (struct assign_parm_data_all
> > 
> >        did_conversion = true;
> >      }
> > +  else if (MEM_P (data->entry_parm)
> > +          && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> > +             > MEM_ALIGN (data->entry_parm)
> > +          && (((icode = optab_handler (movmisalign_optab,
> > +                                       promoted_nominal_mode))
> > +               != CODE_FOR_nothing)
> > +              || targetm.slow_unaligned_access (promoted_nominal_mode,
> > +                                                MEM_ALIGN 
> > (data->entry_parm))))
> > +    {
> > +      if (icode != CODE_FOR_nothing)
> > +       emit_insn (GEN_FCN (icode) (parmreg, validated_mem));
> > +      else
> > +       rtl = parmreg = extract_bit_field (validated_mem,
> > +                       GET_MODE_BITSIZE (promoted_nominal_mode), 0,
> > +                       unsignedp, parmreg,
> > +                       promoted_nominal_mode, VOIDmode, false, NULL);
> > +    }
> >    else
> >      emit_move_insn (parmreg, validated_mem);
> > 
> > This hunk would be obvious to me if we'd use MEM_ALIGN (validated_mem) /
> > GET_MODE (validated_mem) instead of MEM_ALIGN (data->entry_parm)
> > and promoted_nominal_mode.
> > 
> 
> Yes, the idea is just to save some cycles, since
> 
> parmreg = gen_reg_rtx (promoted_nominal_mode);
> we know that parmreg will also have that mode, plus
> emit_move_insn (parmreg, validated_mem) which would be called here
> asserts that:
> 
>   gcc_assert (mode != BLKmode
>               && (GET_MODE (y) == mode || GET_MODE (y) == VOIDmode));
> 
> so GET_MODE(validated_mem) == GET_MODE (parmreg) == promoted_nominal_mode
> 
> I still like the current version with promoted_nominal_mode slighhtly
> better both because of performance, and the 80-column restriction. :)

So if you say they are 1:1 equivalent then go for it (for this hunk,
approved as "obvious").

Richard.

  reply	other threads:[~2019-08-15  8:55 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-10 12:51 [PATCHv2] " Bernd Edlinger
2019-03-19 14:01 ` [PING] " Bernd Edlinger
2019-03-21 11:26 ` Richard Biener
2019-03-22 17:47   ` Bernd Edlinger
2019-03-25  9:28     ` Richard Biener
2019-07-30 22:13       ` [PATCHv3] " Bernd Edlinger
2019-07-31 13:17         ` Richard Earnshaw (lists)
2019-08-01 11:19           ` Bernd Edlinger
2019-08-02  9:10             ` Richard Earnshaw (lists)
2019-08-02 13:11         ` Richard Biener
2019-08-02 19:01           ` Bernd Edlinger
2019-08-08 14:20             ` [PATCHv4] " Bernd Edlinger
2019-08-14 10:54               ` [PING] " Bernd Edlinger
2019-08-14 12:27               ` Richard Biener
2019-08-14 22:26                 ` Bernd Edlinger
2019-08-15  8:58                   ` Richard Biener [this message]
2019-08-15 12:38                     ` Bernd Edlinger
2019-08-15 13:03                       ` Richard Biener
2019-08-15 14:33                         ` Richard Biener
2019-08-15 15:28                         ` Bernd Edlinger
2019-08-15 17:42                           ` Richard Biener
2019-08-15 21:19                             ` [PATCHv5] " Bernd Edlinger
2019-08-20  5:38                               ` Jeff Law
2019-08-20 15:04                               ` John David Anglin
     [not found]                                 ` <0d39b64f-67d9-7857-cf4e-36f09c0dc15e@bell.net>
2019-08-20 16:03                                   ` Fwd: " Bernd Edlinger
2019-09-04 12:53                               ` Richard Earnshaw (lists)
2019-09-04 13:29                                 ` Bernd Edlinger
2019-09-04 14:14                                   ` Richard Earnshaw (lists)
2019-09-04 15:00                                     ` Bernd Edlinger
2019-09-04 15:48                                       ` Richard Earnshaw (lists)
2019-09-05  9:21                                         ` Richard Earnshaw (lists)
2019-09-05  9:35                                           ` Bernd Edlinger
2019-09-06 10:15                                 ` Bernd Edlinger
2019-09-06 10:18                                   ` Richard Earnshaw (lists)
2019-08-15 21:27                             ` [PATCH] Sanitizing the middle-end interface to the back-end for strict alignment Bernd Edlinger
2019-08-17 10:11                               ` Bernd Edlinger
2019-08-23  0:01                                 ` Jeff Law
2019-08-23  0:05                               ` Jeff Law
2019-08-23 15:15                                 ` [PING] " Bernd Edlinger
2019-08-27 10:07                               ` Kyrill Tkachov
2019-08-28 11:50                                 ` Bernd Edlinger
2019-08-28 12:01                                   ` Kyrill Tkachov
2019-08-28 13:54                                     ` Christophe Lyon
2019-08-28 21:48                                       ` Bernd Edlinger
2019-08-29  9:09                                         ` Kyrill Tkachov
2019-08-29 10:00                                           ` Christophe Lyon
2019-08-29 22:57                                             ` Bernd Edlinger
2019-08-30 10:07                                               ` Kyrill Tkachov
2019-08-30 15:22                                               ` Christophe Lyon
2019-08-14 11:56             ` [PATCHv3] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544) Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.20.1908151032040.32458@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=bernd.edlinger@hotmail.de \
    --cc=ebotcazou@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=kyrylo.tkachov@foss.arm.com \
    --cc=law@redhat.com \
    --cc=ramana.radhakrishnan@arm.com \
    --cc=richard.earnshaw@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).