Hi Richard, it is already a while ago, but I had not found time to continue with this patch until now. I think I have now a better solution, which properly addresses your comments below. On 3/25/19 9:41 AM, Richard Biener wrote: > On Fri, 22 Mar 2019, Bernd Edlinger wrote: > >> On 3/21/19 12:15 PM, Richard Biener wrote: >>> On Sun, 10 Mar 2019, Bernd Edlinger wrote: >>> Finally... >>> >>> Index: gcc/function.c >>> =================================================================== >>> --- gcc/function.c (revision 269264) >>> +++ gcc/function.c (working copy) >>> @@ -2210,6 +2210,12 @@ use_register_for_decl (const_tree decl) >>> if (DECL_MODE (decl) == BLKmode) >>> return false; >>> >>> + if (STRICT_ALIGNMENT && TREE_CODE (decl) == PARM_DECL >>> + && DECL_INCOMING_RTL (decl) && MEM_P (DECL_INCOMING_RTL (decl)) >>> + && GET_MODE_ALIGNMENT (DECL_MODE (decl)) >>> + > MEM_ALIGN (DECL_INCOMING_RTL (decl))) >>> + return false; >>> + >>> /* If -ffloat-store specified, don't put explicit float variables >>> into registers. */ >>> /* ??? This should be checked after DECL_ARTIFICIAL, but tree-ssa >>> >>> I wonder if it is necessary to look at DECL_INCOMING_RTL here >>> and why such RTL may not exist? That is, iff DECL_INCOMING_RTL >>> doesn't exist then shouldn't we return false for safety reasons? >>> You are right, it is not possbile to return different results from use_register_for_decl before vs. after incoming RTL is assigned. That hits an assertion in set_rtl. This hunk is gone now, instead I changed assign_parm_setup_reg to use movmisalign optab and/or extract_bit_field if misaligned entry_parm is to be assigned in a register. I have no test coverage for the movmisalign optab though, so I rely on your code review for that part. >>> Similarly the very same issue should exist on x86_64 which is >>> !STRICT_ALIGNMENT, it's just the ABI seems to provide the appropriate >>> alignment on the caller side. So the STRICT_ALIGNMENT check is >>> a wrong one. >>> >> >> I may be plain wrong here, but I thought that !STRICT_ALIGNMENT targets >> just use MEM_ALIGN to select the right instructions. MEM_ALIGN >> is always 32-bit align on the DImode memory. The x86_64 vector instructions >> would look at MEM_ALIGN and do the right thing, yes? > > No, they need to use the movmisalign optab and end up with UNSPECs > for example. Ah, thanks, now I see. >> It seems to be the definition of STRICT_ALIGNMENT targets that all RTL >> instructions need to have MEM_ALIGN >= GET_MODE_ALIGNMENT, so the target >> does not even have to look at MEM_ALIGN except in the mov_misalign_optab, >> right? > > Yes, I think we never losened that. Note that RTL expansion has to > fix this up for them. Note that strictly speaking SLOW_UNALIGNED_ACCESS > specifies that x86 is strict-align wrt vector modes. > Yes I agree, the code would be incorrect for x86 as well when the movmisalign_optab is not used. So I invoke the movmisalign optab if available and if not fall back to extract_bit_field. As in the assign_parm_setup_stack assign_parm_setup_reg assumes that data->promoted_mode != data->nominal_mode does not happen with misaligned stack slots. Attached is the v3 if my patch. Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf. Is it OK for trunk? Thanks Bernd.