On 24 May 2016 at 17:42, Richard Biener wrote: > On Tue, 24 May 2016, Prathamesh Kulkarni wrote: > >> On 23 May 2016 at 17:35, Richard Biener wrote: >> > On Mon, May 23, 2016 at 10:58 AM, Prathamesh Kulkarni >> > wrote: >> >> Hi, >> >> I have updated my patch for divmod (attached), which was originally >> >> based on Kugan's patch. >> >> The patch transforms stmts with code TRUNC_DIV_EXPR and TRUNC_MOD_EXPR >> >> having same operands to divmod representation, so we can cse computation of mod. >> >> >> >> t1 = a TRUNC_DIV_EXPR b; >> >> t2 = a TRUNC_MOD_EXPR b >> >> is transformed to: >> >> complex_tmp = DIVMOD (a, b); >> >> t1 = REALPART_EXPR (complex_tmp); >> >> t2 = IMAGPART_EXPR (complex_tmp); >> >> >> >> * New hook divmod_expand_libfunc >> >> The rationale for introducing the hook is that different targets have >> >> incompatible calling conventions for divmod libfunc. >> >> Currently three ports define divmod libfunc: c6x, spu and arm. >> >> c6x and spu follow the convention of libgcc2.c:__udivmoddi4: >> >> return quotient and store remainder in argument passed as pointer, >> >> while the arm version takes two arguments and returns both >> >> quotient and remainder having mode double the size of the operand mode. >> >> The port should hence override the hook expand_divmod_libfunc >> >> to generate call to target-specific divmod. >> >> Ports should define this hook if: >> >> a) The port does not have divmod or div insn for the given mode. >> >> b) The port defines divmod libfunc for the given mode. >> >> The default hook default_expand_divmod_libfunc() generates call >> >> to libgcc2.c:__udivmoddi4 provided the operands are unsigned and >> >> are of DImode. >> >> >> >> Patch passes bootstrap+test on x86_64-unknown-linux-gnu and >> >> cross-tested on arm*-*-*. >> >> Bootstrap+test in progress on arm-linux-gnueabihf. >> >> Does this patch look OK ? >> > >> > diff --git a/gcc/targhooks.c b/gcc/targhooks.c >> > index 6b4601b..e4a021a 100644 >> > --- a/gcc/targhooks.c >> > +++ b/gcc/targhooks.c >> > @@ -1965,4 +1965,31 @@ default_optab_supported_p (int, machine_mode, >> > machine_mode, optimization_type) >> > return true; >> > } >> > >> > +void >> > +default_expand_divmod_libfunc (bool unsignedp, machine_mode mode, >> > + rtx op0, rtx op1, >> > + rtx *quot_p, rtx *rem_p) >> > >> > functions need a comment. >> > >> > ISTR it was suggested that ARM change to libgcc2.c__udivmoddi4 style? In that >> > case we could avoid the target hook. >> Well I would prefer adding the hook because that's more easier -;) >> Would it be ok for now to go with the hook ? >> > >> > + /* If target overrides expand_divmod_libfunc hook >> > + then perform divmod by generating call to the target-specifc divmod >> > libfunc. */ >> > + if (targetm.expand_divmod_libfunc != default_expand_divmod_libfunc) >> > + return true; >> > + >> > + /* Fall back to using libgcc2.c:__udivmoddi4. */ >> > + return (mode == DImode && unsignedp); >> > >> > I don't understand this - we know optab_libfunc returns non-NULL for 'mode' >> > but still restrict this to DImode && unsigned? Also if >> > targetm.expand_divmod_libfunc >> > is not the default we expect the target to handle all modes? >> Ah indeed, the check for DImode is unnecessary. >> However I suppose the check for unsignedp should be there, >> since we want to generate call to __udivmoddi4 only if operand is unsigned ? > > The optab libfunc for sdivmod should be NULL in that case. Ah indeed, thanks. > >> > >> > That said - I expected the above piece to be simply a 'return true;' ;) >> > >> > Usually we use some can_expand_XXX helper in optabs.c to query if the target >> > supports a specific operation (for example SImode divmod would use DImode >> > divmod by means of widening operands - for the unsigned case of course). >> Thanks for pointing out. So if a target does not support divmod >> libfunc for a mode >> but for a wider mode, then we could zero-extend operands to the wider-mode, >> perform divmod on the wider-mode, and then cast result back to the >> original mode. >> I haven't done that in this patch, would it be OK to do that as a follow up ? > > I think that you should conservatively handle the div_optab query, thus if > the target has a HW division in a wider mode don't use the divmod IFN. > You'd simply iterate over GET_MODE_WIDER_MODE and repeat the > if (optab_handler (div_optab, mode) != CODE_FOR_nothing) check, bailing > out if that is available. Done. > >> > + /* Disable the transform if either is a constant, since >> > division-by-constant >> > + may have specialized expansion. */ >> > + if (TREE_CONSTANT (op1) || TREE_CONSTANT (op2)) >> > + return false; >> > >> > please use CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2) >> > >> > + if (TYPE_OVERFLOW_TRAPS (type)) >> > + return false; >> > >> > why's that? Generally please first test cheap things (trapping, constant-ness) >> > before checking expensive stuff (target_supports_divmod_p). >> I added TYPE_OVERFLOW_TRAPS (type) based on your suggestion in: >> https://www.mail-archive.com/gcc@gcc.gnu.org/msg78534.html >> "When looking at TRUNC_DIV_EXPR you should also exclude >> the case where TYPE_OVERFLOW_TRAPS (type) as that should >> expand using the [su]divv optabs (no trapping overflow >> divmod optab exists)." > > Ok, didn't remember that. > >> > >> > +static bool >> > +convert_to_divmod (gassign *stmt) >> > +{ >> > + if (!divmod_candidate_p (stmt)) >> > + return false; >> > + >> > + tree op1 = gimple_assign_rhs1 (stmt); >> > + tree op2 = gimple_assign_rhs2 (stmt); >> > + >> > + vec stmts = vNULL; >> > >> > use an auto_vec - you currently leak it in at least one place. >> > >> > + if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt)) >> > + cfg_changed = true; >> > >> > note that this suggests you should check whether any of the stmts may throw >> > internally as you don't update / transfer EH info correctly. So for 'stmt' and >> > all 'use_stmt' check stmt_can_throw_internal and if so do not add it to >> > the list of stmts to modify. >> > >> > Btw, I think you should not add 'stmt' immediately but when iterating over >> > all uses also gather uses in TRUNC_MOD_EXPR. >> > >> > Otherwise looks ok. >> Done changes in this version. I am gathering mod uses same time as div uses, >> so this imposes a constraint that mod dominates mod. I am not sure if >> that's desirable. > > I think you also need a mod_seen variable now that you don't necessarily > end up with 'stmt' in the vector of stmts. I don't see how there is a > constraint that mod dominates mod - it's just that the top_stmt needs > to dominate all other uses that can be replaced with replacing top_stmt > with a divmod. It's just that the actual stmt set we choose may now > depend on immediate uses order which on a second thought is bad > as immediate uses order could be affected by debug stmts ... hmm. > > To avoid this please re-add the code adding 'stmt' to stmts immediately > and add a use_stmt != stmt check to the immediate use processing loop > so that we don't end up adding it twice. Well I wonder what will happen for the following case: t1 = x / y; if (cond) t2 = x % y; else t3 = x % y; Assuming stmt == top_stmt is "t2 = x % y" and use_stmt is "t3 = x % y", use_stmt will not get added to stmts vector, since top_stmt and use_stmt are not in same bb, and bb's containing top_stmt and use_stmt don't dominate each other. Not sure if this is practical case (I assume fre will hoist mod outside if-else?) Now that we immediately add stmt to stmts vector, I suppose mod_seen shall not be required ? Thanks, Prathamesh > > Richard.