Richard Sandiford wrote: > Thanks for all the updates. > > David Daney writes: > >> I kept the call to mips_cache_flush_func pretty much unchanged because I >> didn't want to break compatibility with the command line options that >> allow this to be changed. Also I added CLEAR_INSN_CACHE that expands >> in libgcc that is used if __builtin___clear_cache() cannot be expanded >> in-line. This is slightly inefficient as it adds an extra function call >> layer through libgcc and has to do the arithmetic to convert the >> function parameters, but I am thinking I shouldn't be concerned about a >> few extra cycles doing this when we are about to do a system call. The >> alternative is to have __builtin___clear_cache() directly expand to >> mips_cache_flush_func and leave libgcc out of the picture for MIPS. >> > > I'm happy either way. I suppose the latter is appealing in some ways -- > to mirror what you said above, it means that -mflush-func will be > honoured -- but I can see why the consistency of calling a libgcc > function is good too. > > Perhaps if we aren't sure, the libgcc-less version is better. > Once we add the function, we'll have to keep it for ABI compatibility. > Right. That is what I did in this new version of the patch. > >> Index: config/mips/mips.h >> =================================================================== >> --- config/mips/mips.h (revision 125997) >> +++ config/mips/mips.h (working copy) >> @@ -136,6 +136,10 @@ extern const struct mips_cpu_info mips_c >> extern const struct mips_cpu_info *mips_arch_info; >> extern const struct mips_cpu_info *mips_tune_info; >> extern const struct mips_rtx_cost_data *mips_cost; >> + >> +extern bool mips_builtin_clear_cache_inline_p (void); >> +#undef TARGET_BUILTIN_CLEAR_CACHE_INLINE_P >> +#define TARGET_BUILTIN_CLEAR_CACHE_INLINE_P mips_builtin_clear_cache_inline_p >> > > I think the last two lines belong in mips.c, and that > mips_builtin_clear_cache_inline_p can be static. See how > similar hooks are handled. > OK, this is moot now. I removed the target hook from the architecture independent part of the patch. > >> +;; Flush the instruction cache starting as operands[0] with length >> +;; operands[1]. >> +(define_expand "flush_icache" >> + [(match_operand:SI 0 "pmode_register_operand") >> + (match_operand:SI 1 "register_operand")] >> > > I think this should be: > > [(match_operand 0 "pmode_register_operand") > (match_operand:SI 1 "const_arith_operand")] > This insn is gone now. >> +;; Expand in-line code to clear the instruction cache between operand[0] and >> +;; operand[1]. >> +(define_expand "clear_cache" >> + [(match_operand:SI 0 "general_operand") >> + (match_operand:SI 1 "general_operand")] >> > > I still think these should be: > > [(match_operand 0 "pmode_register_operand") > (match_operand 1 "pmode_register_operand")] > Fixed. >> +(define_insn "sync" >> + [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)] >> + "" >> + "sync" >> + [(set_attr "length" "4")]) >> > > Best make this conditional on ISA_HAS_SYNC, just to be on the safe side. > Very minor nit, but please drop the length attribute, since it's just > restating the default. Likewise for the others. > > Fixed. >> +(define_insn "synci" >> + [(unspec_volatile [(match_operand:SI 0 "general_operand" "d")] UNSPEC_SYNCI)] >> + "" >> + "synci\t0(%0)" >> + [(set_attr "length" "4")]) >> > > ISA_HAS_SYNC again. Please also make the predicate "pmode_register_operand" > (without a mode). > > Fixed. >> +(define_insn "rdhwr" >> + [(set (match_operand:SI 0 "general_operand" "=d") >> + (unspec_volatile [(match_operand:SI 1 "immediate_operand" "n")] >> + UNSPEC_RDHWR))] >> + "" >> + "rdhwr\t%0,$%1" >> + [(set_attr "length" "4")]) >> > > Getting even more minor here, but const_int_operand is a little > more specific than immediate_operand, and matches "n" better. > > Fixed. >> Index: config/mips/mips.c >> =================================================================== >> --- config/mips/mips.c (revision 125997) >> +++ config/mips/mips.c (working copy) >> @@ -3759,6 +3759,47 @@ mips_block_move_loop (rtx dest, rtx src, >> mips_block_move_straight (dest, src, leftover); >> } >> >> + >> +/* Return true if we will expand __builtin_clear_cache() in-line. */ >> +bool >> > > Silly formatting stuff: two spaces after ".". Usual (and local) style > is to have a blank line between comment and function. > > I tend to prefer comments like: > > Implement TARGET_BUILTIN_CLEAR_CACHE_INLINE_P. > > instead, so that you know instantly which tm.texi macro you need to > look at in order to get the full interface. > > Agreed, but moot. This part was removed. >> +/* Expand a loop of synci insns */ >> +void >> +mips_expand_synci_loop (rtx begin, rtx end) >> > > Blank line, and ". */". Pedantically, you should mention every > argument in the comment, so how about: > > /* Expand a loop of synci insns for the address range [BEGIN, END). */ > > Fixed. That is what I used. >> +{ >> + rtx end_addr, inc, label, label_ref, cmp, cmp_result; >> + >> + begin = force_reg(Pmode, begin); >> + end = force_reg(Pmode, end); >> > > This wouldn't be needed after the pmode_register_operand change, > although I agree it is needed as things stand. > > Removed. >> + /* Load INC with the cache line size (rdhwr INC,$1). */ >> + inc = gen_reg_rtx (SImode); >> + emit_insn (gen_rdhwr (inc ,gen_rtx_CONST_INT(SImode, 1))); >> > > CONST_INT objects don't have modes, and formatting. Should be: > > emit_insn (gen_rdhwr (inc, const1_rtx)); > > Fixed. >> + /* Loop back to here. */ >> + label = gen_label_rtx (); >> + emit_label (label); >> + >> + emit_insn (gen_synci (begin)); >> + >> + cmp = gen_reg_rtx (SImode); >> > > Pmode rather than SImode. > > Fixed. >> + emit_insn (gen_rtx_SET (VOIDmode, cmp, gen_rtx_GTU (Pmode, begin, end))); >> > > mips_emit_binary (GTU, cmp, begin, end); > > Fixed. >> + >> + emit_insn (gen_rtx_SET (VOIDmode, begin, gen_rtx_PLUS (Pmode, begin, inc))); >> > > Either: > > emit_insn (gen_add3_insn (begin, begin, inc)); > > or: > > mips_emit_binary (PLUS, begin, begin, inc); > > Probably the latter, givne the GTU thing above. > > Fixed. I chose the latter option as suggested. >> + >> + label_ref = gen_rtx_LABEL_REF (VOIDmode, label); >> + cmp_result = gen_rtx_EQ (VOIDmode, cmp, gen_rtx_CONST_INT(SImode, 0)); >> + emit_jump_insn (gen_rtx_SET (VOIDmode, pc_rtx, >> + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp_result, >> + label_ref, >> + pc_rtx))); >> > > cmp_result = gen_rtx_EQ (VOIDmode, cmp, const0_rtx); > emit_jump_insn (gen_condjump (cmp_result, label)); > Fixed. > All in all, very minor comments. Thanks again for your work on this. > OK, how about this new version. INITIALIZE_TRAMPOLINE now uses the new clear_cache insn. This means that it has to add the trampoline size to the ADDR argument before expanding clear_cache. If clear_cache expands to a library call, it has to subtract BEGIN from END to get the length. Looking at the generated code it appears that the RTL optimizers see that this addition/subtraction pair cancel each other out, and somewhat optimal code is generated. Other than that and removing the __builtin_clear_cache_inline_p target hook as suggested by Mark Mitchell, this version of the patch only contains some of the cleanups you suggested. Bootstrapped and regression tested on: x86_64-unknown-linux-gnu all default languages. i686-unknown-linux-gnu to mipsel-linux (--with-arch=mips32 and --with-arch-mips32r2) cross tested with --enable-languages-c,c++,java with no regressions. OK to commit? 2007-07-08 David Daney * config/mips/mips.h (ISA_HAS_SYNCI): New target capability predicate. (INITIALIZE_TRAMPOLINE): Emit clear_cache insn instead of library call. * config/mips/mips.c (mips_expand_synci_loop): New function. * config/mips/mips.md (UNSPEC_CLEAR_HAZARD): New constant. (UNSPEC_RDHWR): Same. (UNSPEC_SYNCI): Same. (UNSPEC_SYNC): Same. (clear_cache): New expand. (sync): New insn. (synci): Same. (rdhwr): Same. (clear_hazard): Same. * config/mips/mips-protos.h (mips_expand_synci_loop): Declare function. * testsuite/gcc.target/mips/clear-cache-1.c: New test. * testsuite/gcc.target/mips/clear-cache-2.c: New test.