From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31524 invoked by alias); 5 Jul 2007 18:53:05 -0000 Received: (qmail 31487 invoked by uid 22791); 5 Jul 2007 18:53:01 -0000 X-Spam-Check-By: sourceware.org Received: from smtp.nildram.co.uk (HELO smtp.nildram.co.uk) (195.149.33.74) by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 05 Jul 2007 18:52:52 +0000 Received: from firetop.home (unknown [84.12.162.236]) by smtp.nildram.co.uk (Postfix) with ESMTP id F0AF154624; Thu, 5 Jul 2007 19:52:46 +0100 (BST) Received: from richard by firetop.home with local (Exim 4.63) (envelope-from ) id 1I6WRc-00071g-Ra; Thu, 05 Jul 2007 19:52:48 +0100 From: Richard Sandiford To: David Daney Mail-Followup-To: David Daney ,gcc-patches@gcc.gnu.org, rsandifo@nildram.co.uk Cc: gcc-patches@gcc.gnu.org Subject: Re: [Patch] 2/3 MIPS support for builtin __builtin_flush_icache(). References: <468734D3.3020908@avtrex.com> <46873597.30607@avtrex.com> <877ipk8kab.fsf@firetop.home> <468C9FCA.4060406@avtrex.com> Date: Thu, 05 Jul 2007 19:05:00 -0000 In-Reply-To: <468C9FCA.4060406@avtrex.com> (David Daney's message of "Thu, 05 Jul 2007 00:37:46 -0700") Message-ID: <87odiq3cov.fsf@firetop.home> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-07/txt/msg00488.txt.bz2 Thanks for all the updates. David Daney writes: > Richard Sandiford wrote: >> David Daney writes: >>> + /* Flush both caches. We need to flush the data cache in case >>> + the system has a write-back cache. */ >>> + /* ??? Should check the return value for errors. */ >>> + if (mips_cache_flush_func && mips_cache_flush_func[0]) >>> + emit_library_call (gen_rtx_SYMBOL_REF (Pmode, mips_cache_flush_func), >>> + 0, VOIDmode, 3, copy_rtx (operands[0]), Pmode, >>> + copy_rtx (operands[1]), TYPE_MODE (integer_type_node), >>> + GEN_INT (3), TYPE_MODE (integer_type_node)); >>> + DONE; >>> >> >> Same copy_rtx comment here. TBH, I'm not sure what the "???" comment >> refers to. >> >> > I just pased this part in from its old home in mips.h. The ??? comment > came with it. It is gone now. Sorry, I missed that. And thanks for deleting it. >> I wonder if would be better >> to simply expand this as rtl, with special patterns for the "rdhwr", >> "synci" and "sync" instructions. This is similar to what we do for >> other largish patterns. That's certainly not a requirement though; >> I'm not even sure it makes sense. It's just an idea-cum-question. >> > That is what I did. Looks good, thanks. >> Following on from your comment about delay slots, it also strikes me >> that the "bal; nop; addiu" sequence is almost always longer than the >> sequence to load a local address into a register. I think the only >> exception is static n64 with -mno-sym32. I wonder if it would be >> worth using a sequence like: >> >> rtx label, target, insn; >> >> label = gen_label_rtx (); >> target = force_reg (Pmode, gen_rtx_LABEL_REF (Pmode, label)); >> insn = emit_jump_insn (gen_indirect_jump_hb (target)); >> REG_NOTES (insn) = gen_rtx_INSN_LIST (REG_LABEL, label, REG_NOTES (insn)); >> emit_label (label); >> >> where indirect_jump_hb would just be something like: >> >> (define_insn "indirect_jump_hb" >> [(set (pc) (match_operand "pmode_register_operand" "d")) >> (unspec_volatile [(const_int 0)] UNSPEC_JR_HB)] >> "ISA_HAS_SYNCI" >> "jr.hb\t%0" >> [(set_attr "type" "jump")]) >> >> (all untested). I think we might then be able to fill the delay slot >> from the target of the jump, which should be OK in this context. >> Again, this is just an idea-cum-question. >> >> > I spent way too much time trying to get that to work. For now I left it > mostly unchanged from the original patch. > > The problem is that the CFG things just don't seem to be able to handle: > 1) Jump insns where the body is a (parallel ...) instead of a (set (pc) > ...). > 2) Unconditional jumps where the target and fall through edges goto the > same place. Thanks for trying. In that case, I agree your patch is the way to go. > I kept the call to mips_cache_flush_func pretty much unchanged because I > didn't want to break compatibility with the command line options that > allow this to be changed. Also I added CLEAR_INSN_CACHE that expands > in libgcc that is used if __builtin___clear_cache() cannot be expanded > in-line. This is slightly inefficient as it adds an extra function call > layer through libgcc and has to do the arithmetic to convert the > function parameters, but I am thinking I shouldn't be concerned about a > few extra cycles doing this when we are about to do a system call. The > alternative is to have __builtin___clear_cache() directly expand to > mips_cache_flush_func and leave libgcc out of the picture for MIPS. I'm happy either way. I suppose the latter is appealing in some ways -- to mirror what you said above, it means that -mflush-func will be honoured -- but I can see why the consistency of calling a libgcc function is good too. Perhaps if we aren't sure, the libgcc-less version is better. Once we add the function, we'll have to keep it for ABI compatibility. > Index: config/mips/mips.h > =================================================================== > --- config/mips/mips.h (revision 125997) > +++ config/mips/mips.h (working copy) > @@ -136,6 +136,10 @@ extern const struct mips_cpu_info mips_c > extern const struct mips_cpu_info *mips_arch_info; > extern const struct mips_cpu_info *mips_tune_info; > extern const struct mips_rtx_cost_data *mips_cost; > + > +extern bool mips_builtin_clear_cache_inline_p (void); > +#undef TARGET_BUILTIN_CLEAR_CACHE_INLINE_P > +#define TARGET_BUILTIN_CLEAR_CACHE_INLINE_P mips_builtin_clear_cache_inline_p I think the last two lines belong in mips.c, and that mips_builtin_clear_cache_inline_p can be static. See how similar hooks are handled. (There's a bit of confusing difference in the way target macros are handled. Some are intended to be defined by target .h files, and target-def.h only defines them if the target .h files don't. Others are defined by target-def.h and then overridden in the target .c file. My understanding is that the former are intended for things like OS-specific and assembler-specific macros, whereas the latter are intended for _target_-specific macros. TARGET_BUILTIN_CLEAR_CACHE_INLINE_P is one of the latter for MIPS.) > +;; Flush the instruction cache starting as operands[0] with length > +;; operands[1]. > +(define_expand "flush_icache" > + [(match_operand:SI 0 "pmode_register_operand") > + (match_operand:SI 1 "register_operand")] I think this should be: [(match_operand 0 "pmode_register_operand") (match_operand:SI 1 "const_arith_operand")] There should be no mode on a pmode_register_operand; the predicate exists to check the run-time Pmode. And I think the only user -- INITIALIZE_TRAMPOLINE -- does pass a const_int here. The use of const_arith_operand (as opposed to const_int_operand) emphasises that the constant must be a valid operand to an addition isntruction. > + if (ISA_HAS_SYNCI) > + { > + rtx end_addr = gen_reg_rtx (Pmode); > + emit_insn (gen_rtx_SET (VOIDmode, end_addr, > + gen_rtx_PLUS (Pmode, operands[0], operands[1]))); emit_insn (gen_add3_insn (end_addr, operands[0], operands[1])); > +;; Expand in-line code to clear the instruction cache between operand[0] and > +;; operand[1]. > +(define_expand "clear_cache" > + [(match_operand:SI 0 "general_operand") > + (match_operand:SI 1 "general_operand")] I still think these should be: [(match_operand 0 "pmode_register_operand") (match_operand 1 "pmode_register_operand")] > +(define_insn "sync" > + [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)] > + "" > + "sync" > + [(set_attr "length" "4")]) Best make this conditional on ISA_HAS_SYNC, just to be on the safe side. Very minor nit, but please drop the length attribute, since it's just restating the default. Likewise for the others. > +(define_insn "synci" > + [(unspec_volatile [(match_operand:SI 0 "general_operand" "d")] UNSPEC_SYNCI)] > + "" > + "synci\t0(%0)" > + [(set_attr "length" "4")]) ISA_HAS_SYNC again. Please also make the predicate "pmode_register_operand" (without a mode). > +(define_insn "rdhwr" > + [(set (match_operand:SI 0 "general_operand" "=d") > + (unspec_volatile [(match_operand:SI 1 "immediate_operand" "n")] > + UNSPEC_RDHWR))] > + "" > + "rdhwr\t%0,$%1" > + [(set_attr "length" "4")]) Getting even more minor here, but const_int_operand is a little more specific than immediate_operand, and matches "n" better. > Index: config/mips/mips.c > =================================================================== > --- config/mips/mips.c (revision 125997) > +++ config/mips/mips.c (working copy) > @@ -3759,6 +3759,47 @@ mips_block_move_loop (rtx dest, rtx src, > mips_block_move_straight (dest, src, leftover); > } > > + > +/* Return true if we will expand __builtin_clear_cache() in-line. */ > +bool Silly formatting stuff: two spaces after ".". Usual (and local) style is to have a blank line between comment and function. I tend to prefer comments like: Implement TARGET_BUILTIN_CLEAR_CACHE_INLINE_P. instead, so that you know instantly which tm.texi macro you need to look at in order to get the full interface. > +/* Expand a loop of synci insns */ > +void > +mips_expand_synci_loop (rtx begin, rtx end) Blank line, and ". */". Pedantically, you should mention every argument in the comment, so how about: /* Expand a loop of synci insns for the address range [BEGIN, END). */ > +{ > + rtx end_addr, inc, label, label_ref, cmp, cmp_result; > + > + begin = force_reg(Pmode, begin); > + end = force_reg(Pmode, end); This wouldn't be needed after the pmode_register_operand change, although I agree it is needed as things stand. > + /* Load INC with the cache line size (rdhwr INC,$1). */ > + inc = gen_reg_rtx (SImode); > + emit_insn (gen_rdhwr (inc ,gen_rtx_CONST_INT(SImode, 1))); CONST_INT objects don't have modes, and formatting. Should be: emit_insn (gen_rdhwr (inc, const1_rtx)); > + /* Loop back to here. */ > + label = gen_label_rtx (); > + emit_label (label); > + > + emit_insn (gen_synci (begin)); > + > + cmp = gen_reg_rtx (SImode); Pmode rather than SImode. > + emit_insn (gen_rtx_SET (VOIDmode, cmp, gen_rtx_GTU (Pmode, begin, end))); mips_emit_binary (GTU, cmp, begin, end); > + > + emit_insn (gen_rtx_SET (VOIDmode, begin, gen_rtx_PLUS (Pmode, begin, inc))); Either: emit_insn (gen_add3_insn (begin, begin, inc)); or: mips_emit_binary (PLUS, begin, begin, inc); Probably the latter, givne the GTU thing above. > + > + label_ref = gen_rtx_LABEL_REF (VOIDmode, label); > + cmp_result = gen_rtx_EQ (VOIDmode, cmp, gen_rtx_CONST_INT(SImode, 0)); > + emit_jump_insn (gen_rtx_SET (VOIDmode, pc_rtx, > + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp_result, > + label_ref, > + pc_rtx))); cmp_result = gen_rtx_EQ (VOIDmode, cmp, const0_rtx); emit_jump_insn (gen_condjump (cmp_result, label)); All in all, very minor comments. Thanks again for your work on this. Richard