From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24527 invoked by alias); 1 Jul 2007 10:56:19 -0000 Received: (qmail 24517 invoked by uid 22791); 1 Jul 2007 10:56:18 -0000 X-Spam-Check-By: sourceware.org Received: from smtp.nildram.co.uk (HELO smtp.nildram.co.uk) (195.112.4.54) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sun, 01 Jul 2007 10:56:15 +0000 Received: from firetop.home (85-211-140-22.dyn.gotadsl.co.uk [85.211.140.22]) by smtp.nildram.co.uk (Postfix) with ESMTP id C73CB2B71DB; Sun, 1 Jul 2007 11:56:10 +0100 (BST) Received: from richard by firetop.home with local (Exim 4.63) (envelope-from ) id 1I4x6C-0001VD-I6; Sun, 01 Jul 2007 11:56:12 +0100 From: Richard Sandiford To: David Daney Mail-Followup-To: David Daney , gcc-patches@gcc.gnu.org, rsandifo@nildram.co.uk Cc: gcc-patches@gcc.gnu.org Subject: Re: [Patch] 2/3 MIPS support for builtin __builtin_flush_icache(). References: <468734D3.3020908@avtrex.com> <46873597.30607@avtrex.com> Date: Sun, 01 Jul 2007 10:56:00 -0000 In-Reply-To: <46873597.30607@avtrex.com> (David Daney's message of "Sat, 30 Jun 2007 22:03:19 -0700") Message-ID: <877ipk8kab.fsf@firetop.home> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-07/txt/msg00017.txt.bz2 David Daney writes: > This is the second part of the __builtin_flush_icache patch. It adds > support for MIPS. Thanks for doing this. > @@ -4171,6 +4173,72 @@ (define_insn "cprestore" > } > [(set_attr "type" "store") > (set_attr "length" "4,12")]) > + > +(define_expand "flush_icache" > + [(match_operand:SI 0 "general_operand" "r") > + (match_operand:SI 1 "general_operand" "r")] No constraints in a define_expand; just remove the "r" strings altogether. The expander ought to be able to require something more specific than general_operand here. I suppose that'll need some changes to the target-independent part of the patch. It will also mean changing the above to: [(match_operand 0 "pmode_register_operand") (match_operand 1 "pmode_register_operand")] since this pattern is used in situations where Pmode != SImode. > + if (ISA_HAS_SYNCI) > + { > + emit_insn (gen_synci_loop (copy_rtx (operands[0]), > + copy_rtx (operands[1]))); > + emit_insn (gen_clear_hazard ()); Odd indentation. No need to call copy_rtx here; the operands are only used this once. > + /* Flush both caches. We need to flush the data cache in case > + the system has a write-back cache. */ > + /* ??? Should check the return value for errors. */ > + if (mips_cache_flush_func && mips_cache_flush_func[0]) > + emit_library_call (gen_rtx_SYMBOL_REF (Pmode, mips_cache_flush_func), > + 0, VOIDmode, 3, copy_rtx (operands[0]), Pmode, > + copy_rtx (operands[1]), TYPE_MODE (integer_type_node), > + GEN_INT (3), TYPE_MODE (integer_type_node)); > + DONE; Same copy_rtx comment here. TBH, I'm not sure what the "???" comment refers to. > +(define_insn "synci_loop" > + [(unspec_volatile[(match_operand:SI 0 "general_operand" "r") > + (match_operand:SI 1 "general_operand" "r") > + (clobber (match_scratch:SI 2 "=0")) > + (clobber (match_scratch:SI 3 "=1")) > + (clobber (match_scratch:SI 4 "=r")) > + (clobber (match_scratch:SI 5 "=r"))] > + UNSPEC_SYNCI_LOOP)] Is there any particular need to force operand 2 to be the same as operand 0, and operand 3 to be the same as operand 1? Since these clobbers aren't earlyclobbers, I would have expected the register allocators to be able to reuse operands 0 and 1 as scratch registers if it thought that doing so was useful. _Forcing_ it to reuse them seems unnecessarily restrictive. Minor nit, but please use "d" rather than "r", for consistency with other mips.md patterns. I realise this pattern will never be used for MIPS16, but even so. > + "ISA_HAS_SYNCI" > +{ > + return ".set\tpush\n" > + "\t.set\tnoreorder\n" > + "\t.set\tnomacro\n" > + "\taddu\t%3,%0,%1\n" > + "\trdhwr\t%4,$1\n" > + "1:\tsynci\t0(%2)\n" > + "\tsltu\t%5,%2,%3\n" > + "\tbne\t%5,$0,1b\n" > + "\taddu\t%2,%2,%4\n" > + "\tsync\n" > + "\t.set\tpop"; > + } > + [(set_attr "length" "28")]) Since it's a fixed string, you could replace the "{ ... }" C code with: ".set\tpush\; .set\tnoreorder\; ..." I'm not sure if that's easier to read or not. However, we're eventually going to want this for MIPS64 too, and while we could use mode macros to parameterise it, I wonder if would be better to simply expand this as rtl, with special patterns for the "rdhwr", "synci" and "sync" instructions. This is similar to what we do for other largish patterns. That's certainly not a requirement though; I'm not even sure it makes sense. It's just an idea-cum-question. > +(define_insn "clear_hazard" > + [(unspec_volatile [(clobber (match_scratch:SI 0 "=r"))] UNSPEC_CLEAR_HAZARD) > + (clobber (reg:SI 31))] Clobbers aren't expected inside unspec_volatiles. I think this should be: [(unspec_volatile [(const_int 0)] UNSPEC_CLEAR_HAZARD) (clobber (match_scratch:SI 0 "=d")) (clobber (reg:SI 31))] (where "(const_int 0)" is the traditional "I don't take any arguments" hack) > + "ISA_HAS_SYNCI" > +{ > + return ".set\tpush\n" > + "\t.set\tnoreorder\n" > + "\t.set\tnomacro\n" > + "\tbal\t1f\n" > + "\tnop\n" > + "1:\taddiu\t%0,$31,12\n" > + "\tjr.hb\t%0\n" > + "\tnop\n" > + "\t.set\tpop"; > +} Following on from your comment about delay slots, it also strikes me that the "bal; nop; addiu" sequence is almost always longer than the sequence to load a local address into a register. I think the only exception is static n64 with -mno-sym32. I wonder if it would be worth using a sequence like: rtx label, target, insn; label = gen_label_rtx (); target = force_reg (Pmode, gen_rtx_LABEL_REF (Pmode, label)); insn = emit_jump_insn (gen_indirect_jump_hb (target)); REG_NOTES (insn) = gen_rtx_INSN_LIST (REG_LABEL, label, REG_NOTES (insn)); emit_label (label); where indirect_jump_hb would just be something like: (define_insn "indirect_jump_hb" [(set (pc) (match_operand "pmode_register_operand" "d")) (unspec_volatile [(const_int 0)] UNSPEC_JR_HB)] "ISA_HAS_SYNCI" "jr.hb\t%0" [(set_attr "type" "jump")]) (all untested). I think we might then be able to fill the delay slot from the target of the jump, which should be OK in this context. Again, this is just an idea-cum-question. The tests looked good, thanks. Richard