From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1259 invoked by alias); 5 Jul 2007 07:38:43 -0000 Received: (qmail 1246 invoked by uid 22791); 5 Jul 2007 07:38:41 -0000 X-Spam-Check-By: sourceware.org Received: from smtp1.dnsmadeeasy.com (HELO smtp1.dnsmadeeasy.com) (205.234.170.134) by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 05 Jul 2007 07:38:34 +0000 Received: from smtp1.dnsmadeeasy.com (localhost [127.0.0.1]) by smtp1.dnsmadeeasy.com (Postfix) with ESMTP id D5F222FADEE; Thu, 5 Jul 2007 07:38:30 +0000 (UTC) X-Authenticated-Name: js.dnsmadeeasy X-Transit-System: In case of SPAM please contact abuse@dnsmadeeasy.com Received: from avtrex.com (unknown [67.116.42.147]) by smtp1.dnsmadeeasy.com (Postfix) with ESMTP; Thu, 5 Jul 2007 07:38:30 +0000 (UTC) Received: from jennifer.localdomain ([192.168.7.221]) by avtrex.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 5 Jul 2007 00:38:27 -0700 Message-ID: <468C9FCA.4060406@avtrex.com> Date: Thu, 05 Jul 2007 07:50:00 -0000 From: David Daney User-Agent: Thunderbird 2.0.0.4 (X11/20070615) MIME-Version: 1.0 To: David Daney , gcc-patches@gcc.gnu.org, rsandifo@nildram.co.uk Subject: Re: [Patch] 2/3 MIPS support for builtin __builtin_flush_icache(). References: <468734D3.3020908@avtrex.com> <46873597.30607@avtrex.com> <877ipk8kab.fsf@firetop.home> In-Reply-To: <877ipk8kab.fsf@firetop.home> Content-Type: multipart/mixed; boundary="------------080607000509070301010102" X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-07/txt/msg00417.txt.bz2 This is a multi-part message in MIME format. --------------080607000509070301010102 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-length: 7305 Richard Sandiford wrote: > David Daney writes: > >> @@ -4171,6 +4173,72 @@ (define_insn "cprestore" >> } >> [(set_attr "type" "store") >> (set_attr "length" "4,12")]) >> + >> +(define_expand "flush_icache" >> + [(match_operand:SI 0 "general_operand" "r") >> + (match_operand:SI 1 "general_operand" "r")] >> > > No constraints in a define_expand; just remove the "r" strings altogether. > > Fixed. >> + if (ISA_HAS_SYNCI) >> + { >> + emit_insn (gen_synci_loop (copy_rtx (operands[0]), >> + copy_rtx (operands[1]))); >> + emit_insn (gen_clear_hazard ()); >> > > Odd indentation. No need to call copy_rtx here; the operands are only > used this once. > > Right. I guess I am a little unclear on when copy_rtx is required. >> + /* Flush both caches. We need to flush the data cache in case >> + the system has a write-back cache. */ >> + /* ??? Should check the return value for errors. */ >> + if (mips_cache_flush_func && mips_cache_flush_func[0]) >> + emit_library_call (gen_rtx_SYMBOL_REF (Pmode, mips_cache_flush_func), >> + 0, VOIDmode, 3, copy_rtx (operands[0]), Pmode, >> + copy_rtx (operands[1]), TYPE_MODE (integer_type_node), >> + GEN_INT (3), TYPE_MODE (integer_type_node)); >> + DONE; >> > > Same copy_rtx comment here. TBH, I'm not sure what the "???" comment > refers to. > > I just pased this part in from its old home in mips.h. The ??? comment came with it. It is gone now. >> +(define_insn "synci_loop" >> + [(unspec_volatile[(match_operand:SI 0 "general_operand" "r") >> + (match_operand:SI 1 "general_operand" "r") >> + (clobber (match_scratch:SI 2 "=0")) >> + (clobber (match_scratch:SI 3 "=1")) >> + (clobber (match_scratch:SI 4 "=r")) >> + (clobber (match_scratch:SI 5 "=r"))] >> + UNSPEC_SYNCI_LOOP)] >> > > Is there any particular need to force operand 2 to be the same as > operand 0, and operand 3 to be the same as operand 1? Since these > clobbers aren't earlyclobbers, I would have expected the register > allocators to be able to reuse operands 0 and 1 as scratch registers > if it thought that doing so was useful. _Forcing_ it to reuse them > seems unnecessarily restrictive. > > Minor nit, but please use "d" rather than "r", for consistency with > other mips.md patterns. I realise this pattern will never be used for > MIPS16, but even so. > The first one was to force the register to be reused so I didn't have to waste an instruction making a copy of it. The second was gratuitous. It is moot though as this part is gone now. > I wonder if would be better > to simply expand this as rtl, with special patterns for the "rdhwr", > "synci" and "sync" instructions. This is similar to what we do for > other largish patterns. That's certainly not a requirement though; > I'm not even sure it makes sense. It's just an idea-cum-question. > > That is what I did. >> +(define_insn "clear_hazard" >> + [(unspec_volatile [(clobber (match_scratch:SI 0 "=r"))] UNSPEC_CLEAR_HAZARD) >> + (clobber (reg:SI 31))] >> > > Clobbers aren't expected inside unspec_volatiles. I think this should be: > > [(unspec_volatile [(const_int 0)] UNSPEC_CLEAR_HAZARD) > (clobber (match_scratch:SI 0 "=d")) > (clobber (reg:SI 31))] > > (where "(const_int 0)" is the traditional "I don't take any arguments" hack) > > Fixed. >> + "ISA_HAS_SYNCI" >> +{ >> + return ".set\tpush\n" >> + "\t.set\tnoreorder\n" >> + "\t.set\tnomacro\n" >> + "\tbal\t1f\n" >> + "\tnop\n" >> + "1:\taddiu\t%0,$31,12\n" >> + "\tjr.hb\t%0\n" >> + "\tnop\n" >> + "\t.set\tpop"; >> +} >> > > Following on from your comment about delay slots, it also strikes me > that the "bal; nop; addiu" sequence is almost always longer than the > sequence to load a local address into a register. I think the only > exception is static n64 with -mno-sym32. I wonder if it would be > worth using a sequence like: > > rtx label, target, insn; > > label = gen_label_rtx (); > target = force_reg (Pmode, gen_rtx_LABEL_REF (Pmode, label)); > insn = emit_jump_insn (gen_indirect_jump_hb (target)); > REG_NOTES (insn) = gen_rtx_INSN_LIST (REG_LABEL, label, REG_NOTES (insn)); > emit_label (label); > > where indirect_jump_hb would just be something like: > > (define_insn "indirect_jump_hb" > [(set (pc) (match_operand "pmode_register_operand" "d")) > (unspec_volatile [(const_int 0)] UNSPEC_JR_HB)] > "ISA_HAS_SYNCI" > "jr.hb\t%0" > [(set_attr "type" "jump")]) > > (all untested). I think we might then be able to fill the delay slot > from the target of the jump, which should be OK in this context. > Again, this is just an idea-cum-question. > > I spent way too much time trying to get that to work. For now I left it mostly unchanged from the original patch. The problem is that the CFG things just don't seem to be able to handle: 1) Jump insns where the body is a (parallel ...) instead of a (set (pc) ...). 2) Unconditional jumps where the target and fall through edges goto the same place. I kept the call to mips_cache_flush_func pretty much unchanged because I didn't want to break compatibility with the command line options that allow this to be changed. Also I added CLEAR_INSN_CACHE that expands in libgcc that is used if __builtin___clear_cache() cannot be expanded in-line. This is slightly inefficient as it adds an extra function call layer through libgcc and has to do the arithmetic to convert the function parameters, but I am thinking I shouldn't be concerned about a few extra cycles doing this when we are about to do a system call. The alternative is to have __builtin___clear_cache() directly expand to mips_cache_flush_func and leave libgcc out of the picture for MIPS. Attached is the new version of the patch. Currently bootstrapping/testing on x86_64-unknown-linux-gnu, mipsel-unknown-linux-gnu (--with-arch=[mips32|mips32r2]). OK to commit if the target independent portion is approved? 2007-07-04 David Daney * config/mips/mips.h (mips_builtin_clear_cache_inline_p): Declare function. (TARGET_BUILTIN_CLEAR_CACHE_INLINE_P): Define target hook. (ISA_HAS_SYNCI): New target capability predicate. (CLEAR_INSN_CACHE): Define libgcc2 hook. (INITIALIZE_TRAMPOLINE): Emit flush_icache instead library call. * config/mips/netbsd.h (CLEAR_INSN_CACHE): Define libgcc2 hook. * config/mips/mips-protos.h (mips_expand_synci_loop): Declare function. * config/mips/mips.md (UNSPEC_CLEAR_HAZARD): New constant. (UNSPEC_RDHWR): Same. (UNSPEC_SYNCI): Same. (UNSPEC_SYNC): Same. (flush_icache): New expand. (clear_cache): Same. (sync): New insn. (synci): Same. (rdhwr): Same. (clear_hazard): Same. * config/mips/mips.c (mips_builtin_clear_cache_inline_p): New function. (mips_expand_synci_loop): Same. * testsuite/gcc.target/mips/clear-cache-1.c: New test. * testsuite/gcc.target/mips/clear-cache-2.c: New test. --------------080607000509070301010102 Content-Type: text/x-patch; name="flush-cache2.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="flush-cache2.diff" Content-length: 9970 Index: config/mips/mips.h =================================================================== --- config/mips/mips.h (revision 125997) +++ config/mips/mips.h (working copy) @@ -136,6 +136,10 @@ extern const struct mips_cpu_info mips_c extern const struct mips_cpu_info *mips_arch_info; extern const struct mips_cpu_info *mips_tune_info; extern const struct mips_rtx_cost_data *mips_cost; + +extern bool mips_builtin_clear_cache_inline_p (void); +#undef TARGET_BUILTIN_CLEAR_CACHE_INLINE_P +#define TARGET_BUILTIN_CLEAR_CACHE_INLINE_P mips_builtin_clear_cache_inline_p #endif /* Macros to silence warnings about numbers being signed in traditional @@ -770,6 +774,10 @@ extern const struct mips_rtx_cost_data * || ISA_MIPS32R2 \ || ISA_MIPS64 \ || TARGET_MIPS5500) + +/* ISA includes synci, jr.hb and jalr.hb */ +#define ISA_HAS_SYNCI ISA_MIPS32R2 + /* Add -G xx support. */ @@ -2108,6 +2116,17 @@ typedef struct mips_args { #define CACHE_FLUSH_FUNC "_flush_cache" #endif +/* Clear the instruction cache from `beg' to `end'. */ +#undef CLEAR_INSN_CACHE +#define CLEAR_INSN_CACHE(BEG, END) \ +{ \ + extern void _flush_cache (char *b, int l, int f); \ + if (__builtin_clear_cache_inline_p()) \ + __builtin___clear_cache ((BEG), (END)); \ + else \ + _flush_cache ((BEG), ((char *)(END) - (char *)(BEG)), 3); \ +} + /* A C statement to initialize the variable parts of a trampoline. ADDR is an RTX for the address of the trampoline; FNADDR is an RTX for the address of the nested function; STATIC_CHAIN is an @@ -2122,15 +2141,7 @@ typedef struct mips_args { chain_addr = plus_constant (func_addr, GET_MODE_SIZE (ptr_mode)); \ emit_move_insn (gen_rtx_MEM (ptr_mode, func_addr), FUNC); \ emit_move_insn (gen_rtx_MEM (ptr_mode, chain_addr), CHAIN); \ - \ - /* Flush both caches. We need to flush the data cache in case \ - the system has a write-back cache. */ \ - /* ??? Should check the return value for errors. */ \ - if (mips_cache_flush_func && mips_cache_flush_func[0]) \ - emit_library_call (gen_rtx_SYMBOL_REF (Pmode, mips_cache_flush_func), \ - 0, VOIDmode, 3, ADDR, Pmode, \ - GEN_INT (TRAMPOLINE_SIZE), TYPE_MODE (integer_type_node),\ - GEN_INT (3), TYPE_MODE (integer_type_node)); \ + emit_insn (gen_flush_icache (copy_rtx (ADDR), GEN_INT (TRAMPOLINE_SIZE))); \ } /* Addressing modes, and classification of registers for them. */ Index: config/mips/netbsd.h =================================================================== --- config/mips/netbsd.h (revision 125997) +++ config/mips/netbsd.h (working copy) @@ -190,6 +190,16 @@ Boston, MA 02110-1301, USA. */ #undef CACHE_FLUSH_FUNC #define CACHE_FLUSH_FUNC "_cacheflush" +/* Clear the instruction cache from `beg' to `end'. */ +#undef CLEAR_INSN_CACHE +#define CLEAR_INSN_CACHE(BEG, END) \ +{ \ + extern void _cacheflush (char *b, int l, int f); \ + if (__builtin_clear_cache_inline_p()) \ + __builtin___clear_cache ((BEG), (END)); \ + else \ + _cacheflush ((BEG), ((char *)(END) - (char *)(BEG)), 3); \ +} /* Make gcc agree with */ Index: config/mips/mips-protos.h =================================================================== --- config/mips/mips-protos.h (revision 125997) +++ config/mips/mips-protos.h (working copy) @@ -1,6 +1,6 @@ /* Prototypes of target machine for GNU compiler. MIPS version. Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, - 1999, 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc. + 1999, 2001, 2002, 2003, 2004, 2005, 2007 Free Software Foundation, Inc. Contributed by A. Lichnewsky (lich@inria.inria.fr). Changed by Michael Meissner (meissner@osf.org). 64-bit r4000 support by Ian Lance Taylor (ian@cygnus.com) and @@ -185,6 +185,7 @@ extern void mips_expand_call (rtx, rtx, extern void mips_emit_fcc_reload (rtx, rtx, rtx); extern void mips_set_return_address (rtx, rtx); extern bool mips_expand_block_move (rtx, rtx, rtx); +extern void mips_expand_synci_loop (rtx, rtx); extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx); extern void function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode, Index: config/mips/mips.md =================================================================== --- config/mips/mips.md (revision 125997) +++ config/mips/mips.md (working copy) @@ -50,6 +50,10 @@ (define_constants (UNSPEC_TLS_GET_TP 28) (UNSPEC_MFHC1 31) (UNSPEC_MTHC1 32) + (UNSPEC_CLEAR_HAZARD 33) + (UNSPEC_RDHWR 34) + (UNSPEC_SYNCI 35) + (UNSPEC_SYNC 36) (UNSPEC_ADDRESS_FIRST 100) @@ -4171,6 +4175,82 @@ (define_insn "cprestore" } [(set_attr "type" "store") (set_attr "length" "4,12")]) + +;; Flush the instruction cache starting as operands[0] with length +;; operands[1]. +(define_expand "flush_icache" + [(match_operand:SI 0 "pmode_register_operand") + (match_operand:SI 1 "register_operand")] + "" + " +{ + if (ISA_HAS_SYNCI) + { + rtx end_addr = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (VOIDmode, end_addr, + gen_rtx_PLUS (Pmode, operands[0], operands[1]))); + emit_insn (gen_clear_cache (operands[0], end_addr)); + } + else + /* Flush both caches. We need to flush the data cache in case + the system has a write-back cache. */ + if (mips_cache_flush_func && mips_cache_flush_func[0]) + emit_library_call (gen_rtx_SYMBOL_REF (Pmode, mips_cache_flush_func), + 0, VOIDmode, 3, operands[0], Pmode, + operands[1], TYPE_MODE (integer_type_node), + GEN_INT (3), TYPE_MODE (integer_type_node)); + DONE; +}") + +;; Expand in-line code to clear the instruction cache between operand[0] and +;; operand[1]. +(define_expand "clear_cache" + [(match_operand:SI 0 "general_operand") + (match_operand:SI 1 "general_operand")] + "ISA_HAS_SYNCI" + " +{ + mips_expand_synci_loop (operands[0], operands[1]); + emit_insn (gen_clear_hazard ()); + DONE; +}") + +(define_insn "sync" + [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)] + "" + "sync" + [(set_attr "length" "4")]) + +(define_insn "synci" + [(unspec_volatile [(match_operand:SI 0 "general_operand" "d")] UNSPEC_SYNCI)] + "" + "synci\t0(%0)" + [(set_attr "length" "4")]) + +(define_insn "rdhwr" + [(set (match_operand:SI 0 "general_operand" "=d") + (unspec_volatile [(match_operand:SI 1 "immediate_operand" "n")] + UNSPEC_RDHWR))] + "" + "rdhwr\t%0,$%1" + [(set_attr "length" "4")]) + +(define_insn "clear_hazard" + [(unspec_volatile [(const_int 0)] UNSPEC_CLEAR_HAZARD) + (clobber (reg:SI 31))] + "ISA_HAS_SYNCI" +{ + return ".set\tpush\n" + "\t.set\tnoreorder\n" + "\t.set\tnomacro\n" + "\tbal\t1f\n" + "\tnop\n" + "1:\taddiu\t$31,$31,12\n" + "\tjr.hb\t$31\n" + "\tnop\n" + "\t.set\tpop"; +} + [(set_attr "length" "20")]) ;; Block moves, see mips.c for more details. ;; Argument 0 is the destination Index: config/mips/mips.c =================================================================== --- config/mips/mips.c (revision 125997) +++ config/mips/mips.c (working copy) @@ -3759,6 +3759,47 @@ mips_block_move_loop (rtx dest, rtx src, mips_block_move_straight (dest, src, leftover); } + +/* Return true if we will expand __builtin_clear_cache() in-line. */ +bool +mips_builtin_clear_cache_inline_p (void) +{ + return ISA_HAS_SYNCI; +} + +/* Expand a loop of synci insns */ +void +mips_expand_synci_loop (rtx begin, rtx end) +{ + rtx end_addr, inc, label, label_ref, cmp, cmp_result; + + begin = force_reg(Pmode, begin); + end = force_reg(Pmode, end); + + /* Load INC with the cache line size (rdhwr INC,$1). */ + inc = gen_reg_rtx (SImode); + emit_insn (gen_rdhwr (inc ,gen_rtx_CONST_INT(SImode, 1))); + + /* Loop back to here. */ + label = gen_label_rtx (); + emit_label (label); + + emit_insn (gen_synci (begin)); + + cmp = gen_reg_rtx (SImode); + emit_insn (gen_rtx_SET (VOIDmode, cmp, gen_rtx_GTU (Pmode, begin, end))); + + emit_insn (gen_rtx_SET (VOIDmode, begin, gen_rtx_PLUS (Pmode, begin, inc))); + + label_ref = gen_rtx_LABEL_REF (VOIDmode, label); + cmp_result = gen_rtx_EQ (VOIDmode, cmp, gen_rtx_CONST_INT(SImode, 0)); + emit_jump_insn (gen_rtx_SET (VOIDmode, pc_rtx, + gen_rtx_IF_THEN_ELSE (VOIDmode, cmp_result, + label_ref, + pc_rtx))); + emit_insn (gen_sync ()); +} + /* Expand a movmemsi instruction. */ bool Index: testsuite/gcc.target/mips/clear-cache-1.c =================================================================== --- testsuite/gcc.target/mips/clear-cache-1.c (revision 0) +++ testsuite/gcc.target/mips/clear-cache-1.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-mips-options "-O2 -mips32r2" } */ +/* { dg-final { scan-assembler "synci" } } */ +/* { dg-final { scan-assembler "jr.hb" } } */ +/* { dg-final { scan-assembler-not "__clear_cache" } } */ + +void f() +{ + int size = 40; + char *memory = __builtin_alloca(size); + __builtin___clear_cache(memory, memory + size); +} + Index: testsuite/gcc.target/mips/clear-cache-2.c =================================================================== --- testsuite/gcc.target/mips/clear-cache-2.c (revision 0) +++ testsuite/gcc.target/mips/clear-cache-2.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-mips-options "-O2 -mips32" } */ +/* { dg-final { scan-assembler-not "synci" } } */ +/* { dg-final { scan-assembler-not "jr.hb" } } */ +/* { dg-final { scan-assembler "__clear_cache" } } */ + +void f() +{ + int size = 40; + char *memory = __builtin_alloca(size); + __builtin___clear_cache(memory, memory + size); +} + --------------080607000509070301010102--