From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-325014-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 23874 invoked by alias); 21 Aug 2012 20:24:16 -0000
Received: (qmail 23855 invoked by uid 22791); 21 Aug 2012 20:24:12 -0000
X-SWARE-Spam-Status: No, hits=-4.3 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE
X-Spam-Check-By: sourceware.org
Received: from mail-wg0-f51.google.com (HELO mail-wg0-f51.google.com) (74.125.82.51)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 21 Aug 2012 20:23:58 +0000
Received: by wgbed3 with SMTP id ed3so148764wgb.8        for <gcc-patches@gcc.gnu.org>; Tue, 21 Aug 2012 13:23:57 -0700 (PDT)
Received: by 10.180.95.193 with SMTP id dm1mr39145692wib.10.1345580636886;        Tue, 21 Aug 2012 13:23:56 -0700 (PDT)
Received: from localhost ([2.26.188.227])        by mx.google.com with ESMTPS id j6sm6688777wiy.4.2012.08.21.13.23.49        (version=TLSv1/SSLv3 cipher=OTHER);        Tue, 21 Aug 2012 13:23:53 -0700 (PDT)
From: Richard Sandiford <rdsandiford@googlemail.com>
To: Sandra Loosemore <sandra@codesourcery.com>
Mail-Followup-To: Sandra Loosemore <sandra@codesourcery.com>,<gcc-patches@gcc.gnu.org>, rdsandiford@googlemail.com
Cc: <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH, MIPS] fix MIPS16 jump table overflow
References: <5032EAAA.2070703@codesourcery.com>
Date: Tue, 21 Aug 2012 20:24:00 -0000
In-Reply-To: <5032EAAA.2070703@codesourcery.com> (Sandra Loosemore's message	of "Mon, 20 Aug 2012 19:55:54 -0600")
Message-ID: <87boi456zv.fsf@talisman.home>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2012-08/txt/msg01450.txt.bz2

Sandra Loosemore <sandra@codesourcery.com> writes:
> In config/mips/mips.h, there is presently this comment:
>
> /* ??? 16-bit offsets can overflow in large functions.  */
> #define TARGET_MIPS16_SHORT_JUMP_TABLES TARGET_MIPS16_TEXT_LOADS
>
> A while ago we had a bug report where a big switch statement did, in 
> fact, overflow the range of 16-bit offsets, causing a runtime error.
>
> GCC already has generic machinery to shorten offset tables for switch 
> statements that does the necessary range checking, but it only works 
> with "casesi", not the lower-level "tablejump" expansion.  So, this 
> patch provides a "casesi" expander to handle this situation.

Nice.

> This patch has been in use on our local 4.5 and 4.6 branches for about a 
> year now.  When testing it on mainline, I found it tripped over the 
> recent change to add MIPS16 branch overflow checking in other 
> situations, causing it to get into an infinite loop.  I think telling it 
> to ignore these new jump insns it doesn't know how to process is the 
> right thing to do, but I'm not sure if there's a better way to restrict 
> the condition or make mips16_split_long_branches more robust.  Richard,
> since that's your code I assume you'll suggest an alternative if this 
> doesn't meet with your approval.

Changing it to:

	if (JUMP_P (insn)
	    && USEFUL_INSN_P (insn)
	    && get_attr_length (insn) > 8
	    && (any_condjump_p (insn) || any_uncond_jump_p (insn)))

should be OK.

> @@ -5937,6 +5933,91 @@
>    [(set_attr "type" "jump")
>     (set_attr "mode" "none")])
>  
> +;; For MIPS16, we don't know whether a given jump table will use short or
> +;; word-sized offsets until late in compilation, when we are able to determine
> +;; the sizes of the insns which comprise the containing function.  This
> +;; necessitates the use of the casesi rather than the tablejump pattern, since
> +;; the latter tries to calculate the index of the offset to jump through early
> +;; in compilation, i.e. at expand time, when nothing is known about the
> +;; eventual function layout.
> +
> +(define_expand "casesi"
> +  [(match_operand:SI 0 "register_operand" "")	; index to jump on
> +   (match_operand:SI 1 "const_int_operand" "")	; lower bound
> +   (match_operand:SI 2 "const_int_operand" "")	; total range
> +   (match_operand:SI 3 "" "")			; table label
> +   (match_operand:SI 4 "" "")]			; out of range label

The last two are Pmode rather than SImode.  Since there aren't different
case* patterns for different Pmodes, we can't use :P instead, so let's just
drop the modes on 4 and 5.

Would be nice to add a compile test for -mabi=64 just to make sure
that Pmode == DImode works.  A copy of an existing test like
code-readable-1.c would be fine.

> +(define_insn "casesi_internal_mips16"
> +  [(set (pc)
> +     (if_then_else
> +       (leu (match_operand:SI 0 "register_operand" "d")
> +	    (match_operand:SI 1 "arith_operand" "dI"))
> +       (mem:SI (plus:SI (mult:SI (match_dup 0) (const_int 4))
> +			(label_ref (match_operand 2 "" ""))))
> +       (label_ref (match_operand 3 "" ""))))
> +   (clobber (match_scratch:SI 4 "=&d"))
> +   (clobber (match_scratch:SI 5 "=d"))
> +   (clobber (reg:SI MIPS16_T_REGNUM))
> +   (use (label_ref (match_dup 2)))]

Although this is descriptive, the MEM is probably more trouble
than it's worth.  A hard-coded MEM like this will alias a lot
of things, whereas we're only reading from the function itself.
I think an unspec would be better.

This pattern should have :P for operands 4 and 5, with the pattern
name becoming:

"casesi_internal_mips16_<mode>"

PMODE_INSN should make it easy to wrap up the difference.

There shouldn't be any need for the final USE.  Let me know
if you found otherwise, because that sounds like a bug.

> +  "TARGET_MIPS16_SHORT_JUMP_TABLES"
> +{
> +  rtx diff_vec = PATTERN (next_real_insn (operands[2]));
> +
> +  gcc_assert (GET_CODE (diff_vec) == ADDR_DIFF_VEC);
> +  
> +  output_asm_insn ("sltu\t%0, %1", operands);
> +  output_asm_insn ("bteqz\t%3", operands);
> +  output_asm_insn ("la\t%4, %2", operands);
> +  
> +  switch (GET_MODE (diff_vec))
> +    {
> +    case HImode:
> +      output_asm_insn ("sll\t%5, %0, 1", operands);
> +      output_asm_insn ("addu\t%5, %4, %5", operands);
> +      output_asm_insn ("lh\t%5, 0(%5)", operands);
> +      break;
> +    
> +    case SImode:
> +      output_asm_insn ("sll\t%5, %0, 2", operands);
> +      output_asm_insn ("addu\t%5, %4, %5", operands);
> +      output_asm_insn ("lw\t%5, 0(%5)", operands);
> +      break;
> +
> +    default:
> +      gcc_unreachable ();
> +    }
> +  
> +  output_asm_insn ("addu\t%4, %4, %5", operands);
> +  
> +  return "j\t%4";
> +}
> +  [(set_attr "length" "32")])

The "addu"s here ought to be "<d>addu"s after the :P change.

I think we can avoid the earlyclobber on operand 4 by moving the LA
after the SLL.

> +#define CASE_VECTOR_MODE ptr_mode
> +
> +/* Only use short offsets if their range will not overflow.  */
> +#define CASE_VECTOR_SHORTEN_MODE(MIN, MAX, BODY) \
> +  (TARGET_MIPS16_SHORT_JUMP_TABLES && ((MIN) >= -32768 && (MAX) < 32768) \
> +   ? HImode : ptr_mode)

ptr_mode should be SImode here, to match the switch above.
We don't even begin to support functions bigger than 2GB. :-)

> Index: gcc/testsuite/gcc.target/mips/code-readable-1.c
> ===================================================================
> --- gcc/testsuite/gcc.target/mips/code-readable-1.c	(revision 190463)
> +++ gcc/testsuite/gcc.target/mips/code-readable-1.c	(working copy)
> @@ -25,7 +25,7 @@ bar (void)
>  }
>  
>  /* { dg-final { scan-assembler "\tla\t" } } */
> -/* { dg-final { scan-assembler "\t\\.half\t" } } */
> +/* { dg-final { scan-assembler "\t\\.\(half\|word\)\t\\.L\[0-9\]*-\\.L\[0-9\]\*" } } */
>  /* { dg-final { scan-assembler-not "%hi\\(\[^)\]*L" } } */
>  /* { dg-final { scan-assembler-not "%lo\\(\[^)\]*L" } } */
>  

I'd prefer to add -O and stick to requiring .half.

We call shorten_branches several times, so I was worried that we
might be shortening the cases prematurely.  It looks like later
calls are already able to undo the shortening though, so hopefully
that isn't a problem.

Richard