Need advice on bounds checking approaches

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Need advice on bounds checking approaches
@ 2000-03-24 11:18 Greg McGary
  2000-03-24 12:08 ` Joern Rennecke
  0 siblings, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-03-24 11:18 UTC (permalink / raw)
  To: gcc; +Cc: gkm

The final implementation phase for bounded pointers is to generate the
code that does the checks.  There are some choices to make, and I'd
appreciate hearing from experienced maintainers how best to do it.

The primary goal is to do this in a way that supports optimizing away
redundant checks.

Checks need to be generated at the time of pointer dereference or
array reference.  Let's focus on pointer dereference.

Consider this code:

	void
	foo (char *p)
	{
	  *p = 1;
	}

Recall that the type char * has been transformed internally to a
bounded pointer type like this:

	struct charp { char *value, *base, *extent; };

So, our function is internally equivalent to this:

	void
	foo (struct charp p)
	{
	  *p.value = 1;
	}

One place to generate checks is in the IR, transforming the function
into something like this:

	void
	foo (struct charp p)
	{
	  *({ if (p.value < p.base || p.value >= p.extent)
		 abort ();
	      p.value; }) = 1;
	}

This is easy to implement, but seems to have drawbacks for
optimization: The if statement expands to a sequence of jumps whose
presence creates new basic-block boundaries.  Also, the resulting RTL
isn't readily identifiable as a bounds check

An alternate approach is to transform the function like so:

	void
	foo (struct charp p)
	{
	  *__builtin_check_bounds (p) = 1;
	}

Where __builtin_check_bounds accepts a bounded pointer argument and
returns a simple pointer value, and as a side effect, injects
bounds-checking RTL nodes into the insn stream

	DEF_RTL_EXPR(CHECK_BOUNDS, "check_bounds", "eee", 'x')

The three args are pointer value, base & extent.

Now, the optimization passes (CSE, most likely), can easily identify
redundant checks and eliminate them.  Moreover, if a machine has a
bounds-checking instruction, or a better than normal insn sequence for
bounds checking, it can define an insn to recognize the "check_bounds"
pattern.

E.g., the most efficient way to check bounds on i960 is with this
sequence (ptr & bas stand for registers holding those BP components):

	cmpo	ptr, bas	; cc=100 on failure
	concmp	ext, ptr	; cc=010 on failure
	faultle.f

If a machine can't do anything better then it will need to default
to something like this (in pseudo asm):

	cmp	ptr, bas
	blt	0f
	cmp	ptr, ext
	blt	1f
    0:	call	abort
    1:

Question: with the above plan, is there a way to provide a default
expansion of the "check_bounds" pattern into primitive RTL
(comparisons, conditional branches and call to abort) for those
targets that don't define an insn for "check_bounds"?
The i960 will define an insn for this, so that it can do better,
but most other targets won't be able to do better and it would be nice
to avoid having to hack every MD file.

Is there some other, better way to go?

Thanks,
Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-24 11:18 Need advice on bounds checking approaches Greg McGary
@ 2000-03-24 12:08 ` Joern Rennecke
  2000-03-24 12:28   ` Greg McGary
  0 siblings, 1 reply; 38+ messages in thread
From: Joern Rennecke @ 2000-03-24 12:08 UTC (permalink / raw)
  To: Greg McGary; +Cc: gcc

> Question: with the above plan, is there a way to provide a default
> expansion of the "check_bounds" pattern into primitive RTL
> (comparisons, conditional branches and call to abort) for those
> targets that don't define an insn for "check_bounds"?

You can use HAVE_check_bounds to test if a "check_bounds" pattern has been
defined in the md file.  If it is defined, you can use gen_check_bounds
to generate this pattern.  The expander might fail (e.g. because the
target can implement the bounds checking insn only for a particular
cpu_, in which case you get a zero return value from gen_check_bounds.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-24 12:08 ` Joern Rennecke
@ 2000-03-24 12:28   ` Greg McGary
  2000-03-24 16:18     ` Jeffrey A Law
  0 siblings, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-03-24 12:28 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: gcc

Joern Rennecke <amylaar@cygnus.co.uk> writes:

> > Question: with the above plan, is there a way to provide a default
> > expansion of the "check_bounds" pattern into primitive RTL
> > (comparisons, conditional branches and call to abort) for those
> > targets that don't define an insn for "check_bounds"?
> 
> You can use HAVE_check_bounds to test if a "check_bounds" pattern has been
> defined in the md file.  If it is defined, you can use gen_check_bounds
> to generate this pattern.  The expander might fail (e.g. because the
> target can implement the bounds checking insn only for a particular
> cpu_, in which case you get a zero return value from gen_check_bounds.

I am aware of HAVE_check_bounds.  Unfortunately, it doesn't do what I
want.  If HAVE_check_bounds, I could generate "check_bounds", and if
! HAVE_check_bounds, I could generate the primitive RTL (compares +
conditional branches + abort).  If ! HAVE_check_bounds, then I lose
the ability to conveniently optimize away redundant checks, and my
basic blocks get sliced up by the presence of extra branches.

What I want is to always generate check_bounds insns, always optimize
away redundant checks the same way, and then provide a default
*expansion* if the MD doesn't define one.

Is that feasible, or am I dreaming?

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-24 12:28   ` Greg McGary
@ 2000-03-24 16:18     ` Jeffrey A Law
  2000-03-24 16:50       ` Greg McGary
  0 siblings, 1 reply; 38+ messages in thread
From: Jeffrey A Law @ 2000-03-24 16:18 UTC (permalink / raw)
  To: Greg McGary; +Cc: Joern Rennecke, gcc

  In message < msitycyuyg.fsf@gkm-dsl-194.ascend.com >you write:
  > I am aware of HAVE_check_bounds.  Unfortunately, it doesn't do what I
  > want.  If HAVE_check_bounds, I could generate "check_bounds", and if
  > ! HAVE_check_bounds, I could generate the primitive RTL (compares +
  > conditional branches + abort).  If ! HAVE_check_bounds, then I lose
  > the ability to conveniently optimize away redundant checks, and my
  > basic blocks get sliced up by the presence of extra branches.
  > 
  > What I want is to always generate check_bounds insns, always optimize
  > away redundant checks the same way, and then provide a default
  > *expansion* if the MD doesn't define one.
  > 
  > Is that feasible, or am I dreaming?
Right now, you're dreaming.

We have kicked around the idea of two levels of RTL where we do some
optimizations on the higher level RTL, then drop down to a lower level
RTL.

Maybe it would make sense to do your optimizations at the tree level?  We're
working on functions-as-trees for the C front-end.  Just a thought.

jeff
  > 
  > Greg
  > 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-24 16:18     ` Jeffrey A Law
@ 2000-03-24 16:50       ` Greg McGary
  2000-03-24 17:27         ` Jamie Lokier
  2000-03-27 12:30         ` Jeffrey A Law
  0 siblings, 2 replies; 38+ messages in thread
From: Greg McGary @ 2000-03-24 16:50 UTC (permalink / raw)
  To: law; +Cc: Joern Rennecke, gcc

Jeffrey A Law <law@cygnus.com> writes:

> We have kicked around the idea of two levels of RTL where we do some
> optimizations on the higher level RTL, then drop down to a lower level
> RTL.

Perhaps I could do that in a limited way, by adding a BP-specific
optimization pass after loop optimization that eliminates redundant
checks and expands "check_bounds" into primitive RTL for targets that
don't HAVE_check_bounds.  Could that pass muster?

> Maybe it would make sense to do your optimizations at the tree level?  We're
> working on functions-as-trees for the C front-end.  Just a thought.

It's a fine thought, but I can't do that today, can I?  I only see
whole function mode being used for C++ inlines.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-24 16:50       ` Greg McGary
@ 2000-03-24 17:27         ` Jamie Lokier
  2000-03-27 12:30         ` Jeffrey A Law
  1 sibling, 0 replies; 38+ messages in thread
From: Jamie Lokier @ 2000-03-24 17:27 UTC (permalink / raw)
  To: Greg McGary; +Cc: law, Joern Rennecke, gcc

Greg McGary wrote:
> > We have kicked around the idea of two levels of RTL where we do some
> > optimizations on the higher level RTL, then drop down to a lower level
> > RTL.
> 
> Perhaps I could do that in a limited way, by adding a BP-specific
> optimization pass after loop optimization that eliminates redundant
> checks and expands "check_bounds" into primitive RTL for targets that
> don't HAVE_check_bounds.  Could that pass muster?

The general idea works ok for constant_p.  But that is removed
relatively early.

-- Jamie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-24 16:50       ` Greg McGary
  2000-03-24 17:27         ` Jamie Lokier
@ 2000-03-27 12:30         ` Jeffrey A Law
  2000-03-27 12:45           ` Mark Mitchell
  2000-03-27 13:05           ` Greg McGary
  1 sibling, 2 replies; 38+ messages in thread
From: Jeffrey A Law @ 2000-03-27 12:30 UTC (permalink / raw)
  To: Greg McGary; +Cc: Joern Rennecke, gcc

  In message < msvh2byiu2.fsf@gkm-dsl-194.ascend.com >you write:
  > Jeffrey A Law <law@cygnus.com> writes:
  > 
  > > We have kicked around the idea of two levels of RTL where we do some
  > > optimizations on the higher level RTL, then drop down to a lower level
  > > RTL.
  > 
  > Perhaps I could do that in a limited way, by adding a BP-specific
  > optimization pass after loop optimization that eliminates redundant
  > checks and expands "check_bounds" into primitive RTL for targets that
  > don't HAVE_check_bounds.  Could that pass muster?
Possibly.  However, I'm not a big fan of having feature specific
optimization passes, even if we've had some in the past (addressof & 
constant_p).

I'd be much happier if we could find a clean way to get the optimizations
you need using our existing optimizers.

  > It's a fine thought, but I can't do that today, can I?  I only see
  > whole function mode being used for C++ inlines.
That's not my understanding -- you'd probably need to talk to Mark -- I
was under the impression that we always did function at a time stuff
for C++.
Jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 12:30         ` Jeffrey A Law
@ 2000-03-27 12:45           ` Mark Mitchell
  2000-03-27 13:05           ` Greg McGary
  1 sibling, 0 replies; 38+ messages in thread
From: Mark Mitchell @ 2000-03-27 12:45 UTC (permalink / raw)
  To: law; +Cc: gkm, amylaar, gcc

>>>>> "Jeffrey" == Jeffrey A Law <law@cygnus.com> writes:

    >> It's a fine thought, but I can't do that today, can I?  I only
    >> see whole function mode being used for C++ inlines.

    Jeffrey> That's not my understanding -- you'd probably need to
    Jeffrey> talk to Mark -- I was under the impression that we always
    Jeffrey> did function at a time stuff for C++. 

Jeff's understanding is correct.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 12:30         ` Jeffrey A Law
  2000-03-27 12:45           ` Mark Mitchell
@ 2000-03-27 13:05           ` Greg McGary
  2000-03-27 13:54             ` Geoff Keating
  2000-03-29 12:22             ` Jeffrey A Law
  1 sibling, 2 replies; 38+ messages in thread
From: Greg McGary @ 2000-03-27 13:05 UTC (permalink / raw)
  To: law; +Cc: Joern Rennecke, gcc

Jeffrey A Law <law@cygnus.com> writes:

>   > Perhaps I could do that in a limited way, by adding a BP-specific
>   > optimization pass after loop optimization that eliminates redundant
>   > checks and expands "check_bounds" into primitive RTL for targets that
>   > don't HAVE_check_bounds.  Could that pass muster?
> 
> Possibly.  However, I'm not a big fan of having feature specific
> optimization passes, even if we've had some in the past (addressof & 
> constant_p).
> 
> I'd be much happier if we could find a clean way to get the optimizations
> you need using our existing optimizers.

We can do without a special pass.  CSE seems the best place to
eliminate redundant checks, so we'd just need a little bit of code to
handle the new check_bounds_rtx.  We can just forget about doing
default translation of check_bounds_rtx into primitive RTL, and
require that targets supporting bounds checking handle it in the
machine description.  So far, all of the targets I want to support
initially (i960, PowerPC, i386, MIPS) will benefit from special
handling in the MD file anyway.

Does that sound reasonable?

>   > It's a fine thought, but I can't do that today, can I?  I only see
>   > whole function mode being used for C++ inlines.
> 
> That's not my understanding -- you'd probably need to talk to Mark -- I
> was under the impression that we always did function at a time stuff
> for C++.

You're right.  I was careless in my initial reading of the condition
surrounding the setting of cfun->x_whole_function_mode_p.  Even so,
this doesn't help me for C or ObjC, unless those will be converted
soon.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 13:05           ` Greg McGary
@ 2000-03-27 13:54             ` Geoff Keating
  2000-03-27 14:21               ` Greg McGary
  2000-03-29 12:22             ` Jeffrey A Law
  1 sibling, 1 reply; 38+ messages in thread
From: Geoff Keating @ 2000-03-27 13:54 UTC (permalink / raw)
  To: Greg McGary; +Cc: gcc

Greg McGary <gkm@eng.ascend.com> writes:

> We can do without a special pass.  CSE seems the best place to
> eliminate redundant checks, so we'd just need a little bit of code to
> handle the new check_bounds_rtx.  We can just forget about doing
> default translation of check_bounds_rtx into primitive RTL, and
> require that targets supporting bounds checking handle it in the
> machine description.  So far, all of the targets I want to support
> initially (i960, PowerPC, i386, MIPS) will benefit from special
> handling in the MD file anyway.
> 
> Does that sound reasonable?

Why would these targets need special handling in the MD file?

I can't think of any support on ppc for this that isn't already
represented in the machine description through the TRAP RTXs.

-- 
- Geoffrey Keating <geoffk@cygnus.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 13:54             ` Geoff Keating
@ 2000-03-27 14:21               ` Greg McGary
  2000-03-27 14:30                 ` Jeffrey A Law
                                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Greg McGary @ 2000-03-27 14:21 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc

Geoff Keating <geoffk@cygnus.com> writes:

> Greg McGary <gkm@eng.ascend.com> writes:
> 
> > We can do without a special pass.  CSE seems the best place to
> > eliminate redundant checks, so we'd just need a little bit of code to
> > handle the new check_bounds_rtx.  We can just forget about doing
> > default translation of check_bounds_rtx into primitive RTL, and
> > require that targets supporting bounds checking handle it in the
> > machine description.  So far, all of the targets I want to support
> > initially (i960, PowerPC, i386, MIPS) will benefit from special
> > handling in the MD file anyway.
> > 
> > Does that sound reasonable?
> 
> Why would these targets need special handling in the MD file?

I didn't say "need", I said "benefit from".

For i960, this sequence is optimal:

	cmpo	ptr, bas
	concmp	ext, ptr
	faultle.f

For i386, it makes sense the use the `bound' instruction.

For MIPS, I'll want to generate a `break' instruction.

For PowerPC, I'll want to use conditional trap instructions.

> I can't think of any support on ppc for this that isn't already
> represented in the machine description through the TRAP RTXs.

Conditional traps aren't widely available (only i960, m68k, powerpc,
sparc), so that approach doesn't help all that much for portability.
Is there something more that I'm missing?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 14:21               ` Greg McGary
@ 2000-03-27 14:30                 ` Jeffrey A Law
  2000-03-27 19:23                   ` Michael Hayes
  2000-03-27 14:34                 ` Geoff Keating
  2000-03-27 22:07                 ` Greg McGary
  2 siblings, 1 reply; 38+ messages in thread
From: Jeffrey A Law @ 2000-03-27 14:30 UTC (permalink / raw)
  To: Greg McGary; +Cc: Geoff Keating, gcc

  In message < ms8zz4rr51.fsf@gkm-dsl-194.ascend.com >you write:
  > > I can't think of any support on ppc for this that isn't already
  > > represented in the machine description through the TRAP RTXs.
  > 
  > Conditional traps aren't widely available (only i960, m68k, powerpc,
  > sparc), so that approach doesn't help all that much for portability.
  > Is there something more that I'm missing?
The PA has conditional traps.  We've just never bothered adding them
the MD file.

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 14:30                 ` Jeffrey A Law
@ 2000-03-27 19:23                   ` Michael Hayes
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Hayes @ 2000-03-27 19:23 UTC (permalink / raw)
  To: law; +Cc: Greg McGary, Geoff Keating, gcc

Jeffrey A Law writes:
 > 
 >   In message < ms8zz4rr51.fsf@gkm-dsl-194.ascend.com >you write:
 >   > > I can't think of any support on ppc for this that isn't already
 >   > > represented in the machine description through the TRAP RTXs.
 >   > 
 >   > Conditional traps aren't widely available (only i960, m68k, powerpc,
 >   > sparc), so that approach doesn't help all that much for portability.
 >   > Is there something more that I'm missing?
 > The PA has conditional traps.  We've just never bothered adding them
 > the MD file.

Ditto C3x/C4x.

Michael.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 14:21               ` Greg McGary
  2000-03-27 14:30                 ` Jeffrey A Law
@ 2000-03-27 14:34                 ` Geoff Keating
  2000-03-27 22:07                 ` Greg McGary
  2 siblings, 0 replies; 38+ messages in thread
From: Geoff Keating @ 2000-03-27 14:34 UTC (permalink / raw)
  To: gkm; +Cc: gcc

> Cc: gcc@gcc.gnu.org
> From: Greg McGary <gkm@eng.ascend.com>
> Date: 27 Mar 2000 15:21:14 -0700

> For i960, this sequence is optimal:
> 
> 	cmpo	ptr, bas
> 	concmp	ext, ptr
> 	faultle.f
> 
> For i386, it makes sense the use the `bound' instruction.
> 
> For MIPS, I'll want to generate a `break' instruction.
> 
> For PowerPC, I'll want to use conditional trap instructions.
> 
> > I can't think of any support on ppc for this that isn't already
> > represented in the machine description through the TRAP RTXs.
> 
> Conditional traps aren't widely available (only i960, m68k, powerpc,
> sparc), so that approach doesn't help all that much for portability.
> Is there something more that I'm missing?

They're not widely available because no-one has bothered to implement
them in the MD files for other targets.  The hardware support is
available on many platforms, including i386.

For those platforms that don't have conditional traps, then presumably
there is some way to generate an unconditional trap---if there is no
guaranteed illegal instruction, and reading from location 0 won't do
it, you can always branch to abort().  The 'break' instruction is, I
believe, an unconditional trap for MIPS.  If only unconditional traps
are available, conditional traps are handled with branches, as you'd
expect.

More importantly, the trap infrastructure is used in a number of other
features in the compiler, so by implementing it on an architecture,
you get a bunch of stuff for free.

-- 
- Geoffrey Keating <geoffk@cygnus.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 14:21               ` Greg McGary
  2000-03-27 14:30                 ` Jeffrey A Law
  2000-03-27 14:34                 ` Geoff Keating
@ 2000-03-27 22:07                 ` Greg McGary
  2000-03-28  1:55                   ` Richard Henderson
  2000-03-28  7:05                   ` Jeffrey A Law
  2 siblings, 2 replies; 38+ messages in thread
From: Greg McGary @ 2000-03-27 22:07 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc

Greg McGary <gkm@eng.ascend.com> writes:

> For i386, it makes sense the use the `bound' instruction.

I should qualify that: `bound' is optimal for space, but
it might run faster to decompose into register compares.
I haven't analyzed it yet.

> For MIPS, I'll want to generate a `break' instruction.

I now see that since ISA II, mips has had conditional trap insns, so
we should use those.

> > I can't think of any support on ppc for this that isn't already
> > represented in the machine description through the TRAP RTXs.
> 
> Conditional traps aren't widely available (only i960, m68k, powerpc,
> sparc), so that approach doesn't help all that much for portability.
> Is there something more that I'm missing?

Add PA, C[34]x and MIPS to that list.  Others can be simulated with
conditional branches around unconditional traps.  We still need CSE to
optimize away redundant conditional traps.  An advantage to emitting
separate conditional traps for low and high bounds checks is that the
separate lower-bound check can be more easily be moved out of a loop
for monotonically increasing pointer livs.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 22:07                 ` Greg McGary
@ 2000-03-28  1:55                   ` Richard Henderson
  2000-03-28  7:05                   ` Jeffrey A Law
  1 sibling, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2000-03-28  1:55 UTC (permalink / raw)
  To: Greg McGary; +Cc: Geoff Keating, gcc

On Mon, Mar 27, 2000 at 11:07:11PM -0700, Greg McGary wrote:
> > For i386, it makes sense the use the `bound' instruction.
> 
> I should qualify that: `bound' is optimal for space, but
> it might run faster to decompose into register compares.

`bound' is a horribly slow instruction on all processors.

> Add PA, C[34]x and MIPS to that list.  Others can be simulated with
> conditional branches around unconditional traps.

Alpha has a particular sequence that's supposed to be
used to specifically report out of bounds conditions:

	ldi $16, GEN_SUBRNG
	call_pal gentrap


r~

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 22:07                 ` Greg McGary
  2000-03-28  1:55                   ` Richard Henderson
@ 2000-03-28  7:05                   ` Jeffrey A Law
  2000-03-28  9:28                     ` Greg McGary
  2000-03-28  9:57                     ` Joern Rennecke
  1 sibling, 2 replies; 38+ messages in thread
From: Jeffrey A Law @ 2000-03-28  7:05 UTC (permalink / raw)
  To: Greg McGary; +Cc: Geoff Keating, gcc

  In message < ms3dpbr5kg.fsf@gkm-dsl-194.ascend.com >you write:
  > optimize away redundant conditional traps.  An advantage to emitting
  > separate conditional traps for low and high bounds checks is that the
  > separate lower-bound check can be more easily be moved out of a loop
  > for monotonically increasing pointer livs.
I hadn't thought of that.  It assumes you can't overflow, but that's
probably a reasonable simplification for pointer operations.

Is there any advantage to not always emitting the high/low bounds checks
separately?

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28  7:05                   ` Jeffrey A Law
@ 2000-03-28  9:28                     ` Greg McGary
  2000-03-28  9:48                       ` Jeffrey A Law
  2000-03-28  9:57                     ` Joern Rennecke
  1 sibling, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-03-28  9:28 UTC (permalink / raw)
  To: law; +Cc: Geoff Keating, gcc

Jeffrey A Law <law@cygnus.com> writes:

> Is there any advantage to not always emitting the high/low bounds checks
> separately?

For i960 there's a small advantage that if you emit checks together,
you can use concmp and save one instruction ("cmpo; concmpo; fault"
vs. "cmpo; fault; cmpo; fault").  Big deal...

For i386, if you check both at once, you can use the `bound'
instruction, which is slower, but very compact.

In summary, there will be some small space advantage for any target
that has special instructions that support bounds checking.  For
multiple-issue CPUs, the space advantage will come at the expense of
runtime, because the checks would likely run faster if they were
broken apart to improve scheduling.  It should be easy enough to
implement each and benchmark to quantify the difference, or even to
implement both, and choose at compile time based on the value of
`optimize_size'.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28  9:28                     ` Greg McGary
@ 2000-03-28  9:48                       ` Jeffrey A Law
  2000-03-28 11:30                         ` Geoff Keating
  0 siblings, 1 reply; 38+ messages in thread
From: Jeffrey A Law @ 2000-03-28  9:48 UTC (permalink / raw)
  To: Greg McGary; +Cc: Geoff Keating, gcc

  In message < msem8vovhj.fsf@gkm-dsl-194.ascend.com >you write:
  > Jeffrey A Law <law@cygnus.com> writes:
  > 
  > > Is there any advantage to not always emitting the high/low bounds checks
  > > separately?
  > 
  > For i960 there's a small advantage that if you emit checks together,
  > you can use concmp and save one instruction ("cmpo; concmpo; fault"
  > vs. "cmpo; fault; cmpo; fault").  Big deal...
That indicates to me that we have 3 primitives.

  1. check high bounds
  2. check low bounds
  3. check high and low bounds

When we generate code we should first check for the availability of #3, then
fall back to #1 & #2, then fall back to whatever scheme we can devise using
unconditional break/trap instructions or abort calls.


  > For i386, if you check both at once, you can use the `bound'
  > instruction, which is slower, but very compact.
That's easily handled by making the check high & low bounds instruction
be conditional on optimize_size and provide additional patterns to theck
the high bound and another pattern to check the low bound.


  > In summary, there will be some small space advantage for any target
  > that has special instructions that support bounds checking.  For
  > multiple-issue CPUs, the space advantage will come at the expense of
  > runtime, because the checks would likely run faster if they were
  > broken apart to improve scheduling.  It should be easy enough to
  > implement each and benchmark to quantify the difference, or even to
  > implement both, and choose at compile time based on the value of
  > `optimize_size'.
Well, I'm not 100% you'll see any significant optimization advantage
due to better scheduling -- it largely depends on how the compiler 
treats the bounds checking instructions.

For example, if the compiler treats them as ending the current basic block,
then that's going to inhibit a lot of optimizations.  However, that might
be necessary to prevent the compiler from over-optimizing in the presence
of bounds checking instructions.  I don't really know.



jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28  9:48                       ` Jeffrey A Law
@ 2000-03-28 11:30                         ` Geoff Keating
  2000-03-28 12:26                           ` Greg McGary
  2000-03-28 14:21                           ` Greg McGary
  0 siblings, 2 replies; 38+ messages in thread
From: Geoff Keating @ 2000-03-28 11:30 UTC (permalink / raw)
  To: law; +Cc: gkm, gcc

> cc: Geoff Keating <geoffk@cygnus.com>, gcc@gcc.gnu.org
> Reply-To: law@cygnus.com
> Date: Tue, 28 Mar 2000 10:40:52 -0700
> From: Jeffrey A Law <law@cygnus.com>
> 
> 
>   In message < msem8vovhj.fsf@gkm-dsl-194.ascend.com >you write:
>   > Jeffrey A Law <law@cygnus.com> writes:
>   > 
>   > > Is there any advantage to not always emitting the high/low bounds checks
>   > > separately?
>   > 
>   > For i960 there's a small advantage that if you emit checks together,
>   > you can use concmp and save one instruction ("cmpo; concmpo; fault"
>   > vs. "cmpo; fault; cmpo; fault").  Big deal...
> That indicates to me that we have 3 primitives.
> 
>   1. check high bounds
>   2. check low bounds
>   3. check high and low bounds
> 
> When we generate code we should first check for the availability of #3, then
> fall back to #1 & #2, then fall back to whatever scheme we can devise using
> unconditional break/trap instructions or abort calls.

Note that you can do (3) with a subtraction and a single conditional
trap instruction---eg. if you want to check whether 'r3' is between 4
and 10, you can do

    tmp = r3 - 4
    if ((unsigned)r3 > 6)  trap;

or

    tmp = r3 - 4 + MIN_INT;
    if ((int)r3 > MIN_INT + 6)  trap;

depending on what's available.  The machine-independent code should be
able to work this out.

Of course, this only makes sense if a conditional trap is slower than
a subtraction, which is not the case on at least ppc (they are both
one cycle each).  If there are no conditional traps, probably
gcc should already be trying to do this for conditional branches.

>   > For i386, if you check both at once, you can use the `bound'
>   > instruction, which is slower, but very compact.
> That's easily handled by making the check high & low bounds instruction
> be conditional on optimize_size and provide additional patterns to theck
> the high bound and another pattern to check the low bound.

It certainly sounds like a good idea on x86 to have the use of the
'bound' insn conditional on optimize_size.

> For example, if the compiler treats them as ending the current basic block,
> then that's going to inhibit a lot of optimizations.  However, that might
> be necessary to prevent the compiler from over-optimizing in the presence
> of bounds checking instructions.  I don't really know.

They shouldn't end a basic block.  So long as the compiler knows that
'trap' RTXs are memory barriers, I think it can optimise them just
like any other insn; and I think that's what it now does.

-- 
- Geoffrey Keating <geoffk@cygnus.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 11:30                         ` Geoff Keating
@ 2000-03-28 12:26                           ` Greg McGary
  2000-03-28 12:30                             ` Geoff Keating
  2000-03-28 14:21                           ` Greg McGary
  1 sibling, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-03-28 12:26 UTC (permalink / raw)
  To: Geoff Keating; +Cc: law, gcc

Geoff Keating <geoffk@cygnus.com> writes:

> They shouldn't end a basic block.  So long as the compiler knows that
> 'trap' RTXs are memory barriers, I think it can optimise them just
> like any other insn; and I think that's what it now does.

Please expand on this.  I'm not sure what you mean here.  My only
knowledge of barriers is what's in the gcc manual, namely that they
mark places where control cannot flow past.  However, this is not the
case with bounds checks.  Most often, control flows past the checks
because the trap condition is false.  Even when the trap condition is
true, the OS should have the option of logging the failure then
resuming execution rather than simply terminating.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 12:26                           ` Greg McGary
@ 2000-03-28 12:30                             ` Geoff Keating
  2000-03-28 12:59                               ` Greg McGary
  0 siblings, 1 reply; 38+ messages in thread
From: Geoff Keating @ 2000-03-28 12:30 UTC (permalink / raw)
  To: gkm; +Cc: law, gcc

> Cc: law@cygnus.com, gcc@gcc.gnu.org
> From: Greg McGary <gkm@eng.ascend.com>
> Date: 28 Mar 2000 13:25:51 -0700
> 
> Geoff Keating <geoffk@cygnus.com> writes:
> 
> > They shouldn't end a basic block.  So long as the compiler knows that
> > 'trap' RTXs are memory barriers, I think it can optimise them just
> > like any other insn; and I think that's what it now does.
> 
> Please expand on this.  I'm not sure what you mean here.  My only
> knowledge of barriers is what's in the gcc manual, namely that they
> mark places where control cannot flow past.  However, this is not the
> case with bounds checks.  Most often, control flows past the checks
> because the trap condition is false.  Even when the trap condition is
> true, the OS should have the option of logging the failure then
> resuming execution rather than simply terminating.

By 'memory barrier', I mean an instruction over which it is not safe
to move memory accesses.  It would be annoying if in the code sequence

   if (i > 10) trap;
   a[i] = 3;

the store to the array was moved before the trap.

-- 
- Geoffrey Keating <geoffk@cygnus.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 12:30                             ` Geoff Keating
@ 2000-03-28 12:59                               ` Greg McGary
  2000-03-28 13:12                                 ` Greg McGary
                                                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Greg McGary @ 2000-03-28 12:59 UTC (permalink / raw)
  To: Geoff Keating; +Cc: law, gcc

Geoff Keating <geoffk@cygnus.com> writes:

> By 'memory barrier', I mean an instruction over which it is not safe
> to move memory accesses.  It would be annoying if in the code sequence
> 
>    if (i > 10) trap;
>    a[i] = 3;
> 
> the store to the array was moved before the trap.

I'm not sure I agree that it's annoying to reorder bounds checks &
memory accesses.  What matters is that we get the trap for the bounds
violation, and that the bounds-violating code is identifiable from the
trap.  The order in which the check & associated memory reference
occurs is only a concern if you're going to somehow try and prevent or
alter the memory access on the basis of having gotten the trap.
That's way beyond what we're trying to do here.  If the memory access
gives us a SEGV, it doesn't matter whether or not the trap came
first--we still found the bug.  IF the memory access doesn't SEGV, the
trap will uncover the bug regardless of whether it comes before or
after the access.  Therefore, if the memory-barrier behavior for
bounds checks inhibits optimization in any meaningful then I think we
should drop the barrier semantics--it's not worth the cost.

Is there some other advantage to memory-barrier semantics wrt. bounds
checks that I'm overlooking?

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 12:59                               ` Greg McGary
@ 2000-03-28 13:12                                 ` Greg McGary
  2000-03-29 10:17                                   ` Joe Buck
  2000-03-28 13:41                                 ` Alan Lehotsky
  2001-09-04 23:52                                 ` Tom Tromey
  2 siblings, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-03-28 13:12 UTC (permalink / raw)
  To: Geoff Keating; +Cc: law, gcc

Greg McGary <gkm@eng.ascend.com> writes:

> Is there some other advantage to memory-barrier semantics wrt. bounds
> checks that I'm overlooking?

I just thought of one: in a flat-memory-model system (say an embedded
system), it might useful to have a policy whereby a task is terminated
immediately upon detection of a bounds violation, and it's desirable
to terminate it before it has a chance to scribble on some other
task's memory.  OTOH, this is tricky business: you need some means of
releasing resources locked by the newly terminated task and rolling
back any partially completed work.  It sounds like a ton of work to
implement correctly, so the benefit of memory-barrier semantics is not
easily realized.

Does anyone know what optimizations barrier semantics will inhibit?
Instruction scheduling and code motion in the loop optimizer are the
two that spring to mind.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 13:12                                 ` Greg McGary
@ 2000-03-29 10:17                                   ` Joe Buck
  0 siblings, 0 replies; 38+ messages in thread
From: Joe Buck @ 2000-03-29 10:17 UTC (permalink / raw)
  To: Greg McGary; +Cc: Geoff Keating, law, gcc

Greg McGary <gkm@eng.ascend.com> writes:
> 
> > Is there some other advantage to memory-barrier semantics wrt. bounds
> > checks that I'm overlooking?
> 
> I just thought of one: in a flat-memory-model system (say an embedded
> system), it might useful to have a policy whereby a task is terminated
> immediately upon detection of a bounds violation, and it's desirable
> to terminate it before it has a chance to scribble on some other
> task's memory.  OTOH, this is tricky business: you need some means of
> releasing resources locked by the newly terminated task and rolling
> back any partially completed work.  It sounds like a ton of work to
> implement correctly, so the benefit of memory-barrier semantics is not
> easily realized.

If the function that terminates the task is supplied by the RTOS (or can
be overriden), then the RTOS can take care of releasing resources owned by
the task that were provided by the RTOS (e.g. semaphores, allocated memory
segments, reserved ports) when the task is killed.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 12:59                               ` Greg McGary
  2000-03-28 13:12                                 ` Greg McGary
@ 2000-03-28 13:41                                 ` Alan Lehotsky
  2000-03-28 14:25                                   ` Greg McGary
  2001-09-04 23:52                                 ` Tom Tromey
  2 siblings, 1 reply; 38+ messages in thread
From: Alan Lehotsky @ 2000-03-28 13:41 UTC (permalink / raw)
  To: gkm; +Cc: geoffk, law, gcc

One advantage of checking before stomping is that if stomping
memory destroys your stack, it is darn hard to debug the problem...

-- AL

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 13:41                                 ` Alan Lehotsky
@ 2000-03-28 14:25                                   ` Greg McGary
  0 siblings, 0 replies; 38+ messages in thread
From: Greg McGary @ 2000-03-28 14:25 UTC (permalink / raw)
  To: Alan Lehotsky; +Cc: geoffk, law, gcc

Alan Lehotsky <lehotsky@sunspot.tiac.net> writes:

> One advantage of checking before stomping is that if stomping
> memory destroys your stack, it is darn hard to debug the problem...

Excellent point.  I think the thing to do is implement both
alternatives, benchmark and see if there's any compelling reason to
relax barrier semantics.  If the performance difference is negligible,
then always use barrier semantics.  If the performance difference is
significant, then make it optional.  Some might wish to pay the
runtime penalty for a guarantee that bounds violations won't stomp the
stack.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 12:59                               ` Greg McGary
  2000-03-28 13:12                                 ` Greg McGary
  2000-03-28 13:41                                 ` Alan Lehotsky
@ 2001-09-04 23:52                                 ` Tom Tromey
  2 siblings, 0 replies; 38+ messages in thread
From: Tom Tromey @ 2001-09-04 23:52 UTC (permalink / raw)
  To: gcc

>>>>> "Greg" == Greg McGary <gkm@eng.ascend.com> writes:

Greg> Is there some other advantage to memory-barrier semantics
Greg> wrt. bounds checks that I'm overlooking?

You once mentioned the idea of changing gcj to use your bounds
checking code.  Java has strict rules about when the exception must be
thrown.  (Whether it is feasible to change the gcj front end to use
this stuff, I don't know.)

Tom

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28 11:30                         ` Geoff Keating
  2000-03-28 12:26                           ` Greg McGary
@ 2000-03-28 14:21                           ` Greg McGary
  1 sibling, 0 replies; 38+ messages in thread
From: Greg McGary @ 2000-03-28 14:21 UTC (permalink / raw)
  To: Geoff Keating; +Cc: law, gcc

Geoff Keating <geoffk@cygnus.com> writes:

> Note that you can do (3) with a subtraction and a single conditional
> trap instruction---eg. if you want to check whether 'r3' is between 4
> and 10, you can do
> 
>     tmp = r3 - 4
>     if ((unsigned)r3 > 6)  trap;

you meant:
      if ((unsigned)tmp > 6)  trap;

> or
> 
>     tmp = r3 - 4 + MIN_INT;
>     if ((int)r3 > MIN_INT + 6)  trap;

you meant:
      if ((int)tmp > MIN_INT + 6)  trap;

> depending on what's available.  The machine-independent code should be
> able to work this out.

In practice, you can't do the 10-4 subtraction at compile time.  You
need two runtime subtractions (rV = value reg, rB = base reg, rX =
extent reg):

	rB = rV - rB
	rX = rX - rB
	trap_if rB > rX

> Of course, this only makes sense if a conditional trap is slower than
> a subtraction,

For an untaken conditional trap that is 2x slower than a subtraction,
the times are even for 2 traps vs. 2 subtractions + 1 trap, with the
advantage to the 2 traps because it doesn't clobber rB.  Untaken
conditional traps would need to be at least 3x slower to justify doing
the subtraction trick.  I'd bet dollars to donuts that there are is no
RISC that has untaken conditional traps that take 3+ clocks.

> ... If there are no conditional traps, probably
> gcc should already be trying to do this for conditional branches.

I don't think the subtractions win for ix86, which is the only CISC
having no conditional traps that most people care about.  In order to
do the extent-base subtraction, you have to load the extent into a
register which you would not need to do otherwise.  Also, you need to
do the value-base subtraction in a temporary, in order to preserve rV.
With the subtraction approach you need to do this (in pseudo asm;
rV, rX and rT are registers holding value, extent and temp; `base' and
`extent' are references to those quantities in memory):

	load extent to rX
	subtract base from rX
	copy rV to rT
	subtract base from rT
	compare rT and rX
	branch to 0f if <=
	trap
    0:

With integrated simple compares & conditional branches:

	compare rV with base
	branch to 0f if >=
	compare rV with extent
	branch to 1f if <
    0:  trap
    1:

With independent compares & conditional branches:

	compare rV with base
	branch to 0f if >=
	trap
    0:
	...
	compare rV with extent
	branch to 1f if <
        trap
    1:

Both of the simpler compare/branch sequences are as good as or better
than the subtractions for time & space, and they consume fewer
registers.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-28  7:05                   ` Jeffrey A Law
  2000-03-28  9:28                     ` Greg McGary
@ 2000-03-28  9:57                     ` Joern Rennecke
  1 sibling, 0 replies; 38+ messages in thread
From: Joern Rennecke @ 2000-03-28  9:57 UTC (permalink / raw)
  To: law; +Cc: Greg McGary, Geoff Keating, gcc

> I hadn't thought of that.  It assumes you can't overflow, but that's
> probably a reasonable simplification for pointer operations.

If bounds checking is wanted, I don't think it is reasonable to assume
that there can't be an overflow - after all, the point of bounds checking
is to detect bugs.

However, in most cases we will be able to deduct that there can't be an
overflow.  That is:
- if a bounds check is done in every iteration, and
- if sum of the upper bound (rounded down for alignment if a STRICT_ALIGNMENT
  memory access is done at every iteration), plus the increment (or the
  upper bound of the value range of the increment if variable, but in a
  known range) doesn't overflow.
> 
> Is there any advantage to not always emitting the high/low bounds checks
> separately?

Yes, a combined check is simpler.  Even if you don't have a specific
instruction for a bounds check, you can subtract the lower bound, then
compare the upper bound minus the lower bound unsigned to the subtraction
result.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-27 13:05           ` Greg McGary
  2000-03-27 13:54             ` Geoff Keating
@ 2000-03-29 12:22             ` Jeffrey A Law
  2000-03-29 13:35               ` Geoff Keating
  2000-04-07  9:57               ` Greg McGary
  1 sibling, 2 replies; 38+ messages in thread
From: Jeffrey A Law @ 2000-03-29 12:22 UTC (permalink / raw)
  To: Greg McGary; +Cc: Joern Rennecke, gcc

  In message < msaejkt998.fsf@gkm-dsl-194.ascend.com >you write:
  > > 
  > > I'd be much happier if we could find a clean way to get the optimizations
  > > you need using our existing optimizers.
  > 
  > We can do without a special pass.  CSE seems the best place to
  > eliminate redundant checks, so we'd just need a little bit of code to
  > handle the new check_bounds_rtx.  We can just forget about doing
  > default translation of check_bounds_rtx into primitive RTL, and
  > require that targets supporting bounds checking handle it in the
  > machine description.  So far, all of the targets I want to support
  > initially (i960, PowerPC, i386, MIPS) will benefit from special
  > handling in the MD file anyway.
  > 
  > Does that sound reasonable?
Seems reasonable.  We should probably have some way to automatically warn the
user if they try to use bounded pointers on a target which does not have 
conditional traps, etc.

We could probably note if the target supports conditional traps during one
of the RTL initializers and warn when we encounter a [un]bounded keyword or
attribute during parsing.

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-29 12:22             ` Jeffrey A Law
@ 2000-03-29 13:35               ` Geoff Keating
  2000-04-07  9:57               ` Greg McGary
  1 sibling, 0 replies; 38+ messages in thread
From: Geoff Keating @ 2000-03-29 13:35 UTC (permalink / raw)
  To: gcc

Jeffrey A Law <law@cygnus.com> writes:

>   In message < msaejkt998.fsf@gkm-dsl-194.ascend.com >you write:
>   > > 
>   > > I'd be much happier if we could find a clean way to get the optimizations
>   > > you need using our existing optimizers.
>   > 
>   > We can do without a special pass.  CSE seems the best place to
>   > eliminate redundant checks, so we'd just need a little bit of code to
>   > handle the new check_bounds_rtx.  We can just forget about doing
>   > default translation of check_bounds_rtx into primitive RTL, and
>   > require that targets supporting bounds checking handle it in the
>   > machine description.  So far, all of the targets I want to support
>   > initially (i960, PowerPC, i386, MIPS) will benefit from special
>   > handling in the MD file anyway.
>   > 
>   > Does that sound reasonable?
> Seems reasonable.  We should probably have some way to automatically warn the
> user if they try to use bounded pointers on a target which does not have 
> conditional traps, etc.

like this code:

    case BUILT_IN_TRAP:
#ifdef HAVE_trap
      if (HAVE_trap)
	emit_insn (gen_trap ());
      else
#endif
	error ("__builtin_trap not supported by this target");
      emit_barrier ();
      return const0_rtx;

-- 
- Geoffrey Keating <geoffk@cygnus.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-03-29 12:22             ` Jeffrey A Law
  2000-03-29 13:35               ` Geoff Keating
@ 2000-04-07  9:57               ` Greg McGary
  2000-04-09 11:01                 ` Jeffrey A Law
  1 sibling, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-04-07  9:57 UTC (permalink / raw)
  To: law; +Cc: Joern Rennecke, gcc

I wrote:

> We can do without a special pass.  CSE seems the best place to
> eliminate redundant checks, so we'd just need a little bit of code to
> handle the new check_bounds_rtx.  We can just forget about doing
> default translation of check_bounds_rtx into primitive RTL, and
> require that targets supporting bounds checking handle it in the
> machine description.  So far, all of the targets I want to support
> initially (i960, PowerPC, i386, MIPS) will benefit from special
> handling in the MD file anyway.

Happy news: this turned out to be less intrusive than I feared it
might become.  I emit checks as trees like so:

------------------------------------------------------------------------------
  value = build_bounded_ptr_value_ref (bp);
  base = build_bounded_ptr_base_ref (bp);
  extent = build_bounded_ptr_extent_ref (bp);

  /* GKM FIXME: fold needs to be enhanced to evaluate expressions of
     the form (a >= (a + i)) ==> 0, where `a' is an address and `i' is
     a positive integer.  */
  compare = fold (build_binary_op (TRUTH_ORIF_EXPR,
				   fold (build_binary_op (LT_EXPR, value, base, 1)),
				   fold (build_binary_op (GE_EXPR, value, extent, 1)), 1));
  if (integer_zerop (compare))
    return value;
  else
    {
      tree crash = build_compound_expr (tree_cons (NULL_TREE,
						   build_function_call (trap_fndecl, NULL_TREE),
						   tree_cons (NULL_TREE, integer_zero_node,
							      NULL_TREE)));
      tree maybe_crash = fold (build_binary_op (TRUTH_ANDIF_EXPR, compare, crash, 1));
      tree check = fold (build_compound_expr (tree_cons (NULL_TREE, maybe_crash,
							 tree_cons (NULL_TREE, value,
								    NULL_TREE))));
      if (integer_onep (compare))
	warning ("bounds violation");
      return check;
    }
------------------------------------------------------------------------------
(The integer_zerop check isn't really necessary since fold will do the
same thing...  Should I bother with such micro-efficiencies?)

The jump optimizer already automagically converts conditional jumps
around traps into conditional traps, if they're defined for the
machine, otherwise they remain as unconditional traps.

Conditional traps are much easier to deal with for optimization since
they hide branches that would otherwise chop up basic blocks and they
are readily identifiable as checks, so I defined pseudo conditional
trap insns for ix86, like so:

------------------------------------------------------------------------------
(define_insn "trap"
  [(trap_if (const_int 1) (const_int 5))]
  ""
  "int\\t%0")

;;; ix86 doesn't have conditional traps, but we fake them for the sake
;;; of bounds checking.  By emitting bounds checks as conditional
;;; traps rather than as conditional jumps around unconditional traps
;;; we avoid introducing spurious basic-block boundaries and facilitate
;;; elimination of redundant checks.

(define_expand "conditional_trap"
  [(trap_if (match_operator 0 "comparison_operator"
	     [(match_dup 2) (const_int 0)])
	    (match_operand 1 "const_int_operand" ""))]
  ""
  "
{
  enum rtx_code code = GET_CODE (operands[0]);
  int unordered = (code == EQ || code == NE);
  emit_insn (gen_rtx_TRAP_IF (VOIDmode,
			      ix86_expand_compare (code, unordered),
			      operands[1]));
  DONE;
}")

(define_insn ""
  [(trap_if (match_operator 0 "comparison_operator"
	     [(reg 17) (const_int 0)])
	    (match_operand 1 "const_int_operand" ""))]
  ""
  "j%c0\\t1f\;int\\t%1\\n1:")
------------------------------------------------------------------------------
(Should I somehow mark conditional traps as being generated
automatically by bounds checking, and only optimize those away, or
should I just document the as yet undocumented __builtin_trap as
subject to this optimization?)

It is very easy to eliminate redundant checks at the end of the main
loop in combine_instructions:

	  if (GET_CODE (insn) == INSN
	      && GET_CODE (PATTERN (insn)) == TRAP_IF)
	    {
	      for (links = trap_insns; links; links = XEXP (links, 1))
		{
		  rtx earlier = XEXP (links, 0);
		  if (rtx_and_links_equal_p (insn, earlier))
		    {
		      /* This conditional trap is redundant, so delete
			 it after giving its notes to the earlier insn.  */
		      distribute_notes (REG_NOTES (insn), insn, earlier, earlier,
					NULL_RTX, NULL_RTX);
		      PUT_CODE (insn, NOTE);
		      NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED;
		      NOTE_SOURCE_FILE (insn) = 0;
		      break;
		    }
		}
	      if (GET_CODE (insn) == INSN)
		/* This conditional trap is unique, so remember it.  */
		trap_insns = alloc_INSN_LIST (insn, trap_insns);


...

/* Compare two insns and all the REG assignments they depend upon for
   equality.  We use this in order to determine if two conditional traps
   are identical, so that we may eliminate redundancies.  */

static int
rtx_and_links_equal_p (insn1, insn2)
     rtx insn1;
     rtx insn2;
{
  if (rtx_equal_p (PATTERN (insn1), PATTERN (insn2)))
    {
      rtx links1 = LOG_LINKS (insn1);
      rtx links2 = LOG_LINKS (insn2);
      while (links1 && links2)
	{
	  if (! rtx_and_links_equal_p (XEXP (links1, 0), XEXP (links2, 0)))
	    return 0;
	  links1 = XEXP (links1, 1);
	  links2 = XEXP (links2, 1);
	}
      return 1;
    }
  return 0;
}
------------------------------------------------------------------------------

I can now build runnable GNU libc-2.1 + most of GNU fileutils &
textutils with -fbounded-pointers.  I even found a bounds-violation
bug in GNU pr today.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-04-07  9:57               ` Greg McGary
@ 2000-04-09 11:01                 ` Jeffrey A Law
  2000-04-09 11:38                   ` Greg McGary
  2000-04-09 16:26                   ` Greg McGary
  0 siblings, 2 replies; 38+ messages in thread
From: Jeffrey A Law @ 2000-04-09 11:01 UTC (permalink / raw)
  To: Greg McGary; +Cc: Joern Rennecke, gcc

  In message < ms7le93l40.fsf@tucson-net-82.eng.ascend.com >you write:
   > > We can do without a special pass.  CSE seems the best place to
  > > eliminate redundant checks, so we'd just need a little bit of code to
  > > handle the new check_bounds_rtx.  We can just forget about doing
  > > default translation of check_bounds_rtx into primitive RTL, and
  > > require that targets supporting bounds checking handle it in the
  > > machine description.  So far, all of the targets I want to support
  > > initially (i960, PowerPC, i386, MIPS) will benefit from special
  > > handling in the MD file anyway.
  > 
  > Happy news: this turned out to be less intrusive than I feared it
  > might become.  
Cool.

  > ---------------------------------------------------------------------------
  > ---
  >   value = build_bounded_ptr_value_ref (bp);
  >   base = build_bounded_ptr_base_ref (bp);
  >   extent = build_bounded_ptr_extent_ref (bp);
  > 
  >   /* GKM FIXME: fold needs to be enhanced to evaluate expressions of
  >      the form (a >= (a + i)) ==> 0, where `a' is an address and `i' is
  >      a positive integer.  */
I'm surprised fold doesn't already do that kind of transformation.  There
may be obscure issues with signed vs unsigned pointers and such here.

Anyway, it would seem relatively straightforward to me to add that class of
optimizations to fold.  We'd probably need to restrict them to cases where
'a' is a pointer type though.

  >   compare = fold (build_binary_op (TRUTH_ORIF_EXPR,
  > 				   fold (build_binary_op (LT_EXPR, value, base,
  >  1)),
  > 				   fold (build_binary_op (GE_EXPR, value, exten
  > t, 1)), 1));
  >   if (integer_zerop (compare))
I wonder if this can be done via a call into the range optimizers.  Their
job is to optimize range checking.

  >       tree crash = build_compound_expr (tree_cons (NULL_TREE,
  > 						   build_function_call (trap_fn
  > decl, NULL_TREE),
  > 						   tree_cons (NULL_TREE, intege
  > r_zero_node,
  > 							      NULL_TREE)));
  >       tree maybe_crash = fold (build_binary_op (TRUTH_ANDIF_EXPR, compare, 
  > crash, 1));
I don't understand this -- it looks like you're expecting the trap function
to return a value.  Is that correct?  Presumably it will be a runtime value,
not a compile-time value?

  >       tree check = fold (build_compound_expr (tree_cons (NULL_TREE, maybe_c
  > rash,
  > 							 tree_cons (NULL_TREE, 
  > value,
  > 								    NULL_TREE))
  > ));
  >       if (integer_onep (compare))
  > 	warning ("bounds violation");
  >       return check;
Seems to me check will never be integer_onep unless we know trap_fn always
returns a nonzero value.

  > (The integer_zerop check isn't really necessary since fold will do the
  > same thing...  Should I bother with such micro-efficiencies?)
I don't follow exactly, possibly because I don't know where you propose
to insert this code.

  > The jump optimizer already automagically converts conditional jumps
  > around traps into conditional traps, if they're defined for the
  > machine, otherwise they remain as unconditional traps.
Right.


  > ;;; ix86 doesn't have conditional traps, but we fake them for the sake
  > ;;; of bounds checking.  By emitting bounds checks as conditional
  > ;;; traps rather than as conditional jumps around unconditional traps
  > ;;; we avoid introducing spurious basic-block boundaries and facilitate
  > ;;; elimination of redundant checks.
This style may end up being the recommended approach; not really sure.  Maybe
we just leave it up to the backend folks for each port with a note that
this kind of approach might be more efficient.


  > (Should I somehow mark conditional traps as being generated
  > automatically by bounds checking, and only optimize those away, or
  > should I just document the as yet undocumented __builtin_trap as
  > subject to this optimization?)
I don't think so.


  > It is very easy to eliminate redundant checks at the end of the main
  > loop in combine_instructions:
This looks very compile-time expensive since you have to walk the LOG_LINKs
chain all the way back to the top of the dependency chain.

Presumably you're trying to delete one conditional trap that is subsumed by
an earlier conditional trap?  If so, that might be best modeled after an
identical optimization we do on jumps.   See jump.c

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-04-09 11:01                 ` Jeffrey A Law
@ 2000-04-09 11:38                   ` Greg McGary
  2000-04-10 10:13                     ` Jeffrey A Law
  2000-04-09 16:26                   ` Greg McGary
  1 sibling, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-04-09 11:38 UTC (permalink / raw)
  To: law; +Cc: Joern Rennecke, gcc

Jeffrey A Law <law@cygnus.com> writes:

>   >       tree crash = build_compound_expr (tree_cons (NULL_TREE,
>   > 						   build_function_call (trap_fn
>   > decl, NULL_TREE),
>   > 						   tree_cons (NULL_TREE, intege
>   > r_zero_node,
>   > 							      NULL_TREE)));
>   >       tree maybe_crash = fold (build_binary_op (TRUTH_ANDIF_EXPR, compare, 
>   > crash, 1));
> I don't understand this -- it looks like you're expecting the trap function
> to return a value.  Is that correct?  Presumably it will be a runtime value,
> not a compile-time value?

The thing is that __builtin_trap doesn't return a value, but the
expression I'm building wants a value, so I wrap it in `(__builtin_trap (), 0)'.

The whole expansion is this:

(((p.value < p.base || p.value >= p.extent)
  && (__builtin_trap (), 0)),
 p.value)

Without the compound expr around __builtin_trap, it's this:

(((p.value < p.base || p.value >= p.extent)
  && __builtin_trap),
 p.value)

but here gcc complains about the truth_andif_expr not returning a
value, even though we ignore the value anyway, since I'm just using
the short-circuit behavior of `&&' to evaluate __builtin_trap trap if
the predicate is true.  The value of the entire check expression
wants to be p.value.

>   >       tree check = fold (build_compound_expr (tree_cons (NULL_TREE, maybe_c
>   > rash,
>   > 							 tree_cons (NULL_TREE, 
>   > value,
>   > 								    NULL_TREE))
>   > ));
>   >       if (integer_onep (compare))
>   > 	warning ("bounds violation");
>   >       return check;
> Seems to me check will never be integer_onep unless we know trap_fn always
> returns a nonzero value.

The variable `compare' represents the bounds tests:

	(p.value < p.base || p.value >= p.extent)

The variable `maybe_crash' represents bounds tests & the conditional trap:

	((p.value < p.base || p.value >= p.extent)
	 && (__builtin_trap (), 0))

The variable `check' represents the whole package, which returns the pointer value:

	(((p.value < p.base || p.value >= p.extent)
	  && __builtin_trap),
	 p.value)

Obviously, I should comment better, or choose better variable names.

>   > (Should I somehow mark conditional traps as being generated
>   > automatically by bounds checking, and only optimize those away, or
>   > should I just document the as yet undocumented __builtin_trap as
>   > subject to this optimization?)
> I don't think so.

There are two questions here.  Which one are you answering?
(a) only optimize away redundant trap_if associate with BPs?
(b) optimize away all redundant trap_if, and document that behavior
    for __builtin_trap?  (incidentally, __builtin_trap is undocumented)

>   > It is very easy to eliminate redundant checks at the end of the main
>   > loop in combine_instructions:
> This looks very compile-time expensive since you have to walk the LOG_LINKs
> chain all the way back to the top of the dependency chain.

I'll instrument it, but from limited observation, I have see that we
only walk back one or two level, since pointers are usually not
subjected to complicated arithmetic with many subexpressions.

> Presumably you're trying to delete one conditional trap that is subsumed by
> an earlier conditional trap?  If so, that might be best modeled after an
> identical optimization we do on jumps.   See jump.c

OK.  Thanks for the tips.

Greg

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-04-09 11:38                   ` Greg McGary
@ 2000-04-10 10:13                     ` Jeffrey A Law
  0 siblings, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 2000-04-10 10:13 UTC (permalink / raw)
  To: Greg McGary; +Cc: Joern Rennecke, gcc

  In message < mszor3unld.fsf@tucson-net-82.eng.ascend.com >you write:
  > >   > (Should I somehow mark conditional traps as being generated
  > >   > automatically by bounds checking, and only optimize those away, or
  > >   > should I just document the as yet undocumented __builtin_trap as
  > >   > subject to this optimization?)
  > > I don't think so.
  > 
  > There are two questions here.  Which one are you answering?
  > (a) only optimize away redundant trap_if associate with BPs?
  > (b) optimize away all redundant trap_if, and document that behavior
  >     for __builtin_trap?  (incidentally, __builtin_trap is undocumented)
I think we should always remove redundant trap_ifs that we find, regardless
of whether they're created by bounded-pointers or some other mechanism.
Documenting this behavior (along with __builtin_trap) would be wise.

jeff


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-04-09 11:01                 ` Jeffrey A Law
  2000-04-09 11:38                   ` Greg McGary
@ 2000-04-09 16:26                   ` Greg McGary
  2000-04-10 10:20                     ` Jeffrey A Law
  1 sibling, 1 reply; 38+ messages in thread
From: Greg McGary @ 2000-04-09 16:26 UTC (permalink / raw)
  To: law; +Cc: Joern Rennecke, gcc

Jeffrey A Law <law@cygnus.com> writes:

>   > It is very easy to eliminate redundant checks at the end of the main
>   > loop in combine_instructions:
> This looks very compile-time expensive since you have to walk the LOG_LINKs
> chain all the way back to the top of the dependency chain.

I'll look for a cheaper way to do this.

> Presumably you're trying to delete one conditional trap that is subsumed by
> an earlier conditional trap?

I'm not sure what you mean by "subsumed" in this context.  What I'm
trying to optimize is this case:

	if (x)
	  p->a = 1;
	p->b = 2;
	p->c = 3;
	p->d = 4;
	p++;
	p->e = 5;
	p->f = 6;
	p->g = 7;
	p->h = 8;

Without optimization, the bounds of `p' are checked eight times, once
per `->' operator.  The necessary checks are at `p->a = 1', `p->b =
2', `p->e = 5'; the rest are redundant and should be deleted.  p->a is
separated from the others by a BB boundary, so there's nothing special
here.  p->b .. p->h are in the same BB, but p->e .. p->h have LOG_LINKS
that point to the increment of p.value, and so differ from p->a .. p->d.

> If so, that might be best modeled after an
> identical optimization we do on jumps.   See jump.c

Would you kindly give a more specific clue about where to look in
jump.c?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Need advice on bounds checking approaches
  2000-04-09 16:26                   ` Greg McGary
@ 2000-04-10 10:20                     ` Jeffrey A Law
  0 siblings, 0 replies; 38+ messages in thread
From: Jeffrey A Law @ 2000-04-10 10:20 UTC (permalink / raw)
  To: Greg McGary; +Cc: Joern Rennecke, gcc

  In message < msr9cevotm.fsf@tucson-net-82.eng.ascend.com >you write:
  > > Presumably you're trying to delete one conditional trap that is subsumed 
  > by
  > > an earlier conditional trap?
  > 
  > I'm not sure what you mean by "subsumed" in this context.  What I'm
  > trying to optimize is this case:
  > 
  > 	if (x)
  > 	  p->a = 1;
  > 	p->b = 2;
  > 	p->c = 3;
  > 	p->d = 4;
  > 	p++;
  > 	p->e = 5;
  > 	p->f = 6;
  > 	p->g = 7;
  > 	p->h = 8;
  > 
  > Without optimization, the bounds of `p' are checked eight times, once
  > per `->' operator.  The necessary checks are at `p->a = 1', `p->b =
  > 2', `p->e = 5'; the rest are redundant and should be deleted.  p->a is
  > separated from the others by a BB boundary, so there's nothing special
  > here.  p->b .. p->h are in the same BB, but p->e .. p->h have LOG_LINKS
  > that point to the increment of p.value, and so differ from p->a .. p->d.
By subsumed I mean a check that is redundant due to an earlier check.


  > > If so, that might be best modeled after an
  > > identical optimization we do on jumps.   See jump.c
  > 
  > Would you kindly give a more specific clue about where to look in
  > jump.c?
thread_jumps I believe does something similar to what you need.  I also
believe CSE does similar things as part of it's follow-jumps/skip-blocks
optimizations.


jeff


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2001-09-04 23:52 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-03-24 11:18 Need advice on bounds checking approaches Greg McGary
2000-03-24 12:08 ` Joern Rennecke
2000-03-24 12:28   ` Greg McGary
2000-03-24 16:18     ` Jeffrey A Law
2000-03-24 16:50       ` Greg McGary
2000-03-24 17:27         ` Jamie Lokier
2000-03-27 12:30         ` Jeffrey A Law
2000-03-27 12:45           ` Mark Mitchell
2000-03-27 13:05           ` Greg McGary
2000-03-27 13:54             ` Geoff Keating
2000-03-27 14:21               ` Greg McGary
2000-03-27 14:30                 ` Jeffrey A Law
2000-03-27 19:23                   ` Michael Hayes
2000-03-27 14:34                 ` Geoff Keating
2000-03-27 22:07                 ` Greg McGary
2000-03-28  1:55                   ` Richard Henderson
2000-03-28  7:05                   ` Jeffrey A Law
2000-03-28  9:28                     ` Greg McGary
2000-03-28  9:48                       ` Jeffrey A Law
2000-03-28 11:30                         ` Geoff Keating
2000-03-28 12:26                           ` Greg McGary
2000-03-28 12:30                             ` Geoff Keating
2000-03-28 12:59                               ` Greg McGary
2000-03-28 13:12                                 ` Greg McGary
2000-03-29 10:17                                   ` Joe Buck
2000-03-28 13:41                                 ` Alan Lehotsky
2000-03-28 14:25                                   ` Greg McGary
2001-09-04 23:52                                 ` Tom Tromey
2000-03-28 14:21                           ` Greg McGary
2000-03-28  9:57                     ` Joern Rennecke
2000-03-29 12:22             ` Jeffrey A Law
2000-03-29 13:35               ` Geoff Keating
2000-04-07  9:57               ` Greg McGary
2000-04-09 11:01                 ` Jeffrey A Law
2000-04-09 11:38                   ` Greg McGary
2000-04-10 10:13                     ` Jeffrey A Law
2000-04-09 16:26                   ` Greg McGary
2000-04-10 10:20                     ` Jeffrey A Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).