* Need advice on bounds checking approaches @ 2000-03-24 11:18 Greg McGary 2000-03-24 12:08 ` Joern Rennecke 0 siblings, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-03-24 11:18 UTC (permalink / raw) To: gcc; +Cc: gkm The final implementation phase for bounded pointers is to generate the code that does the checks. There are some choices to make, and I'd appreciate hearing from experienced maintainers how best to do it. The primary goal is to do this in a way that supports optimizing away redundant checks. Checks need to be generated at the time of pointer dereference or array reference. Let's focus on pointer dereference. Consider this code: void foo (char *p) { *p = 1; } Recall that the type char * has been transformed internally to a bounded pointer type like this: struct charp { char *value, *base, *extent; }; So, our function is internally equivalent to this: void foo (struct charp p) { *p.value = 1; } One place to generate checks is in the IR, transforming the function into something like this: void foo (struct charp p) { *({ if (p.value < p.base || p.value >= p.extent) abort (); p.value; }) = 1; } This is easy to implement, but seems to have drawbacks for optimization: The if statement expands to a sequence of jumps whose presence creates new basic-block boundaries. Also, the resulting RTL isn't readily identifiable as a bounds check An alternate approach is to transform the function like so: void foo (struct charp p) { *__builtin_check_bounds (p) = 1; } Where __builtin_check_bounds accepts a bounded pointer argument and returns a simple pointer value, and as a side effect, injects bounds-checking RTL nodes into the insn stream DEF_RTL_EXPR(CHECK_BOUNDS, "check_bounds", "eee", 'x') The three args are pointer value, base & extent. Now, the optimization passes (CSE, most likely), can easily identify redundant checks and eliminate them. Moreover, if a machine has a bounds-checking instruction, or a better than normal insn sequence for bounds checking, it can define an insn to recognize the "check_bounds" pattern. E.g., the most efficient way to check bounds on i960 is with this sequence (ptr & bas stand for registers holding those BP components): cmpo ptr, bas ; cc=100 on failure concmp ext, ptr ; cc=010 on failure faultle.f If a machine can't do anything better then it will need to default to something like this (in pseudo asm): cmp ptr, bas blt 0f cmp ptr, ext blt 1f 0: call abort 1: Question: with the above plan, is there a way to provide a default expansion of the "check_bounds" pattern into primitive RTL (comparisons, conditional branches and call to abort) for those targets that don't define an insn for "check_bounds"? The i960 will define an insn for this, so that it can do better, but most other targets won't be able to do better and it would be nice to avoid having to hack every MD file. Is there some other, better way to go? Thanks, Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-24 11:18 Need advice on bounds checking approaches Greg McGary @ 2000-03-24 12:08 ` Joern Rennecke 2000-03-24 12:28 ` Greg McGary 0 siblings, 1 reply; 38+ messages in thread From: Joern Rennecke @ 2000-03-24 12:08 UTC (permalink / raw) To: Greg McGary; +Cc: gcc > Question: with the above plan, is there a way to provide a default > expansion of the "check_bounds" pattern into primitive RTL > (comparisons, conditional branches and call to abort) for those > targets that don't define an insn for "check_bounds"? You can use HAVE_check_bounds to test if a "check_bounds" pattern has been defined in the md file. If it is defined, you can use gen_check_bounds to generate this pattern. The expander might fail (e.g. because the target can implement the bounds checking insn only for a particular cpu_, in which case you get a zero return value from gen_check_bounds. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-24 12:08 ` Joern Rennecke @ 2000-03-24 12:28 ` Greg McGary 2000-03-24 16:18 ` Jeffrey A Law 0 siblings, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-03-24 12:28 UTC (permalink / raw) To: Joern Rennecke; +Cc: gcc Joern Rennecke <amylaar@cygnus.co.uk> writes: > > Question: with the above plan, is there a way to provide a default > > expansion of the "check_bounds" pattern into primitive RTL > > (comparisons, conditional branches and call to abort) for those > > targets that don't define an insn for "check_bounds"? > > You can use HAVE_check_bounds to test if a "check_bounds" pattern has been > defined in the md file. If it is defined, you can use gen_check_bounds > to generate this pattern. The expander might fail (e.g. because the > target can implement the bounds checking insn only for a particular > cpu_, in which case you get a zero return value from gen_check_bounds. I am aware of HAVE_check_bounds. Unfortunately, it doesn't do what I want. If HAVE_check_bounds, I could generate "check_bounds", and if ! HAVE_check_bounds, I could generate the primitive RTL (compares + conditional branches + abort). If ! HAVE_check_bounds, then I lose the ability to conveniently optimize away redundant checks, and my basic blocks get sliced up by the presence of extra branches. What I want is to always generate check_bounds insns, always optimize away redundant checks the same way, and then provide a default *expansion* if the MD doesn't define one. Is that feasible, or am I dreaming? Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-24 12:28 ` Greg McGary @ 2000-03-24 16:18 ` Jeffrey A Law 2000-03-24 16:50 ` Greg McGary 0 siblings, 1 reply; 38+ messages in thread From: Jeffrey A Law @ 2000-03-24 16:18 UTC (permalink / raw) To: Greg McGary; +Cc: Joern Rennecke, gcc In message < msitycyuyg.fsf@gkm-dsl-194.ascend.com >you write: > I am aware of HAVE_check_bounds. Unfortunately, it doesn't do what I > want. If HAVE_check_bounds, I could generate "check_bounds", and if > ! HAVE_check_bounds, I could generate the primitive RTL (compares + > conditional branches + abort). If ! HAVE_check_bounds, then I lose > the ability to conveniently optimize away redundant checks, and my > basic blocks get sliced up by the presence of extra branches. > > What I want is to always generate check_bounds insns, always optimize > away redundant checks the same way, and then provide a default > *expansion* if the MD doesn't define one. > > Is that feasible, or am I dreaming? Right now, you're dreaming. We have kicked around the idea of two levels of RTL where we do some optimizations on the higher level RTL, then drop down to a lower level RTL. Maybe it would make sense to do your optimizations at the tree level? We're working on functions-as-trees for the C front-end. Just a thought. jeff > > Greg > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-24 16:18 ` Jeffrey A Law @ 2000-03-24 16:50 ` Greg McGary 2000-03-24 17:27 ` Jamie Lokier 2000-03-27 12:30 ` Jeffrey A Law 0 siblings, 2 replies; 38+ messages in thread From: Greg McGary @ 2000-03-24 16:50 UTC (permalink / raw) To: law; +Cc: Joern Rennecke, gcc Jeffrey A Law <law@cygnus.com> writes: > We have kicked around the idea of two levels of RTL where we do some > optimizations on the higher level RTL, then drop down to a lower level > RTL. Perhaps I could do that in a limited way, by adding a BP-specific optimization pass after loop optimization that eliminates redundant checks and expands "check_bounds" into primitive RTL for targets that don't HAVE_check_bounds. Could that pass muster? > Maybe it would make sense to do your optimizations at the tree level? We're > working on functions-as-trees for the C front-end. Just a thought. It's a fine thought, but I can't do that today, can I? I only see whole function mode being used for C++ inlines. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-24 16:50 ` Greg McGary @ 2000-03-24 17:27 ` Jamie Lokier 2000-03-27 12:30 ` Jeffrey A Law 1 sibling, 0 replies; 38+ messages in thread From: Jamie Lokier @ 2000-03-24 17:27 UTC (permalink / raw) To: Greg McGary; +Cc: law, Joern Rennecke, gcc Greg McGary wrote: > > We have kicked around the idea of two levels of RTL where we do some > > optimizations on the higher level RTL, then drop down to a lower level > > RTL. > > Perhaps I could do that in a limited way, by adding a BP-specific > optimization pass after loop optimization that eliminates redundant > checks and expands "check_bounds" into primitive RTL for targets that > don't HAVE_check_bounds. Could that pass muster? The general idea works ok for constant_p. But that is removed relatively early. -- Jamie ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-24 16:50 ` Greg McGary 2000-03-24 17:27 ` Jamie Lokier @ 2000-03-27 12:30 ` Jeffrey A Law 2000-03-27 12:45 ` Mark Mitchell 2000-03-27 13:05 ` Greg McGary 1 sibling, 2 replies; 38+ messages in thread From: Jeffrey A Law @ 2000-03-27 12:30 UTC (permalink / raw) To: Greg McGary; +Cc: Joern Rennecke, gcc In message < msvh2byiu2.fsf@gkm-dsl-194.ascend.com >you write: > Jeffrey A Law <law@cygnus.com> writes: > > > We have kicked around the idea of two levels of RTL where we do some > > optimizations on the higher level RTL, then drop down to a lower level > > RTL. > > Perhaps I could do that in a limited way, by adding a BP-specific > optimization pass after loop optimization that eliminates redundant > checks and expands "check_bounds" into primitive RTL for targets that > don't HAVE_check_bounds. Could that pass muster? Possibly. However, I'm not a big fan of having feature specific optimization passes, even if we've had some in the past (addressof & constant_p). I'd be much happier if we could find a clean way to get the optimizations you need using our existing optimizers. > It's a fine thought, but I can't do that today, can I? I only see > whole function mode being used for C++ inlines. That's not my understanding -- you'd probably need to talk to Mark -- I was under the impression that we always did function at a time stuff for C++. Jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 12:30 ` Jeffrey A Law @ 2000-03-27 12:45 ` Mark Mitchell 2000-03-27 13:05 ` Greg McGary 1 sibling, 0 replies; 38+ messages in thread From: Mark Mitchell @ 2000-03-27 12:45 UTC (permalink / raw) To: law; +Cc: gkm, amylaar, gcc >>>>> "Jeffrey" == Jeffrey A Law <law@cygnus.com> writes: >> It's a fine thought, but I can't do that today, can I? I only >> see whole function mode being used for C++ inlines. Jeffrey> That's not my understanding -- you'd probably need to Jeffrey> talk to Mark -- I was under the impression that we always Jeffrey> did function at a time stuff for C++. Jeff's understanding is correct. -- Mark Mitchell mark@codesourcery.com CodeSourcery, LLC http://www.codesourcery.com ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 12:30 ` Jeffrey A Law 2000-03-27 12:45 ` Mark Mitchell @ 2000-03-27 13:05 ` Greg McGary 2000-03-27 13:54 ` Geoff Keating 2000-03-29 12:22 ` Jeffrey A Law 1 sibling, 2 replies; 38+ messages in thread From: Greg McGary @ 2000-03-27 13:05 UTC (permalink / raw) To: law; +Cc: Joern Rennecke, gcc Jeffrey A Law <law@cygnus.com> writes: > > Perhaps I could do that in a limited way, by adding a BP-specific > > optimization pass after loop optimization that eliminates redundant > > checks and expands "check_bounds" into primitive RTL for targets that > > don't HAVE_check_bounds. Could that pass muster? > > Possibly. However, I'm not a big fan of having feature specific > optimization passes, even if we've had some in the past (addressof & > constant_p). > > I'd be much happier if we could find a clean way to get the optimizations > you need using our existing optimizers. We can do without a special pass. CSE seems the best place to eliminate redundant checks, so we'd just need a little bit of code to handle the new check_bounds_rtx. We can just forget about doing default translation of check_bounds_rtx into primitive RTL, and require that targets supporting bounds checking handle it in the machine description. So far, all of the targets I want to support initially (i960, PowerPC, i386, MIPS) will benefit from special handling in the MD file anyway. Does that sound reasonable? > > It's a fine thought, but I can't do that today, can I? I only see > > whole function mode being used for C++ inlines. > > That's not my understanding -- you'd probably need to talk to Mark -- I > was under the impression that we always did function at a time stuff > for C++. You're right. I was careless in my initial reading of the condition surrounding the setting of cfun->x_whole_function_mode_p. Even so, this doesn't help me for C or ObjC, unless those will be converted soon. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 13:05 ` Greg McGary @ 2000-03-27 13:54 ` Geoff Keating 2000-03-27 14:21 ` Greg McGary 2000-03-29 12:22 ` Jeffrey A Law 1 sibling, 1 reply; 38+ messages in thread From: Geoff Keating @ 2000-03-27 13:54 UTC (permalink / raw) To: Greg McGary; +Cc: gcc Greg McGary <gkm@eng.ascend.com> writes: > We can do without a special pass. CSE seems the best place to > eliminate redundant checks, so we'd just need a little bit of code to > handle the new check_bounds_rtx. We can just forget about doing > default translation of check_bounds_rtx into primitive RTL, and > require that targets supporting bounds checking handle it in the > machine description. So far, all of the targets I want to support > initially (i960, PowerPC, i386, MIPS) will benefit from special > handling in the MD file anyway. > > Does that sound reasonable? Why would these targets need special handling in the MD file? I can't think of any support on ppc for this that isn't already represented in the machine description through the TRAP RTXs. -- - Geoffrey Keating <geoffk@cygnus.com> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 13:54 ` Geoff Keating @ 2000-03-27 14:21 ` Greg McGary 2000-03-27 14:30 ` Jeffrey A Law ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Greg McGary @ 2000-03-27 14:21 UTC (permalink / raw) To: Geoff Keating; +Cc: gcc Geoff Keating <geoffk@cygnus.com> writes: > Greg McGary <gkm@eng.ascend.com> writes: > > > We can do without a special pass. CSE seems the best place to > > eliminate redundant checks, so we'd just need a little bit of code to > > handle the new check_bounds_rtx. We can just forget about doing > > default translation of check_bounds_rtx into primitive RTL, and > > require that targets supporting bounds checking handle it in the > > machine description. So far, all of the targets I want to support > > initially (i960, PowerPC, i386, MIPS) will benefit from special > > handling in the MD file anyway. > > > > Does that sound reasonable? > > Why would these targets need special handling in the MD file? I didn't say "need", I said "benefit from". For i960, this sequence is optimal: cmpo ptr, bas concmp ext, ptr faultle.f For i386, it makes sense the use the `bound' instruction. For MIPS, I'll want to generate a `break' instruction. For PowerPC, I'll want to use conditional trap instructions. > I can't think of any support on ppc for this that isn't already > represented in the machine description through the TRAP RTXs. Conditional traps aren't widely available (only i960, m68k, powerpc, sparc), so that approach doesn't help all that much for portability. Is there something more that I'm missing? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 14:21 ` Greg McGary @ 2000-03-27 14:30 ` Jeffrey A Law 2000-03-27 19:23 ` Michael Hayes 2000-03-27 14:34 ` Geoff Keating 2000-03-27 22:07 ` Greg McGary 2 siblings, 1 reply; 38+ messages in thread From: Jeffrey A Law @ 2000-03-27 14:30 UTC (permalink / raw) To: Greg McGary; +Cc: Geoff Keating, gcc In message < ms8zz4rr51.fsf@gkm-dsl-194.ascend.com >you write: > > I can't think of any support on ppc for this that isn't already > > represented in the machine description through the TRAP RTXs. > > Conditional traps aren't widely available (only i960, m68k, powerpc, > sparc), so that approach doesn't help all that much for portability. > Is there something more that I'm missing? The PA has conditional traps. We've just never bothered adding them the MD file. jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 14:30 ` Jeffrey A Law @ 2000-03-27 19:23 ` Michael Hayes 0 siblings, 0 replies; 38+ messages in thread From: Michael Hayes @ 2000-03-27 19:23 UTC (permalink / raw) To: law; +Cc: Greg McGary, Geoff Keating, gcc Jeffrey A Law writes: > > In message < ms8zz4rr51.fsf@gkm-dsl-194.ascend.com >you write: > > > I can't think of any support on ppc for this that isn't already > > > represented in the machine description through the TRAP RTXs. > > > > Conditional traps aren't widely available (only i960, m68k, powerpc, > > sparc), so that approach doesn't help all that much for portability. > > Is there something more that I'm missing? > The PA has conditional traps. We've just never bothered adding them > the MD file. Ditto C3x/C4x. Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 14:21 ` Greg McGary 2000-03-27 14:30 ` Jeffrey A Law @ 2000-03-27 14:34 ` Geoff Keating 2000-03-27 22:07 ` Greg McGary 2 siblings, 0 replies; 38+ messages in thread From: Geoff Keating @ 2000-03-27 14:34 UTC (permalink / raw) To: gkm; +Cc: gcc > Cc: gcc@gcc.gnu.org > From: Greg McGary <gkm@eng.ascend.com> > Date: 27 Mar 2000 15:21:14 -0700 > For i960, this sequence is optimal: > > cmpo ptr, bas > concmp ext, ptr > faultle.f > > For i386, it makes sense the use the `bound' instruction. > > For MIPS, I'll want to generate a `break' instruction. > > For PowerPC, I'll want to use conditional trap instructions. > > > I can't think of any support on ppc for this that isn't already > > represented in the machine description through the TRAP RTXs. > > Conditional traps aren't widely available (only i960, m68k, powerpc, > sparc), so that approach doesn't help all that much for portability. > Is there something more that I'm missing? They're not widely available because no-one has bothered to implement them in the MD files for other targets. The hardware support is available on many platforms, including i386. For those platforms that don't have conditional traps, then presumably there is some way to generate an unconditional trap---if there is no guaranteed illegal instruction, and reading from location 0 won't do it, you can always branch to abort(). The 'break' instruction is, I believe, an unconditional trap for MIPS. If only unconditional traps are available, conditional traps are handled with branches, as you'd expect. More importantly, the trap infrastructure is used in a number of other features in the compiler, so by implementing it on an architecture, you get a bunch of stuff for free. -- - Geoffrey Keating <geoffk@cygnus.com> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 14:21 ` Greg McGary 2000-03-27 14:30 ` Jeffrey A Law 2000-03-27 14:34 ` Geoff Keating @ 2000-03-27 22:07 ` Greg McGary 2000-03-28 1:55 ` Richard Henderson 2000-03-28 7:05 ` Jeffrey A Law 2 siblings, 2 replies; 38+ messages in thread From: Greg McGary @ 2000-03-27 22:07 UTC (permalink / raw) To: Geoff Keating; +Cc: gcc Greg McGary <gkm@eng.ascend.com> writes: > For i386, it makes sense the use the `bound' instruction. I should qualify that: `bound' is optimal for space, but it might run faster to decompose into register compares. I haven't analyzed it yet. > For MIPS, I'll want to generate a `break' instruction. I now see that since ISA II, mips has had conditional trap insns, so we should use those. > > I can't think of any support on ppc for this that isn't already > > represented in the machine description through the TRAP RTXs. > > Conditional traps aren't widely available (only i960, m68k, powerpc, > sparc), so that approach doesn't help all that much for portability. > Is there something more that I'm missing? Add PA, C[34]x and MIPS to that list. Others can be simulated with conditional branches around unconditional traps. We still need CSE to optimize away redundant conditional traps. An advantage to emitting separate conditional traps for low and high bounds checks is that the separate lower-bound check can be more easily be moved out of a loop for monotonically increasing pointer livs. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 22:07 ` Greg McGary @ 2000-03-28 1:55 ` Richard Henderson 2000-03-28 7:05 ` Jeffrey A Law 1 sibling, 0 replies; 38+ messages in thread From: Richard Henderson @ 2000-03-28 1:55 UTC (permalink / raw) To: Greg McGary; +Cc: Geoff Keating, gcc On Mon, Mar 27, 2000 at 11:07:11PM -0700, Greg McGary wrote: > > For i386, it makes sense the use the `bound' instruction. > > I should qualify that: `bound' is optimal for space, but > it might run faster to decompose into register compares. `bound' is a horribly slow instruction on all processors. > Add PA, C[34]x and MIPS to that list. Others can be simulated with > conditional branches around unconditional traps. Alpha has a particular sequence that's supposed to be used to specifically report out of bounds conditions: ldi $16, GEN_SUBRNG call_pal gentrap r~ ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 22:07 ` Greg McGary 2000-03-28 1:55 ` Richard Henderson @ 2000-03-28 7:05 ` Jeffrey A Law 2000-03-28 9:28 ` Greg McGary 2000-03-28 9:57 ` Joern Rennecke 1 sibling, 2 replies; 38+ messages in thread From: Jeffrey A Law @ 2000-03-28 7:05 UTC (permalink / raw) To: Greg McGary; +Cc: Geoff Keating, gcc In message < ms3dpbr5kg.fsf@gkm-dsl-194.ascend.com >you write: > optimize away redundant conditional traps. An advantage to emitting > separate conditional traps for low and high bounds checks is that the > separate lower-bound check can be more easily be moved out of a loop > for monotonically increasing pointer livs. I hadn't thought of that. It assumes you can't overflow, but that's probably a reasonable simplification for pointer operations. Is there any advantage to not always emitting the high/low bounds checks separately? jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 7:05 ` Jeffrey A Law @ 2000-03-28 9:28 ` Greg McGary 2000-03-28 9:48 ` Jeffrey A Law 2000-03-28 9:57 ` Joern Rennecke 1 sibling, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-03-28 9:28 UTC (permalink / raw) To: law; +Cc: Geoff Keating, gcc Jeffrey A Law <law@cygnus.com> writes: > Is there any advantage to not always emitting the high/low bounds checks > separately? For i960 there's a small advantage that if you emit checks together, you can use concmp and save one instruction ("cmpo; concmpo; fault" vs. "cmpo; fault; cmpo; fault"). Big deal... For i386, if you check both at once, you can use the `bound' instruction, which is slower, but very compact. In summary, there will be some small space advantage for any target that has special instructions that support bounds checking. For multiple-issue CPUs, the space advantage will come at the expense of runtime, because the checks would likely run faster if they were broken apart to improve scheduling. It should be easy enough to implement each and benchmark to quantify the difference, or even to implement both, and choose at compile time based on the value of `optimize_size'. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 9:28 ` Greg McGary @ 2000-03-28 9:48 ` Jeffrey A Law 2000-03-28 11:30 ` Geoff Keating 0 siblings, 1 reply; 38+ messages in thread From: Jeffrey A Law @ 2000-03-28 9:48 UTC (permalink / raw) To: Greg McGary; +Cc: Geoff Keating, gcc In message < msem8vovhj.fsf@gkm-dsl-194.ascend.com >you write: > Jeffrey A Law <law@cygnus.com> writes: > > > Is there any advantage to not always emitting the high/low bounds checks > > separately? > > For i960 there's a small advantage that if you emit checks together, > you can use concmp and save one instruction ("cmpo; concmpo; fault" > vs. "cmpo; fault; cmpo; fault"). Big deal... That indicates to me that we have 3 primitives. 1. check high bounds 2. check low bounds 3. check high and low bounds When we generate code we should first check for the availability of #3, then fall back to #1 & #2, then fall back to whatever scheme we can devise using unconditional break/trap instructions or abort calls. > For i386, if you check both at once, you can use the `bound' > instruction, which is slower, but very compact. That's easily handled by making the check high & low bounds instruction be conditional on optimize_size and provide additional patterns to theck the high bound and another pattern to check the low bound. > In summary, there will be some small space advantage for any target > that has special instructions that support bounds checking. For > multiple-issue CPUs, the space advantage will come at the expense of > runtime, because the checks would likely run faster if they were > broken apart to improve scheduling. It should be easy enough to > implement each and benchmark to quantify the difference, or even to > implement both, and choose at compile time based on the value of > `optimize_size'. Well, I'm not 100% you'll see any significant optimization advantage due to better scheduling -- it largely depends on how the compiler treats the bounds checking instructions. For example, if the compiler treats them as ending the current basic block, then that's going to inhibit a lot of optimizations. However, that might be necessary to prevent the compiler from over-optimizing in the presence of bounds checking instructions. I don't really know. jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 9:48 ` Jeffrey A Law @ 2000-03-28 11:30 ` Geoff Keating 2000-03-28 12:26 ` Greg McGary 2000-03-28 14:21 ` Greg McGary 0 siblings, 2 replies; 38+ messages in thread From: Geoff Keating @ 2000-03-28 11:30 UTC (permalink / raw) To: law; +Cc: gkm, gcc > cc: Geoff Keating <geoffk@cygnus.com>, gcc@gcc.gnu.org > Reply-To: law@cygnus.com > Date: Tue, 28 Mar 2000 10:40:52 -0700 > From: Jeffrey A Law <law@cygnus.com> > > > In message < msem8vovhj.fsf@gkm-dsl-194.ascend.com >you write: > > Jeffrey A Law <law@cygnus.com> writes: > > > > > Is there any advantage to not always emitting the high/low bounds checks > > > separately? > > > > For i960 there's a small advantage that if you emit checks together, > > you can use concmp and save one instruction ("cmpo; concmpo; fault" > > vs. "cmpo; fault; cmpo; fault"). Big deal... > That indicates to me that we have 3 primitives. > > 1. check high bounds > 2. check low bounds > 3. check high and low bounds > > When we generate code we should first check for the availability of #3, then > fall back to #1 & #2, then fall back to whatever scheme we can devise using > unconditional break/trap instructions or abort calls. Note that you can do (3) with a subtraction and a single conditional trap instruction---eg. if you want to check whether 'r3' is between 4 and 10, you can do tmp = r3 - 4 if ((unsigned)r3 > 6) trap; or tmp = r3 - 4 + MIN_INT; if ((int)r3 > MIN_INT + 6) trap; depending on what's available. The machine-independent code should be able to work this out. Of course, this only makes sense if a conditional trap is slower than a subtraction, which is not the case on at least ppc (they are both one cycle each). If there are no conditional traps, probably gcc should already be trying to do this for conditional branches. > > For i386, if you check both at once, you can use the `bound' > > instruction, which is slower, but very compact. > That's easily handled by making the check high & low bounds instruction > be conditional on optimize_size and provide additional patterns to theck > the high bound and another pattern to check the low bound. It certainly sounds like a good idea on x86 to have the use of the 'bound' insn conditional on optimize_size. > For example, if the compiler treats them as ending the current basic block, > then that's going to inhibit a lot of optimizations. However, that might > be necessary to prevent the compiler from over-optimizing in the presence > of bounds checking instructions. I don't really know. They shouldn't end a basic block. So long as the compiler knows that 'trap' RTXs are memory barriers, I think it can optimise them just like any other insn; and I think that's what it now does. -- - Geoffrey Keating <geoffk@cygnus.com> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 11:30 ` Geoff Keating @ 2000-03-28 12:26 ` Greg McGary 2000-03-28 12:30 ` Geoff Keating 2000-03-28 14:21 ` Greg McGary 1 sibling, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-03-28 12:26 UTC (permalink / raw) To: Geoff Keating; +Cc: law, gcc Geoff Keating <geoffk@cygnus.com> writes: > They shouldn't end a basic block. So long as the compiler knows that > 'trap' RTXs are memory barriers, I think it can optimise them just > like any other insn; and I think that's what it now does. Please expand on this. I'm not sure what you mean here. My only knowledge of barriers is what's in the gcc manual, namely that they mark places where control cannot flow past. However, this is not the case with bounds checks. Most often, control flows past the checks because the trap condition is false. Even when the trap condition is true, the OS should have the option of logging the failure then resuming execution rather than simply terminating. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 12:26 ` Greg McGary @ 2000-03-28 12:30 ` Geoff Keating 2000-03-28 12:59 ` Greg McGary 0 siblings, 1 reply; 38+ messages in thread From: Geoff Keating @ 2000-03-28 12:30 UTC (permalink / raw) To: gkm; +Cc: law, gcc > Cc: law@cygnus.com, gcc@gcc.gnu.org > From: Greg McGary <gkm@eng.ascend.com> > Date: 28 Mar 2000 13:25:51 -0700 > > Geoff Keating <geoffk@cygnus.com> writes: > > > They shouldn't end a basic block. So long as the compiler knows that > > 'trap' RTXs are memory barriers, I think it can optimise them just > > like any other insn; and I think that's what it now does. > > Please expand on this. I'm not sure what you mean here. My only > knowledge of barriers is what's in the gcc manual, namely that they > mark places where control cannot flow past. However, this is not the > case with bounds checks. Most often, control flows past the checks > because the trap condition is false. Even when the trap condition is > true, the OS should have the option of logging the failure then > resuming execution rather than simply terminating. By 'memory barrier', I mean an instruction over which it is not safe to move memory accesses. It would be annoying if in the code sequence if (i > 10) trap; a[i] = 3; the store to the array was moved before the trap. -- - Geoffrey Keating <geoffk@cygnus.com> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 12:30 ` Geoff Keating @ 2000-03-28 12:59 ` Greg McGary 2000-03-28 13:12 ` Greg McGary ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Greg McGary @ 2000-03-28 12:59 UTC (permalink / raw) To: Geoff Keating; +Cc: law, gcc Geoff Keating <geoffk@cygnus.com> writes: > By 'memory barrier', I mean an instruction over which it is not safe > to move memory accesses. It would be annoying if in the code sequence > > if (i > 10) trap; > a[i] = 3; > > the store to the array was moved before the trap. I'm not sure I agree that it's annoying to reorder bounds checks & memory accesses. What matters is that we get the trap for the bounds violation, and that the bounds-violating code is identifiable from the trap. The order in which the check & associated memory reference occurs is only a concern if you're going to somehow try and prevent or alter the memory access on the basis of having gotten the trap. That's way beyond what we're trying to do here. If the memory access gives us a SEGV, it doesn't matter whether or not the trap came first--we still found the bug. IF the memory access doesn't SEGV, the trap will uncover the bug regardless of whether it comes before or after the access. Therefore, if the memory-barrier behavior for bounds checks inhibits optimization in any meaningful then I think we should drop the barrier semantics--it's not worth the cost. Is there some other advantage to memory-barrier semantics wrt. bounds checks that I'm overlooking? Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 12:59 ` Greg McGary @ 2000-03-28 13:12 ` Greg McGary 2000-03-29 10:17 ` Joe Buck 2000-03-28 13:41 ` Alan Lehotsky 2001-09-04 23:52 ` Tom Tromey 2 siblings, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-03-28 13:12 UTC (permalink / raw) To: Geoff Keating; +Cc: law, gcc Greg McGary <gkm@eng.ascend.com> writes: > Is there some other advantage to memory-barrier semantics wrt. bounds > checks that I'm overlooking? I just thought of one: in a flat-memory-model system (say an embedded system), it might useful to have a policy whereby a task is terminated immediately upon detection of a bounds violation, and it's desirable to terminate it before it has a chance to scribble on some other task's memory. OTOH, this is tricky business: you need some means of releasing resources locked by the newly terminated task and rolling back any partially completed work. It sounds like a ton of work to implement correctly, so the benefit of memory-barrier semantics is not easily realized. Does anyone know what optimizations barrier semantics will inhibit? Instruction scheduling and code motion in the loop optimizer are the two that spring to mind. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 13:12 ` Greg McGary @ 2000-03-29 10:17 ` Joe Buck 0 siblings, 0 replies; 38+ messages in thread From: Joe Buck @ 2000-03-29 10:17 UTC (permalink / raw) To: Greg McGary; +Cc: Geoff Keating, law, gcc Greg McGary <gkm@eng.ascend.com> writes: > > > Is there some other advantage to memory-barrier semantics wrt. bounds > > checks that I'm overlooking? > > I just thought of one: in a flat-memory-model system (say an embedded > system), it might useful to have a policy whereby a task is terminated > immediately upon detection of a bounds violation, and it's desirable > to terminate it before it has a chance to scribble on some other > task's memory. OTOH, this is tricky business: you need some means of > releasing resources locked by the newly terminated task and rolling > back any partially completed work. It sounds like a ton of work to > implement correctly, so the benefit of memory-barrier semantics is not > easily realized. If the function that terminates the task is supplied by the RTOS (or can be overriden), then the RTOS can take care of releasing resources owned by the task that were provided by the RTOS (e.g. semaphores, allocated memory segments, reserved ports) when the task is killed. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 12:59 ` Greg McGary 2000-03-28 13:12 ` Greg McGary @ 2000-03-28 13:41 ` Alan Lehotsky 2000-03-28 14:25 ` Greg McGary 2001-09-04 23:52 ` Tom Tromey 2 siblings, 1 reply; 38+ messages in thread From: Alan Lehotsky @ 2000-03-28 13:41 UTC (permalink / raw) To: gkm; +Cc: geoffk, law, gcc One advantage of checking before stomping is that if stomping memory destroys your stack, it is darn hard to debug the problem... -- AL ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 13:41 ` Alan Lehotsky @ 2000-03-28 14:25 ` Greg McGary 0 siblings, 0 replies; 38+ messages in thread From: Greg McGary @ 2000-03-28 14:25 UTC (permalink / raw) To: Alan Lehotsky; +Cc: geoffk, law, gcc Alan Lehotsky <lehotsky@sunspot.tiac.net> writes: > One advantage of checking before stomping is that if stomping > memory destroys your stack, it is darn hard to debug the problem... Excellent point. I think the thing to do is implement both alternatives, benchmark and see if there's any compelling reason to relax barrier semantics. If the performance difference is negligible, then always use barrier semantics. If the performance difference is significant, then make it optional. Some might wish to pay the runtime penalty for a guarantee that bounds violations won't stomp the stack. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 12:59 ` Greg McGary 2000-03-28 13:12 ` Greg McGary 2000-03-28 13:41 ` Alan Lehotsky @ 2001-09-04 23:52 ` Tom Tromey 2 siblings, 0 replies; 38+ messages in thread From: Tom Tromey @ 2001-09-04 23:52 UTC (permalink / raw) To: gcc >>>>> "Greg" == Greg McGary <gkm@eng.ascend.com> writes: Greg> Is there some other advantage to memory-barrier semantics Greg> wrt. bounds checks that I'm overlooking? You once mentioned the idea of changing gcj to use your bounds checking code. Java has strict rules about when the exception must be thrown. (Whether it is feasible to change the gcj front end to use this stuff, I don't know.) Tom ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 11:30 ` Geoff Keating 2000-03-28 12:26 ` Greg McGary @ 2000-03-28 14:21 ` Greg McGary 1 sibling, 0 replies; 38+ messages in thread From: Greg McGary @ 2000-03-28 14:21 UTC (permalink / raw) To: Geoff Keating; +Cc: law, gcc Geoff Keating <geoffk@cygnus.com> writes: > Note that you can do (3) with a subtraction and a single conditional > trap instruction---eg. if you want to check whether 'r3' is between 4 > and 10, you can do > > tmp = r3 - 4 > if ((unsigned)r3 > 6) trap; you meant: if ((unsigned)tmp > 6) trap; > or > > tmp = r3 - 4 + MIN_INT; > if ((int)r3 > MIN_INT + 6) trap; you meant: if ((int)tmp > MIN_INT + 6) trap; > depending on what's available. The machine-independent code should be > able to work this out. In practice, you can't do the 10-4 subtraction at compile time. You need two runtime subtractions (rV = value reg, rB = base reg, rX = extent reg): rB = rV - rB rX = rX - rB trap_if rB > rX > Of course, this only makes sense if a conditional trap is slower than > a subtraction, For an untaken conditional trap that is 2x slower than a subtraction, the times are even for 2 traps vs. 2 subtractions + 1 trap, with the advantage to the 2 traps because it doesn't clobber rB. Untaken conditional traps would need to be at least 3x slower to justify doing the subtraction trick. I'd bet dollars to donuts that there are is no RISC that has untaken conditional traps that take 3+ clocks. > ... If there are no conditional traps, probably > gcc should already be trying to do this for conditional branches. I don't think the subtractions win for ix86, which is the only CISC having no conditional traps that most people care about. In order to do the extent-base subtraction, you have to load the extent into a register which you would not need to do otherwise. Also, you need to do the value-base subtraction in a temporary, in order to preserve rV. With the subtraction approach you need to do this (in pseudo asm; rV, rX and rT are registers holding value, extent and temp; `base' and `extent' are references to those quantities in memory): load extent to rX subtract base from rX copy rV to rT subtract base from rT compare rT and rX branch to 0f if <= trap 0: With integrated simple compares & conditional branches: compare rV with base branch to 0f if >= compare rV with extent branch to 1f if < 0: trap 1: With independent compares & conditional branches: compare rV with base branch to 0f if >= trap 0: ... compare rV with extent branch to 1f if < trap 1: Both of the simpler compare/branch sequences are as good as or better than the subtractions for time & space, and they consume fewer registers. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-28 7:05 ` Jeffrey A Law 2000-03-28 9:28 ` Greg McGary @ 2000-03-28 9:57 ` Joern Rennecke 1 sibling, 0 replies; 38+ messages in thread From: Joern Rennecke @ 2000-03-28 9:57 UTC (permalink / raw) To: law; +Cc: Greg McGary, Geoff Keating, gcc > I hadn't thought of that. It assumes you can't overflow, but that's > probably a reasonable simplification for pointer operations. If bounds checking is wanted, I don't think it is reasonable to assume that there can't be an overflow - after all, the point of bounds checking is to detect bugs. However, in most cases we will be able to deduct that there can't be an overflow. That is: - if a bounds check is done in every iteration, and - if sum of the upper bound (rounded down for alignment if a STRICT_ALIGNMENT memory access is done at every iteration), plus the increment (or the upper bound of the value range of the increment if variable, but in a known range) doesn't overflow. > > Is there any advantage to not always emitting the high/low bounds checks > separately? Yes, a combined check is simpler. Even if you don't have a specific instruction for a bounds check, you can subtract the lower bound, then compare the upper bound minus the lower bound unsigned to the subtraction result. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-27 13:05 ` Greg McGary 2000-03-27 13:54 ` Geoff Keating @ 2000-03-29 12:22 ` Jeffrey A Law 2000-03-29 13:35 ` Geoff Keating 2000-04-07 9:57 ` Greg McGary 1 sibling, 2 replies; 38+ messages in thread From: Jeffrey A Law @ 2000-03-29 12:22 UTC (permalink / raw) To: Greg McGary; +Cc: Joern Rennecke, gcc In message < msaejkt998.fsf@gkm-dsl-194.ascend.com >you write: > > > > I'd be much happier if we could find a clean way to get the optimizations > > you need using our existing optimizers. > > We can do without a special pass. CSE seems the best place to > eliminate redundant checks, so we'd just need a little bit of code to > handle the new check_bounds_rtx. We can just forget about doing > default translation of check_bounds_rtx into primitive RTL, and > require that targets supporting bounds checking handle it in the > machine description. So far, all of the targets I want to support > initially (i960, PowerPC, i386, MIPS) will benefit from special > handling in the MD file anyway. > > Does that sound reasonable? Seems reasonable. We should probably have some way to automatically warn the user if they try to use bounded pointers on a target which does not have conditional traps, etc. We could probably note if the target supports conditional traps during one of the RTL initializers and warn when we encounter a [un]bounded keyword or attribute during parsing. jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-29 12:22 ` Jeffrey A Law @ 2000-03-29 13:35 ` Geoff Keating 2000-04-07 9:57 ` Greg McGary 1 sibling, 0 replies; 38+ messages in thread From: Geoff Keating @ 2000-03-29 13:35 UTC (permalink / raw) To: gcc Jeffrey A Law <law@cygnus.com> writes: > In message < msaejkt998.fsf@gkm-dsl-194.ascend.com >you write: > > > > > > I'd be much happier if we could find a clean way to get the optimizations > > > you need using our existing optimizers. > > > > We can do without a special pass. CSE seems the best place to > > eliminate redundant checks, so we'd just need a little bit of code to > > handle the new check_bounds_rtx. We can just forget about doing > > default translation of check_bounds_rtx into primitive RTL, and > > require that targets supporting bounds checking handle it in the > > machine description. So far, all of the targets I want to support > > initially (i960, PowerPC, i386, MIPS) will benefit from special > > handling in the MD file anyway. > > > > Does that sound reasonable? > Seems reasonable. We should probably have some way to automatically warn the > user if they try to use bounded pointers on a target which does not have > conditional traps, etc. like this code: case BUILT_IN_TRAP: #ifdef HAVE_trap if (HAVE_trap) emit_insn (gen_trap ()); else #endif error ("__builtin_trap not supported by this target"); emit_barrier (); return const0_rtx; -- - Geoffrey Keating <geoffk@cygnus.com> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-03-29 12:22 ` Jeffrey A Law 2000-03-29 13:35 ` Geoff Keating @ 2000-04-07 9:57 ` Greg McGary 2000-04-09 11:01 ` Jeffrey A Law 1 sibling, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-04-07 9:57 UTC (permalink / raw) To: law; +Cc: Joern Rennecke, gcc I wrote: > We can do without a special pass. CSE seems the best place to > eliminate redundant checks, so we'd just need a little bit of code to > handle the new check_bounds_rtx. We can just forget about doing > default translation of check_bounds_rtx into primitive RTL, and > require that targets supporting bounds checking handle it in the > machine description. So far, all of the targets I want to support > initially (i960, PowerPC, i386, MIPS) will benefit from special > handling in the MD file anyway. Happy news: this turned out to be less intrusive than I feared it might become. I emit checks as trees like so: ------------------------------------------------------------------------------ value = build_bounded_ptr_value_ref (bp); base = build_bounded_ptr_base_ref (bp); extent = build_bounded_ptr_extent_ref (bp); /* GKM FIXME: fold needs to be enhanced to evaluate expressions of the form (a >= (a + i)) ==> 0, where `a' is an address and `i' is a positive integer. */ compare = fold (build_binary_op (TRUTH_ORIF_EXPR, fold (build_binary_op (LT_EXPR, value, base, 1)), fold (build_binary_op (GE_EXPR, value, extent, 1)), 1)); if (integer_zerop (compare)) return value; else { tree crash = build_compound_expr (tree_cons (NULL_TREE, build_function_call (trap_fndecl, NULL_TREE), tree_cons (NULL_TREE, integer_zero_node, NULL_TREE))); tree maybe_crash = fold (build_binary_op (TRUTH_ANDIF_EXPR, compare, crash, 1)); tree check = fold (build_compound_expr (tree_cons (NULL_TREE, maybe_crash, tree_cons (NULL_TREE, value, NULL_TREE)))); if (integer_onep (compare)) warning ("bounds violation"); return check; } ------------------------------------------------------------------------------ (The integer_zerop check isn't really necessary since fold will do the same thing... Should I bother with such micro-efficiencies?) The jump optimizer already automagically converts conditional jumps around traps into conditional traps, if they're defined for the machine, otherwise they remain as unconditional traps. Conditional traps are much easier to deal with for optimization since they hide branches that would otherwise chop up basic blocks and they are readily identifiable as checks, so I defined pseudo conditional trap insns for ix86, like so: ------------------------------------------------------------------------------ (define_insn "trap" [(trap_if (const_int 1) (const_int 5))] "" "int\\t%0") ;;; ix86 doesn't have conditional traps, but we fake them for the sake ;;; of bounds checking. By emitting bounds checks as conditional ;;; traps rather than as conditional jumps around unconditional traps ;;; we avoid introducing spurious basic-block boundaries and facilitate ;;; elimination of redundant checks. (define_expand "conditional_trap" [(trap_if (match_operator 0 "comparison_operator" [(match_dup 2) (const_int 0)]) (match_operand 1 "const_int_operand" ""))] "" " { enum rtx_code code = GET_CODE (operands[0]); int unordered = (code == EQ || code == NE); emit_insn (gen_rtx_TRAP_IF (VOIDmode, ix86_expand_compare (code, unordered), operands[1])); DONE; }") (define_insn "" [(trap_if (match_operator 0 "comparison_operator" [(reg 17) (const_int 0)]) (match_operand 1 "const_int_operand" ""))] "" "j%c0\\t1f\;int\\t%1\\n1:") ------------------------------------------------------------------------------ (Should I somehow mark conditional traps as being generated automatically by bounds checking, and only optimize those away, or should I just document the as yet undocumented __builtin_trap as subject to this optimization?) It is very easy to eliminate redundant checks at the end of the main loop in combine_instructions: if (GET_CODE (insn) == INSN && GET_CODE (PATTERN (insn)) == TRAP_IF) { for (links = trap_insns; links; links = XEXP (links, 1)) { rtx earlier = XEXP (links, 0); if (rtx_and_links_equal_p (insn, earlier)) { /* This conditional trap is redundant, so delete it after giving its notes to the earlier insn. */ distribute_notes (REG_NOTES (insn), insn, earlier, earlier, NULL_RTX, NULL_RTX); PUT_CODE (insn, NOTE); NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED; NOTE_SOURCE_FILE (insn) = 0; break; } } if (GET_CODE (insn) == INSN) /* This conditional trap is unique, so remember it. */ trap_insns = alloc_INSN_LIST (insn, trap_insns); ... /* Compare two insns and all the REG assignments they depend upon for equality. We use this in order to determine if two conditional traps are identical, so that we may eliminate redundancies. */ static int rtx_and_links_equal_p (insn1, insn2) rtx insn1; rtx insn2; { if (rtx_equal_p (PATTERN (insn1), PATTERN (insn2))) { rtx links1 = LOG_LINKS (insn1); rtx links2 = LOG_LINKS (insn2); while (links1 && links2) { if (! rtx_and_links_equal_p (XEXP (links1, 0), XEXP (links2, 0))) return 0; links1 = XEXP (links1, 1); links2 = XEXP (links2, 1); } return 1; } return 0; } ------------------------------------------------------------------------------ I can now build runnable GNU libc-2.1 + most of GNU fileutils & textutils with -fbounded-pointers. I even found a bounds-violation bug in GNU pr today. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-04-07 9:57 ` Greg McGary @ 2000-04-09 11:01 ` Jeffrey A Law 2000-04-09 11:38 ` Greg McGary 2000-04-09 16:26 ` Greg McGary 0 siblings, 2 replies; 38+ messages in thread From: Jeffrey A Law @ 2000-04-09 11:01 UTC (permalink / raw) To: Greg McGary; +Cc: Joern Rennecke, gcc In message < ms7le93l40.fsf@tucson-net-82.eng.ascend.com >you write: > > We can do without a special pass. CSE seems the best place to > > eliminate redundant checks, so we'd just need a little bit of code to > > handle the new check_bounds_rtx. We can just forget about doing > > default translation of check_bounds_rtx into primitive RTL, and > > require that targets supporting bounds checking handle it in the > > machine description. So far, all of the targets I want to support > > initially (i960, PowerPC, i386, MIPS) will benefit from special > > handling in the MD file anyway. > > Happy news: this turned out to be less intrusive than I feared it > might become. Cool. > --------------------------------------------------------------------------- > --- > value = build_bounded_ptr_value_ref (bp); > base = build_bounded_ptr_base_ref (bp); > extent = build_bounded_ptr_extent_ref (bp); > > /* GKM FIXME: fold needs to be enhanced to evaluate expressions of > the form (a >= (a + i)) ==> 0, where `a' is an address and `i' is > a positive integer. */ I'm surprised fold doesn't already do that kind of transformation. There may be obscure issues with signed vs unsigned pointers and such here. Anyway, it would seem relatively straightforward to me to add that class of optimizations to fold. We'd probably need to restrict them to cases where 'a' is a pointer type though. > compare = fold (build_binary_op (TRUTH_ORIF_EXPR, > fold (build_binary_op (LT_EXPR, value, base, > 1)), > fold (build_binary_op (GE_EXPR, value, exten > t, 1)), 1)); > if (integer_zerop (compare)) I wonder if this can be done via a call into the range optimizers. Their job is to optimize range checking. > tree crash = build_compound_expr (tree_cons (NULL_TREE, > build_function_call (trap_fn > decl, NULL_TREE), > tree_cons (NULL_TREE, intege > r_zero_node, > NULL_TREE))); > tree maybe_crash = fold (build_binary_op (TRUTH_ANDIF_EXPR, compare, > crash, 1)); I don't understand this -- it looks like you're expecting the trap function to return a value. Is that correct? Presumably it will be a runtime value, not a compile-time value? > tree check = fold (build_compound_expr (tree_cons (NULL_TREE, maybe_c > rash, > tree_cons (NULL_TREE, > value, > NULL_TREE)) > )); > if (integer_onep (compare)) > warning ("bounds violation"); > return check; Seems to me check will never be integer_onep unless we know trap_fn always returns a nonzero value. > (The integer_zerop check isn't really necessary since fold will do the > same thing... Should I bother with such micro-efficiencies?) I don't follow exactly, possibly because I don't know where you propose to insert this code. > The jump optimizer already automagically converts conditional jumps > around traps into conditional traps, if they're defined for the > machine, otherwise they remain as unconditional traps. Right. > ;;; ix86 doesn't have conditional traps, but we fake them for the sake > ;;; of bounds checking. By emitting bounds checks as conditional > ;;; traps rather than as conditional jumps around unconditional traps > ;;; we avoid introducing spurious basic-block boundaries and facilitate > ;;; elimination of redundant checks. This style may end up being the recommended approach; not really sure. Maybe we just leave it up to the backend folks for each port with a note that this kind of approach might be more efficient. > (Should I somehow mark conditional traps as being generated > automatically by bounds checking, and only optimize those away, or > should I just document the as yet undocumented __builtin_trap as > subject to this optimization?) I don't think so. > It is very easy to eliminate redundant checks at the end of the main > loop in combine_instructions: This looks very compile-time expensive since you have to walk the LOG_LINKs chain all the way back to the top of the dependency chain. Presumably you're trying to delete one conditional trap that is subsumed by an earlier conditional trap? If so, that might be best modeled after an identical optimization we do on jumps. See jump.c jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-04-09 11:01 ` Jeffrey A Law @ 2000-04-09 11:38 ` Greg McGary 2000-04-10 10:13 ` Jeffrey A Law 2000-04-09 16:26 ` Greg McGary 1 sibling, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-04-09 11:38 UTC (permalink / raw) To: law; +Cc: Joern Rennecke, gcc Jeffrey A Law <law@cygnus.com> writes: > > tree crash = build_compound_expr (tree_cons (NULL_TREE, > > build_function_call (trap_fn > > decl, NULL_TREE), > > tree_cons (NULL_TREE, intege > > r_zero_node, > > NULL_TREE))); > > tree maybe_crash = fold (build_binary_op (TRUTH_ANDIF_EXPR, compare, > > crash, 1)); > I don't understand this -- it looks like you're expecting the trap function > to return a value. Is that correct? Presumably it will be a runtime value, > not a compile-time value? The thing is that __builtin_trap doesn't return a value, but the expression I'm building wants a value, so I wrap it in `(__builtin_trap (), 0)'. The whole expansion is this: (((p.value < p.base || p.value >= p.extent) && (__builtin_trap (), 0)), p.value) Without the compound expr around __builtin_trap, it's this: (((p.value < p.base || p.value >= p.extent) && __builtin_trap), p.value) but here gcc complains about the truth_andif_expr not returning a value, even though we ignore the value anyway, since I'm just using the short-circuit behavior of `&&' to evaluate __builtin_trap trap if the predicate is true. The value of the entire check expression wants to be p.value. > > tree check = fold (build_compound_expr (tree_cons (NULL_TREE, maybe_c > > rash, > > tree_cons (NULL_TREE, > > value, > > NULL_TREE)) > > )); > > if (integer_onep (compare)) > > warning ("bounds violation"); > > return check; > Seems to me check will never be integer_onep unless we know trap_fn always > returns a nonzero value. The variable `compare' represents the bounds tests: (p.value < p.base || p.value >= p.extent) The variable `maybe_crash' represents bounds tests & the conditional trap: ((p.value < p.base || p.value >= p.extent) && (__builtin_trap (), 0)) The variable `check' represents the whole package, which returns the pointer value: (((p.value < p.base || p.value >= p.extent) && __builtin_trap), p.value) Obviously, I should comment better, or choose better variable names. > > (Should I somehow mark conditional traps as being generated > > automatically by bounds checking, and only optimize those away, or > > should I just document the as yet undocumented __builtin_trap as > > subject to this optimization?) > I don't think so. There are two questions here. Which one are you answering? (a) only optimize away redundant trap_if associate with BPs? (b) optimize away all redundant trap_if, and document that behavior for __builtin_trap? (incidentally, __builtin_trap is undocumented) > > It is very easy to eliminate redundant checks at the end of the main > > loop in combine_instructions: > This looks very compile-time expensive since you have to walk the LOG_LINKs > chain all the way back to the top of the dependency chain. I'll instrument it, but from limited observation, I have see that we only walk back one or two level, since pointers are usually not subjected to complicated arithmetic with many subexpressions. > Presumably you're trying to delete one conditional trap that is subsumed by > an earlier conditional trap? If so, that might be best modeled after an > identical optimization we do on jumps. See jump.c OK. Thanks for the tips. Greg ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-04-09 11:38 ` Greg McGary @ 2000-04-10 10:13 ` Jeffrey A Law 0 siblings, 0 replies; 38+ messages in thread From: Jeffrey A Law @ 2000-04-10 10:13 UTC (permalink / raw) To: Greg McGary; +Cc: Joern Rennecke, gcc In message < mszor3unld.fsf@tucson-net-82.eng.ascend.com >you write: > > > (Should I somehow mark conditional traps as being generated > > > automatically by bounds checking, and only optimize those away, or > > > should I just document the as yet undocumented __builtin_trap as > > > subject to this optimization?) > > I don't think so. > > There are two questions here. Which one are you answering? > (a) only optimize away redundant trap_if associate with BPs? > (b) optimize away all redundant trap_if, and document that behavior > for __builtin_trap? (incidentally, __builtin_trap is undocumented) I think we should always remove redundant trap_ifs that we find, regardless of whether they're created by bounded-pointers or some other mechanism. Documenting this behavior (along with __builtin_trap) would be wise. jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-04-09 11:01 ` Jeffrey A Law 2000-04-09 11:38 ` Greg McGary @ 2000-04-09 16:26 ` Greg McGary 2000-04-10 10:20 ` Jeffrey A Law 1 sibling, 1 reply; 38+ messages in thread From: Greg McGary @ 2000-04-09 16:26 UTC (permalink / raw) To: law; +Cc: Joern Rennecke, gcc Jeffrey A Law <law@cygnus.com> writes: > > It is very easy to eliminate redundant checks at the end of the main > > loop in combine_instructions: > This looks very compile-time expensive since you have to walk the LOG_LINKs > chain all the way back to the top of the dependency chain. I'll look for a cheaper way to do this. > Presumably you're trying to delete one conditional trap that is subsumed by > an earlier conditional trap? I'm not sure what you mean by "subsumed" in this context. What I'm trying to optimize is this case: if (x) p->a = 1; p->b = 2; p->c = 3; p->d = 4; p++; p->e = 5; p->f = 6; p->g = 7; p->h = 8; Without optimization, the bounds of `p' are checked eight times, once per `->' operator. The necessary checks are at `p->a = 1', `p->b = 2', `p->e = 5'; the rest are redundant and should be deleted. p->a is separated from the others by a BB boundary, so there's nothing special here. p->b .. p->h are in the same BB, but p->e .. p->h have LOG_LINKS that point to the increment of p.value, and so differ from p->a .. p->d. > If so, that might be best modeled after an > identical optimization we do on jumps. See jump.c Would you kindly give a more specific clue about where to look in jump.c? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Need advice on bounds checking approaches 2000-04-09 16:26 ` Greg McGary @ 2000-04-10 10:20 ` Jeffrey A Law 0 siblings, 0 replies; 38+ messages in thread From: Jeffrey A Law @ 2000-04-10 10:20 UTC (permalink / raw) To: Greg McGary; +Cc: Joern Rennecke, gcc In message < msr9cevotm.fsf@tucson-net-82.eng.ascend.com >you write: > > Presumably you're trying to delete one conditional trap that is subsumed > by > > an earlier conditional trap? > > I'm not sure what you mean by "subsumed" in this context. What I'm > trying to optimize is this case: > > if (x) > p->a = 1; > p->b = 2; > p->c = 3; > p->d = 4; > p++; > p->e = 5; > p->f = 6; > p->g = 7; > p->h = 8; > > Without optimization, the bounds of `p' are checked eight times, once > per `->' operator. The necessary checks are at `p->a = 1', `p->b = > 2', `p->e = 5'; the rest are redundant and should be deleted. p->a is > separated from the others by a BB boundary, so there's nothing special > here. p->b .. p->h are in the same BB, but p->e .. p->h have LOG_LINKS > that point to the increment of p.value, and so differ from p->a .. p->d. By subsumed I mean a check that is redundant due to an earlier check. > > If so, that might be best modeled after an > > identical optimization we do on jumps. See jump.c > > Would you kindly give a more specific clue about where to look in > jump.c? thread_jumps I believe does something similar to what you need. I also believe CSE does similar things as part of it's follow-jumps/skip-blocks optimizations. jeff ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2001-09-04 23:52 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2000-03-24 11:18 Need advice on bounds checking approaches Greg McGary 2000-03-24 12:08 ` Joern Rennecke 2000-03-24 12:28 ` Greg McGary 2000-03-24 16:18 ` Jeffrey A Law 2000-03-24 16:50 ` Greg McGary 2000-03-24 17:27 ` Jamie Lokier 2000-03-27 12:30 ` Jeffrey A Law 2000-03-27 12:45 ` Mark Mitchell 2000-03-27 13:05 ` Greg McGary 2000-03-27 13:54 ` Geoff Keating 2000-03-27 14:21 ` Greg McGary 2000-03-27 14:30 ` Jeffrey A Law 2000-03-27 19:23 ` Michael Hayes 2000-03-27 14:34 ` Geoff Keating 2000-03-27 22:07 ` Greg McGary 2000-03-28 1:55 ` Richard Henderson 2000-03-28 7:05 ` Jeffrey A Law 2000-03-28 9:28 ` Greg McGary 2000-03-28 9:48 ` Jeffrey A Law 2000-03-28 11:30 ` Geoff Keating 2000-03-28 12:26 ` Greg McGary 2000-03-28 12:30 ` Geoff Keating 2000-03-28 12:59 ` Greg McGary 2000-03-28 13:12 ` Greg McGary 2000-03-29 10:17 ` Joe Buck 2000-03-28 13:41 ` Alan Lehotsky 2000-03-28 14:25 ` Greg McGary 2001-09-04 23:52 ` Tom Tromey 2000-03-28 14:21 ` Greg McGary 2000-03-28 9:57 ` Joern Rennecke 2000-03-29 12:22 ` Jeffrey A Law 2000-03-29 13:35 ` Geoff Keating 2000-04-07 9:57 ` Greg McGary 2000-04-09 11:01 ` Jeffrey A Law 2000-04-09 11:38 ` Greg McGary 2000-04-10 10:13 ` Jeffrey A Law 2000-04-09 16:26 ` Greg McGary 2000-04-10 10:20 ` Jeffrey A Law
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).