On 06/16/2011 09:24 AM, Zdenek Dvorak wrote: > Hi, > >> Consider the following example. >> >> extern unsigned int foo (int*) __attribute__((pure)); >> unsigned int >> tr (int array[], int n) >> { >> unsigned int i; >> unsigned int sum = 0; >> for (i = 0; i < n; i++) >> sum += foo (&array[i]); >> return sum; >> } >> >> For 32-bit pointers, the analysis in infer_loop_bounds_from_pointer_arith >> currently concludes that the range of valid &array[i] is &array[0x0] to >> &array[0x3fffffff], meaning 0x40000000 distinct values. >> This implies that i < n is executed at most 0x40000001 times, and i < n >> cannot be eliminated by an 32-bit iterator with step 4, since that one has >> only 0x40000000 distinct values. > > this reasoning seems slightly inaccurate to me: the loop is typically translated > as > > if (n > 0) > { > i = 0; > start: > sum += foo (&array[i]); > i++; > if (i < n) > goto start; > } > > This means that if the array access is performed <= x times, then the exit condition > of the loop is tested <= x times (not counting the copied header). This could be > exploited to fix the problem as well. Indeed, when the header is copied, the elimination is done. But when optimizing for size (which is the case for this PR), the header is not copied and the control flow structure at ivopts is the same as at source level. >> The patch reasons that NULL cannot be used or produced by pointer >> arithmetic, and that we can exclude the possibility of the NULL pointer in the >> range. So the range of valid &array[i] is &array[0] to &array[0x3ffffffe], >> meaning 0x3fffffff distinct values. >> This implies that i < n is executed at most 0x40000000 times and i < n can be >> eliminated. > > Sounds reasonable to me. > >> The patch implements this new limitation by changing the (low, high, step) >> triplet in infer_loop_bounds_from_pointer_arith from (0x0, 0xffffffff, 0x4) >> to (0x4, 0xffffffff, 0x4). > > At least in principle, the variable could have initial value 0x1, so this is a bit > incorrect. (alignment of the memory reference, 0xffffffff, 0x4) should work, though. > That's better indeed, thanks. >> I'm not too happy about the test for C-like language: ptrdiff_type_node != >> NULL_TREE, but I'm not sure how else to test for this. > > The middle-end operations in general follow the C semantics, unless explicitly > specified otherwise (e.g. language-specific alias analysis rules, ...). So, I think > you can drop this test. > Ok, dropped it. Bootstrapped and reg-tested on x86_64. Ok for trunk? Thanks, - Tom