public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Marc Glisse <marc.glisse@inria.fr>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
Date: Sat, 28 Oct 2017 13:04:00 -0000	[thread overview]
Message-ID: <alpine.DEB.2.20.1710192143090.2973@stedding.saclay.inria.fr> (raw)
In-Reply-To: <CAFiYyc0m9RsAeWGA=uPMwqE8CvKpjO8yHykg-Ng+Eq-XC=6LRg@mail.gmail.com>


I am sending the new version of the patch in a separate email, to make it 
more visible, and only replying to a few points here.

On Thu, 19 Oct 2017, Richard Biener wrote:

> On Mon, Oct 9, 2017 at 12:55 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Mon, 3 Jul 2017, Richard Biener wrote:
>>
>>> On Sat, 1 Jul 2017, Marc Glisse wrote:
>>>
>>>> I wrote a quick prototype to see what the fallout would look like.
>>>> Surprisingly, it actually passes bootstrap+testsuite on ppc64el with all
>>>> languages with no regression. Sure, it is probably not a complete
>>>> migration, there are likely a few places still converting to ptrdiff_t
>>>> to perform a regular subtraction, but this seems to indicate that the
>>>> work isn't as bad as using a signed type in pointer_plus_expr for
>>>> instance.
>>>
>>>
>>> The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR
>>> from POINTER_MINUS_EXPR in some cases I guess).
>>>
>>> The tree code needs documenting in tree.def and generic.texi.
>>>
>>> Otherwise ok(*).
>>>
>>> Thanks,
>>> Richard.
>>>
>>> (*) ok, just kidding -- or maybe not
>>
>>
>> I updated the prototype a bit. Some things I noticed:
>>
>> - the C front-end has support for address spaces that seems to imply that
>> pointer subtraction (and division by the size) may be done in a type larger
>> than ptrdiff_t. Currently, it generates
>> (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is some type
>> potentially larger than ptrdiff_t.
>
> It uses a ptrdiff_t corresponding to the respective address space, yes.
> That we use sizetype elsewhere unconditionally is a bug :/
>
>> I am thus generating a POINTER_DIFF_EXPR
>> with that type, while I was originally hoping its type would always be
>> ptrdiff_t. It complicates code and I am not sure I didn't break address
>> spaces anyway... (expansion likely needs to do the inttype dance)
>
> I think that's fine.  Ideally targets would provide a type to use for each
> respective address space given we have targets that have sizetype smaller
> than ptr_mode.
>
>> Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t (what
>> POINTER_PLUS_EXPR takes as second argument) always the same type
>> signed/unsigned?
>
> POINTER_PLUS_EXPR takes 'sizetype', not size_t.

Ah, yes, that's the one I meant...

> So the answer is clearly no.  And yes, that's ugly and broken and should be fixed.
>
>> Counter-examples: m32c (when !TARGET_A16), visium, darwin,
>> powerpcspe, s390, vms... and it isn't even always the same that is bigger
>> than the other. That's quite messy.
>
> Eh.  Yeah, targets are free to choose 'sizetype' and they do so for
> efficiency.  IMHO we should get rid of this "feature".
>
>> - I had to use @@ in match.pd, not for constants, but because in GENERIC we
>> sometimes ended up with pointers where operand_equal_p said yes but
>> types_match disagreed.
>>
>> - It currently regresses 2 go tests: net/http runtime/debug
>
> Those are flaky for me and fail sometimes and sometimes not.

Yes, I noticed that (there are 1 or 2 others in the go testsuite).

> +@item POINTER_DIFF_EXPR
> +This node represents pointer subtraction.  The two operands always
> +have pointer/reference type.  The second operand is always an unsigned
> +integer type compatible with sizetype.  It returns a signed integer.
>
> the 2nd sentence looks bogusly copied.

Oups, removed.

>
> +      /* FIXME.  */
> +      if (code == POINTER_DIFF_EXPR)
> +       return int_const_binop (MINUS_EXPR,
> +                               fold_convert (ptrdiff_type_node, arg1),
> +                               fold_convert (ptrdiff_type_node, arg2));
>
>  wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest (arg2));

Before your reply, I wrote something similar using offset_int instead of 
widest_int (and handling overflow, mostly for documentation purposes). I 
wasn't sure which one to pick, I can change to widest_int if you think it 
is better...

> +    case POINTER_DIFF_EXPR:
> +      {
> +       if (!POINTER_TYPE_P (rhs1_type)
> +           || !POINTER_TYPE_P (rhs2_type)
> +           // || !useless_type_conversion_p (rhs2_type, rhs1_type)
>
> types_compatible_p (rhs1_type, rhs2_type)?

Makes sense.

> +           // || !useless_type_conversion_p (ptrdiff_type_node, lhs_type))
> +           || TREE_CODE (lhs_type) != INTEGER_TYPE
> +           || TYPE_UNSIGNED (lhs_type))
> +         {
> +           error ("type mismatch in pointer diff expression");
> +           debug_generic_stmt (lhs_type);
> +           debug_generic_stmt (rhs1_type);
> +           debug_generic_stmt (rhs2_type);
> +           return true;
>
> there's also verify_expr which could want adjustment for newly created
> tree kinds.

ok.

> So if the precision of the result is larger than that of the pointer 
> operands we sign-extend the result, right?  So the subtraction is 
> performed in precision of the pointer operands and then 
> sign-extended/truncated to the result type? Which means it is _not_ a 
> widening subtraction to get around the half-address-space issue.  The 
> tree.def documentation should reflect this semantic detail.  Not sure if 
> the RTL expansion follows it.

I actually have no idea what expansion does if the size is different (that 
was one of my comments). Crashing is not unlikely.

I have changed things a bit. Now POINTER_DIFF_EXPR always has the same 
precision as input and output (so expansion should be fine). And there is 
code in the C/C++ front-ends that uses casts and MINUS_EXPR if we want a 
result wider than pointers, although it might very well be dead code...

My goal was not to help with objects larger than half the address space, 
but to fix things for small objects unlucky enough to straddle the middle 
of the address space. By properly modeling what overflow means for this 
operation, I expect we can add more simplifications in match.pd.

> I think that we'd ideally have a single legal INTEGER_TYPE precision
> per pointer type precision and that those precisions should match...
> we don't have to follow the mistakes of POINTER_PLUS_EXPR.
>
> So ... above verify TYPE_PRECISION (rhs1_type) == TYPE_PRECISION (lhs_type)?
> Some targets have 24bit ptr_mode but no 24bit integer type which means the
> FE likely chooses 32bit int for the difference computation.  But the middle-end
> should be free to create a 24bit precision type with SImode.
>
> Otherwise as said before - go ahead, I think this would be great to
> have for GCC 8.  I'd say
> ask the maintainers of the above list of targets to do some testing.
>
> "Fixing" POINTER_PLUS_EXPR would be very nice as well.  Again, matching
> precision - I'm not sure if we need to force a signed operand, having either
> might be nice given all sizes are usually unsigned.
>
> Thanks and sorry for the delay,
> Richard.

-- 
Marc Glisse

  reply	other threads:[~2017-10-28 12:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-19 18:25 [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) Jakub Jelinek
2017-06-20  7:41 ` Richard Biener
2017-06-20  8:14   ` Jakub Jelinek
2017-06-20  8:18     ` Richard Biener
2017-06-21  7:58       ` Jakub Jelinek
2017-06-21  8:04         ` Richard Biener
2017-06-21 14:40       ` [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998)) Jakub Jelinek
2017-06-21 15:17         ` Jakub Jelinek
2017-06-21 16:27         ` Marc Glisse
2017-06-22  8:31           ` Richard Biener
2017-06-22  9:29             ` Marc Glisse
2017-06-22  9:46               ` Richard Biener
2017-07-01 16:41                 ` Marc Glisse
2017-07-03 12:37                   ` Richard Biener
2017-10-09 11:01                     ` Marc Glisse
2017-10-19 15:11                       ` Richard Biener
2017-10-28 13:04                         ` Marc Glisse [this message]
2017-10-28 17:13                           ` Richard Biener
2017-07-04  8:53       ` [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) Jakub Jelinek
2017-06-21  8:00   ` Jakub Jelinek
2017-06-21  8:05     ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1710192143090.2973@stedding.saclay.inria.fr \
    --to=marc.glisse@inria.fr \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).