From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-467280-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 19996 invoked by alias); 18 Nov 2017 10:20:48 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 19985 invoked by uid 89); 18 Nov 2017 10:20:47 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KB_WAM_FROM_NAME_SINGLEWORD,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=citation
X-HELO: mail2-relais-roc.national.inria.fr
Received: from mail2-relais-roc.national.inria.fr (HELO mail2-relais-roc.national.inria.fr) (192.134.164.83) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 18 Nov 2017 10:20:45 +0000
Received: from afontenayssb-151-1-50-212.w82-121.abo.wanadoo.fr (HELO stedding) ([82.121.91.212])  by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Nov 2017 11:20:43 +0100
Date: Sat, 18 Nov 2017 10:26:00 -0000
From: Marc Glisse <marc.glisse@inria.fr>
To: Richard Biener <richard.guenther@gmail.com>
cc: Jakub Jelinek <jakub@redhat.com>, GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [RFTesting] New POINTER_DIFF_EXPR
In-Reply-To: <CAFiYyc1Tv-fFYXkFKttDzAGz+aiP3hpEPtjHp46PfZm_w1D+nw@mail.gmail.com>
Message-ID: <alpine.DEB.2.20.1711181055470.4721@stedding.saclay.inria.fr>
References: <alpine.DEB.2.20.1710281133200.6751@stedding.saclay.inria.fr> <alpine.DEB.2.20.1711102046090.6907@stedding.saclay.inria.fr> <CAFiYyc1Tv-fFYXkFKttDzAGz+aiP3hpEPtjHp46PfZm_w1D+nw@mail.gmail.com>
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII; format=flowed
X-SW-Source: 2017-11/txt/msg01623.txt.bz2

On Fri, 17 Nov 2017, Richard Biener wrote:

> On Sat, Nov 11, 2017 at 12:44 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>
>> The exact undefined-behavior status should probably be clarified more.
>> First, I'd like to say that POINTER_DIFF_EXPR may only take 2 pointers into
>> the same "object" (like in C/C++), so they differ by at most half the size
>> of the address space, and a-b is not allowed to be the minimum negative
>> value (so I can safely use b-a), and if we compute a-b and b-c, then a and c
>> are in the same object so I can safely compute a-c, etc. Second, we probably
>> need modes without undefined behavior, wrapping with either -fwrapv or a new
>> -fwrapp, or sanitized. For the sanitized version, we could keep using
>> POINTER_DIFF_EXPR and check TYPE_OVERFLOW_SANITIZED on the pointer type as
>> we currently do for integers. For wrapping, either we say that
>> POINTER_DIFF_EXPR has wrapping semantics when that option is in effect, or
>> we do not use POINTER_DIFF_EXPR and instead cast to unsigned before applying
>> a MINUS_EXPR (my preference).
>
> CCing C/C++ FE folks as well.
>
> +      /* It is better if the caller provides the return type.  */
> +      if (code == POINTER_DIFF_EXPR)
> +       {
> +         offset_int res = wi::sub (wi::to_offset (arg1),
> +                                   wi::to_offset (arg2));
> +         return force_fit_type (signed_type_for (TREE_TYPE (arg1)), res, 1,
> +                                TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2));
> +       }
> +
>
> there's an overload that provides the type... (we should probably go
> over all callers
> and use that).  There is already a folding that is only done when the type is
> available.  So I'd simply remove the above from the non-type overload handling.
> What breaks then?

I haven't checked yet. On the other hand, now that I require the same 
precision for pointers and their difference, using signed_type_for should 
be safe enough (only issue is if a front-end is unhappy that we changed 
the exact type for another 'equivalent' one). But I'll try removing the 
version with signed_type_for.

> With the patch it's of course somewhat ugly in that we need to deal with both
> MINUS_EXPR on pointers and POINTER_DIFF_EXPR - at least that's what
>
> +    case POINTER_DIFF_EXPR:
>     case MINUS_EXPR:
> +      /* Fold &a[i] - &a[j] to i-j.  */
> +      if (TREE_CODE (arg0) == ADDR_EXPR
> +         && TREE_CODE (TREE_OPERAND (arg0, 0)) == ARRAY_REF
> +         && TREE_CODE (arg1) == ADDR_EXPR
> +         && TREE_CODE (TREE_OPERAND (arg1, 0)) == ARRAY_REF)
> +        {
> +         tree tem = fold_addr_of_array_ref_difference (loc, type,
> +                                                       TREE_OPERAND (arg0, 0),
> +                                                       TREE_OPERAND (arg1, 0),
> +                                                       code
> +                                                       == POINTER_DIFF_EXPR);
>
> suggests.  But is that really so?  The FEs today should never build
> a MINUS_EXPR with pointer operands and your patch also doesn't.
> I suppose we only arrive above because STRIP_NOPS strips the
> pointer to integer conversion?

I think so. And since I only patched 2 front-ends, others probably still 
generate casts to integers and MINUS_EXPR and can benefit from this 
transformation. Also, if we implement -fwrapp by casting to unsigned and 
doing MINUS_EXPR, that will remain useful.

> +    case POINTER_DIFF_EXPR:
> +      if (!POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 0)))
> +         || !POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 1))))
> +       {
> +         error ("invalid operand to pointer diff, operand is not a pointer");
> +         return t;
> +       }
> +      CHECK_OP (0, "invalid operand to pointer diff");
> +      CHECK_OP (1, "invalid operand to pointer diff");
> +      break;
>
> can you add
>  || TREE_CODE (TREE_TYPE (t)) != INTEGER_TYPE
>  || TYPE_UNSIGNED (TREE_TYPE (t))

I wasn't sure how much to duplicate between here and below. Added.

> ?  Note that if the precision of the pointer type arguments are not equal to the
> precision of the return value then foldings like
>
>   (simplify
> +   (plus:c (pointer_diff @0 @1) (pointer_diff @2 @0))
> +   (if (TYPE_OVERFLOW_UNDEFINED (type)
> +       && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)))
> +    (pointer_diff @2 @1)))
>
> might not be correct as we drop a possible truncation here.  Shall we

I simplified the transformations after deciding to require equal 
precision.

> require (and document) that the result of the pointer difference in
> a POINTER_DIFF_EXPR has to be representable in the type of the
> expression, otherwise
> the behavior is undefined?  Shall we allow an unsigned return type for
> differences known to be representable?  Probably not...  Does the behavior
> depend on TYPE_OVERFLOW_UNDEFINED?  It would be nice to amend
> the generic.texi docs somewhat here.

I was more precise in tree.def, copying details over to generic.texi now.
But the overflow behavior is something I was asking about above (see
citation at the beginning of this email). I am tempted to say that overflow
is always undefined for POINTER_DIFF_EXPR and we should use MINUS_EXPR on
unsigned integers when we want wrapping, but if people prefer to use
POINTER_DIFF_EXPR in all cases and change its behavior depending on
TYPE_OVERFLOW_UNDEFINED that would be ok with me. I probably wasn't
consistent in match.pd and will need to tweak a few details once this is
decided.

> Ah, the gimple checking adds some more bits here that clarify things:
>
> +    case POINTER_DIFF_EXPR:
> +      {
> +       if (!POINTER_TYPE_P (rhs1_type)
> +           || !POINTER_TYPE_P (rhs2_type)
> +           || !types_compatible_p (rhs1_type, rhs2_type)
> +           || TREE_CODE (lhs_type) != INTEGER_TYPE
> +           || TYPE_UNSIGNED (lhs_type)
> +           || TYPE_PRECISION (lhs_type) != TYPE_PRECISION (rhs1_type))
> +         {
> +           error ("type mismatch in pointer diff expression");
> +           debug_generic_stmt (lhs_type);
> +           debug_generic_stmt (rhs1_type);
> +           debug_generic_stmt (rhs2_type);
> +           return true;
> +         }
>
>
> I think the patch is ok for trunk, the FE bits should get approval
> from their maintainers.
> Joseph may have an idea about the address-space issue.
>
> With this in place we should be able to fix some of the issues Jakub ran into
> with pointer sanitizing?

Some of them should be fixed without doing anything: pointer differences
are not using MINUS_EXPR anymore in C/C++, so they are not sanitized and
don't produce spurious errors when crossing the middle of the address
space. On the other hand, we may want to sanitize POINTER_DIFF_EXPR
specifically.

> Also with this in place it would be nice to do the corresponding adjustment to
> POINTER_PLUS_EXPR, that is, require a _signed_ offset type that matches
> the precision of the other argument.

Maybe next stage1... Too late for this year.

-- 
Marc Glisse