From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-456817-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 125148 invoked by alias); 26 Jun 2017 11:14:41 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 124968 invoked by uid 89); 26 Jun 2017 11:14:32 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: mail-wr0-f178.google.com
Received: from mail-wr0-f178.google.com (HELO mail-wr0-f178.google.com) (209.85.128.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 26 Jun 2017 11:14:28 +0000
Received: by mail-wr0-f178.google.com with SMTP id c11so142856878wrc.3        for <gcc-patches@gcc.gnu.org>; Mon, 26 Jun 2017 04:14:28 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=1e100.net; s=20161025;        h=x-gm-message-state:from:to:mail-followup-to:cc:subject:references         :date:in-reply-to:message-id:user-agent:mime-version;        bh=yX7BU0T8YfXjA4TspFL4v5IrXBg+TO4GqLlLCu59RZ8=;        b=q+FzZOStqeT1UvPu57wJHjmsKb4otBVw4hw7WoUN48wGsyCi8ovAF8V508o3p4x/m3         VQyv9fjtu9C48wrBcHfScG/5LUd0Pv09uVQRGj2VDXDPwy/FNMV/atW/3jrulFLHpo3h         Xm2LtEbflCESrs9Jr9tMjEJyUSDwDewmrktHT5mO24L+zRV/nGwFUsyw4zpBl9xY+XMW         /wbGIkZiJUyuhgW//qx9bMzGsM2zd9UaX6hHv4q5UBX2UITxFcwD85t6Zn7Rx+48xIP3         PrSvCIp9fYWpKIApMjyS+oF5NjFKcfSITOJeKZToSg0pZdzfJJi2DnhsrLXV7UoIvgaV         eVsw==
X-Gm-Message-State: AKS2vOwzuytWvYEuV8hH05i7f62UeG21K4+QVa6YRn+Gav2rVVA7KYXr	NSoLuCxFYjaYuLX46HLA5A==
X-Received: by 10.223.160.111 with SMTP id l44mr14106371wrl.31.1498475666422;        Mon, 26 Jun 2017 04:14:26 -0700 (PDT)
Received: from localhost ([2.26.27.176])        by smtp.gmail.com with ESMTPSA id 32sm13926554wry.31.2017.06.26.04.14.24        (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);        Mon, 26 Jun 2017 04:14:25 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@linaro.org>
To: Richard Biener <richard.guenther@gmail.com>
Mail-Followup-To: Richard Biener <richard.guenther@gmail.com>,GCC Patches <gcc-patches@gcc.gnu.org>, richard.sandiford@linaro.org
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: PR81136: ICE from inconsistent DR_MISALIGNMENTs
References: <87k244z2c5.fsf@linaro.org>	<CAFiYyc015ZVKVk0qegFFBhgzkOjSuoC6=iznQzju41uyiW9XtQ@mail.gmail.com>	<8760fmgb90.fsf@linaro.org>	<CAFiYyc0hAx8hFTiMRjM7f=e=B2g2o4=nqM790RThuj8yxngzpQ@mail.gmail.com>
Date: Mon, 26 Jun 2017 11:14:00 -0000
In-Reply-To: <CAFiYyc0hAx8hFTiMRjM7f=e=B2g2o4=nqM790RThuj8yxngzpQ@mail.gmail.com>	(Richard Biener's message of "Mon, 26 Jun 2017 12:25:54 +0200")
Message-ID: <874lv36lww.fsf@linaro.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-SW-Source: 2017-06/txt/msg01912.txt.bz2

Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>> <richard.sandiford@linaro.org> wrote:
>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>
>>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>
>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>> base alignment than the other.
>>>>
>>>> In the test case this looks like a missed opportunity.  Both references
>>>> are unconditional, so it should be possible to use the highest of the
>>>> available base alignment guarantees when analyzing each reference.
>>>> The patch does this.
>>>>
>>>> However, as the comment in the patch says, the base alignment guarantees
>>>> provided by a conditional reference only apply if the reference occurs
>>>> at least once.  In this case it would be legitimate for two references
>>>> to have the same runtime misalignment and for one reference to provide a
>>>> stronger compile-time guarantee than the other about what the misalignment
>>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>>
>>> Hmm, but you don't actually check whether a reference occurs only
> conditional,
>>> do you?  You just seem to say that for masked loads/stores the reference
>>> is conditional (I believe that's not true).  But for a loop like
>>>
>>>  for (;;)
>>>    if (a[i])
>>>      sum += b[j];
>>>
>>> you still assume b[j] executes unconditionally?
>>
>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>> was supposed to mean "even if the containing statement executes
>> and runs to completion, the reference might not actually occur".
>> The example above isn't conditional in that sense because the
>> reference to b[j] does occur if the store is reached and completes.
>>
>> Masked loads and stores are conditional in that sense though.
>> The reference only occurs if the mask is nonzero; the memory
>> isn't touched otherwise.  The functions are used to if-convert
>> things like:
>>
>>    for (...)
>>      a[i] = b[i] ? c[i] : d[i];
>>
>> where there's no guarantee that it's safe to access c[i] when !b[i]
>> (or d[i] when b[i]).  No reference occurs for an all-false mask.
>
> But as you touch generic data-ref code here you should apply more
> sensible semantics to DR_IS_CONDITIONAL than just marking
> masked loads/stores but not DRs occuring inside BBs only executed
> conditionally ...

I don't see why that's more sensible though.  If a statement is only
conditionally executed in a loop, it's up to the consumer to decide
what to do about that.  The conditions under which the statement
is reached are a control-flow issue and tree-data-ref.c doesn't
have any special information about it.

Masked loads and stores are special because the DR_REFs created by
tree-data-ref.c are artificial: they didn't exist as MEM_REFs in the
original DR_STMT.  And AIUI they didn't exist as MEM_REFs precisely
because they're not guaranteed to happen, even if the load or store
statement itself is executed.  So in this case the DR_IS_CONDITIONAL
is reflecting something that tree-data-ref.c itself has done.

How about calling it DR_IS_CONDITIONAL_IN_STMT to avoid the
general-sounding name?

>>> The vectorizer of course only sees unconditionally executed stmts.
>>>
>>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>>> any real-world (testsuite) issues without this?
>>
>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>> assumptions in silly corner cases.  I could add a scan test for it,
>> for targets with masked loads and stores.  It wouldn't trigger
>> an execution failure though because we assume that targets with
>> masked loads and stores allow unaligned accesses:
>>
>>   /* For now assume all conditional loads/stores support unaligned
>>      access without any special code.  */
>>   if (is_gimple_call (stmt)
>>       && gimple_call_internal_p (stmt)
>>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>     return dr_unaligned_supported;
>>
>> So the worst that would happen is that we'd supposedly peel for
>> alignment, but actually misalign everything instead, and so make
>> things slower rather than quicker.
>>
>>> Note that the assert is to prevent bogus information.  Iff we aligned
>>> DR with base alignment 8 and misalign 3 then if another same-align
>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>> as it still can be 8 after aligning DR.
>>>
>>> So I think it's wrong to put DRs with differing base-alignment into
>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>> updated independenlty after peeling.
>>
>> DR_MISALIGNMENT is relative to the vector alignment rather than
>> the base alignment though.  So:
>
> We seem to use it that way, yes (looking at set_ptr_info_alignment
> uses).  So why not fix the assert then by capping the alignment/misalignment
> we compute at this value as well?  (and document this in the header
> around DR_MISALIGNMENT)
>
> Ideally we'd do alignment analysis independent of the vector size
> though (for those stupid targets with multiple vector sizes to consider...).
>
>> a) when looking for references *A1 and *A2 with the same alignment,
>>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>>    This doesn't require any knowledge about the base alignment.
>>    If we break the addresses down as:
>>
>>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>>
>>    and can prove that BASE1 == BASE2, the alignment of that base
>>    isn't important.  We simply need to prove that REST1 % vecalign
>>    == REST2 % vecalign for all X.
>>
>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>>    rely on the base alignment being known.
>>
>> What a high base alignment for DR_PEEL gives us is the ability to know
>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>> But the points above apply regardless of whether we know that value at
>> compile time or not.
>>
>> In examples like the test case, we would have known at compile time that
>> VF-1 iterations would need to be peeled if we'd picked the store as the
>> DR_PEEL, but would have treated the number of peels as variable if we'd
>> picked the load.  The value calculated at runtime would still have been
>> VF-1, it's just that the code wouldn't have been as efficient.
>>
>> One of the benefits of pooling the alignments for unconditional references
>> is that it no longer matters which DR we pick: the number of peels will
>> be a compile-time constant both ways.
>>
>> Thanks,
>> Richard
>>
>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>> larger align for the other DR if possible.
>
> ^^^
>
> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?

I don't think the problem is the lack of a cap.  In the test case we
see that:

1. B is known at compile time to be X * vecsize + Y when considered in
   isolation, because the base alignment derived from its DR_REF >= vecsize.
   So DR_MISALIGNMENT (B) == Y.

2. A's misalignment wrt vecsize is not known at compile time when
   considered in isolation, because no useful base alignment can be
   derived from its DR_REF.  (The DR_REF is to a plain int rather than
   to a structure with a high alignment.)  So DR_MISALIGNMENT (A) == -1.

3. A and B when considered as a pair trivially have the same misalignment
   wrt vecsize, for the reasons above.

Each of these results is individually correct.  The problem is that the
assert is conflating two things: it's saying that if we know two datarefs
have the same misaligment, we must either be able to calculate a
compile-time misalignment for both datarefs in isolation, or we must
fail to calculate a compile-time misalignment for both datarefs in
isolation.  That isn't true: it's valid to have situations in which the
compile-time misalignment is known for one dataref in isolation but not
for the other.

Thanks,
Richard