No TBAA before ptr_derefs_may_alias

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* No TBAA before ptr_derefs_may_alias_p?
@ 2014-01-31 15:24 Bingfeng Mei
  2014-01-31 15:27 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Bingfeng Mei @ 2014-01-31 15:24 UTC (permalink / raw)
  To: gcc, Richard Biener

Hi,
I got this simple example to vectorize. Somehow, GCC (4.8) generates loop version because
it cannot determine alias between acc[i] write and x[i].real read. It is pretty obvious to me that they are not aliased based on TBAA information.

typedef struct
{
   short real;
   short imag;
} complex16_t;

void
libvector_AccSquareNorm_ref (unsigned long long  *acc,
                             const complex16_t *x, unsigned len)
{
    for (unsigned i = 0; i < len; i++)
    {
        acc[i] +=
            ((unsigned long long)((int)x[i].real * x[i].real)) +
            ((unsigned long long)((int)x[i].imag * x[i].imag));
    }
}

Tracing into how the alias information is calculated, I found it hits the following code
by calling ptr_derefs_may_alias_p and return true. ptr_derefs_may_alias_p doesn't contain
TBAA disambiguation code. Should we add check before that? 

  /* If we had an evolution in a MEM_REF BASE_OBJECT we do not know
     the size of the base-object.  So we cannot do any offset/overlap
     based analysis but have to rely on points-to information only.  */
  if (TREE_CODE (addr_a) == MEM_REF
      && DR_UNCONSTRAINED_BASE (a))
    {
      if (TREE_CODE (addr_b) == MEM_REF
	  && DR_UNCONSTRAINED_BASE (b))
	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
				       TREE_OPERAND (addr_b, 0));
      else
	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
				       build_fold_addr_expr (addr_b));
    }
  else if (TREE_CODE (addr_b) == MEM_REF
	   && DR_UNCONSTRAINED_BASE (b))
    return ptr_derefs_may_alias_p (build_fold_addr_expr (addr_a),
				   TREE_OPERAND (addr_b, 0));

  /* Otherwise DR_BASE_OBJECT is an access that covers the whole object
     that is being subsetted in the loop nest.  */
  if (DR_IS_WRITE (a) && DR_IS_WRITE (b))
    return refs_output_dependent_p (addr_a, addr_b);
  else if (DR_IS_READ (a) && DR_IS_WRITE (b))
    return refs_anti_dependent_p (addr_a, addr_b);
  return refs_may_alias_p (addr_a, addr_b);

This issue can be reproduced on trunk x86-64 gcc. 

Cheers,
Bingfeng Mei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No TBAA before ptr_derefs_may_alias_p?
  2014-01-31 15:24 No TBAA before ptr_derefs_may_alias_p? Bingfeng Mei
@ 2014-01-31 15:27 ` Richard Biener
  2014-01-31 17:01   ` Bingfeng Mei
  2014-01-31 17:18   ` Bingfeng Mei
  0 siblings, 2 replies; 16+ messages in thread
From: Richard Biener @ 2014-01-31 15:27 UTC (permalink / raw)
  To: Bingfeng Mei, gcc

On 1/31/14 4:02 PM, Bingfeng Mei wrote:
> Hi,
> I got this simple example to vectorize. Somehow, GCC (4.8) generates loop version because
> it cannot determine alias between acc[i] write and x[i].real read. It is pretty obvious to me that they are not aliased based on TBAA information.
> 
> typedef struct
> {
>    short real;
>    short imag;
> } complex16_t;
> 
> void
> libvector_AccSquareNorm_ref (unsigned long long  *acc,
>                              const complex16_t *x, unsigned len)
> {
>     for (unsigned i = 0; i < len; i++)
>     {
>         acc[i] +=
>             ((unsigned long long)((int)x[i].real * x[i].real)) +
>             ((unsigned long long)((int)x[i].imag * x[i].imag));
>     }
> }
> 
> Tracing into how the alias information is calculated, I found it hits the following code
> by calling ptr_derefs_may_alias_p and return true. ptr_derefs_may_alias_p doesn't contain
> TBAA disambiguation code. Should we add check before that? 
> 
>   /* If we had an evolution in a MEM_REF BASE_OBJECT we do not know
>      the size of the base-object.  So we cannot do any offset/overlap
>      based analysis but have to rely on points-to information only.  */
>   if (TREE_CODE (addr_a) == MEM_REF
>       && DR_UNCONSTRAINED_BASE (a))
>     {
>       if (TREE_CODE (addr_b) == MEM_REF
> 	  && DR_UNCONSTRAINED_BASE (b))
> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
> 				       TREE_OPERAND (addr_b, 0));
>       else
> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
> 				       build_fold_addr_expr (addr_b));
>     }
>   else if (TREE_CODE (addr_b) == MEM_REF
> 	   && DR_UNCONSTRAINED_BASE (b))
>     return ptr_derefs_may_alias_p (build_fold_addr_expr (addr_a),
> 				   TREE_OPERAND (addr_b, 0));
> 
>   /* Otherwise DR_BASE_OBJECT is an access that covers the whole object
>      that is being subsetted in the loop nest.  */
>   if (DR_IS_WRITE (a) && DR_IS_WRITE (b))
>     return refs_output_dependent_p (addr_a, addr_b);
>   else if (DR_IS_READ (a) && DR_IS_WRITE (b))
>     return refs_anti_dependent_p (addr_a, addr_b);
>   return refs_may_alias_p (addr_a, addr_b);
> 
> This issue can be reproduced on trunk x86-64 gcc. 

True, you can add a

 if (flag_strict_aliasing
    && DR_IS_WRITE (a) && DR_IS_READ (b)
    && !alias_sets_conflict_p (get_alias_set (DR_REF (a)), get_alias_set
(DR_REF (b)))
   return false;

before the ptr_derefs_may_alias_p calls.  TBAA is only valid for
true dependences.

Richard.

> Cheers,
> Bingfeng Mei
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-01-31 15:27 ` Richard Biener
@ 2014-01-31 17:01   ` Bingfeng Mei
  2014-01-31 17:18   ` Bingfeng Mei
  1 sibling, 0 replies; 16+ messages in thread
From: Bingfeng Mei @ 2014-01-31 17:01 UTC (permalink / raw)
  To: Richard Biener, gcc

Thanks, Richard,
I will prepare a patch with test as well as filing a bug.

Bingfeng

-----Original Message-----
From: Richard Biener [mailto:rguenther@suse.de] 
Sent: 31 January 2014 15:24
To: Bingfeng Mei; gcc@gcc.gnu.org
Subject: Re: No TBAA before ptr_derefs_may_alias_p?

On 1/31/14 4:02 PM, Bingfeng Mei wrote:
> Hi,
> I got this simple example to vectorize. Somehow, GCC (4.8) generates loop version because
> it cannot determine alias between acc[i] write and x[i].real read. It is pretty obvious to me that they are not aliased based on TBAA information.
> 
> typedef struct
> {
>    short real;
>    short imag;
> } complex16_t;
> 
> void
> libvector_AccSquareNorm_ref (unsigned long long  *acc,
>                              const complex16_t *x, unsigned len)
> {
>     for (unsigned i = 0; i < len; i++)
>     {
>         acc[i] +=
>             ((unsigned long long)((int)x[i].real * x[i].real)) +
>             ((unsigned long long)((int)x[i].imag * x[i].imag));
>     }
> }
> 
> Tracing into how the alias information is calculated, I found it hits the following code
> by calling ptr_derefs_may_alias_p and return true. ptr_derefs_may_alias_p doesn't contain
> TBAA disambiguation code. Should we add check before that? 
> 
>   /* If we had an evolution in a MEM_REF BASE_OBJECT we do not know
>      the size of the base-object.  So we cannot do any offset/overlap
>      based analysis but have to rely on points-to information only.  */
>   if (TREE_CODE (addr_a) == MEM_REF
>       && DR_UNCONSTRAINED_BASE (a))
>     {
>       if (TREE_CODE (addr_b) == MEM_REF
> 	  && DR_UNCONSTRAINED_BASE (b))
> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
> 				       TREE_OPERAND (addr_b, 0));
>       else
> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
> 				       build_fold_addr_expr (addr_b));
>     }
>   else if (TREE_CODE (addr_b) == MEM_REF
> 	   && DR_UNCONSTRAINED_BASE (b))
>     return ptr_derefs_may_alias_p (build_fold_addr_expr (addr_a),
> 				   TREE_OPERAND (addr_b, 0));
> 
>   /* Otherwise DR_BASE_OBJECT is an access that covers the whole object
>      that is being subsetted in the loop nest.  */
>   if (DR_IS_WRITE (a) && DR_IS_WRITE (b))
>     return refs_output_dependent_p (addr_a, addr_b);
>   else if (DR_IS_READ (a) && DR_IS_WRITE (b))
>     return refs_anti_dependent_p (addr_a, addr_b);
>   return refs_may_alias_p (addr_a, addr_b);
> 
> This issue can be reproduced on trunk x86-64 gcc. 

True, you can add a

 if (flag_strict_aliasing
    && DR_IS_WRITE (a) && DR_IS_READ (b)
    && !alias_sets_conflict_p (get_alias_set (DR_REF (a)), get_alias_set
(DR_REF (b)))
   return false;

before the ptr_derefs_may_alias_p calls.  TBAA is only valid for
true dependences.

Richard.

> Cheers,
> Bingfeng Mei
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-01-31 15:27 ` Richard Biener
  2014-01-31 17:01   ` Bingfeng Mei
@ 2014-01-31 17:18   ` Bingfeng Mei
  2014-01-31 21:32     ` Richard Biener
  1 sibling, 1 reply; 16+ messages in thread
From: Bingfeng Mei @ 2014-01-31 17:18 UTC (permalink / raw)
  To: Richard Biener, gcc

Unfortunately this patch doesn't work because the memory dependency is Anti in this
case. 

Why TBAA cannot handle anti- & output- dependencies? I check GCC bug database, and 
found pr38503 & pr38964.  I don't fully understand it, but seems to me is related
in handling C++ new operator. But this example is pretty clear and has nothing to
do with C++ and new statement. Isn't it too conservative to disable TBAA for anti-
& output- dependency here? 


Bingfeng

-----Original Message-----
From: Richard Biener [mailto:rguenther@suse.de] 
Sent: 31 January 2014 15:24
To: Bingfeng Mei; gcc@gcc.gnu.org
Subject: Re: No TBAA before ptr_derefs_may_alias_p?

On 1/31/14 4:02 PM, Bingfeng Mei wrote:
> Hi,
> I got this simple example to vectorize. Somehow, GCC (4.8) generates loop version because
> it cannot determine alias between acc[i] write and x[i].real read. It is pretty obvious to me that they are not aliased based on TBAA information.
> 
> typedef struct
> {
>    short real;
>    short imag;
> } complex16_t;
> 
> void
> libvector_AccSquareNorm_ref (unsigned long long  *acc,
>                              const complex16_t *x, unsigned len)
> {
>     for (unsigned i = 0; i < len; i++)
>     {
>         acc[i] +=
>             ((unsigned long long)((int)x[i].real * x[i].real)) +
>             ((unsigned long long)((int)x[i].imag * x[i].imag));
>     }
> }
> 
> Tracing into how the alias information is calculated, I found it hits the following code
> by calling ptr_derefs_may_alias_p and return true. ptr_derefs_may_alias_p doesn't contain
> TBAA disambiguation code. Should we add check before that? 
> 
>   /* If we had an evolution in a MEM_REF BASE_OBJECT we do not know
>      the size of the base-object.  So we cannot do any offset/overlap
>      based analysis but have to rely on points-to information only.  */
>   if (TREE_CODE (addr_a) == MEM_REF
>       && DR_UNCONSTRAINED_BASE (a))
>     {
>       if (TREE_CODE (addr_b) == MEM_REF
> 	  && DR_UNCONSTRAINED_BASE (b))
> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
> 				       TREE_OPERAND (addr_b, 0));
>       else
> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
> 				       build_fold_addr_expr (addr_b));
>     }
>   else if (TREE_CODE (addr_b) == MEM_REF
> 	   && DR_UNCONSTRAINED_BASE (b))
>     return ptr_derefs_may_alias_p (build_fold_addr_expr (addr_a),
> 				   TREE_OPERAND (addr_b, 0));
> 
>   /* Otherwise DR_BASE_OBJECT is an access that covers the whole object
>      that is being subsetted in the loop nest.  */
>   if (DR_IS_WRITE (a) && DR_IS_WRITE (b))
>     return refs_output_dependent_p (addr_a, addr_b);
>   else if (DR_IS_READ (a) && DR_IS_WRITE (b))
>     return refs_anti_dependent_p (addr_a, addr_b);
>   return refs_may_alias_p (addr_a, addr_b);
> 
> This issue can be reproduced on trunk x86-64 gcc. 

True, you can add a

 if (flag_strict_aliasing
    && DR_IS_WRITE (a) && DR_IS_READ (b)
    && !alias_sets_conflict_p (get_alias_set (DR_REF (a)), get_alias_set
(DR_REF (b)))
   return false;

before the ptr_derefs_may_alias_p calls.  TBAA is only valid for
true dependences.

Richard.

> Cheers,
> Bingfeng Mei
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-01-31 17:18   ` Bingfeng Mei
@ 2014-01-31 21:32     ` Richard Biener
  2014-02-03  9:51       ` Bingfeng Mei
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2014-01-31 21:32 UTC (permalink / raw)
  To: Bingfeng Mei, gcc

On January 31, 2014 6:01:36 PM GMT+01:00, Bingfeng Mei <bmei@broadcom.com> wrote:
>Unfortunately this patch doesn't work because the memory dependency is
>Anti in this
>case. 
>
>Why TBAA cannot handle anti- & output- dependencies? I check GCC bug
>database, and 
>found pr38503 & pr38964.  I don't fully understand it, but seems to me
>is related
>in handling C++ new operator. But this example is pretty clear and has
>nothing to
>do with C++ and new statement. Isn't it too conservative to disable
>TBAA for anti-
>& output- dependency here? 

Because the gcc memory model allows the dynamic type of a memory location to change by a store.

That in turn is the only sensible way of supporting c++ placement new.

Richard.

>
>Bingfeng
>
>-----Original Message-----
>From: Richard Biener [mailto:rguenther@suse.de] 
>Sent: 31 January 2014 15:24
>To: Bingfeng Mei; gcc@gcc.gnu.org
>Subject: Re: No TBAA before ptr_derefs_may_alias_p?
>
>On 1/31/14 4:02 PM, Bingfeng Mei wrote:
>> Hi,
>> I got this simple example to vectorize. Somehow, GCC (4.8) generates
>loop version because
>> it cannot determine alias between acc[i] write and x[i].real read. It
>is pretty obvious to me that they are not aliased based on TBAA
>information.
>> 
>> typedef struct
>> {
>>    short real;
>>    short imag;
>> } complex16_t;
>> 
>> void
>> libvector_AccSquareNorm_ref (unsigned long long  *acc,
>>                              const complex16_t *x, unsigned len)
>> {
>>     for (unsigned i = 0; i < len; i++)
>>     {
>>         acc[i] +=
>>             ((unsigned long long)((int)x[i].real * x[i].real)) +
>>             ((unsigned long long)((int)x[i].imag * x[i].imag));
>>     }
>> }
>> 
>> Tracing into how the alias information is calculated, I found it hits
>the following code
>> by calling ptr_derefs_may_alias_p and return true.
>ptr_derefs_may_alias_p doesn't contain
>> TBAA disambiguation code. Should we add check before that? 
>> 
>>   /* If we had an evolution in a MEM_REF BASE_OBJECT we do not know
>>      the size of the base-object.  So we cannot do any offset/overlap
>>      based analysis but have to rely on points-to information only. 
>*/
>>   if (TREE_CODE (addr_a) == MEM_REF
>>       && DR_UNCONSTRAINED_BASE (a))
>>     {
>>       if (TREE_CODE (addr_b) == MEM_REF
>> 	  && DR_UNCONSTRAINED_BASE (b))
>> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
>> 				       TREE_OPERAND (addr_b, 0));
>>       else
>> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
>> 				       build_fold_addr_expr (addr_b));
>>     }
>>   else if (TREE_CODE (addr_b) == MEM_REF
>> 	   && DR_UNCONSTRAINED_BASE (b))
>>     return ptr_derefs_may_alias_p (build_fold_addr_expr (addr_a),
>> 				   TREE_OPERAND (addr_b, 0));
>> 
>>   /* Otherwise DR_BASE_OBJECT is an access that covers the whole
>object
>>      that is being subsetted in the loop nest.  */
>>   if (DR_IS_WRITE (a) && DR_IS_WRITE (b))
>>     return refs_output_dependent_p (addr_a, addr_b);
>>   else if (DR_IS_READ (a) && DR_IS_WRITE (b))
>>     return refs_anti_dependent_p (addr_a, addr_b);
>>   return refs_may_alias_p (addr_a, addr_b);
>> 
>> This issue can be reproduced on trunk x86-64 gcc. 
>
>True, you can add a
>
> if (flag_strict_aliasing
>    && DR_IS_WRITE (a) && DR_IS_READ (b)
>   && !alias_sets_conflict_p (get_alias_set (DR_REF (a)), get_alias_set
>(DR_REF (b)))
>   return false;
>
>before the ptr_derefs_may_alias_p calls.  TBAA is only valid for
>true dependences.
>
>Richard.
>
>> Cheers,
>> Bingfeng Mei
>> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-01-31 21:32     ` Richard Biener
@ 2014-02-03  9:51       ` Bingfeng Mei
  2014-02-03  9:59         ` Jakub Jelinek
  0 siblings, 1 reply; 16+ messages in thread
From: Bingfeng Mei @ 2014-02-03  9:51 UTC (permalink / raw)
  To: Richard Biener, gcc

If it is just for C++ placement new, why don't implement it as a lang_hook.
Now other languages such as C have to be made conservative and produce worse
code. 

Bingfeng

-----Original Message-----
From: Richard Biener [mailto:rguenther@suse.de] 
Sent: 31 January 2014 19:44
To: Bingfeng Mei; gcc@gcc.gnu.org
Subject: RE: No TBAA before ptr_derefs_may_alias_p?

On January 31, 2014 6:01:36 PM GMT+01:00, Bingfeng Mei <bmei@broadcom.com> wrote:
>Unfortunately this patch doesn't work because the memory dependency is
>Anti in this
>case. 
>
>Why TBAA cannot handle anti- & output- dependencies? I check GCC bug
>database, and 
>found pr38503 & pr38964.  I don't fully understand it, but seems to me
>is related
>in handling C++ new operator. But this example is pretty clear and has
>nothing to
>do with C++ and new statement. Isn't it too conservative to disable
>TBAA for anti-
>& output- dependency here? 

Because the gcc memory model allows the dynamic type of a memory location to change by a store.

That in turn is the only sensible way of supporting c++ placement new.

Richard.

>
>Bingfeng
>
>-----Original Message-----
>From: Richard Biener [mailto:rguenther@suse.de] 
>Sent: 31 January 2014 15:24
>To: Bingfeng Mei; gcc@gcc.gnu.org
>Subject: Re: No TBAA before ptr_derefs_may_alias_p?
>
>On 1/31/14 4:02 PM, Bingfeng Mei wrote:
>> Hi,
>> I got this simple example to vectorize. Somehow, GCC (4.8) generates
>loop version because
>> it cannot determine alias between acc[i] write and x[i].real read. It
>is pretty obvious to me that they are not aliased based on TBAA
>information.
>> 
>> typedef struct
>> {
>>    short real;
>>    short imag;
>> } complex16_t;
>> 
>> void
>> libvector_AccSquareNorm_ref (unsigned long long  *acc,
>>                              const complex16_t *x, unsigned len)
>> {
>>     for (unsigned i = 0; i < len; i++)
>>     {
>>         acc[i] +=
>>             ((unsigned long long)((int)x[i].real * x[i].real)) +
>>             ((unsigned long long)((int)x[i].imag * x[i].imag));
>>     }
>> }
>> 
>> Tracing into how the alias information is calculated, I found it hits
>the following code
>> by calling ptr_derefs_may_alias_p and return true.
>ptr_derefs_may_alias_p doesn't contain
>> TBAA disambiguation code. Should we add check before that? 
>> 
>>   /* If we had an evolution in a MEM_REF BASE_OBJECT we do not know
>>      the size of the base-object.  So we cannot do any offset/overlap
>>      based analysis but have to rely on points-to information only. 
>*/
>>   if (TREE_CODE (addr_a) == MEM_REF
>>       && DR_UNCONSTRAINED_BASE (a))
>>     {
>>       if (TREE_CODE (addr_b) == MEM_REF
>> 	  && DR_UNCONSTRAINED_BASE (b))
>> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
>> 				       TREE_OPERAND (addr_b, 0));
>>       else
>> 	return ptr_derefs_may_alias_p (TREE_OPERAND (addr_a, 0),
>> 				       build_fold_addr_expr (addr_b));
>>     }
>>   else if (TREE_CODE (addr_b) == MEM_REF
>> 	   && DR_UNCONSTRAINED_BASE (b))
>>     return ptr_derefs_may_alias_p (build_fold_addr_expr (addr_a),
>> 				   TREE_OPERAND (addr_b, 0));
>> 
>>   /* Otherwise DR_BASE_OBJECT is an access that covers the whole
>object
>>      that is being subsetted in the loop nest.  */
>>   if (DR_IS_WRITE (a) && DR_IS_WRITE (b))
>>     return refs_output_dependent_p (addr_a, addr_b);
>>   else if (DR_IS_READ (a) && DR_IS_WRITE (b))
>>     return refs_anti_dependent_p (addr_a, addr_b);
>>   return refs_may_alias_p (addr_a, addr_b);
>> 
>> This issue can be reproduced on trunk x86-64 gcc. 
>
>True, you can add a
>
> if (flag_strict_aliasing
>    && DR_IS_WRITE (a) && DR_IS_READ (b)
>   && !alias_sets_conflict_p (get_alias_set (DR_REF (a)), get_alias_set
>(DR_REF (b)))
>   return false;
>
>before the ptr_derefs_may_alias_p calls.  TBAA is only valid for
>true dependences.
>
>Richard.
>
>> Cheers,
>> Bingfeng Mei
>> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03  9:51       ` Bingfeng Mei
@ 2014-02-03  9:59         ` Jakub Jelinek
  2014-02-03 10:14           ` Florian Weimer
  2014-02-03 10:15           ` Richard Biener
  0 siblings, 2 replies; 16+ messages in thread
From: Jakub Jelinek @ 2014-02-03  9:59 UTC (permalink / raw)
  To: Bingfeng Mei; +Cc: Richard Biener, gcc

On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
> If it is just for C++ placement new, why don't implement it as a lang_hook.
> Now other languages such as C have to be made conservative and produce worse
> code.

Even in C++ code you don't use placement new that often, so e.g. by having
the placement new explicit through some special GIMPLE statement in the IL,
you could e.g. just look if a particular function or loop contains any
placement new stmts (cached in struct function and loop?) and use TBAA if
it isn't there.

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03  9:59         ` Jakub Jelinek
@ 2014-02-03 10:14           ` Florian Weimer
  2014-02-03 10:19             ` Richard Biener
  2014-02-03 10:15           ` Richard Biener
  1 sibling, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2014-02-03 10:14 UTC (permalink / raw)
  To: Jakub Jelinek, Bingfeng Mei; +Cc: Richard Biener, gcc

On 02/03/2014 10:59 AM, Jakub Jelinek wrote:
> On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
>> If it is just for C++ placement new, why don't implement it as a lang_hook.
>> Now other languages such as C have to be made conservative and produce worse
>> code.
>
> Even in C++ code you don't use placement new that often, so e.g. by having
> the placement new explicit through some special GIMPLE statement in the IL,
> you could e.g. just look if a particular function or loop contains any
> placement new stmts (cached in struct function and loop?) and use TBAA if
> it isn't there.

I believe the convenience of TBAA lies in the fact that you don't have 
to prove anything about actual program behavior if the types are 
sufficiently distinct.  If you allow local violations of that principle, 
the global property inevitably breaks down as well.

In any case, C code can call C++ code and vice versa, so it's difficult 
to consider each language in isolation.

-- 
Florian Weimer / Red Hat Product Security Team

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03  9:59         ` Jakub Jelinek
  2014-02-03 10:14           ` Florian Weimer
@ 2014-02-03 10:15           ` Richard Biener
  2014-02-03 10:36             ` Richard Biener
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Biener @ 2014-02-03 10:15 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bingfeng Mei, gcc

On Mon, 3 Feb 2014, Jakub Jelinek wrote:

> On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
> > If it is just for C++ placement new, why don't implement it as a lang_hook.
> > Now other languages such as C have to be made conservative and produce worse
> > code.

But if you combine a C++ and a C unit with LTO then what do you do?
Aliasing is a thing that needs to be fully defined from within the
middle-end.

> Even in C++ code you don't use placement new that often, so e.g. by having
> the placement new explicit through some special GIMPLE statement in the IL,
> you could e.g. just look if a particular function or loop contains any
> placement new stmts (cached in struct function and loop?) and use TBAA if
> it isn't there.

I'd say that's a hack.  We've been there before and we've failed 
miserably.

Note that the current memory model allows unions to no longer have
alias-set zero thus it _improves_ TBAA (ok, unions still _do_ have
alias-set zero because the RTL oracle doesn't correctly handle
offset-based must-aliases).  Changing that would probably help
GCC itself a lot.  [this isn't about punning through unions but
about changing the active member by storing into it, something
code usually doesn't do but it still gets pessimized because it
could]

Note that for the case in question the patch I proposed still
can help some code - also the vectorizer, as opposed to the
generic data-dependence code, can impose additional constraints
because it knows the vectorized loop body contains at least two
scalar iterations.  And note that for the case in question we
can derive non-aliasing because with

  p[i] += q[i];

p[i] is both read _and_ written in the same iteration thus
it cannot have the dynamic type of q[i] before it's stored
into.  Of course data-dependence doesn't do this kind of
analysis currently, but it certainly could.

Arguing to change how we handle placement new (same issues
arise with anonymous storage and memcpy in C!) is throwing
out the baby with the bathwater.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03 10:14           ` Florian Weimer
@ 2014-02-03 10:19             ` Richard Biener
  2014-02-03 11:49               ` Bingfeng Mei
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2014-02-03 10:19 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Jakub Jelinek, Bingfeng Mei, gcc

On Mon, 3 Feb 2014, Florian Weimer wrote:

> On 02/03/2014 10:59 AM, Jakub Jelinek wrote:
> > On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
> > > If it is just for C++ placement new, why don't implement it as a
> > > lang_hook.
> > > Now other languages such as C have to be made conservative and produce
> > > worse
> > > code.
> > 
> > Even in C++ code you don't use placement new that often, so e.g. by having
> > the placement new explicit through some special GIMPLE statement in the IL,
> > you could e.g. just look if a particular function or loop contains any
> > placement new stmts (cached in struct function and loop?) and use TBAA if
> > it isn't there.
> 
> I believe the convenience of TBAA lies in the fact that you don't have to
> prove anything about actual program behavior if the types are sufficiently
> distinct.  If you allow local violations of that principle, the global
> property inevitably breaks down as well.
> 
> In any case, C code can call C++ code and vice versa, so it's difficult to
> consider each language in isolation.

As I said in other mail even C code can change the dynamic type of
a storage location (via memcpy).  And as soon as you require
a look at stmts inbetween two refs that you ask the oracle to
disambiguate you are doing sth wrong.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03 10:15           ` Richard Biener
@ 2014-02-03 10:36             ` Richard Biener
  2014-02-03 11:58               ` Bingfeng Mei
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2014-02-03 10:36 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bingfeng Mei, gcc

On Mon, 3 Feb 2014, Richard Biener wrote:

> And note that for the case in question we
> can derive non-aliasing because with
> 
>   p[i] += q[i];
> 
> p[i] is both read _and_ written in the same iteration thus
> it cannot have the dynamic type of q[i] before it's stored
> into.  Of course data-dependence doesn't do this kind of
> analysis currently, but it certainly could.

The vectorizer already has code to analyzes data-refs for groups,
not for read-write of the same loc as needed here, so it could
be reasonably easy to extend its analysis to detect this case
and mark the write DR with a flag so that in
vect_analyze_data_ref_dependence the _vectorizer_ could apply
TBAA to disambiguate the two DRs.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03 10:19             ` Richard Biener
@ 2014-02-03 11:49               ` Bingfeng Mei
  2014-02-03 13:17                 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Bingfeng Mei @ 2014-02-03 11:49 UTC (permalink / raw)
  To: Richard Biener, Florian Weimer; +Cc: Jakub Jelinek, gcc

For the following code, why can load be moved before store instruction? TBAA still applies even it is an anti-dependency. Somehow alias analysis is implemented differently in vectorization. 

for 
int foo (long long *a, short *b, int n)
{
   *a = (long long)(n * 100);
  
   return (*b) + 1000;
}
x86-64 code
	imull	$100, %edx, %edx
	movswl	(%rsi), %eax
	movslq	%edx, %rdx
	movq	%rdx, (%rdi)
	addl	$1000, %eax
	ret


Bingfeng
-----Original Message-----
From: Richard Biener [mailto:rguenther@suse.de] 
Sent: 03 February 2014 10:18
To: Florian Weimer
Cc: Jakub Jelinek; Bingfeng Mei; gcc@gcc.gnu.org
Subject: Re: No TBAA before ptr_derefs_may_alias_p?

On Mon, 3 Feb 2014, Florian Weimer wrote:

> On 02/03/2014 10:59 AM, Jakub Jelinek wrote:
> > On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
> > > If it is just for C++ placement new, why don't implement it as a
> > > lang_hook.
> > > Now other languages such as C have to be made conservative and produce
> > > worse
> > > code.
> > 
> > Even in C++ code you don't use placement new that often, so e.g. by having
> > the placement new explicit through some special GIMPLE statement in the IL,
> > you could e.g. just look if a particular function or loop contains any
> > placement new stmts (cached in struct function and loop?) and use TBAA if
> > it isn't there.
> 
> I believe the convenience of TBAA lies in the fact that you don't have to
> prove anything about actual program behavior if the types are sufficiently
> distinct.  If you allow local violations of that principle, the global
> property inevitably breaks down as well.
> 
> In any case, C code can call C++ code and vice versa, so it's difficult to
> consider each language in isolation.

As I said in other mail even C code can change the dynamic type of
a storage location (via memcpy).  And as soon as you require
a look at stmts inbetween two refs that you ask the oracle to
disambiguate you are doing sth wrong.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03 10:36             ` Richard Biener
@ 2014-02-03 11:58               ` Bingfeng Mei
  0 siblings, 0 replies; 16+ messages in thread
From: Bingfeng Mei @ 2014-02-03 11:58 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek; +Cc: gcc

Even I change acc[i] += to acc[i] in the original example,
gcc still generates versioning due to alias. So it clearly
behave differently from scalar code aliasing analysis.

tst3.c:12: note: versioning for alias required: can't determine dependence between _10->real and *_7
tst3.c:12: note: mark for run-time aliasing test between _10->real and *_7
tst3.c:12: note: versioning for alias required: can't determine dependence

Bingfeng

-----Original Message-----
From: Richard Biener [mailto:rguenther@suse.de] 
Sent: 03 February 2014 10:35
To: Jakub Jelinek
Cc: Bingfeng Mei; gcc@gcc.gnu.org
Subject: Re: No TBAA before ptr_derefs_may_alias_p?

On Mon, 3 Feb 2014, Richard Biener wrote:

> And note that for the case in question we
> can derive non-aliasing because with
> 
>   p[i] += q[i];
> 
> p[i] is both read _and_ written in the same iteration thus
> it cannot have the dynamic type of q[i] before it's stored
> into.  Of course data-dependence doesn't do this kind of
> analysis currently, but it certainly could.

The vectorizer already has code to analyzes data-refs for groups,
not for read-write of the same loc as needed here, so it could
be reasonably easy to extend its analysis to detect this case
and mark the write DR with a flag so that in
vect_analyze_data_ref_dependence the _vectorizer_ could apply
TBAA to disambiguate the two DRs.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03 11:49               ` Bingfeng Mei
@ 2014-02-03 13:17                 ` Richard Biener
  2014-02-03 14:43                   ` Bingfeng Mei
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2014-02-03 13:17 UTC (permalink / raw)
  To: Bingfeng Mei; +Cc: Florian Weimer, Jakub Jelinek, gcc

On Mon, 3 Feb 2014, Bingfeng Mei wrote:

> For the following code, why can load be moved before store instruction? 
> TBAA still applies even it is an anti-dependency. Somehow alias analysis 
> is implemented differently in vectorization.
> 
> for 
> int foo (long long *a, short *b, int n)
> {
>    *a = (long long)(n * 100);
>   
>    return (*b) + 1000;
> }
> x86-64 code
> 	imull	$100, %edx, %edx
> 	movswl	(%rsi), %eax
> 	movslq	%edx, %rdx
> 	movq	%rdx, (%rdi)
> 	addl	$1000, %eax
> 	ret

That's a bug.  Probably a wrong predicate used in the scheduler
(we've fixed many I think).  -fno-schedule-insns2 fixes it.

But after some local discussion I think we can do

Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c   (revision 207417)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -235,6 +235,18 @@ vect_analyze_data_ref_dependence (struct
       || (DR_IS_READ (dra) && DR_IS_READ (drb)))
     return false;
 
+  /* Even if we have an anti-dependence then, as the vectorized loop 
covers at
+     least two scalar iterations, there is always also a true dependence.
+     As the vectorizer does not re-order loads and stores we can ignore
+     the anti-dependence if TBAA can disambiguate both DRs similar to the
+     case with known negative distance anti-dependences (positive
+     distance anti-dependences would violate TBAA constraints).  */
+  if (((DR_IS_READ (dra) && DR_IS_WRITE (drb))
+       || (DR_IS_WRITE (dra) && DR_IS_READ (drb)))
+      && !alias_sets_conflict_p (get_alias_set (DR_REF (dra)),
+                                get_alias_set (DR_REF (drb))))
+    return false;
+
   /* Unknown data dependence.  */
   if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
     {

We agreed to that dependence-analysis isn't really the suitable place
to apply TBAA.  To arrive at the above the reasoning goes like so:
we need to avoid the case where loading DRA after storing DRB
would load a different value.  But if DRA were to load from a place
where DRB stored to then this would be a true dependence and thus
we can apply TBAA to that "re-load" and thus argue it may not happen.

The same reasoning applies to LIM and PRE performing invariant motion
and disambiguating the load they want to hoist against a store over
the back-edge - if there were any aliasing then it wouldn't be valid.

Note that both transforms, vectorization and LIM, are careful not to
move the loads after the stores.  The vectorizer still can re-order
loads and stores by means of effectively unrolling, thus

   a[i] = b[i]

becomes

   tem1 = a[i]
   tem2 = a[i+1]
...
   b[i] = tem1
   b[i+1] = tem2
...

instead of

   b[i] = a[i]
   b[i+1] = a[i+1]
...

so the interesting case to construct is one with different size a[]
and b[] (to allow one set of DRs catching the other) and try to
prove that you can't construct one that causes a[] to read from a
location that b[] stored to but the vectorizer would introduce such
false dependence.  I think that's not possible (fingers crossing ;)).

Richard.


> 
> Bingfeng
> -----Original Message-----
> From: Richard Biener [mailto:rguenther@suse.de] 
> Sent: 03 February 2014 10:18
> To: Florian Weimer
> Cc: Jakub Jelinek; Bingfeng Mei; gcc@gcc.gnu.org
> Subject: Re: No TBAA before ptr_derefs_may_alias_p?
> 
> On Mon, 3 Feb 2014, Florian Weimer wrote:
> 
> > On 02/03/2014 10:59 AM, Jakub Jelinek wrote:
> > > On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
> > > > If it is just for C++ placement new, why don't implement it as a
> > > > lang_hook.
> > > > Now other languages such as C have to be made conservative and produce
> > > > worse
> > > > code.
> > > 
> > > Even in C++ code you don't use placement new that often, so e.g. by having
> > > the placement new explicit through some special GIMPLE statement in the IL,
> > > you could e.g. just look if a particular function or loop contains any
> > > placement new stmts (cached in struct function and loop?) and use TBAA if
> > > it isn't there.
> > 
> > I believe the convenience of TBAA lies in the fact that you don't have to
> > prove anything about actual program behavior if the types are sufficiently
> > distinct.  If you allow local violations of that principle, the global
> > property inevitably breaks down as well.
> > 
> > In any case, C code can call C++ code and vice versa, so it's difficult to
> > consider each language in isolation.
> 
> As I said in other mail even C code can change the dynamic type of
> a storage location (via memcpy).  And as soon as you require
> a look at stmts inbetween two refs that you ask the oracle to
> disambiguate you are doing sth wrong.
> 
> Richard.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03 13:17                 ` Richard Biener
@ 2014-02-03 14:43                   ` Bingfeng Mei
  2014-02-03 14:45                     ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Bingfeng Mei @ 2014-02-03 14:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: Florian Weimer, Jakub Jelinek, gcc

Thanks, Richard,
I think I can follow your logic. That patch works for my example. BTW, I have
a bug report (pr60012), if you are to check in the patch.

Should I also report the scalar example as a bug? It looks innocuous per se :-).

Bingfeng

-----Original Message-----
From: Richard Biener [mailto:rguenther@suse.de] 
Sent: 03 February 2014 13:16
To: Bingfeng Mei
Cc: Florian Weimer; Jakub Jelinek; gcc@gcc.gnu.org
Subject: RE: No TBAA before ptr_derefs_may_alias_p?

On Mon, 3 Feb 2014, Bingfeng Mei wrote:

> For the following code, why can load be moved before store instruction? 
> TBAA still applies even it is an anti-dependency. Somehow alias analysis 
> is implemented differently in vectorization.
> 
> for 
> int foo (long long *a, short *b, int n)
> {
>    *a = (long long)(n * 100);
>   
>    return (*b) + 1000;
> }
> x86-64 code
> 	imull	$100, %edx, %edx
> 	movswl	(%rsi), %eax
> 	movslq	%edx, %rdx
> 	movq	%rdx, (%rdi)
> 	addl	$1000, %eax
> 	ret

That's a bug.  Probably a wrong predicate used in the scheduler
(we've fixed many I think).  -fno-schedule-insns2 fixes it.

But after some local discussion I think we can do

Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c   (revision 207417)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -235,6 +235,18 @@ vect_analyze_data_ref_dependence (struct
       || (DR_IS_READ (dra) && DR_IS_READ (drb)))
     return false;
 
+  /* Even if we have an anti-dependence then, as the vectorized loop 
covers at
+     least two scalar iterations, there is always also a true dependence.
+     As the vectorizer does not re-order loads and stores we can ignore
+     the anti-dependence if TBAA can disambiguate both DRs similar to the
+     case with known negative distance anti-dependences (positive
+     distance anti-dependences would violate TBAA constraints).  */
+  if (((DR_IS_READ (dra) && DR_IS_WRITE (drb))
+       || (DR_IS_WRITE (dra) && DR_IS_READ (drb)))
+      && !alias_sets_conflict_p (get_alias_set (DR_REF (dra)),
+                                get_alias_set (DR_REF (drb))))
+    return false;
+
   /* Unknown data dependence.  */
   if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
     {

We agreed to that dependence-analysis isn't really the suitable place
to apply TBAA.  To arrive at the above the reasoning goes like so:
we need to avoid the case where loading DRA after storing DRB
would load a different value.  But if DRA were to load from a place
where DRB stored to then this would be a true dependence and thus
we can apply TBAA to that "re-load" and thus argue it may not happen.

The same reasoning applies to LIM and PRE performing invariant motion
and disambiguating the load they want to hoist against a store over
the back-edge - if there were any aliasing then it wouldn't be valid.

Note that both transforms, vectorization and LIM, are careful not to
move the loads after the stores.  The vectorizer still can re-order
loads and stores by means of effectively unrolling, thus

   a[i] = b[i]

becomes

   tem1 = a[i]
   tem2 = a[i+1]
...
   b[i] = tem1
   b[i+1] = tem2
...

instead of

   b[i] = a[i]
   b[i+1] = a[i+1]
...

so the interesting case to construct is one with different size a[]
and b[] (to allow one set of DRs catching the other) and try to
prove that you can't construct one that causes a[] to read from a
location that b[] stored to but the vectorizer would introduce such
false dependence.  I think that's not possible (fingers crossing ;)).

Richard.


> 
> Bingfeng
> -----Original Message-----
> From: Richard Biener [mailto:rguenther@suse.de] 
> Sent: 03 February 2014 10:18
> To: Florian Weimer
> Cc: Jakub Jelinek; Bingfeng Mei; gcc@gcc.gnu.org
> Subject: Re: No TBAA before ptr_derefs_may_alias_p?
> 
> On Mon, 3 Feb 2014, Florian Weimer wrote:
> 
> > On 02/03/2014 10:59 AM, Jakub Jelinek wrote:
> > > On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
> > > > If it is just for C++ placement new, why don't implement it as a
> > > > lang_hook.
> > > > Now other languages such as C have to be made conservative and produce
> > > > worse
> > > > code.
> > > 
> > > Even in C++ code you don't use placement new that often, so e.g. by having
> > > the placement new explicit through some special GIMPLE statement in the IL,
> > > you could e.g. just look if a particular function or loop contains any
> > > placement new stmts (cached in struct function and loop?) and use TBAA if
> > > it isn't there.
> > 
> > I believe the convenience of TBAA lies in the fact that you don't have to
> > prove anything about actual program behavior if the types are sufficiently
> > distinct.  If you allow local violations of that principle, the global
> > property inevitably breaks down as well.
> > 
> > In any case, C code can call C++ code and vice versa, so it's difficult to
> > consider each language in isolation.
> 
> As I said in other mail even C code can change the dynamic type of
> a storage location (via memcpy).  And as soon as you require
> a look at stmts inbetween two refs that you ask the oracle to
> disambiguate you are doing sth wrong.
> 
> Richard.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: No TBAA before ptr_derefs_may_alias_p?
  2014-02-03 14:43                   ` Bingfeng Mei
@ 2014-02-03 14:45                     ` Richard Biener
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Biener @ 2014-02-03 14:45 UTC (permalink / raw)
  To: Bingfeng Mei; +Cc: Florian Weimer, Jakub Jelinek, gcc

On Mon, 3 Feb 2014, Bingfeng Mei wrote:

> Thanks, Richard,
> I think I can follow your logic. That patch works for my example. BTW, I have
> a bug report (pr60012), if you are to check in the patch.

Thanks.

> Should I also report the scalar example as a bug? It looks innocuous per 
> se :-).

I already have done that (PR60043)

Richard.

> Bingfeng
> 
> -----Original Message-----
> From: Richard Biener [mailto:rguenther@suse.de] 
> Sent: 03 February 2014 13:16
> To: Bingfeng Mei
> Cc: Florian Weimer; Jakub Jelinek; gcc@gcc.gnu.org
> Subject: RE: No TBAA before ptr_derefs_may_alias_p?
> 
> On Mon, 3 Feb 2014, Bingfeng Mei wrote:
> 
> > For the following code, why can load be moved before store instruction? 
> > TBAA still applies even it is an anti-dependency. Somehow alias analysis 
> > is implemented differently in vectorization.
> > 
> > for 
> > int foo (long long *a, short *b, int n)
> > {
> >    *a = (long long)(n * 100);
> >   
> >    return (*b) + 1000;
> > }
> > x86-64 code
> > 	imull	$100, %edx, %edx
> > 	movswl	(%rsi), %eax
> > 	movslq	%edx, %rdx
> > 	movq	%rdx, (%rdi)
> > 	addl	$1000, %eax
> > 	ret
> 
> That's a bug.  Probably a wrong predicate used in the scheduler
> (we've fixed many I think).  -fno-schedule-insns2 fixes it.
> 
> But after some local discussion I think we can do
> 
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c   (revision 207417)
> +++ gcc/tree-vect-data-refs.c   (working copy)
> @@ -235,6 +235,18 @@ vect_analyze_data_ref_dependence (struct
>        || (DR_IS_READ (dra) && DR_IS_READ (drb)))
>      return false;
>  
> +  /* Even if we have an anti-dependence then, as the vectorized loop 
> covers at
> +     least two scalar iterations, there is always also a true dependence.
> +     As the vectorizer does not re-order loads and stores we can ignore
> +     the anti-dependence if TBAA can disambiguate both DRs similar to the
> +     case with known negative distance anti-dependences (positive
> +     distance anti-dependences would violate TBAA constraints).  */
> +  if (((DR_IS_READ (dra) && DR_IS_WRITE (drb))
> +       || (DR_IS_WRITE (dra) && DR_IS_READ (drb)))
> +      && !alias_sets_conflict_p (get_alias_set (DR_REF (dra)),
> +                                get_alias_set (DR_REF (drb))))
> +    return false;
> +
>    /* Unknown data dependence.  */
>    if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
>      {
> 
> We agreed to that dependence-analysis isn't really the suitable place
> to apply TBAA.  To arrive at the above the reasoning goes like so:
> we need to avoid the case where loading DRA after storing DRB
> would load a different value.  But if DRA were to load from a place
> where DRB stored to then this would be a true dependence and thus
> we can apply TBAA to that "re-load" and thus argue it may not happen.
> 
> The same reasoning applies to LIM and PRE performing invariant motion
> and disambiguating the load they want to hoist against a store over
> the back-edge - if there were any aliasing then it wouldn't be valid.
> 
> Note that both transforms, vectorization and LIM, are careful not to
> move the loads after the stores.  The vectorizer still can re-order
> loads and stores by means of effectively unrolling, thus
> 
>    a[i] = b[i]
> 
> becomes
> 
>    tem1 = a[i]
>    tem2 = a[i+1]
> ...
>    b[i] = tem1
>    b[i+1] = tem2
> ...
> 
> instead of
> 
>    b[i] = a[i]
>    b[i+1] = a[i+1]
> ...
> 
> so the interesting case to construct is one with different size a[]
> and b[] (to allow one set of DRs catching the other) and try to
> prove that you can't construct one that causes a[] to read from a
> location that b[] stored to but the vectorizer would introduce such
> false dependence.  I think that's not possible (fingers crossing ;)).
> 
> Richard.
> 
> 
> > 
> > Bingfeng
> > -----Original Message-----
> > From: Richard Biener [mailto:rguenther@suse.de] 
> > Sent: 03 February 2014 10:18
> > To: Florian Weimer
> > Cc: Jakub Jelinek; Bingfeng Mei; gcc@gcc.gnu.org
> > Subject: Re: No TBAA before ptr_derefs_may_alias_p?
> > 
> > On Mon, 3 Feb 2014, Florian Weimer wrote:
> > 
> > > On 02/03/2014 10:59 AM, Jakub Jelinek wrote:
> > > > On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote:
> > > > > If it is just for C++ placement new, why don't implement it as a
> > > > > lang_hook.
> > > > > Now other languages such as C have to be made conservative and produce
> > > > > worse
> > > > > code.
> > > > 
> > > > Even in C++ code you don't use placement new that often, so e.g. by having
> > > > the placement new explicit through some special GIMPLE statement in the IL,
> > > > you could e.g. just look if a particular function or loop contains any
> > > > placement new stmts (cached in struct function and loop?) and use TBAA if
> > > > it isn't there.
> > > 
> > > I believe the convenience of TBAA lies in the fact that you don't have to
> > > prove anything about actual program behavior if the types are sufficiently
> > > distinct.  If you allow local violations of that principle, the global
> > > property inevitably breaks down as well.
> > > 
> > > In any case, C code can call C++ code and vice versa, so it's difficult to
> > > consider each language in isolation.
> > 
> > As I said in other mail even C code can change the dynamic type of
> > a storage location (via memcpy).  And as soon as you require
> > a look at stmts inbetween two refs that you ask the oracle to
> > disambiguate you are doing sth wrong.
> > 
> > Richard.
> > 
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-02-03 14:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-31 15:24 No TBAA before ptr_derefs_may_alias_p? Bingfeng Mei
2014-01-31 15:27 ` Richard Biener
2014-01-31 17:01   ` Bingfeng Mei
2014-01-31 17:18   ` Bingfeng Mei
2014-01-31 21:32     ` Richard Biener
2014-02-03  9:51       ` Bingfeng Mei
2014-02-03  9:59         ` Jakub Jelinek
2014-02-03 10:14           ` Florian Weimer
2014-02-03 10:19             ` Richard Biener
2014-02-03 11:49               ` Bingfeng Mei
2014-02-03 13:17                 ` Richard Biener
2014-02-03 14:43                   ` Bingfeng Mei
2014-02-03 14:45                     ` Richard Biener
2014-02-03 10:15           ` Richard Biener
2014-02-03 10:36             ` Richard Biener
2014-02-03 11:58               ` Bingfeng Mei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).