public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements
@ 2023-05-30  8:35 ptk.prasertsuk at gmail dot com
  2023-05-30 11:55 ` [Bug tree-optimization/110035] " rguenth at gcc dot gnu.org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: ptk.prasertsuk at gmail dot com @ 2023-05-30  8:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

            Bug ID: 110035
           Summary: Missed optimization for dependent assignment
                    statements
           Product: gcc
           Version: 12.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ptk.prasertsuk at gmail dot com
  Target Milestone: ---

Created attachment 55212
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55212&action=edit
Test case, compiled with -stdc++=20 -O2

The test case, when compiled, produces additional move instructions:

movdqu  (%rdi), %xmm2
movdqu  16(%rdi), %xmm1
movdqu  32(%rdi), %xmm0
movl    $48, %edi
movaps  %xmm2, 32(%rsp)
movaps  %xmm1, 16(%rsp)
movaps  %xmm0, (%rsp)
call    _Znwm@PLT
movdqa  32(%rsp), %xmm2
movdqa  16(%rsp), %xmm1
movdqa  (%rsp), %xmm0
movq    %rax, %rdi
movups  %xmm2, (%rax)
movups  %xmm1, 16(%rax)
movups  %xmm0, 32(%rax)

compared to more optimized result using clang++ 14.0.0 with same flags:

callq   _Znwm@PLT
movups  (%rbx), %xmm0
movups  16(%rbx), %xmm1
movups  32(%rbx), %xmm2
movups  %xmm0, (%rax)
movups  %xmm1, 16(%rax)
movups  %xmm2, 32(%rax)
movq    %rax, %rdi

Clang has MemCpyOptPass which detects and removes memory dependency of the
second set of move instructions, which allows Dead Store Elimination pass to
remove the first set of move instructions.

g++-12 -v
Using built-in specs.
COLLECT_GCC=g++-12
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12.1.0-2ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-sZcx2y/gcc-12-12.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-sZcx2y/gcc-12-12.1.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.0 (Ubuntu 12.1.0-2ubuntu1~22.04)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
@ 2023-05-30 11:55 ` rguenth at gcc dot gnu.org
  2023-05-30 16:45 ` pinskia at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30 11:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ick - convoluted C++.  We end up with

void ff (struct MyClass & obj)
{
  vector(2) long unsigned int vect_SR.16;
  vector(2) long unsigned int vect_SR.15;
  vector(2) long unsigned int vect_SR.14;
  void * _6;

  <bb 2> [local count: 1073741824]:
  vect_SR.14_5 = MEM <vector(2) long unsigned int> [(struct MyClass
&)obj_2(D)];
  vect_SR.15_28 = MEM <vector(2) long unsigned int> [(struct MyClass &)obj_2(D)
+ 16];
  vect_SR.16_30 = MEM <vector(2) long unsigned int> [(struct MyClass &)obj_2(D)
+ 32];
  _6 = operator new (48);
  MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6] = vect_SR.14_5;
  MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 16B] =
vect_SR.15_28;
  MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 32B] =
vect_SR.16_30;
  HandleMyClass2 (_6); [tail call]

and the issue is that 'operator new (48)' can alter what 'obj' points to,
so we cannot move the loads across the call and we get spilling.

There is no inter-procedural analysis in GCC that would tell us that
'obj_2(D)' (the MyClass & obj argument of ff) does not point to an
object that did not escape.  In fact 'ff' has global visibility
and it might have other callers.

If you add -fwhole-program then you get the function inlined to main and

main:
.LFB652:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        movl    $48, %edi
        call    _Znwm
        movq    $0, (%rax)
        movq    %rax, %rdi
        movq    $0, 8(%rax)
        movq    $0, 16(%rax)
        movq    $0, 24(%rax)
        movq    $0, 32(%rax)
        movq    $0, 40(%rax)
        call    _Z14HandleMyClass2Pv
        xorl    %eax, %eax
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret

(not using vectors because 'main' is considered cold).  Do you cite an
inline copy of ff() for clang?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
  2023-05-30 11:55 ` [Bug tree-optimization/110035] " rguenth at gcc dot gnu.org
@ 2023-05-30 16:45 ` pinskia at gcc dot gnu.org
  2023-05-30 17:22 ` pinskia at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-30 16:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement
   Last reconfirmed|                            |2023-05-30
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
In the case of x86_64, it is just moving the loads across the operator new, I
think:
  vect_SR.14_5 = MEM <vector(2) long unsigned int> [(struct MyClass
&)obj_2(D)];
  vect_SR.15_28 = MEM <vector(2) long unsigned int> [(struct MyClass &)obj_2(D)
+ 16];
  vect_SR.16_30 = MEM <vector(2) long unsigned int> [(struct MyClass &)obj_2(D)
+ 32];
  _6 = operator new (48);
  MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6] = vect_SR.14_5;
  MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 16B] =
vect_SR.15_28;
  MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 32B] =
vect_SR.16_30;
  HandleMyClass2 (_6); [tail call]

Other targets is moving across the operator new too:

  D.14580.__obj = *obj_2(D);
  _6 = operator new (48);
  MEM[(struct MyClass2 *)_6].f = D.14580;


More obvious Reduced testcase:
```
struct MyClass
{
    unsigned long long arr[128];
};

[[gnu::noipa]]
void sink(void *m){}
void gg(MyClass &a)
{
  MyClass c = a;
  MyClass *b = new MyClass;
  *b = c;
  sink(b);
}
```

There might be a dup of this issue too.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
  2023-05-30 11:55 ` [Bug tree-optimization/110035] " rguenth at gcc dot gnu.org
  2023-05-30 16:45 ` pinskia at gcc dot gnu.org
@ 2023-05-30 17:22 ` pinskia at gcc dot gnu.org
  2023-05-31  1:20 ` ptk.prasertsuk at gmail dot com
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-30 17:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
We don't even optimize:
```
struct MyClass
{
    unsigned long long arr[128];
};

[[gnu::noipa]]
void sink(void *m);
void gg(MyClass &a, MyClass *b)
{
  MyClass c = a;
  *b = c;
  sink(b);
}
```

As I mentioned there are dups of the above testcase.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (2 preceding siblings ...)
  2023-05-30 17:22 ` pinskia at gcc dot gnu.org
@ 2023-05-31  1:20 ` ptk.prasertsuk at gmail dot com
  2023-05-31  1:46 ` ptk.prasertsuk at gmail dot com
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ptk.prasertsuk at gmail dot com @ 2023-05-31  1:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #4 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> Ick - convoluted C++.  We end up with
> 
> void ff (struct MyClass & obj)
> {
>   vector(2) long unsigned int vect_SR.16;
>   vector(2) long unsigned int vect_SR.15;
>   vector(2) long unsigned int vect_SR.14;
>   void * _6;
> 
>   <bb 2> [local count: 1073741824]:
>   vect_SR.14_5 = MEM <vector(2) long unsigned int> [(struct MyClass
> &)obj_2(D)];
>   vect_SR.15_28 = MEM <vector(2) long unsigned int> [(struct MyClass
> &)obj_2(D) + 16];
>   vect_SR.16_30 = MEM <vector(2) long unsigned int> [(struct MyClass
> &)obj_2(D) + 32];
>   _6 = operator new (48);
>   MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6] = vect_SR.14_5;
>   MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 16B] =
> vect_SR.15_28;
>   MEM <vector(2) long unsigned int> [(struct MyClass2 *)_6 + 32B] =
> vect_SR.16_30;
>   HandleMyClass2 (_6); [tail call]
> 
> and the issue is that 'operator new (48)' can alter what 'obj' points to,
> so we cannot move the loads across the call and we get spilling.
> 
> There is no inter-procedural analysis in GCC that would tell us that
> 'obj_2(D)' (the MyClass & obj argument of ff) does not point to an
> object that did not escape.  In fact 'ff' has global visibility
> and it might have other callers.
> 
> If you add -fwhole-program then you get the function inlined to main and
> 
> main:
> .LFB652:
>         .cfi_startproc
>         subq    $8, %rsp
>         .cfi_def_cfa_offset 16
>         movl    $48, %edi
>         call    _Znwm
>         movq    $0, (%rax)
>         movq    %rax, %rdi
>         movq    $0, 8(%rax)
>         movq    $0, 16(%rax)
>         movq    $0, 24(%rax)
>         movq    $0, 32(%rax)
>         movq    $0, 40(%rax)
>         call    _Z14HandleMyClass2Pv
>         xorl    %eax, %eax
>         addq    $8, %rsp
>         .cfi_def_cfa_offset 8
>         ret
> 
> (not using vectors because 'main' is considered cold).  Do you cite an
> inline copy of ff() for clang?

Hi Richard,

The clang snippet I provided is not inlined into 'main' function.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (3 preceding siblings ...)
  2023-05-31  1:20 ` ptk.prasertsuk at gmail dot com
@ 2023-05-31  1:46 ` ptk.prasertsuk at gmail dot com
  2023-05-31  6:34 ` rguenther at suse dot de
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ptk.prasertsuk at gmail dot com @ 2023-05-31  1:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #5 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
(In reply to Andrew Pinski from comment #3)
> We don't even optimize:
> ```
> struct MyClass
> {
>     unsigned long long arr[128];
> };
> 
> [[gnu::noipa]]
> void sink(void *m);
> void gg(MyClass &a, MyClass *b)
> {
>   MyClass c = a;
>   *b = c;
>   sink(b);
> }
> ```
> 
> As I mentioned there are dups of the above testcase.

Would you mind pointing me to the original issue?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (4 preceding siblings ...)
  2023-05-31  1:46 ` ptk.prasertsuk at gmail dot com
@ 2023-05-31  6:34 ` rguenther at suse dot de
  2023-06-03  0:23 ` ptk.prasertsuk at gmail dot com
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenther at suse dot de @ 2023-05-31  6:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 30 May 2023, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
> 
> Andrew Pinski <pinskia at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>            Keywords|                            |missed-optimization
>      Ever confirmed|0                           |1
>            Severity|normal                      |enhancement
>    Last reconfirmed|                            |2023-05-30
>              Status|UNCONFIRMED                 |NEW
> 
> --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> More obvious Reduced testcase:
> ```
> struct MyClass
> {
>     unsigned long long arr[128];
> };
> 
> [[gnu::noipa]]
> void sink(void *m){}
> void gg(MyClass &a)
> {
>   MyClass c = a;
>   MyClass *b = new MyClass;
>   *b = c;
>   sink(b);
> }
> ```
> 
> There might be a dup of this issue too.

But we cannot move the load of 'a' across the call to operator new
since that can possibly clobber 'a' (you can overwrite 'new' with
something having observable side-effects)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (5 preceding siblings ...)
  2023-05-31  6:34 ` rguenther at suse dot de
@ 2023-06-03  0:23 ` ptk.prasertsuk at gmail dot com
  2023-06-05  7:16 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ptk.prasertsuk at gmail dot com @ 2023-06-03  0:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #7 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
For the LLVM IR code of the snippet I provided, Clang's alias analysis can
prove that `new` call has no side effect to other memory location. This is
indicated by `noalias` keyword at the return value of the `new` call (_Znwm).

According to Clang's Language Reference:
"On function return values, the noalias attribute indicates that the function
acts like a system memory allocation function, returning a pointer to allocated
storage disjoint from the storage for any other object accessible to the
caller."

Is this possible for GCC alias analysis pass?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (6 preceding siblings ...)
  2023-06-03  0:23 ` ptk.prasertsuk at gmail dot com
@ 2023-06-05  7:16 ` rguenth at gcc dot gnu.org
  2023-06-05  7:58 ` ptk.prasertsuk at gmail dot com
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-05  7:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Pontakorn Prasertsuk from comment #7)
> For the LLVM IR code of the snippet I provided, Clang's alias analysis can
> prove that `new` call has no side effect to other memory location. This is
> indicated by `noalias` keyword at the return value of the `new` call (_Znwm).
> 
> According to Clang's Language Reference:
> "On function return values, the noalias attribute indicates that the
> function acts like a system memory allocation function, returning a pointer
> to allocated storage disjoint from the storage for any other object
> accessible to the caller."
> 
> Is this possible for GCC alias analysis pass?

>   MyClass c = a;
>   MyClass *b = new MyClass;
>   *b = c;

the point is that 'new' can alter the value of 'a', GCC already knows that
'b' is distinct from c and a but that's not the relevant thing.  It looks
like LLVM creates wrong-code here.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (7 preceding siblings ...)
  2023-06-05  7:16 ` rguenth at gcc dot gnu.org
@ 2023-06-05  7:58 ` ptk.prasertsuk at gmail dot com
  2023-06-05  8:11 ` rguenther at suse dot de
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ptk.prasertsuk at gmail dot com @ 2023-06-05  7:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #9 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
(In reply to Richard Biener from comment #8)
> (In reply to Pontakorn Prasertsuk from comment #7)
> > For the LLVM IR code of the snippet I provided, Clang's alias analysis can
> > prove that `new` call has no side effect to other memory location. This is
> > indicated by `noalias` keyword at the return value of the `new` call (_Znwm).
> > 
> > According to Clang's Language Reference:
> > "On function return values, the noalias attribute indicates that the
> > function acts like a system memory allocation function, returning a pointer
> > to allocated storage disjoint from the storage for any other object
> > accessible to the caller."
> > 
> > Is this possible for GCC alias analysis pass?
> 
> >   MyClass c = a;
> >   MyClass *b = new MyClass;
> >   *b = c;
> 
> the point is that 'new' can alter the value of 'a', GCC already knows that
> 'b' is distinct from c and a but that's not the relevant thing.  It looks
> like LLVM creates wrong-code here.

In what case can 'new' alter 'a'? I thought memory allocation functions such as
'malloc, 'calloc' and 'new' cannot alias other memory locations than its return
value.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (8 preceding siblings ...)
  2023-06-05  7:58 ` ptk.prasertsuk at gmail dot com
@ 2023-06-05  8:11 ` rguenther at suse dot de
  2023-06-06  5:46 ` ptk.prasertsuk at gmail dot com
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenther at suse dot de @ 2023-06-05  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 5 Jun 2023, ptk.prasertsuk at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
> 
> --- Comment #9 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
> (In reply to Richard Biener from comment #8)
> > (In reply to Pontakorn Prasertsuk from comment #7)
> > > For the LLVM IR code of the snippet I provided, Clang's alias analysis can
> > > prove that `new` call has no side effect to other memory location. This is
> > > indicated by `noalias` keyword at the return value of the `new` call (_Znwm).
> > > 
> > > According to Clang's Language Reference:
> > > "On function return values, the noalias attribute indicates that the
> > > function acts like a system memory allocation function, returning a pointer
> > > to allocated storage disjoint from the storage for any other object
> > > accessible to the caller."
> > > 
> > > Is this possible for GCC alias analysis pass?
> > 
> > >   MyClass c = a;
> > >   MyClass *b = new MyClass;
> > >   *b = c;
> > 
> > the point is that 'new' can alter the value of 'a', GCC already knows that
> > 'b' is distinct from c and a but that's not the relevant thing.  It looks
> > like LLVM creates wrong-code here.
> 
> In what case can 'new' alter 'a'? I thought memory allocation functions such as
> 'malloc, 'calloc' and 'new' cannot alias other memory locations than its return
> value.

'new' can be overridden by the user, you can declare your own 
implementation that does fancy stuff behind the scenes, including
in the above case altering 'a'.  Welcome to C++ ...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (9 preceding siblings ...)
  2023-06-05  8:11 ` rguenther at suse dot de
@ 2023-06-06  5:46 ` ptk.prasertsuk at gmail dot com
  2023-06-06  5:49 ` ptk.prasertsuk at gmail dot com
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ptk.prasertsuk at gmail dot com @ 2023-06-06  5:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #11 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
(In reply to rguenther@suse.de from comment #10)
> On Mon, 5 Jun 2023, ptk.prasertsuk at gmail dot com wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
> > 
> > --- Comment #9 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
> > (In reply to Richard Biener from comment #8)
> > > (In reply to Pontakorn Prasertsuk from comment #7)
> > > > For the LLVM IR code of the snippet I provided, Clang's alias analysis can
> > > > prove that `new` call has no side effect to other memory location. This is
> > > > indicated by `noalias` keyword at the return value of the `new` call (_Znwm).
> > > > 
> > > > According to Clang's Language Reference:
> > > > "On function return values, the noalias attribute indicates that the
> > > > function acts like a system memory allocation function, returning a pointer
> > > > to allocated storage disjoint from the storage for any other object
> > > > accessible to the caller."
> > > > 
> > > > Is this possible for GCC alias analysis pass?
> > > 
> > > >   MyClass c = a;
> > > >   MyClass *b = new MyClass;
> > > >   *b = c;
> > > 
> > > the point is that 'new' can alter the value of 'a', GCC already knows that
> > > 'b' is distinct from c and a but that's not the relevant thing.  It looks
> > > like LLVM creates wrong-code here.
> > 
> > In what case can 'new' alter 'a'? I thought memory allocation functions such as
> > 'malloc, 'calloc' and 'new' cannot alias other memory locations than its return
> > value.
> 
> 'new' can be overridden by the user, you can declare your own 
> implementation that does fancy stuff behind the scenes, including
> in the above case altering 'a'.  Welcome to C++ ...

I assume you are referring to this case: https://godbolt.org/z/z4Y7YdxWE

Clang indeed assumes that 'new' is non-alias and this feature can be turned off
by using -fno-assume-sane-operator-new

However, can we safely assume that 'malloc' and 'calloc' are non-alias as well?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (10 preceding siblings ...)
  2023-06-06  5:46 ` ptk.prasertsuk at gmail dot com
@ 2023-06-06  5:49 ` ptk.prasertsuk at gmail dot com
  2023-06-06  8:17 ` rguenther at suse dot de
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: ptk.prasertsuk at gmail dot com @ 2023-06-06  5:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #12 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
I notice that GCC also does not optimize this case:
https://godbolt.org/z/7oGqjqqz4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (11 preceding siblings ...)
  2023-06-06  5:49 ` ptk.prasertsuk at gmail dot com
@ 2023-06-06  8:17 ` rguenther at suse dot de
  2023-06-06  8:29 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenther at suse dot de @ 2023-06-06  8:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #13 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 6 Jun 2023, ptk.prasertsuk at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
> 
> --- Comment #11 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
> (In reply to rguenther@suse.de from comment #10)
> > On Mon, 5 Jun 2023, ptk.prasertsuk at gmail dot com wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
> > > 
> > > --- Comment #9 from Pontakorn Prasertsuk <ptk.prasertsuk at gmail dot com> ---
> > > (In reply to Richard Biener from comment #8)
> > > > (In reply to Pontakorn Prasertsuk from comment #7)
> > > > > For the LLVM IR code of the snippet I provided, Clang's alias analysis can
> > > > > prove that `new` call has no side effect to other memory location. This is
> > > > > indicated by `noalias` keyword at the return value of the `new` call (_Znwm).
> > > > > 
> > > > > According to Clang's Language Reference:
> > > > > "On function return values, the noalias attribute indicates that the
> > > > > function acts like a system memory allocation function, returning a pointer
> > > > > to allocated storage disjoint from the storage for any other object
> > > > > accessible to the caller."
> > > > > 
> > > > > Is this possible for GCC alias analysis pass?
> > > > 
> > > > >   MyClass c = a;
> > > > >   MyClass *b = new MyClass;
> > > > >   *b = c;
> > > > 
> > > > the point is that 'new' can alter the value of 'a', GCC already knows that
> > > > 'b' is distinct from c and a but that's not the relevant thing.  It looks
> > > > like LLVM creates wrong-code here.
> > > 
> > > In what case can 'new' alter 'a'? I thought memory allocation functions such as
> > > 'malloc, 'calloc' and 'new' cannot alias other memory locations than its return
> > > value.
> > 
> > 'new' can be overridden by the user, you can declare your own 
> > implementation that does fancy stuff behind the scenes, including
> > in the above case altering 'a'.  Welcome to C++ ...
> 
> I assume you are referring to this case: https://godbolt.org/z/z4Y7YdxWE
> 
> Clang indeed assumes that 'new' is non-alias and this feature can be turned off
> by using -fno-assume-sane-operator-new
> 
> However, can we safely assume that 'malloc' and 'calloc' are non-alias as well?

Well, we do.  For the C++ new case we did and it did break real world
programs.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (12 preceding siblings ...)
  2023-06-06  8:17 ` rguenther at suse dot de
@ 2023-06-06  8:29 ` rguenth at gcc dot gnu.org
  2023-06-06  9:12 ` amonakov at gcc dot gnu.org
  2023-06-06 11:54 ` rguenther at suse dot de
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-06  8:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aagarwa at gcc dot gnu.org,
                   |                            |amonakov at gcc dot gnu.org

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Pontakorn Prasertsuk from comment #12)
> I notice that GCC also does not optimize this case:
> https://godbolt.org/z/7oGqjqqz4

Yes.  To quote:

#include <array>
#include <cstdint>
#include <cstdlib>
#include <iostream>

struct MyClass {
    std::array<uint64_t, 6> arr;
};

MyClass globalA;

// Prevent optimization
void sink(MyClass *m) { std::cout << m->arr[0] << std::endl; }

void __attribute__((noinline)) gg(MyClass &a) {
    MyClass c = a;
    MyClass *b = (MyClass *)malloc(sizeof(MyClass));
    *b = c;
    sink(b);
}

and we do RTL expansion from

  <bb 2> [local count: 1073741824]:
  vect_c_arr__M_elems_0_6.31_25 = MEM <vector(2) long unsigned int> [(long
unsigned int *)a_2(D)];
  vect_c_arr__M_elems_0_6.32_27 = MEM <vector(2) long unsigned int> [(long
unsigned int *)a_2(D) + 16B];
  vect_c_arr__M_elems_0_6.33_29 = MEM <vector(2) long unsigned int> [(long
unsigned int *)a_2(D) + 32B];
  b_4 = malloc (48);
  MEM <vector(2) long unsigned int> [(long unsigned int *)b_4] =
vect_c_arr__M_elems_0_6.31_25;
  MEM <vector(2) long unsigned int> [(long unsigned int *)b_4 + 16B] =
vect_c_arr__M_elems_0_6.32_27;
  MEM <vector(2) long unsigned int> [(long unsigned int *)b_4 + 32B] =
vect_c_arr__M_elems_0_6.33_29;
  sink (b_4); [tail call]

note that the temporary was elided but we specifically avoid TER
(some magic scheduling of stmts in a basic-block) to cross function
calls and there's no optimization phase that would try to optimize
register pressure over calls.  In this case we want to sink the
loads across the call, in other cases we want to avoid doing so.
In the end this would be a job for a late running pass that factors
in things like register pressure and the set of call clobbered register.

I'll note that -fschedule-insns doesn't seem to have any effect here,
but I also remember that scheduling around calls was recently fiddled with,
specifically in r13-5154-g733a1b777f16cd which restricts motion even
with -fsched-pressure (not sure how that honors call clobbered regs).

In the above case the GPR for a_2(D) would be needed after the call
(but there are not call clobbered GPRs) but the three data vectors
in xmm would no longer be live across the call (and all vector registers
are call clobbered on x86).

Of course I'm not sure at all whether RTL scheduling can disambiguate
against a 'malloc' call.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (13 preceding siblings ...)
  2023-06-06  8:29 ` rguenth at gcc dot gnu.org
@ 2023-06-06  9:12 ` amonakov at gcc dot gnu.org
  2023-06-06 11:54 ` rguenther at suse dot de
  15 siblings, 0 replies; 17+ messages in thread
From: amonakov at gcc dot gnu.org @ 2023-06-06  9:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #15 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
malloc and friends modify 'errno' on failure, so in they would have to be
special-cased for alias analysis.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/110035] Missed optimization for dependent assignment statements
  2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
                   ` (14 preceding siblings ...)
  2023-06-06  9:12 ` amonakov at gcc dot gnu.org
@ 2023-06-06 11:54 ` rguenther at suse dot de
  15 siblings, 0 replies; 17+ messages in thread
From: rguenther at suse dot de @ 2023-06-06 11:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 6 Jun 2023, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
> 
> --- Comment #15 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
> malloc and friends modify 'errno' on failure, so in they would have to be
> special-cased for alias analysis.

That's already handled, but it's conditional on -fmath-errno (there's a PR
about that).

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-06-06 11:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-30  8:35 [Bug tree-optimization/110035] New: Missed optimization for dependent assignment statements ptk.prasertsuk at gmail dot com
2023-05-30 11:55 ` [Bug tree-optimization/110035] " rguenth at gcc dot gnu.org
2023-05-30 16:45 ` pinskia at gcc dot gnu.org
2023-05-30 17:22 ` pinskia at gcc dot gnu.org
2023-05-31  1:20 ` ptk.prasertsuk at gmail dot com
2023-05-31  1:46 ` ptk.prasertsuk at gmail dot com
2023-05-31  6:34 ` rguenther at suse dot de
2023-06-03  0:23 ` ptk.prasertsuk at gmail dot com
2023-06-05  7:16 ` rguenth at gcc dot gnu.org
2023-06-05  7:58 ` ptk.prasertsuk at gmail dot com
2023-06-05  8:11 ` rguenther at suse dot de
2023-06-06  5:46 ` ptk.prasertsuk at gmail dot com
2023-06-06  5:49 ` ptk.prasertsuk at gmail dot com
2023-06-06  8:17 ` rguenther at suse dot de
2023-06-06  8:29 ` rguenth at gcc dot gnu.org
2023-06-06  9:12 ` amonakov at gcc dot gnu.org
2023-06-06 11:54 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).