public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/114342] New: suboptimal codegen of vector::vector(range)
@ 2024-03-14 23:25 hiraditya at msn dot com
  2024-03-14 23:32 ` [Bug middle-end/114342] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: hiraditya at msn dot com @ 2024-03-14 23:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114342

            Bug ID: 114342
           Summary: suboptimal codegen of vector::vector(range)
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hiraditya at msn dot com
  Target Milestone: ---

#include<vector>
#include <ranges>

std::vector<int> td() {
  int arr[]{-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10,
15, -5, 10, 15,-5, 10, 15 -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5,
10, 15, -5, 10, 15, -5, 10,-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5,
10, 15, -5, 10,-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5,
10,-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10,-5, 10,
15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10,};
  auto b = std::ranges::begin(arr);
  auto e = std::ranges::end(arr);
  std::vector<int> dd(b, e);
  return dd;
}

What is the reason for calling `rep movsq` twice?

$ gcc -O3 -std=c++23
```
td():
        push    rbp
        mov     esi, OFFSET FLAT:.LC0
        mov     ecx, 55
        pxor    xmm0, xmm0
        push    rbx
        mov     rbx, rdi
        sub     rsp, 456
        mov     QWORD PTR [rbx+16], 0
        mov     rbp, rsp
        movups  XMMWORD PTR [rbx], xmm0
        mov     rdi, rbp
        rep movsq
        mov     eax, DWORD PTR [rsi]
        mov     DWORD PTR [rdi], eax
        mov     edi, 444
        call    operator new(unsigned long)
        lea     rdx, [rax+444]
        mov     QWORD PTR [rbx], rax
        lea     rdi, [rax+8]
        mov     rsi, rbp
        mov     QWORD PTR [rbx+16], rdx
        mov     rcx, QWORD PTR [rsp]
        and     rdi, -8
        mov     QWORD PTR [rax], rcx
        mov     rcx, QWORD PTR [rsp+436]
        mov     QWORD PTR [rax+436], rcx
        sub     rax, rdi
        sub     rsi, rax
        add     eax, 444
        shr     eax, 3
        mov     ecx, eax
        mov     rax, rbx
        rep movsq
        mov     QWORD PTR [rbx+8], rdx
        add     rsp, 456
        pop     rbx
        pop     rbp
        ret
        mov     rbp, rax
        jmp     .L2
td() [clone .cold]:
.L2:
        mov     rdi, QWORD PTR [rbx]
        mov     rsi, QWORD PTR [rbx+16]
        sub     rsi, rdi
        test    rdi, rdi
        je      .L3
        call    operator delete(void*, unsigned long)
.L3:
        mov     rdi, rbp
        call    _Unwind_Resume
```

https://godbolt.org/z/5333db8Px

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug middle-end/114342] suboptimal codegen of vector::vector(range)
  2024-03-14 23:25 [Bug c++/114342] New: suboptimal codegen of vector::vector(range) hiraditya at msn dot com
@ 2024-03-14 23:32 ` pinskia at gcc dot gnu.org
  2024-03-14 23:37 ` pinskia at gcc dot gnu.org
  2024-03-19 17:26 ` hiraditya at msn dot com
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-14 23:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114342

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The first memcpy (rep movsq) is for:
```
  int arr[]{-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10,
15, -5, 10, 15,-5, 10, 15 -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5,
10, 15, -5, 10, 15, -5, 10,-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5,
10, 15, -5, 10,-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5,
10,-5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10,-5, 10,
15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10, 15, -5, 10,};

```


The second memcpy is copying arr into the vector.
```
  _25 = operator new (444);

  <bb 3> [local count: 1073741824]:
  dd_2(D)->D.81462._M_impl.D.80768._M_start = _25;
  _16 = _25 + 444;
  dd_2(D)->D.81462._M_impl.D.80768._M_end_of_storage = _16;
  __builtin_memcpy (_25, &arr, 444);
```

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug middle-end/114342] suboptimal codegen of vector::vector(range)
  2024-03-14 23:25 [Bug c++/114342] New: suboptimal codegen of vector::vector(range) hiraditya at msn dot com
  2024-03-14 23:32 ` [Bug middle-end/114342] " pinskia at gcc dot gnu.org
@ 2024-03-14 23:37 ` pinskia at gcc dot gnu.org
  2024-03-19 17:26 ` hiraditya at msn dot com
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-14 23:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114342

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
           Severity|normal                      |enhancement
   Last reconfirmed|                            |2024-03-14

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
If you change `arr` to be: `static const int arr[]`. then GCC can do it with
only one memcpy.

So basically GCC does not know it remove arr from being a stack variable.

clang/LLVM is able to figure that out but it definitely requires inlining to do
that.

```
  arr = *.LC0;
...
  __builtin_memcpy (_21, &arr, 444);
```

Basically GCC does not realize it can "remove" the local variable arr here.

Note there are duplicates of this bug report already too.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug middle-end/114342] suboptimal codegen of vector::vector(range)
  2024-03-14 23:25 [Bug c++/114342] New: suboptimal codegen of vector::vector(range) hiraditya at msn dot com
  2024-03-14 23:32 ` [Bug middle-end/114342] " pinskia at gcc dot gnu.org
  2024-03-14 23:37 ` pinskia at gcc dot gnu.org
@ 2024-03-19 17:26 ` hiraditya at msn dot com
  2 siblings, 0 replies; 4+ messages in thread
From: hiraditya at msn dot com @ 2024-03-19 17:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114342

AK <hiraditya at msn dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
            Version|unknown                     |14.0
             Status|NEW                         |RESOLVED

--- Comment #3 from AK <hiraditya at msn dot com> ---
I see. marking as duplicate. Thanks for clarifying!

*** This bug has been marked as a duplicate of bug 59863 ***

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-03-19 17:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-14 23:25 [Bug c++/114342] New: suboptimal codegen of vector::vector(range) hiraditya at msn dot com
2024-03-14 23:32 ` [Bug middle-end/114342] " pinskia at gcc dot gnu.org
2024-03-14 23:37 ` pinskia at gcc dot gnu.org
2024-03-19 17:26 ` hiraditya at msn dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).