public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/109985] New: __builtin_prefetch ignored by GCC 12/13
@ 2023-05-26 11:38 pdimov at gmail dot com
  2023-05-26 16:36 ` [Bug tree-optimization/109985] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: pdimov at gmail dot com @ 2023-05-26 11:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109985

            Bug ID: 109985
           Summary: __builtin_prefetch ignored by GCC 12/13
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pdimov at gmail dot com
  Target Milestone: ---

We are investigating a Boost.Unordered performance regression with GCC 12,
on the following benchmark:

https://github.com/boostorg/boost_unordered_benchmarks/blob/4c717baac1bff8d3e51cb8485b72bbb63d533265/scattered_lookup.cpp

and it looks like the reason is that GCC 12 (and 13) ignore a call to
`__builtin_prefetch`.

While GCC 11 generates this:

```
.L108:
        mov     r8, r12
        movdqa  xmm0, xmm1
        sal     r8, 4
        lea     r14, [r10+r8]
        pcmpeqb xmm0, XMMWORD PTR [r14]
        pmovmskb        edx, xmm0
        and     edx, 32767
        je      .L104
        sub     r8, r12
        sal     r8, 4
        add     r8, QWORD PTR [rbx+32]
        prefetcht0      [r8]
.L106:
        xor     r15d, r15d
        rep bsf r15d, edx
        movsx   r15, r15d
        sal     r15, 4
        add     r15, r8
        cmp     rsi, QWORD PTR [r15]
        jne     .L144
        add     r9, QWORD PTR [r15+8]
        mov     rax, rdi
        cmp     r11, rdi
        jne     .L145
```
(https://godbolt.org/z/d663fdM16 - prefetcht0 [r8] right before L106)

GCC 12 generates this in the same function:
```
.L108:
        mov     r8, r10
        movdqa  xmm0, xmm1
        sal     r8, 4
        lea     r9, [rbp+0+r8]
        pcmpeqb xmm0, XMMWORD PTR [r9]
        pmovmskb        edx, xmm0
        and     edx, 32767
        je      .L104
        mov     rdi, QWORD PTR [rsp+16]
        sub     r8, r10
        mov     QWORD PTR [rsp+24], rax
        sal     r8, 4
        mov     rdi, QWORD PTR [rdi+32]
        mov     QWORD PTR [rsp+8], rdi
        mov     rax, rdi
.L106:
        xor     edi, edi
        rep bsf edi, edx
        movsx   rdi, edi
        sal     rdi, 4
        add     rdi, r8
        add     rdi, rax
        cmp     r11, QWORD PTR [rdi]
        jne     .L143
        add     rsi, 8
        add     rbx, QWORD PTR [rdi+8]
        cmp     r12, rsi
        jne     .L109
```
(https://godbolt.org/z/T7csq7TPz - no prefetcht0 instruction before L106)

Simplifying this code unfortunately leads to the prefetcht0 being generated.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-05-30  7:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26 11:38 [Bug c++/109985] New: __builtin_prefetch ignored by GCC 12/13 pdimov at gmail dot com
2023-05-26 16:36 ` [Bug tree-optimization/109985] " pinskia at gcc dot gnu.org
2023-05-26 17:15 ` christian.mazakas at gmail dot com
2023-05-26 17:17 ` pinskia at gcc dot gnu.org
2023-05-26 17:38 ` jakub at gcc dot gnu.org
2023-05-26 22:28 ` pinskia at gcc dot gnu.org
2023-05-28 20:33 ` hubicka at gcc dot gnu.org
2023-05-28 20:40 ` hubicka at gcc dot gnu.org
2023-05-30  7:31 ` rguenth at gcc dot gnu.org
2023-05-30  7:32 ` [Bug tree-optimization/109985] [12/13/14 Regression] " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).