public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
@ 2024-03-03  2:53 unlvsur at live dot com
  2024-03-03  3:06 ` [Bug ipa/114215] " pinskia at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03  2:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

            Bug ID: 114215
           Summary: GCC makes wrong decision for inline with -Os or -Oz to
                    deal with trivial functions
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: unlvsur at live dot com
  Target Milestone: ---

This is a very common example for implementing a bounds checking vecotr index.
GCC makes the wrong decision. This even increases the code size but not
decreases the size compared to -O3.

GCC with -Oz
https://godbolt.org/z/sa9YYqnYY
GCC with -Ofast
https://godbolt.org/z/b6jahvh6s

clang with -Oz
https://godbolt.org/z/GxPaxP66b

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
@ 2024-03-03  3:06 ` pinskia at gcc dot gnu.org
  2024-03-03  3:18 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03  3:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 57597
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57597&action=edit
Testcase

Please next time attach the testcase rather than just link to godbolt.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
  2024-03-03  3:06 ` [Bug ipa/114215] " pinskia at gcc dot gnu.org
@ 2024-03-03  3:18 ` pinskia at gcc dot gnu.org
  2024-03-03  3:27 ` unlvsur at live dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03  3:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2024-03-03

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
    checkedvector::value_type& checkedvector::operator[](size_type)/167 call is
unlikely and code size would grow
      freq:1.00 loop depth: 0 size: 4 time: 13 callee size: 6 stack: 0
       op1 is compile time invariant



If I had:
```
void test_demovector(checkedvector& vec, int x) noexcept
{
  for(int i = 0; i < y; i++)
    vec[i]=5;
}
```

Then GCC will inling operator[]:
    checkedvector::value_type& checkedvector::operator[](size_type)/167 inlined
      freq:8.09
      Stack frame offset 0, callee self size 0
      void __builtin_trap()/205 function body not available
        freq:0.00 loop depth: 1 size: 1 time:  1

I am suspecting this is the right heurstic. Do you have real code where the
inlining does not happen at -Os/-Oz or you just looking at the code generation
with code that might be benchmarking things?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
  2024-03-03  3:06 ` [Bug ipa/114215] " pinskia at gcc dot gnu.org
  2024-03-03  3:18 ` pinskia at gcc dot gnu.org
@ 2024-03-03  3:27 ` unlvsur at live dot com
  2024-03-03  3:30 ` unlvsur at live dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03  3:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

--- Comment #3 from cqwrteur <unlvsur at live dot com> ---
test_demovector(checkedvector&):
        pushq   %rbx
        movq    %rdi, %rbx
        pushq   $4
        popq    %rsi
        call    checkedvector::operator[](unsigned long)
        movq    %rbx, %rdi
        movq    $5, (%rax)
        pushq   $6
        popq    %rsi
        call    checkedvector::operator[](unsigned long)
        movq    %rbx, %rdi
        movq    $5, (%rax)
        pushq   $10
        popq    %rsi
        call    checkedvector::operator[](unsigned long)
        movq    %rbx, %rdi
        movq    $5, (%rax)
        pushq   $5
        popq    %rsi
        call    checkedvector::operator[](unsigned long)
        movq    $5, (%rax)
        popq    %rbx
        ret
test_demovector_forceinline(checkedvector&):
        movq    (%rdi), %rax
        movq    8(%rdi), %rdx
        subq    %rax, %rdx
        cmpq    $32, %rdx
        ja      .L7
.L8:
        ud2
.L7:
        movq    $5, 32(%rax)
        cmpq    $48, %rdx
        jbe     .L8
        movq    $5, 48(%rax)
        cmpq    $80, %rdx
        jbe     .L8
        movq    $5, 80(%rax)
        movq    $5, 40(%rax)
        ret

see? first one has more instructions than the 2nd one.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
                   ` (2 preceding siblings ...)
  2024-03-03  3:27 ` unlvsur at live dot com
@ 2024-03-03  3:30 ` unlvsur at live dot com
  2024-03-03  3:36 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03  3:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

--- Comment #4 from cqwrteur <unlvsur at live dot com> ---
void test_demovector(checkedvector& vec, __SIZE_TYPE__ x) noexcept
{
  for(__SIZE_TYPE__ i = 0; i < x; i++)
    vec[i]=5;
}

void test_demovector_forceinline(checkedvector& vec, __SIZE_TYPE__ x) noexcept
{
  for(__SIZE_TYPE__ i = 0; i < x; i++)
    vec.index_forceinline(i)=5;
}

still more instructions for the first one even with a loop.

test_demovector(checkedvector&, unsigned long):
        pushq   %r12
        movq    %rdi, %r12
        pushq   %rbp
        movq    %rsi, %rbp
        pushq   %rbx
        xorl    %ebx, %ebx
.L10:
        cmpq    %rbp, %rbx
        je      .L13
        movq    %rbx, %rsi
        movq    %r12, %rdi
        incq    %rbx
        call    checkedvector::operator[](unsigned long)
        movq    $5, (%rax)
        jmp     .L10
.L13:
        popq    %rbx
        popq    %rbp
        popq    %r12
        ret



test_demovector_forceinline(checkedvector&, unsigned long):
        xorl    %eax, %eax
.L15:
        cmpq    %rsi, %rax
        je      .L18
        movq    (%rdi), %rcx
        movq    8(%rdi), %rdx
        subq    %rcx, %rdx
        sarq    $3, %rdx
        cmpq    %rdx, %rax
        jb      .L16
        ud2
.L16:
        movq    $5, (%rcx,%rax,8)
        incq    %rax
        jmp     .L15
.L18:
        ret

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
                   ` (3 preceding siblings ...)
  2024-03-03  3:30 ` unlvsur at live dot com
@ 2024-03-03  3:36 ` pinskia at gcc dot gnu.org
  2024-03-03  3:44 ` [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions unlvsur at live dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03  3:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Still waiting on a full application rather then small benchmark type sources.
The heurstic here is that if you call operator[] multiple times, it might be
better not to inline it for size reasons.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
                   ` (4 preceding siblings ...)
  2024-03-03  3:36 ` pinskia at gcc dot gnu.org
@ 2024-03-03  3:44 ` unlvsur at live dot com
  2024-03-03  3:45 ` unlvsur at live dot com
  2024-03-06  2:23 ` unlvsur at live dot com
  7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03  3:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

--- Comment #6 from cqwrteur <unlvsur at live dot com> ---
(In reply to Andrew Pinski from comment #5)
> Still waiting on a full application rather then small benchmark type
> sources. The heurstic here is that if you call operator[] multiple times, it
> might be better not to inline it for size reasons.

I know. but here the function is too small to the point it is always better to
inline it. Because all it does is an efficient bounds checking. You do not want
bounds checking to be a function call.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
                   ` (5 preceding siblings ...)
  2024-03-03  3:44 ` [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions unlvsur at live dot com
@ 2024-03-03  3:45 ` unlvsur at live dot com
  2024-03-06  2:23 ` unlvsur at live dot com
  7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-03  3:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

--- Comment #7 from cqwrteur <unlvsur at live dot com> ---
 __builtin_trap() is just to crash the program.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions
  2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
                   ` (6 preceding siblings ...)
  2024-03-03  3:45 ` unlvsur at live dot com
@ 2024-03-06  2:23 ` unlvsur at live dot com
  7 siblings, 0 replies; 9+ messages in thread
From: unlvsur at live dot com @ 2024-03-06  2:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114215

--- Comment #8 from cqwrteur <unlvsur at live dot com> ---
(In reply to Andrew Pinski from comment #5)
> Still waiting on a full application rather then small benchmark type
> sources. The heurstic here is that if you call operator[] multiple times, it
> might be better not to inline it for size reasons.

llvm actually makes different decision here. Probably GCC thinks __builtin_trap
is expensive despite it is not.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-03-06  2:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-03  2:53 [Bug rtl-optimization/114215] New: GCC makes wrong decision for inline with -Os or -Oz to deal with trivial functions unlvsur at live dot com
2024-03-03  3:06 ` [Bug ipa/114215] " pinskia at gcc dot gnu.org
2024-03-03  3:18 ` pinskia at gcc dot gnu.org
2024-03-03  3:27 ` unlvsur at live dot com
2024-03-03  3:30 ` unlvsur at live dot com
2024-03-03  3:36 ` pinskia at gcc dot gnu.org
2024-03-03  3:44 ` [Bug ipa/114215] -Os or -Oz inlining seems wrong for vague linkage functions unlvsur at live dot com
2024-03-03  3:45 ` unlvsur at live dot com
2024-03-06  2:23 ` unlvsur at live dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).