public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code
@ 2022-02-15 11:28 redi at gcc dot gnu.org
  2022-02-15 11:32 ` [Bug tree-optimization/104547] " redi at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-15 11:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

            Bug ID: 104547
           Summary: std::vector::resize(v.size() - n) produces poor code
           Product: gcc
           Version: 11.2.1
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

The codegen for this is pretty bad:

#include <vector>

void shrink(std::vector<int>& v, unsigned n) {
    v.resize(v.size() - n);
}

_Z6shrinkRSt6vectorIiSaIiEEj:
.LFB865:
        .cfi_startproc
        movq    8(%rdi), %rdx
        movq    (%rdi), %rcx
        movl    %esi, %esi
        movq    %rdx, %rax
        subq    %rcx, %rax
        sarq    $2, %rax
        movq    %rax, %r8
        subq    %rsi, %r8
        jb      .L3
        cmpq    %rax, %r8
        jnb     .L1
        leaq    (%rcx,%r8,4), %rax
        cmpq    %rax, %rdx
        je      .L1
        movq    %rax, 8(%rdi)
.L1:
        ret
.L3:
        pushq   %rax
        .cfi_def_cfa_offset 16
        movl    $.LC0, %edi
        call    _ZSt20__throw_length_errorPKc
        .cfi_endproc



Telling the compiler that v.size() - n doesn't wrap doesn't help at all:

#include <vector>

void shrink(std::vector<int>& v, unsigned n) {
    if (v.size() < n)
      __builtin_unreachable();
    v.resize(v.size() - n);
}

Why do we still have the call to __throw_length_error() there? It's
unreachable, because we're only shrinking, not growing.


This is better:

void shrink_pop(std::vector<int>& v, unsigned n) {
    while (n--)
      v.pop_back();
}

_Z10shrink_popRSt6vectorIiSaIiEEj:
.LFB866:
        .cfi_startproc
        testl   %esi, %esi
        je      .L10
        movq    8(%rdi), %rax
        movl    %esi, %esi
        negq    %rsi
        leaq    (%rax,%rsi,4), %rdx
        .p2align 4,,10
        .p2align 3
.L12:
        subq    $4, %rax
        movq    %rax, 8(%rdi)
        cmpq    %rax, %rdx
        jne     .L12
.L10:
        ret
        .cfi_endproc


And this:

void shrink_min(std::vector<int>& v, unsigned n) {
    v.resize(std::min<std::size_t>(v.size(), n));
}

_Z10shrink_minRSt6vectorIiSaIiEEj:
.LFB867:
        .cfi_startproc
        movq    8(%rdi), %rdx
        movq    (%rdi), %rcx
        movl    %esi, %esi
        movq    %rdx, %rax
        subq    %rcx, %rax
        sarq    $2, %rax
        cmpq    %rax, %rsi
        jb      .L54
.L52:
        ret
        .p2align 4,,10
        .p2align 3
.L54:
        leaq    (%rcx,%rsi,4), %rax
        cmpq    %rax, %rdx
        je      .L52
        movq    %rax, 8(%rdi)
        ret
        .cfi_endproc

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
@ 2022-02-15 11:32 ` redi at gcc dot gnu.org
  2022-02-15 11:34 ` redi at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-15 11:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
If we have two versions of the function in the same TU:

#include <vector>

void shrink(std::vector<int>& v, unsigned n) {
    v.resize(v.size() - n);
}

void shrink_min(std::vector<int>& v, unsigned n) {
    v.resize(std::min<std::size_t>(v.size(), n));
}

Then the .s file contains an instantiation of vector::_M_default_append even
though neither of the functions calls it. Why isn't that eliminated as dead
code? We shouldn't need to keep an implicit instantiation of a non-inline
function template if it's not called.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
  2022-02-15 11:32 ` [Bug tree-optimization/104547] " redi at gcc dot gnu.org
@ 2022-02-15 11:34 ` redi at gcc dot gnu.org
  2022-02-15 11:39 ` redi at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-15 11:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
If I change the type of n to match the type of v.size() the code size explodes:

#include <vector>

void shrink_assume(std::vector<int>& v, std::size_t n) {
    if (v.size() < n)
      __builtin_unreachable();
    v.resize(v.size() - n);
}


This adds 141 lines to the .s compared to the same function with unsigned n.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
  2022-02-15 11:32 ` [Bug tree-optimization/104547] " redi at gcc dot gnu.org
  2022-02-15 11:34 ` redi at gcc dot gnu.org
@ 2022-02-15 11:39 ` redi at gcc dot gnu.org
  2022-02-15 11:41 ` redi at gcc dot gnu.org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-15 11:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
For this TU with three functions, clang generates good code:


#include <vector>

void shrink_pop(std::vector<int>& v, std::size_t n) {
    while (n--)
      v.pop_back();
}

void shrink_assume(std::vector<int>& v, std::size_t n) {
    if (v.size() < n)
      __builtin_unreachable();
    v.resize(v.size() - n);
}

void shrink_min(std::vector<int>& v, std::size_t n) {
    v.resize(std::min<std::size_t>(v.size(), n));
}


Clang removes the unreachable __throw_length_error, doesn't explode if the type
is size_t, and doesn't keep the unused _M_default_append instantiation.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-02-15 11:39 ` redi at gcc dot gnu.org
@ 2022-02-15 11:41 ` redi at gcc dot gnu.org
  2022-02-15 12:08 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-15 11:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #0)
> This is better:
> 
> void shrink_pop(std::vector<int>& v, unsigned n) {
>     while (n--)
>       v.pop_back();
> }
> 
> _Z10shrink_popRSt6vectorIiSaIiEEj:
> .LFB866:
>         .cfi_startproc
>         testl   %esi, %esi
>         je      .L10
>         movq    8(%rdi), %rax
>         movl    %esi, %esi
>         negq    %rsi
>         leaq    (%rax,%rsi,4), %rdx
>         .p2align 4,,10
>         .p2align 3
> .L12:
>         subq    $4, %rax
>         movq    %rax, 8(%rdi)
>         cmpq    %rax, %rdx
>         jne     .L12
> .L10:
>         ret
>         .cfi_endproc

This should be A LOT better, but see PR 96717 comment 6 and 7. It's still
better than the resize(v.size() - n) code even with that regression.

Clang's code for shrink_pop is:

_Z10shrink_popRSt6vectorIiSaIiEEm:      # @_Z10shrink_popRSt6vectorIiSaIiEEm
        .cfi_startproc
# %bb.0:
        testq   %rsi, %rsi
        je      .LBB0_2
# %bb.1:
        shlq    $2, %rsi
        subq    %rsi, 8(%rdi)
.LBB0_2:
        retq

I was expecting something like that for v.resize(v.size() - n) too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-02-15 11:41 ` redi at gcc dot gnu.org
@ 2022-02-15 12:08 ` jakub at gcc dot gnu.org
  2022-02-15 12:30 ` redi at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-15 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-02-15
                 CC|                            |aldyh at gcc dot gnu.org,
                   |                            |amacleod at redhat dot com,
                   |                            |jakub at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
If you mean with -O3 on the
#include <vector>

void shrink(std::vector<int>& v, unsigned n) {
    if (v.size() < n)
      __builtin_unreachable();
    v.resize(v.size() - n);
}
testcase (with -O2 _M_default_append isn't nlined), then I think it is
something that ideally ranger's symbolic handling should figure out but
doesn't.
We have:
  long unsigned int _1;
...
  long unsigned int _11;
...
  if (_1 > _11)
    goto <bb 3>; [0.00%]
  else
    goto <bb 4>; [100.00%]

  <bb 3> [count: 0]:
  __builtin_unreachable ();

  <bb 4> [local count: 1073741824]:
  # RANGE ~[2305843009213693952, 16140901060200890368]
  _2 = _11 - _1;
  if (_2 > _11)
and we don't figure out that _2 > _11 is always false, because we earlier
assert that _1 <= _11 and so _11 - _1 will not wrap around and so the result _2
<= _11.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-02-15 12:08 ` jakub at gcc dot gnu.org
@ 2022-02-15 12:30 ` redi at gcc dot gnu.org
  2022-02-15 15:27 ` amacleod at redhat dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-15 12:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #5)
> If you mean with -O3 

Yes, sorry for not saying so: all examples were compiled with -O3.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-02-15 12:30 ` redi at gcc dot gnu.org
@ 2022-02-15 15:27 ` amacleod at redhat dot com
  2022-02-15 15:37 ` redi at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: amacleod at redhat dot com @ 2022-02-15 15:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #7 from Andrew Macleod <amacleod at redhat dot com> ---
Created attachment 52447
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52447&action=edit
proposed patch

(In reply to Jakub Jelinek from comment #5)
> If you mean with -O3 on the
> #include <vector>
> 
> void shrink(std::vector<int>& v, unsigned n) {
>     if (v.size() < n)
>       __builtin_unreachable();
>     v.resize(v.size() - n);
> }
> testcase (with -O2 _M_default_append isn't nlined), then I think it is
> something that ideally ranger's symbolic handling should figure out but
> doesn't.
> We have:
>   long unsigned int _1;
> ...
>   long unsigned int _11;
> ...
>   if (_1 > _11)
>     goto <bb 3>; [0.00%]
>   else
>     goto <bb 4>; [100.00%]
>   
>   <bb 3> [count: 0]:
>   __builtin_unreachable ();
>   
>   <bb 4> [local count: 1073741824]:
>   # RANGE ~[2305843009213693952, 16140901060200890368]
>   _2 = _11 - _1;
>   if (_2 > _11)
> and we don't figure out that _2 > _11 is always false, because we earlier
> assert that _1 <= _11 and so _11 - _1 will not wrap around and so the result
> _2 <= _11.

Relational : (_1 <= _11)
    <bb 4> [local count: 1073741824]:
    _2 = _11 - _1;
    if (_2 > _11)
      goto <bb 5>; [33.00%]
    else
      goto <bb 6>; [67.00%]

Looks like we know _1 is <= _11, but what seems to be missing is the extra
relation that _2 > _11 as a result.

normally we'd handle this by providing operator_minus::lhs_op1_relation()
routine, but unfortunately, it does not currently take a relation between op1
and op2 like the corresponding fold does.. it should be trivial to add it
however. 

The attached patch does the basics...  adds the parameter and provides a very
simple operator_minus::lhs_op1_relation () which then achieves the goal of the
PR. It hasn't been bootstrapped or anything.

I didn't try to flush out anything else, or make any enhancements because I
don't trust myself at this late stage to get it right :-) Feel free to work
with it if you want.

ie, I don't even look at the ranges.. if op1 >= op2, we return LHS <= op1.  We
could also check if the range of op2 does not contain 0, then we could return
LSH < OP1.  There is probably some sign checking that could be done too. 

There is also  the routine fold uses (minus_op1_op2_relation_effect) which
could possibly be leveraged as well, but I don't want to delve into that

Note that this will affect all minus operations everywhere...  for instance,
this patch ends up enabling a much earlier threading pass to eliminate the
branch.. so it never makes it to VRP2.

I also didn't implement lhs_op2 relation.. again, I don't trust myself to get
it right.
So I can either package this up and submit it, or someone more trustworthy can
take it an run with it and add whatever tweaks you feel are safe.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-02-15 15:27 ` amacleod at redhat dot com
@ 2022-02-15 15:37 ` redi at gcc dot gnu.org
  2022-05-13 14:21 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-15 15:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #8 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Thanks for looking at it! There's no urgency here from my point of view, I just
wanted to get it into bugzilla so I could concentrate on other things. Parking
this until stage1 makes sense.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-02-15 15:37 ` redi at gcc dot gnu.org
@ 2022-05-13 14:21 ` cvs-commit at gcc dot gnu.org
  2022-05-13 14:22 ` amacleod at redhat dot com
  2022-11-28 23:01 ` pinskia at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-13 14:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:cf2141a0c640fc9b1c497db3f4d5b270f4b8252a

commit r13-438-gcf2141a0c640fc9b1c497db3f4d5b270f4b8252a
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Tue Feb 15 10:17:26 2022 -0500

    Add relation between op1 & op2 to lhs_opN_relation API.

    We use the relation between op1 and op2 to help fold a statement, but
    it was not provided to the lhs_op1_relation and lhs_op2_relation routines
    to determine if is also creates a relation between the LHS and either
operand.

            gcc/
            PR tree-optimization/104547
            * gimple-range-fold.cc (fold_using_range::range_of_range_op): Add
            the op1/op2 relation to the relation call.
            * range-op.cc (*::lhs_op1_relation): Add param.
            (*::lhs_op2_relation): Ditto.
            (operator_minus::lhs_op1_relation): New.
            (range_relational_tests): Add relation param.
            * range-op.h (lhs_op1_relation, lhs_op2_relation): Adjust
prototype.

            gcc/testsuite/
            * g++.dg/pr104547.C: New.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-05-13 14:21 ` cvs-commit at gcc dot gnu.org
@ 2022-05-13 14:22 ` amacleod at redhat dot com
  2022-11-28 23:01 ` pinskia at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: amacleod at redhat dot com @ 2022-05-13 14:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #10 from Andrew Macleod <amacleod at redhat dot com> ---
fixed.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/104547] std::vector::resize(v.size() - n) produces poor code
  2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-05-13 14:22 ` amacleod at redhat dot com
@ 2022-11-28 23:01 ` pinskia at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-28 23:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104547

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |13.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-11-28 23:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-15 11:28 [Bug tree-optimization/104547] New: std::vector::resize(v.size() - n) produces poor code redi at gcc dot gnu.org
2022-02-15 11:32 ` [Bug tree-optimization/104547] " redi at gcc dot gnu.org
2022-02-15 11:34 ` redi at gcc dot gnu.org
2022-02-15 11:39 ` redi at gcc dot gnu.org
2022-02-15 11:41 ` redi at gcc dot gnu.org
2022-02-15 12:08 ` jakub at gcc dot gnu.org
2022-02-15 12:30 ` redi at gcc dot gnu.org
2022-02-15 15:27 ` amacleod at redhat dot com
2022-02-15 15:37 ` redi at gcc dot gnu.org
2022-05-13 14:21 ` cvs-commit at gcc dot gnu.org
2022-05-13 14:22 ` amacleod at redhat dot com
2022-11-28 23:01 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).