public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/110070] New: Code quality regression with for (int i: {1,2,4,6})
@ 2023-06-01  7:26 roger at nextmovesoftware dot com
  2023-06-01  8:28 ` [Bug c++/110070] " redi at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-06-01  7:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110070

            Bug ID: 110070
           Summary: Code quality regression with for (int i: {1,2,4,6})
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: roger at nextmovesoftware dot com
  Target Milestone: ---

The fix for PR c++/70167 (in GCC 11.3) inadvertently introduced a code quality
regression for simple range-for using initializer lists.  The motivating
example is an idiom from the stockfish benchmark [update_continuation_histories
in src/search.cpp]:

#include <initializer_list>
extern void ext(int);
void foo()
{
  for (int i: {1,2,4,6})
    ext(i);
}

which currently generates inefficient code by copying the array (to the stack)
before use:
foo():
        pushq   %rbp
        pushq   %rbx
        subq    $24, %rsp
        movdqa  .LC0(%rip), %xmm0
        movq    %rsp, %rbx
        leaq    16(%rsp), %rbp
        movaps  %xmm0, (%rsp)
.L2:
        movl    (%rbx), %edi
        addq    $4, %rbx
        call    ext(int)
        cmpq    %rbp, %rbx
        jne     .L2
        addq    $24, %rsp
        popq    %rbx
        popq    %rbp
        ret
.LC0:
        .long   1
        .long   2
        .long   4
        .long   6

In GCC 11.2 and earlier, the initializing array is efficiently used without
copying:

foo():
        pushq   %rbx
        movl    $C.0.0, %ebx
.L2:
        movl    (%rbx), %edi
        addq    $4, %rbx
        call    ext(int)
        cmpq    $C.0.0+16, %rbx
        jne     .L2
        popq    %rbx
        ret
C.0.0:
        .long   1
        .long   2
        .long   4
        .long   6

The underlying cause of the code difference stems from whether the initializer
is marked "static" in the middle-end, as shown by the differences between:

  const int init[4] = {1,2,4,6};
  for (int i: init) ... // generates a copy

and 

  static const int init[4] = {1,2,4,6};
  for (int i: init) ... // doesn't generate a copy


Fortunately, there's already code in the depth of the C++ front=end for marking
such initializer lists/constructors as static, so I initially tried fixing this
myself, at first trying:

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 05df628..a91693d 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3314,7 +3314,6 @@ finish_compound_literal (tree type, tree
compound_literal,
   /* FIXME all C99 compound literals should be variables rather than C++
      temporaries, unless they are used as an aggregate initializer.  */
   if ((!at_function_scope_p () || CP_TYPE_CONST_P (type))
-      && fcl_context == fcl_c99
       && TREE_CODE (type) == ARRAY_TYPE
       && !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type)
       && initializer_constant_valid_p (compound_literal, type))

and then trying:

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 2736f55..53220da 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -8557,7 +8557,10 @@ convert_like_internal (conversion *convs, tree expr,
tree
 fn, int argnum,
            elttype = cp_build_qualified_type
              (elttype, cp_type_quals (elttype) | TYPE_QUAL_CONST);
            array = build_array_of_n_type (elttype, len);
-           array = finish_compound_literal (array, new_ctor, complain);
+           /* Indicate that a non-lvalue static const array is acceptable
+              by specifying fcl_c99.  */
+           array = finish_compound_literal (array, new_ctor, complain,
+                                            fcl_c99);
            /* Take the address explicitly rather than via decay_conversion
               to avoid the error about taking the address of a temporary.  */
            array = cp_build_addr_expr (array, complain);


Both of which fix/improve code generation for this case, but break the
initializer tests in the g++.dg testsuite in interesting ways.  At this point I
thought I'd give up and leave the fix to the experts.  The range_expr passed to
cp_convert_range_for is:

 <constructor 0x7ffff6dc9348
    type <lang_type 0x7ffff6dadc78 init list VOID
        align:1 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff6dadc78>
    constant length:4
    val <non_lvalue_expr 0x7ffff6dcd540
        type <integer_type 0x7ffff6c415e8 int public type_6 SI
            size <integer_cst 0x7ffff6c43228 constant 32>
            unit-size <integer_cst 0x7ffff6c43240 constant 4>
            align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7ffff6c415e8 precision:32 min <integer_cst 0x7ffff6c431e0 -2147483648> max
<integer_cst 0x7ffff6c431f8 2147483647>
            pointer_to_this <pointer_type 0x7ffff6c49b28>>
        constant public
        arg:0 <integer_cst 0x7ffff6c43390 constant 1>
        iter.cc:7:16 start: iter.cc:7:16 finish: iter.cc:7:16>
    val <non_lvalue_expr 0x7ffff6dcd560 type <integer_type 0x7ffff6c415e8 int>
        constant public
        arg:0 <integer_cst 0x7ffff6c43768 constant 2>
        iter.cc:7:18 start: iter.cc:7:18 finish: iter.cc:7:18>
    val <non_lvalue_expr 0x7ffff6dcd580 type <integer_type 0x7ffff6c415e8 int>
        constant public
        arg:0 <integer_cst 0x7ffff6c43780 constant 4>
        iter.cc:7:20 start: iter.cc:7:20 finish: iter.cc:7:20>
    val <non_lvalue_expr 0x7ffff6dcd5a0 type <integer_type 0x7ffff6c415e8 int>
        constant public
        arg:0 <integer_cst 0x7ffff6c437b0 constant 6>
        iter.cc:7:22 start: iter.cc:7:22 finish: iter.cc:7:22>>

which contains a lot of non_lvalue_expr, so it's surprising (to me) that we try
to turn this into an lvalue, when it should/could be read-only.

Thanks in advance.  My apologies if this is a duplicate/known issue.
Ideally, GCC should be able to unroll this loop, but that's a different issue.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-08-18 17:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-01  7:26 [Bug c++/110070] New: Code quality regression with for (int i: {1,2,4,6}) roger at nextmovesoftware dot com
2023-06-01  8:28 ` [Bug c++/110070] " redi at gcc dot gnu.org
2023-06-02 15:01 ` cvs-commit at gcc dot gnu.org
2023-06-02 15:34 ` jason at gcc dot gnu.org
2023-08-18 17:52 ` fxcoudert at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).