From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 424D63858C30; Thu, 1 Jun 2023 07:26:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 424D63858C30 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685604371; bh=akkYRlTT0TrKSkvFhXBAWsTanYz5ihlH8c1cD9PReuc=; h=From:To:Subject:Date:From; b=UBn097YzeBeor7NwJ1O3ec/Cg1IaPyhBENHKWAdlK8MaKenzqpHzDCpcDJrpIjY3Y kCQj3oabVOVHedUU2TaT4RIy220oEOrx5AiWmOVDViONQo8e+AsFCF73gaQQPYa1mi eGMrRocitCKo15rfUlz79hCODA3k0lEo2S0EUiCw= From: "roger at nextmovesoftware dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/110070] New: Code quality regression with for (int i: {1,2,4,6}) Date: Thu, 01 Jun 2023 07:26:09 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: roger at nextmovesoftware dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110070 Bug ID: 110070 Summary: Code quality regression with for (int i: {1,2,4,6}) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: roger at nextmovesoftware dot com Target Milestone: --- The fix for PR c++/70167 (in GCC 11.3) inadvertently introduced a code qual= ity regression for simple range-for using initializer lists. The motivating example is an idiom from the stockfish benchmark [update_continuation_histo= ries in src/search.cpp]: #include extern void ext(int); void foo() { for (int i: {1,2,4,6}) ext(i); } which currently generates inefficient code by copying the array (to the sta= ck) before use: foo(): pushq %rbp pushq %rbx subq $24, %rsp movdqa .LC0(%rip), %xmm0 movq %rsp, %rbx leaq 16(%rsp), %rbp movaps %xmm0, (%rsp) .L2: movl (%rbx), %edi addq $4, %rbx call ext(int) cmpq %rbp, %rbx jne .L2 addq $24, %rsp popq %rbx popq %rbp ret .LC0: .long 1 .long 2 .long 4 .long 6 In GCC 11.2 and earlier, the initializing array is efficiently used without copying: foo(): pushq %rbx movl $C.0.0, %ebx .L2: movl (%rbx), %edi addq $4, %rbx call ext(int) cmpq $C.0.0+16, %rbx jne .L2 popq %rbx ret C.0.0: .long 1 .long 2 .long 4 .long 6 The underlying cause of the code difference stems from whether the initiali= zer is marked "static" in the middle-end, as shown by the differences between: const int init[4] =3D {1,2,4,6}; for (int i: init) ... // generates a copy and=20 static const int init[4] =3D {1,2,4,6}; for (int i: init) ... // doesn't generate a copy Fortunately, there's already code in the depth of the C++ front=3Dend for m= arking such initializer lists/constructors as static, so I initially tried fixing = this myself, at first trying: diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 05df628..a91693d 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -3314,7 +3314,6 @@ finish_compound_literal (tree type, tree compound_literal, /* FIXME all C99 compound literals should be variables rather than C++ temporaries, unless they are used as an aggregate initializer. */ if ((!at_function_scope_p () || CP_TYPE_CONST_P (type)) - && fcl_context =3D=3D fcl_c99 && TREE_CODE (type) =3D=3D ARRAY_TYPE && !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type) && initializer_constant_valid_p (compound_literal, type)) and then trying: diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc index 2736f55..53220da 100644 --- a/gcc/cp/call.cc +++ b/gcc/cp/call.cc @@ -8557,7 +8557,10 @@ convert_like_internal (conversion *convs, tree expr, tree fn, int argnum, elttype =3D cp_build_qualified_type (elttype, cp_type_quals (elttype) | TYPE_QUAL_CONST); array =3D build_array_of_n_type (elttype, len); - array =3D finish_compound_literal (array, new_ctor, complain); + /* Indicate that a non-lvalue static const array is acceptable + by specifying fcl_c99. */ + array =3D finish_compound_literal (array, new_ctor, complain, + fcl_c99); /* Take the address explicitly rather than via decay_conversion to avoid the error about taking the address of a temporary. = */ array =3D cp_build_addr_expr (array, complain); Both of which fix/improve code generation for this case, but break the initializer tests in the g++.dg testsuite in interesting ways. At this poi= nt I thought I'd give up and leave the fix to the experts. The range_expr passe= d to cp_convert_range_for is: constant length:4 val unit-size align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-ty= pe 0x7ffff6c415e8 precision:32 min max pointer_to_this > constant public arg:0 iter.cc:7:16 start: iter.cc:7:16 finish: iter.cc:7:16> val constant public arg:0 iter.cc:7:18 start: iter.cc:7:18 finish: iter.cc:7:18> val constant public arg:0 iter.cc:7:20 start: iter.cc:7:20 finish: iter.cc:7:20> val constant public arg:0 iter.cc:7:22 start: iter.cc:7:22 finish: iter.cc:7:22>> which contains a lot of non_lvalue_expr, so it's surprising (to me) that we= try to turn this into an lvalue, when it should/could be read-only. Thanks in advance. My apologies if this is a duplicate/known issue. Ideally, GCC should be able to unroll this loop, but that's a different iss= ue.=