From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DC9E33858D28; Thu, 6 Apr 2023 18:31:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DC9E33858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1680805897; bh=LYOw5rrwaB+B1OijOzWA1FJoLvOFxDOy3v9uX8j4ZmY=; h=From:To:Subject:Date:From; b=UXE2nMvg8ZCVL49QLKUBDEipOOeqnoyN2a8v/+g/ud+sescJS4mXk0NJrLde6xjIc eZ0GHsC23CTIPY9HhjxpD3V0eBvZNe1z6AnOZrJDn13FR5LBeHY+4iBwQ1lKj1QLOc uKnYBgsood/yi9zHQ+LmCt9GUoPjgSVw4b++8+Po= From: "hiraditya at msn dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109441] New: missed optimization when all elements of vector are known Date: Thu, 06 Apr 2023 18:31:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: hiraditya at msn dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109441 Bug ID: 109441 Summary: missed optimization when all elements of vector are known Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hiraditya at msn dot com Target Milestone: --- Reference: https://godbolt.org/z/af4x6zhz9 When all elements of vector are 0, then the compiler should be able to remo= ve the loop and just return 0. Testcase: #include using namespace std; using T =3D int; T v() { T s; std::vector v; v.resize(1000, 0); for (auto i =3D 0; i < v.size(); ++i) { s +=3D v[i]; } return s; } $ g++ -O3 -std=3Dc++17 .LC0: .string "vector::_M_fill_insert" v(): push rbx pxor xmm0, xmm0 mov edx, 1000 xor esi, esi sub rsp, 48 lea rcx, [rsp+12] lea rdi, [rsp+16] mov QWORD PTR [rsp+32], 0 mov DWORD PTR [rsp+12], 0 movaps XMMWORD PTR [rsp+16], xmm0 call std::vector >::_M_fill_insert(__gnu_cxx::__normal_iterator > >, unsigned long, int const&) mov rdx, QWORD PTR [rsp+24] mov rdi, QWORD PTR [rsp+16] mov rax, rdx sub rax, rdi mov rsi, rax sar rsi, 2 cmp rdx, rdi je .L99 test rax, rax mov ecx, 1 cmovne rcx, rsi cmp rax, 12 jbe .L107 mov rdx, rcx pxor xmm0, xmm0 mov rax, rdi shr rdx, 2 sal rdx, 4 add rdx, rdi .L101: movdqu xmm2, XMMWORD PTR [rax] add rax, 16 paddd xmm0, xmm2 cmp rdx, rax jne .L101 movdqa xmm1, xmm0 psrldq xmm1, 8 paddd xmm0, xmm1 movdqa xmm1, xmm0 psrldq xmm1, 4 paddd xmm0, xmm1 movd ebx, xmm0 test cl, 3 je .L99 and rcx, -4 mov eax, ecx .L100: lea edx, [rax+1] add ebx, DWORD PTR [rdi+rcx*4] movsx rdx, edx cmp rdx, rsi jnb .L99 add eax, 2 lea rcx, [0+rdx*4] add ebx, DWORD PTR [rdi+rdx*4] cdqe cmp rax, rsi jnb .L99 add ebx, DWORD PTR [rdi+4+rcx] .L99: test rdi, rdi je .L98 mov rsi, QWORD PTR [rsp+32] sub rsi, rdi call operator delete(void*, unsigned long) .L98: add rsp, 48 mov eax, ebx pop rbx ret .L107: xor eax, eax xor ecx, ecx jmp .L100 mov rbx, rax jmp .L105 v() [clone .cold]:=