From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4B7823858D20; Tue, 3 Oct 2023 18:27:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4B7823858D20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1696357637; bh=MJzTrt9DMSLvdH2sC7oOkq0LumYn+zSKZZS9jHchZL8=; h=From:To:Subject:Date:From; b=L2+B7yGqLx4tTqCQlhCnsakg/F3WZEAOi8d07F9gfSln1c3G7x0jFn4EbJT46Lx1Y x6Uf1tuvTQlSNwWBKZOI3RDHazEHtQY0Pc7TZC8DDmlcKRYW0oO4lbEwzaCsI0TQ++ RCKOwOGxLyz6nPbDdaGIGKN1s7/1yj56lWsxdirs= From: "deodharvinit99 at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/111683] New: Incorrect answer when using SSE2 intrinsics with -O3 Date: Tue, 03 Oct 2023 18:27:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 10.2.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: deodharvinit99 at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111683 Bug ID: 111683 Summary: Incorrect answer when using SSE2 intrinsics with -O3 Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: deodharvinit99 at gmail dot com Target Milestone: --- Created attachment 56042 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D56042&action=3Dedit convn_script g++ produces incorrect answers when SSE2 intrinsics are used with -O3.=20 -O2 produces same answers compared to an equivalent code written without SS= E2 A standalone repro shell script: g++ -O2 convn_script.cpp ./a.out g++ -O3 convn_script.cpp ./a.out Target: x86_64-linux-gnu gcc version 10.2.1 20210110 (Debian 10.2.1-6) covn_script.cpp: #include #include #include // Function Definitions // // Check properties of input in9 // // Arguments : const double in9[10] // const double in10[7] // double out5[16] // Return Type : void // void convn_script(const double in9[10], const double in10[7], double out5[1= 6]) { int iB; int iC; // Check properties of input in10 std::memset(&out5[0], 0, 16U * sizeof(double)); iC =3D 0; iB =3D 0; for (int i{0}; i < 7; i++) { int b_i; int vectorUB; if (i + 10 <=3D 15) { b_i =3D 9; } else { b_i =3D 15 - i; } vectorUB =3D (((b_i + 1) / 2) << 1) - 2; for (int r{0}; r <=3D vectorUB; r +=3D 2) { __m128d b_r; b_i =3D iC + r; b_r =3D _mm_loadu_pd(&out5[b_i]); _mm_storeu_pd(&out5[b_i], _mm_add_pd(b_r, _mm_mul_pd(_mm_set1_pd(in10[iB]), _mm_loadu_pd(&in9[r])))); } iC =3D iB + 1; iB++; } } int main() { double in9[10] =3D {0.8147, 0.9058, 0.1270, 0.9134, 0.6324, 0.0975, 0.2785, 0.5469, 0.9575, 0.9649}; double in10[7] =3D { 0.1576, 0.9706, 0.9572, 0.4854, 0.8003, 0.1419, 0.4218}; double out5[16]; convn_script(in9, in10, out5); for(int i =3D 0; i < 16; i++) { std::cout << "Out[" << i << "] =3D " << out5[i] << "\n"; } }=