From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id B498B3858402; Wed, 31 Jan 2024 02:33:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B498B3858402 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1706668395; bh=NxTH0D9qpX4hcrRSx5jxHCnXvRfqvbykovvAZdjfuAM=; h=From:To:Subject:Date:From; b=LEmHSJdRfNRJMAQdjGT/InPAY0GJ+NLoI30sqhojexLAE5VeNiVXKVxXthnkQsryv T1kbCbhaWgTZGq9MSe8z1pYrm0x8UhrhML2Q4UfaO3CiwDXoin6eJiBKrXSVzz/TUR uV9ACNNDEH6h5kqywqz3YXNpMWSF+FrjiiqnYdHw= From: "pinskia at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113678] New: SLP misses up vec_concat Date: Wed, 31 Jan 2024 02:33:15 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: pinskia at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter target_milestone cf_gcctarget Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113678 Bug ID: 113678 Summary: SLP misses up vec_concat Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: x86_64 Take: ``` void f(char *a, char *b) { int b0 =3D b[0]; int b1 =3D b[1]; int b2 =3D b[2]; int b3 =3D b[3]; int b4 =3D 0; int b5 =3D 0; int b6 =3D 0; int b7 =3D 0; a[0] =3D b0; a[1] =3D b1; a[2] =3D b2; a[3] =3D b3; #if 0 asm("":::"memory"); #endif a[4] =3D b0; a[5] =3D b1; a[6] =3D b2; a[7] =3D b3; } ``` On x86_64 we get some mess because SLP decides to do this: ``` _1 =3D *b_6(D); _2 =3D MEM[(char *)b_6(D) + 1B]; _3 =3D MEM[(char *)b_6(D) + 2B]; _4 =3D MEM[(char *)b_6(D) + 3B]; _16 =3D {_1, _2, _3, _4, _1, _2, _3, _4}; ``` But this is could be done as 2 stores (if we change the `#if 0` to `#if 1` = we get the better code): ``` vect__1.5_18 =3D MEM [(char *)b_6(D)]; MEM [(char *)a_7(D)] =3D vect__1.5_18; MEM [(char *)a_7(D) + 4B] =3D vect__1.5_18; ``` Or we could get one store even like LLVM gets: ``` movd xmm0, dword ptr [rsi] # xmm0 =3D mem[0],zero,zero= ,zero pshufd xmm0, xmm0, 0 # xmm0 =3D xmm0[0,0,0,0] movq qword ptr [rdi], xmm0 ret ```=