From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 358273858C27; Fri, 5 Mar 2021 06:53:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 358273858C27 From: "linkw at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST Date: Fri, 05 Mar 2021 06:53:01 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: linkw at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Mar 2021 06:53:01 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99398 Bug ID: 99398 Summary: Miss to optimize vector permutation fed by CTOR and CTOR/CST Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- #include "altivec.h" vector long long foo(long long a, long long b) { vector long long v1 =3D {a, 0}; vector long long v2 =3D {b, 0}; vector unsigned char vc =3D {0,1,2,3,4,5,6,7, 16,17,18,19,20,21,22,23}; vector long long vres =3D (vector long long)vec_perm ((vector unsigned ch= ar)v1, (vector unsigned char)v2, vc); return vres; } gcc -Ofast -mcpu=3Dpower9, it generates (asm on BE btw) mtvsrdd 32,3,9 mtvsrdd 33,4,9 lxv 34,0(10) vperm 2,0,1,2 blr But it can be optimized into: mtvsrdd 34,3,4 blr The gimple at optimized dumping looks like: __vector long foo (long long int a, long long int b) { __vector long vres; __vector long v2; __vector long v1; __vector unsigned char _5; __vector unsigned char _6; __vector unsigned char _7; [local count: 1073741824]: v1_2 =3D {a_1(D), 0}; v2_4 =3D {b_3(D), 0}; _5 =3D VIEW_CONVERT_EXPR<__vector unsigned char>(v1_2); _6 =3D VIEW_CONVERT_EXPR<__vector unsigned char>(v2_4); _7 =3D VEC_PERM_EXPR <_5, _6, { 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 2= 0, 21, 22, 23 }>; vres_8 =3D VIEW_CONVERT_EXPR<__vector long>(_7); return vres_8; } But it can look like: __vector long foo (long long int a, long long int b) { vector(2) long long int _10; [local count: 1073741824]: _10 =3D {a_1(D), b_3(D)}; return _10; }=