From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 192F9385840A; Tue, 28 Sep 2021 07:09:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 192F9385840A From: "rguenther at suse dot de" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/102494] Failure to optimize vector reduction properly especially when using OpenMP Date: Tue, 28 Sep 2021 07:09:44 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenther at suse dot de X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2021 07:09:45 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102494 --- Comment #8 from rguenther at suse dot de --- On Tue, 28 Sep 2021, crazylht at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102494 >=20 > --- Comment #7 from Hongtao.liu --- > After supporting v4hi reduce, gimple seems not optimal to convert v8qi to= v8hi. >=20 > 6 vector(4) short int vect__21.36; > 7 vector(4) unsigned short vect__2.31; > 8 int16_t stmp_r_17.17; > 9 vector(8) short int vect__16.15; > 10 int16_t D.2229[8]; > 11 vector(8) short int _50; > 12 vector(8) short int _51; > 13 vector(8) short int _52; > 14 vector(8) short int _53; > 15 vector(8) short int _54; > 16 vector(8) short int _55; >=20 > 18 [local count: 189214783]: > 19 vect__2.31_97 =3D [vec_unpack_lo_expr] a_90(D); > 20 vect__2.31_98 =3D [vec_unpack_hi_expr] a_90(D); > 21 vect__21.36_105 =3D VIEW_CONVERT_EXPR(vect__2.31= _97); > 22 vect__21.36_106 =3D VIEW_CONVERT_EXPR(vect__2.31= _98); > 23 MEM [(short int *)&D.2229] =3D vect__21.36_105; > 24 MEM [(short int *)&D.2229 + 8B] =3D vect__21.36= _106; so the above could possibly use a V8QI -> V8HI conversion, the loop vectorizer isn't good at producing those though. And of course the appropriate conversion optab has to exist. > 25 vect__16.15_47 =3D MEM [(short int *)&D.2229]; Here's lack of "CSE" - I do have patches somewhere to turn this into vect__16.15_47 =3D { vect__21.36_105, vect__21.36_106 }; but I'm not sure that's going to be profitable (well, the code as-is will get a STLF hit). There's also store-merging that could instead merge the stores similarly (but then there's no CSE after store-merging so the load would remain).=