From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BB75F3858C54; Thu, 29 Jun 2023 04:15:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BB75F3858C54 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688012140; bh=IvMq6JZdAd1BVZl6AQTWKfxWi52U1LbumwwypgdVNQY=; h=From:To:Subject:Date:From; b=JI2mGzJdLhNyXXR/cwyovvnu+f44XD8nTrc0hEnUOc5DY6G68ACHdy9e3/OiZb4tk 3hUibmdwZBzokRpF9cAq0x9QdEVob+49mbvzSJ61gj/jewU3NhZg/MbPIuxeIkbZ8d H7DiHlM9CHwBS2svBkB9Z98s5LMpfNQqBfKGRuCg= From: "pinskia at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/110473] New: vec_convert for aarch64 seems to lower to something which should be improved Date: Thu, 29 Jun 2023 04:15:40 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: pinskia at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter target_milestone cf_gcctarget Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110473 Bug ID: 110473 Summary: vec_convert for aarch64 seems to lower to something which should be improved Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` typedef unsigned int v4si __attribute__ ((vector_size (4*sizeof(int)))); typedef unsigned short v4hi __attribute__ ((vector_size (4*sizeof(short)))); v4si f(v4si a, v4si b) { v4hi t =3D __builtin_convertvector (a, v4hi); v4si t1 =3D __builtin_convertvector (t, v4si); return t1; } ``` This gets lowered in veclower21 to ``` _6 =3D BIT_FIELD_REF <_5, 64, 0>; t_2 =3D _6; _7 =3D BIT_FIELD_REF ; _8 =3D (unsigned int) _7; _9 =3D BIT_FIELD_REF ; _10 =3D (unsigned int) _9; _11 =3D BIT_FIELD_REF ; _12 =3D (unsigned int) _11; _13 =3D BIT_FIELD_REF ; _14 =3D (unsigned int) _13; t1_3 =3D {_8, _10, _12, _14}; ``` And then forwprop optimizes this to: ``` _6 =3D BIT_FIELD_REF <_5, 64, 0>; t1_3 =3D (v4si) _6; ``` And then combine comes along and optimizes that to: (insn 9 8 11 2 (set (reg:V8HI 98) (vec_concat:V8HI (truncate:V4HI (reg:V4SI 102)) (const_vector:V4HI [ (const_int 0 [0]) repeated x4 ]))) "/app/example.cpp":7:8 5467 {truncv4siv4hi2_vec_concatz_le} (expr_list:REG_DEAD (reg:V4SI 102) (nil))) (note 11 9 16 2 NOTE_INSN_DELETED) (insn 16 11 17 2 (set (reg/i:V4SI 32 v0) (sign_extend:V4SI (subreg:V4HI (reg:V8HI 98) 0))) "/app/example.cpp":10:1 5459 {extendv4hiv4si2} (expr_list:REG_DEAD (reg:V8HI 98) (nil))) But the first one is basically just (truncate:V4HI (reg:V4SI 102)) (due to = the way the instruction works, the top parts is also zeros). So we get in the end: xtn v0.4h, v0.4s sxtl v0.4s, v0.4h Why couldn't vectlower could just do: ``` _6 =3D (v4hi)a_1(D); t1_3 =3D (v4si) _6; ``` In the first place instead of depending on later optimizations (at least ha= ndle the second part)? note with the above lowering we might hit an issue in match.pd where it is trying to turn that into t1_3 =3D t1_3 =3D {0xFFFF,0xFFFF,0xFFFF,0xFFFF} & = _6; (both because of TYPE_PRECISION and because it just uses wide_int_to_tree .= ..=