From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 23F563858414; Fri, 26 Nov 2021 16:00:38 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 23F563858414
From: "ubizjak at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/102811] vcvtph2ps and vcvtps2ph should be used to
 convert _Float16 to SFmode with -mf16c
Date: Fri, 26 Nov 2021 16:00:38 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: ubizjak at gmail dot com
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Resolution: FIXED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-102811-4-gMO5AbDRAv@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-102811-4@http.gcc.gnu.org/bugzilla/>
References: <bug-102811-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Nov 2021 16:00:38 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102811
--- Comment #15 from Uro=C5=A1 Bizjak <ubizjak at gmail dot com> ---
(In reply to Hongtao.liu from comment #14)
> (In reply to Uro=C5=A1 Bizjak from comment #13)
> > (In reply to Hongtao.liu from comment #12)
> > > >=20
> > > > Just noticed that for some reason two VPXORs are emitted. One shoul=
d be
> > > > enough for both VPINSRW insns.
> > >=20
> > > With new alternative in your attached match(vpblenw one), RA could re=
use
> > > zero register, w/o that, xmm0/xmm1 need to be explictly clear for the=
 upper
> > > bits.
> > > vpblendw        $1, %xmm1, %xmm2, %xmm1 # 14    [c=3D4 l=3D6]  *vec_s=
etv8hf_0/8
> >=20
> > True, but I'd expect some post-reload(?) pass to propagate zeros and re=
move
> > redundant initializations.
>=20
> On the other hand, if not use expand_vector_set (which treats zero regist=
er
> as both input and output), but emit_insn(gen_sse4_1_pinsrph(...)) with a =
new
> pseudo register as dest. the redudant initialization could be optimized o=
ff
> by fwprop1.
>=20
>         pextrw  $0, %xmm1, %eax
>         pextrw  $0, %xmm0, %edx
>         vpxor   %xmm1, %xmm1, %xmm1
>         vpinsrw $0, %edx, %xmm1, %xmm0
>         vpinsrw $0, %eax, %xmm1, %xmm1
>         vcvtph2ps       %xmm1, %xmm1
>         vcvtph2ps       %xmm0, %xmm0
>         vaddss  %xmm1, %xmm0, %xmm0
>         vinsertps       $0xe, %xmm0, %xmm0, %xmm0
>         vcvtps2ph       $4, %xmm0, %xmm0

Then we will lose optimization in expand vector set:

    case E_V8HFmode:
      if (TARGET_AVX2)
        {
          mmode =3D SImode;
          gen_blendm =3D gen_sse4_1_pblendph;
          blendm_const =3D true;
        }
      else
        use_vec_merge =3D true;
      break;

Maybe we should simply copy "target" to a new pseudo here:

do_vec_merge:
      tmp =3D gen_rtx_VEC_DUPLICATE (mode, val);
      tmp =3D gen_rtx_VEC_MERGE (mode, tmp, target,
                               GEN_INT (HOST_WIDE_INT_1U << elt));
      emit_insn (gen_rtx_SET (target, tmp));

OTOH, if recycling "target" inhibits FWprop, we should perhaps copy "target=
" to
a new pseudo at the beginning of the expand_vector_set?=