public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert
@ 2021-10-18 11:51 ubizjak at gmail dot com
  2021-10-19  1:44 ` [Bug target/102812] " crazylht at gmail dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2021-10-18 11:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812

            Bug ID: 102812
           Summary: Unoptimal (and wrong) code for _Float16 insert
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following code:

--cut here--
typedef _Float16 v8hf __attribute__((__vector_size__ (16)));

v8hf t (_Float16 a)
{
  return (v8hf){a, 0, 0, 0, 0, 0, 0, 0};
}
--cut here--

compiles with -msse4 to:

        pxor    %xmm15, %xmm15
        movaps  %xmm15, -56(%rsp)
        pextrw  $0, %xmm0, -56(%rsp)
        vmovdqa64       -56(%rsp), %xmm0

PBLWNDW with cleared %xmm15 would be much more optimal, and wouldn't use
memory.

Also, VMOVDQA64 is an AVX512F/AVX512VL, not a SSE4 (not even AVX) instruction.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102812] Unoptimal (and wrong) code for _Float16 insert
  2021-10-18 11:51 [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert ubizjak at gmail dot com
@ 2021-10-19  1:44 ` crazylht at gmail dot com
  2021-10-20  8:17 ` ubizjak at gmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2021-10-19  1:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
ix86_get_ssemov needs to be updated for V8HF/V16HF since they cound be existed
under TARGET_SSE2/TARGET_AVX.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102812] Unoptimal (and wrong) code for _Float16 insert
  2021-10-18 11:51 [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert ubizjak at gmail dot com
  2021-10-19  1:44 ` [Bug target/102812] " crazylht at gmail dot com
@ 2021-10-20  8:17 ` ubizjak at gmail dot com
  2021-10-20  9:08 ` wwwhhhyyy333 at gmail dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2021-10-20  8:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812

--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
Please note that the code above should compile via ix86_expand_vector_set,
similar to:

--cut here--
typedef short v8hi __attribute__((__vector_size__(16)));

v8hi foo (short a)
{
  return (v8hi) {a, 0, 0, 0, 0, 0, 0, 0 };
}
--cut here--

that results in:

        vpxor   %xmm0, %xmm0, %xmm0
        vpinsrw $0, %edi, %xmm0, %xmm0
        ret

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102812] Unoptimal (and wrong) code for _Float16 insert
  2021-10-18 11:51 [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert ubizjak at gmail dot com
  2021-10-19  1:44 ` [Bug target/102812] " crazylht at gmail dot com
  2021-10-20  8:17 ` ubizjak at gmail dot com
@ 2021-10-20  9:08 ` wwwhhhyyy333 at gmail dot com
  2021-10-21  1:15 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: wwwhhhyyy333 at gmail dot com @ 2021-10-20  9:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812

--- Comment #3 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> ---
(In reply to Uroš Bizjak from comment #2)
> Please note that the code above should compile via ix86_expand_vector_set,
> similar to:
> 
> --cut here--
> typedef short v8hi __attribute__((__vector_size__(16)));
> 
> v8hi foo (short a)
> {
>   return (v8hi) {a, 0, 0, 0, 0, 0, 0, 0 };
> }
> --cut here--
> 
> that results in:
> 
>         vpxor   %xmm0, %xmm0, %xmm0
>         vpinsrw $0, %edi, %xmm0, %xmm0
>         ret

Currently we have

if (TARGET_AVX512FP16 && VALID_AVX512FP16_REG_MODE (mode))
  return true;

in ix86_vector_mode_supported_p, so for SSE2 target V8HFmode would be returned
in BLKmode.

After I put V8HFmode to VALID_SSE2_REG_MODE the code would be like

vmovss  %xmm0, %xmm0, %xmm1        
vpxor   %xmm0, %xmm0, %xmm0        
pextrw  $0, %xmm1, -10(%rsp)       
vpinsrw $0, -10(%rsp), %xmm0, %xmm0

Seems IRA spills the HF reg to memory..

I wonder whether we should move vector mode support to sse2 for now, as we
don't have sufficient HF vector arithmetic emulation for non-avx512fp16 target.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102812] Unoptimal (and wrong) code for _Float16 insert
  2021-10-18 11:51 [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert ubizjak at gmail dot com
                   ` (2 preceding siblings ...)
  2021-10-20  9:08 ` wwwhhhyyy333 at gmail dot com
@ 2021-10-21  1:15 ` crazylht at gmail dot com
  2021-10-21  8:59 ` cvs-commit at gcc dot gnu.org
  2021-12-16 19:45 ` ubizjak at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2021-10-21  1:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongyu Wang from comment #3)
> (In reply to Uroš Bizjak from comment #2)
> > Please note that the code above should compile via ix86_expand_vector_set,
> > similar to:
> > 
> > --cut here--
> > typedef short v8hi __attribute__((__vector_size__(16)));
> > 
> > v8hi foo (short a)
> > {
> >   return (v8hi) {a, 0, 0, 0, 0, 0, 0, 0 };
> > }
> > --cut here--
> > 
> > that results in:
> > 
> >         vpxor   %xmm0, %xmm0, %xmm0
> >         vpinsrw $0, %edi, %xmm0, %xmm0
> >         ret
> 
> Currently we have
> 
> if (TARGET_AVX512FP16 && VALID_AVX512FP16_REG_MODE (mode))
>   return true;
> 
> in ix86_vector_mode_supported_p, so for SSE2 target V8HFmode would be
> returned in BLKmode.
> 
> After I put V8HFmode to VALID_SSE2_REG_MODE the code would be like
> 
> vmovss  %xmm0, %xmm0, %xmm1        
> vpxor   %xmm0, %xmm0, %xmm0        
> pextrw  $0, %xmm1, -10(%rsp)       
> vpinsrw $0, -10(%rsp), %xmm0, %xmm0
> 
> Seems IRA spills the HF reg to memory..
> 
> I wonder whether we should move vector mode support to sse2 for now, as we
> don't have sufficient HF vector arithmetic emulation for non-avx512fp16
> target.
Acccording to document, maybe we can.
@deftypefn {Target Hook} bool TARGET_VECTOR_MODE_SUPPORTED_P (machine_mode
@var{mode})
Define this to return nonzero if the port is prepared to handle
insns involving vector mode @var{mode}.  At the very least, it
must have move patterns for this mode.
@end deftypefn

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102812] Unoptimal (and wrong) code for _Float16 insert
  2021-10-18 11:51 [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert ubizjak at gmail dot com
                   ` (3 preceding siblings ...)
  2021-10-21  1:15 ` crazylht at gmail dot com
@ 2021-10-21  8:59 ` cvs-commit at gcc dot gnu.org
  2021-12-16 19:45 ` ubizjak at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-21  8:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hongyu Wang <hongyuw@gcc.gnu.org>:

https://gcc.gnu.org/g:c8a889fc0e115d40a2d02f32842655f3eadc8fa1

commit r12-4601-gc8a889fc0e115d40a2d02f32842655f3eadc8fa1
Author: Hongyu Wang <hongyu.wang@intel.com>
Date:   Wed Oct 20 13:13:39 2021 +0800

    i386: Fix wrong codegen for V8HF move without TARGET_AVX512F

    Since _Float16 type is enabled under sse2 target, returning
    V8HFmode vector without AVX512F target would generate wrong
    vmovdqa64 instruction. Adjust ix86_get_ssemov to avoid this.

    gcc/ChangeLog:
            PR target/102812
            * config/i386/i386.c (ix86_get_ssemov): Adjust HFmode vector
            move to use the same logic as HImode.

    gcc/testsuite/ChangeLog:
            PR target/102812
            * gcc.target/i386/pr102812.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102812] Unoptimal (and wrong) code for _Float16 insert
  2021-10-18 11:51 [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert ubizjak at gmail dot com
                   ` (4 preceding siblings ...)
  2021-10-21  8:59 ` cvs-commit at gcc dot gnu.org
@ 2021-12-16 19:45 ` ubizjak at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2021-12-16 19:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |12.0

--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
Fixed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-12-16 19:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 11:51 [Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert ubizjak at gmail dot com
2021-10-19  1:44 ` [Bug target/102812] " crazylht at gmail dot com
2021-10-20  8:17 ` ubizjak at gmail dot com
2021-10-20  9:08 ` wwwhhhyyy333 at gmail dot com
2021-10-21  1:15 ` crazylht at gmail dot com
2021-10-21  8:59 ` cvs-commit at gcc dot gnu.org
2021-12-16 19:45 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).