public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN
@ 2023-11-08  0:21 pan2.li at intel dot com
  2023-11-08  7:05 ` [Bug c/112432] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: pan2.li at intel dot com @ 2023-11-08  0:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

            Bug ID: 112432
           Summary: Internal-fn: The [i|l|ll]rint family don't support
                    FLOATN
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pan2.li at intel dot com
  Target Milestone: ---

The [i|l|ll]rint family are defined as DEF_INTERNAL_FLT_FN instead of
DEF_INTERNAL_FLT_FLOATN_FN in the internal-fn.def. Thus, the standard name like
lrint<m><n> cannot be expanded when _Float16 type is given.

Is there any reason/background that [i|l|ll]rint can honor FLOATN or not? List
all related fn definition as below.

DEF_INTERNAL_FLT_FN (ICEIL, ECF_CONST, lceil, unary_convert)
DEF_INTERNAL_FLT_FN (IFLOOR, ECF_CONST, lfloor, unary_convert)
DEF_INTERNAL_FLT_FN (IRINT, ECF_CONST, lrint, unary_convert)
DEF_INTERNAL_FLT_FN (IROUND, ECF_CONST, lround, unary_convert)
DEF_INTERNAL_FLT_FN (LCEIL, ECF_CONST, lceil, unary_convert)
DEF_INTERNAL_FLT_FN (LFLOOR, ECF_CONST, lfloor, unary_convert)
DEF_INTERNAL_FLT_FN (LRINT, ECF_CONST, lrint, unary_convert)
DEF_INTERNAL_FLT_FN (LROUND, ECF_CONST, lround, unary_convert)
DEF_INTERNAL_FLT_FN (LLCEIL, ECF_CONST, lceil, unary_convert)
DEF_INTERNAL_FLT_FN (LLFLOOR, ECF_CONST, lfloor, unary_convert)
DEF_INTERNAL_FLT_FN (LLRINT, ECF_CONST, lrint, unary_convert)
DEF_INTERNAL_FLT_FN (LLROUND, ECF_CONST, lround, unary_convert)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
@ 2023-11-08  7:05 ` rguenth at gcc dot gnu.org
  2023-11-08  7:17 ` pan2.li at intel dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-08  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2023-11-08
                 CC|                            |jsm28 at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Is there a corresponding C API?  We don't have "generic" versions in
builtins.def either (with _VAR).

That said, what's the testcase here?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
  2023-11-08  7:05 ` [Bug c/112432] " rguenth at gcc dot gnu.org
@ 2023-11-08  7:17 ` pan2.li at intel dot com
  2023-11-08  8:34 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pan2.li at intel dot com @ 2023-11-08  7:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

--- Comment #2 from Li Pan <pan2.li at intel dot com> ---
(In reply to Richard Biener from comment #1)
> Is there a corresponding C API?  We don't have "generic" versions in
> builtins.def either (with _VAR).
> 
> That said, what's the testcase here?

I found some FLOATN like api from glibc doc, when given N is 16.

long int lrintfN (_FloatN x);
long int lroundfN (_FloatN x);

https://www.gnu.org/software/libc/manual/2.38/html_mono/libc.html

The context comes from the autovec for the lrintf and lrintf16. For example as
below

void
test_lrintf16 (long *out, _Float16 *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lrintf16 (in[i]);
}

void
test_lrintf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lrintf (in[i]);
}

We may have similar rtl code when compile with "-march=rv64gcv_zvfh_zfh
-mabi=lp64d -O3 -ftree-vectorize -ffast-math".
void
test_lrintf16 (long *out, _Float16 *in, unsigned count)
{
  # ivtmp.8_28 = PHI <ivtmp.8_27(4), ivtmp.8_26(3)>                             
  # ivtmp.9_25 = PHI <ivtmp.9_24(4), ivtmp.9_23(3)>
  _22 = (void *) ivtmp.8_28;
  _4 = MEM[(_Float16 *)_22];
  _7 = __builtin_lrintf16 (_4);
  _21 = (void *) ivtmp.9_25;
  MEM[(long int *)_21] = _7;
  ivtmp.8_27 = ivtmp.8_28 + 2;
  ivtmp.9_24 = ivtmp.9_25 + 8;
}

void
test_lrintf (long *out, float *in, unsigned count)
{
  # ivtmp.37_32 = PHI <ivtmp.37_48(5), ivtmp.37_28(4)>
  # ivtmp.40_26 = PHI <ivtmp.40_25(5), ivtmp.40_24(4)>
  _23 = (void *) ivtmp.37_32;
  vect__4.21_40 = MEM <vector(16) float> [(float *)_23];
  vect__7.22_41 = .LRINT (vect__4.21_40);                 // Expand lrint<m><n>
  _22 = (void *) ivtmp.40_26;
  MEM <vector(16) long int> [(long int *)_22] = vect__7.22_41;
  ivtmp.37_48 = ivtmp.37_32 + 64;
  ivtmp.40_25 = ivtmp.40_26 + 128;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
  2023-11-08  7:05 ` [Bug c/112432] " rguenth at gcc dot gnu.org
  2023-11-08  7:17 ` pan2.li at intel dot com
@ 2023-11-08  8:34 ` rguenth at gcc dot gnu.org
  2023-11-08 10:29 ` pan2.li at intel dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-08  8:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, yes, for lrint we have the builtins - I just looked for lceil here.  So
yeah, where there are DEF_EXT_LIB_FLOATN_NX_BUILTINS we should have
DEF_INTERNAL_FLT_FLOATN_FN.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
                   ` (2 preceding siblings ...)
  2023-11-08  8:34 ` rguenth at gcc dot gnu.org
@ 2023-11-08 10:29 ` pan2.li at intel dot com
  2023-11-09  6:36 ` pan2.li at intel dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pan2.li at intel dot com @ 2023-11-08 10:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

--- Comment #4 from Li Pan <pan2.li at intel dot com> ---
(In reply to Richard Biener from comment #3)
> Ah, yes, for lrint we have the builtins - I just looked for lceil here.  So
> yeah, where there are DEF_EXT_LIB_FLOATN_NX_BUILTINS we should have
> DEF_INTERNAL_FLT_FLOATN_FN.

Thanks Richard, I will have a try for this change.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
                   ` (3 preceding siblings ...)
  2023-11-08 10:29 ` pan2.li at intel dot com
@ 2023-11-09  6:36 ` pan2.li at intel dot com
  2023-11-10  0:56 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pan2.li at intel dot com @ 2023-11-09  6:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

--- Comment #5 from Li Pan <pan2.li at intel dot com> ---
(In reply to Li Pan from comment #4)
> (In reply to Richard Biener from comment #3)
> > Ah, yes, for lrint we have the builtins - I just looked for lceil here.  So
> > yeah, where there are DEF_EXT_LIB_FLOATN_NX_BUILTINS we should have
> > DEF_INTERNAL_FLT_FLOATN_FN.
> 
> Thanks Richard, I will have a try for this change.

After some double-confirmation, the related definition are list as below

         glibc  GCC-FLOATN_NX_BUILTINS
iceil    N      N
ifloor   N      N
irint    N      N
iround   N      N

lceil    N      N
lfloor   N      N
lrint    Y      Y
lround   Y      Y

llceil   N      N
llfllor  N      N
llrint   Y      Y
llround  Y      Y

We only need to support lrint/lround/llrint/llround for FLOATN for now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
                   ` (4 preceding siblings ...)
  2023-11-09  6:36 ` pan2.li at intel dot com
@ 2023-11-10  0:56 ` cvs-commit at gcc dot gnu.org
  2023-11-10  0:59 ` pan2.li at intel dot com
  2023-12-18 13:19 ` cvs-commit at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-10  0:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pan Li <panli@gcc.gnu.org>:

https://gcc.gnu.org/g:907603d4b117e82dbbde2d58a04e33f3021908e7

commit r14-5307-g907603d4b117e82dbbde2d58a04e33f3021908e7
Author: Pan Li <pan2.li@intel.com>
Date:   Thu Nov 9 22:04:39 2023 +0800

    Internal-fn: Add FLOATN support for l/ll round and rint [PR/112432]

    The defined DEF_EXT_LIB_FLOATN_NX_BUILTINS functions should also
    have DEF_INTERNAL_FLT_FLOATN_FN instead of DEF_INTERNAL_FLT_FN for
    the FLOATN support. According to the glibc API and gcc builtin, we
    have below table for the FLOATN is supported or not.

    +---------+-------+-------------------------------------+
    |         | glibc | gcc: DEF_EXT_LIB_FLOATN_NX_BUILTINS |
    +---------+-------+-------------------------------------+
    | iceil   | N     | N                                   |
    | ifloor  | N     | N                                   |
    | irint   | N     | N                                   |
    | iround  | N     | N                                   |
    | lceil   | N     | N                                   |
    | lfloor  | N     | N                                   |
    | lrint   | Y     | Y                                   |
    | lround  | Y     | Y                                   |
    | llceil  | N     | N                                   |
    | llfllor | N     | N                                   |
    | llrint  | Y     | Y                                   |
    | llround | Y     | Y                                   |
    +---------+-------+-------------------------------------+

    This patch would like to support FLOATN for:
    1. lrint
    2. lround
    3. llrint
    4. llround

    The below tests are passed within this patch:
    1. x86 bootstrap and regression test.
    2. aarch64 regression test.
    3. riscv regression tests.

            PR target/112432

    gcc/ChangeLog:

            * internal-fn.def (LRINT): Add FLOATN support.
            (LROUND): Ditto.
            (LLRINT): Ditto.
            (LLROUND): Ditto.

    Signed-off-by: Pan Li <pan2.li@intel.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
                   ` (5 preceding siblings ...)
  2023-11-10  0:56 ` cvs-commit at gcc dot gnu.org
@ 2023-11-10  0:59 ` pan2.li at intel dot com
  2023-12-18 13:19 ` cvs-commit at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: pan2.li at intel dot com @ 2023-11-10  0:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

Li Pan <pan2.li at intel dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Li Pan <pan2.li at intel dot com> ---
The FLOATN support patch merged to trunk already, the below builtin has FLOATN
support now.

1. lrint
2. lround
3. llrint
4. llrount

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN
  2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
                   ` (6 preceding siblings ...)
  2023-11-10  0:59 ` pan2.li at intel dot com
@ 2023-12-18 13:19 ` cvs-commit at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-12-18 13:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

--- Comment #8 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pan Li <panli@gcc.gnu.org>:

https://gcc.gnu.org/g:b3b2799b872bc4c1944629af9dfc8472c8ca5fe6

commit r14-6659-gb3b2799b872bc4c1944629af9dfc8472c8ca5fe6
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date:   Mon Dec 18 19:35:21 2023 +0800

    RISC-V: Support one more overlap for wv instructions

    For 'wv' instructions, e.g. vwadd.wv vd,vs2,vs1.

    vs2 has same EEW as vd.
    vs1 has smaller than vd.

    So, vs2 can overlap with vd, but vs1 can only overlap highest-number of vd
    when LMUL of vs1 is greater than 1.

    We already have supported overlap for vs1 LMUL >= 1.
    But I forget vs1 LMUL < 1, vs2 can overlap vd even though vs1 totally can
not overlap vd.

    Consider the reduction auto-vectorization:

    int64_t
    reduc_plus_int (int *__restrict a, int n)
    {
      int64_t r = 0;
      for (int i = 0; i < n; ++i)
        r += a[i];
      return r;
    }

    When we use --param=riscv-autovec-lmul=m2, the codegen is good to us
because we already supported
    overlap for source EEW32 LMUL1 -> dest EEW64 LMUL2.

    --param=riscv-autovec-lmul=m2:

    reduc_plus_int:
            ble     a1,zero,.L4
            vsetvli a5,zero,e64,m2,ta,ma
            vmv.v.i v2,0
    .L3:
            vsetvli a5,a1,e32,m1,tu,ma
            slli    a4,a5,2
            sub     a1,a1,a5
            vle32.v v1,0(a0)
            add     a0,a0,a4
            vwadd.wv        v2,v2,v1
            bne     a1,zero,.L3
            li      a5,0
            vsetivli        zero,1,e64,m1,ta,ma
            vmv.s.x v1,a5
            vsetvli a5,zero,e64,m2,ta,ma
            vredsum.vs      v2,v2,v1
            vmv.x.s a0,v2
            ret
    .L4:
            li      a0,0
            ret

    However, default LMUL (--param=riscv-autovec-lmul=m1) generates redundant
vmv1r since
    it is EEW32 LMUL=MF2 -> EEW64 LMUL = 1

    Before this patch:

    reduc_plus_int:
            ble     a1,zero,.L4
            vsetvli a5,zero,e64,m1,ta,ma
            vmv.v.i v1,0
    .L3:
            vsetvli a5,a1,e32,mf2,tu,ma
            slli    a4,a5,2
            sub     a1,a1,a5
            vle32.v v2,0(a0)
            vmv1r.v v3,v1                  ---->  This should be removed.
            add     a0,a0,a4
            vwadd.wv        v1,v3,v2       ---->  vs2 should be v1
            bne     a1,zero,.L3
            li      a5,0
            vsetivli        zero,1,e64,m1,ta,ma
            vmv.s.x v2,a5
            vsetvli a5,zero,e64,m1,ta,ma
            vredsum.vs      v1,v1,v2
            vmv.x.s a0,v1
            ret
    .L4:
            li      a0,0
            ret

    After this patch:

    reduc_plus_int:
            ble     a1,zero,.L4
            vsetvli a5,zero,e64,m1,ta,ma
            vmv.v.i v1,0
    .L3:
            vsetvli a5,a1,e32,mf2,tu,ma
            slli    a4,a5,2
            sub     a1,a1,a5
            vle32.v v2,0(a0)
            add     a0,a0,a4
            vwadd.wv        v1,v1,v2
            bne     a1,zero,.L3
            li      a5,0
            vsetivli        zero,1,e64,m1,ta,ma
            vmv.s.x v2,a5
            vsetvli a5,zero,e64,m1,ta,ma
            vredsum.vs      v1,v1,v2
            vmv.x.s a0,v1
            ret
    .L4:
            li      a0,0
            ret

            PR target/112432

    gcc/ChangeLog:

            * config/riscv/riscv.md (none,W21,W42,W84,W43,W86,W87): Add W0.
            (none,W21,W42,W84,W43,W86,W87,W0): Ditto.
            * config/riscv/vector.md: Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/riscv/rvv/base/pr112432-42.c: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-18 13:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-08  0:21 [Bug c/112432] New: Internal-fn: The [i|l|ll]rint family don't support FLOATN pan2.li at intel dot com
2023-11-08  7:05 ` [Bug c/112432] " rguenth at gcc dot gnu.org
2023-11-08  7:17 ` pan2.li at intel dot com
2023-11-08  8:34 ` rguenth at gcc dot gnu.org
2023-11-08 10:29 ` pan2.li at intel dot com
2023-11-09  6:36 ` pan2.li at intel dot com
2023-11-10  0:56 ` cvs-commit at gcc dot gnu.org
2023-11-10  0:59 ` pan2.li at intel dot com
2023-12-18 13:19 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).