public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target
@ 2021-08-06 23:04 hjl.tools at gmail dot com
  2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-06 23:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

            Bug ID: 101809
           Summary: emulated gather capability doesn't support 32-bit
                    target
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: crazylht at gmail dot com, rguenth at gcc dot gnu.org
  Target Milestone: ---

On Linux/x86-64, I get

[hjl@gnu-cfl-2 xxx]$ cat x.c
#include <stdint.h>

#define loop_t uint32_t
#define idx_t uint32_t

void loop(double * const __restrict__ dst,
          double const * const __restrict__ src,
          idx_t const * const __restrict__ idx,
          loop_t const begin,
          loop_t const end)
{
  for (loop_t i = begin; i < end; ++i)
    dst[i] = 42.0 * src[idx[i]];
}
[hjl@gnu-cfl-2 xxx]$ make x.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O3
-m32 -march=x86-64 -mfpmath=sse -S x.c
[hjl@gnu-cfl-2 xxx]$ cat x.s
        .file   "x.c"
        .text
        .p2align 4
        .globl  loop
        .type   loop, @function
loop:
.LFB0:
        .cfi_startproc
        pushl   %edi
        .cfi_def_cfa_offset 8
        .cfi_offset 7, -8
        pushl   %esi
        .cfi_def_cfa_offset 12
        .cfi_offset 6, -12
        pushl   %ebx
        .cfi_def_cfa_offset 16
        .cfi_offset 3, -16
        movl    28(%esp), %eax
        movl    32(%esp), %ecx
        movl    16(%esp), %ebx
        movl    20(%esp), %esi
        movl    24(%esp), %edi
        cmpl    %ecx, %eax
        jnb     .L1
        movsd   .LC0, %xmm1
        .p2align 4,,10
        .p2align 3
.L3:
        movl    (%edi,%eax,4), %edx
        movsd   (%esi,%edx,8), %xmm0
        mulsd   %xmm1, %xmm0
        movsd   %xmm0, (%ebx,%eax,8)
        addl    $1, %eax
        cmpl    %eax, %ecx
        jne     .L3
.L1:
        popl    %ebx
        .cfi_restore 3
        .cfi_def_cfa_offset 12
        popl    %esi
        .cfi_restore 6
        .cfi_def_cfa_offset 8
        popl    %edi
        .cfi_restore 7
        .cfi_def_cfa_offset 4
        ret
        .cfi_endproc
.LFE0:
        .size   loop, .-loop
        .section        .rodata.cst8,"aM",@progbits,8
        .align 8
.LC0:
        .long   0
        .long   1078263808
        .ident  "GCC: (GNU) 12.0.0 20210806 (experimental)"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 xxx]$ 

emulated gather capability isn't enabled.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug middle-end/101809] emulated gather capability doesn't support 32-bit target
  2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
@ 2021-08-07 14:11 ` hjl.tools at gmail dot com
  2021-08-09  7:31 ` [Bug tree-optimization/101809] " rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-07 14:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
It fails in get_load_store_type:

          else if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ()
                   || !known_eq (TYPE_VECTOR_SUBPARTS (vectype),
                                 TYPE_VECTOR_SUBPARTS
                                   (gs_info->offset_vectype)))
            {
              if (dump_enabled_p ())
                dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                                 "unsupported vector types for emulated "
                                 "gather.\n");
              return false;
            }

For V2DF gather, we need V2DI index.  But for -m32, index is V4SI.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
  2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
  2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
@ 2021-08-09  7:31 ` rguenth at gcc dot gnu.org
  2021-08-09  8:50 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-09  7:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
          Component|middle-end                  |tree-optimization
             Target|                            |i?86-*-*
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2021-08-09
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Yes, I was lazy here - the complication is in the shared gather setup code
which will end up in a mismatching number of index/data vectors eventually.

I will eventually improve this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
  2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
  2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
  2021-08-09  7:31 ` [Bug tree-optimization/101809] " rguenth at gcc dot gnu.org
@ 2021-08-09  8:50 ` rguenth at gcc dot gnu.org
  2021-08-10  9:29 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-09  8:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is also the FAIL of

FAIL: gcc.target/i386/vect-gather-1.c scan-tree-dump vect "loop vectorized"

with -m32 testing.  Note the intent was to have the testcase work independent
on the presence of HW gather (for you folks testing with -march=cascadelake).

The XFAIL condition is going to be tricky for this so I'll leave this test
FAILing at least until I can decide whether fixing the missing feature is
possible for GCC 12.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
  2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2021-08-09  8:50 ` rguenth at gcc dot gnu.org
@ 2021-08-10  9:29 ` rguenth at gcc dot gnu.org
  2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-10  9:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I am testing a patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
  2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
                   ` (3 preceding siblings ...)
  2021-08-10  9:29 ` rguenth at gcc dot gnu.org
@ 2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
  2021-08-10 10:28 ` rguenth at gcc dot gnu.org
  2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-10 10:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:08aa0e3d4f781fd6a6e293bb06d280365a0bdc1d

commit r12-2836-g08aa0e3d4f781fd6a6e293bb06d280365a0bdc1d
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Aug 10 10:54:58 2021 +0200

    tree-optimization/101809 - support emulated gather for double[int]

    This adds emulated gather support for index vectors with more
    elements than the data vector.  The internal function gather
    vectorization code doesn't currently handle this (but the builtin
    decl code does).  This allows vectorization of double data gather
    with int indexes on 32bit platforms where there isn't an implicit
    widening to 64bit present.

    2021-08-10  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/101809
            * tree-vect-stmts.c (get_load_store_type): Allow emulated
            gathers with offset vector nunits being a constant multiple
            of the data vector nunits.
            (vect_get_gather_scatter_ops): Use the appropriate nunits
            for the offset vector defs.
            (vectorizable_store): Adjust call to
            vect_get_gather_scatter_ops.
            (vectorizable_load): Likewise.  Handle the case of less
            offset vectors than data vectors.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
  2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
                   ` (4 preceding siblings ...)
  2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
@ 2021-08-10 10:28 ` rguenth at gcc dot gnu.org
  2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-10 10:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.  The reverse remains, so we currently cannot vectorize int[long] or
char[int] with emulated gather (but the more elements we get the less efficient
it will be).

It would be nice to transition x86 over to internal function gathers
(thus [mask_]gather_load and [mask_]scatter_store optabs, away from the
target hook returing a builtin decl).  If somebody can cover the .md
part (write define_expands) I'll take over the vectorizer part which
unfortunately isn't a no-op.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
  2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
                   ` (5 preceding siblings ...)
  2021-08-10 10:28 ` rguenth at gcc dot gnu.org
@ 2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-10 12:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:

https://gcc.gnu.org/g:557d06f8b3ddb54bca134695e117c40c6e2267ab

commit r12-2838-g557d06f8b3ddb54bca134695e117c40c6e2267ab
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Aug 10 05:30:44 2021 -0700

    Enable gcc.target/i386/pr88531-1a.c for all targets

            PR tree-optimization/101809
            * gcc.target/i386/pr88531-1a.c: Enable for all targets.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-10 12:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
2021-08-09  7:31 ` [Bug tree-optimization/101809] " rguenth at gcc dot gnu.org
2021-08-09  8:50 ` rguenth at gcc dot gnu.org
2021-08-10  9:29 ` rguenth at gcc dot gnu.org
2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
2021-08-10 10:28 ` rguenth at gcc dot gnu.org
2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).