public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
@ 2021-04-02  3:49 crazylht at gmail dot com
  2021-04-02 14:29 ` [Bug target/99881] " hjl.tools at gmail dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: crazylht at gmail dot com @ 2021-04-02  3:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

            Bug ID: 99881
           Summary: Regression compare -O2 -ftree-vectorize with -O2 on
                    SKX/CLX
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

testcase is extracted from 557.xz_r

void
foo (int* __restrict a, int n, int c)
{
    a[0] = n;
    a[1] = c;
}

gcc -O2 -ftree-vectorize -fvect-cost-model=very-cheap

foo(int*, int, int):
        movd    xmm0, esi
        movd    xmm1, edx
        punpckldq       xmm0, xmm1
        movq    QWORD PTR [rdi], xmm0
        ret

without vectorization

foo(int*, int, int):
        mov     DWORD PTR [rdi], esi
        mov     DWORD PTR [rdi+4], edx
        ret

cost model:
scalar: 2 times scalar_store costs 24,
vector: 1 times unaligned_store costs 12, vec_contruct 8

I know that the current strategy of the cost model is to enable vectorization
as much as possible, but for the case above, it hurts performance. Because the
throughput of punpckldq is 1 on SKX/CLX, which becomes a bottleneck (znver2 is
ok). with -march=SKX, the second vmovd and unpck will be replaced by vpinsr,
and it regression more since vpinsr has throught 2 on CLX/SKX.

So i'm thinking to add extra cost for 2-element vec_construct to prevent the
above vectorization, at the same time, try not to affect other vectorization
situations.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
@ 2021-04-02 14:29 ` hjl.tools at gmail dot com
  2021-04-02 19:34 ` hjl.tools at gmail dot com
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: hjl.tools at gmail dot com @ 2021-04-02 14:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-04-02
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Hongtao.liu from comment #0)
> testcase is extracted from 557.xz_r
> 
> void
> foo (int* __restrict a, int n, int c)
> {
>     a[0] = n;
>     a[1] = c;
> }
> 
> gcc -O2 -ftree-vectorize -fvect-cost-model=very-cheap
> 
> foo(int*, int, int):
>         movd    xmm0, esi
>         movd    xmm1, edx
>         punpckldq       xmm0, xmm1
>         movq    QWORD PTR [rdi], xmm0
>         ret
> 
> without vectorization
> 
> foo(int*, int, int):
>         mov     DWORD PTR [rdi], esi
>         mov     DWORD PTR [rdi+4], edx
>         ret
> 
> cost model:
> scalar: 2 times scalar_store costs 24,
> vector: 1 times unaligned_store costs 12, vec_contruct 8

How is vec_contruct cost computed today?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
  2021-04-02 14:29 ` [Bug target/99881] " hjl.tools at gmail dot com
@ 2021-04-02 19:34 ` hjl.tools at gmail dot com
  2021-04-06  7:48 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: hjl.tools at gmail dot com @ 2021-04-02 19:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 50501
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50501&action=edit
A patch to add vec_contruct cost

ix86_builtin_vectorization_cost has

      case vec_construct:
        {
          /* N element inserts into SSE vectors.  */
          int cost = TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op;
          /* One vinserti128 for combining two SSE vectors for AVX256.  */
          if (GET_MODE_BITSIZE (mode) == 256)
            cost += ix86_vec_cost (mode, ix86_cost->addss);
          /* One vinserti64x4 and two vinserti128 for combining SSE
             and AVX256 vectors to AVX512.  */
          else if (GET_MODE_BITSIZE (mode) == 512)
            cost += 3 * ix86_vec_cost (mode, ix86_cost->addss);
          return cost; 
        }

Add vec_contruct cost for vec_construct operation.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
  2021-04-02 14:29 ` [Bug target/99881] " hjl.tools at gmail dot com
  2021-04-02 19:34 ` hjl.tools at gmail dot com
@ 2021-04-06  7:48 ` rguenth at gcc dot gnu.org
  2021-04-06 10:06 ` crazylht at gmail dot com
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-06  7:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
But 2 element construction _should_ be cheap.  What is missing is the move
cost from GPR to XMM regs (but we do not have a good idea whether the sources
are memory, so it's not as clear-cut here either).

IMHO a better approach might be to up unaligned vector store/load costs?

For the testcase at hand why does a throughput of 1 pose a problem?  There's
only one punpckldq instruction around?

Note that for the case of non-loop vectorization of 'double' the two element
vector CTORs are common and important to handle cheaply.  See also all the
discussion in PR98856

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2021-04-06  7:48 ` rguenth at gcc dot gnu.org
@ 2021-04-06 10:06 ` crazylht at gmail dot com
  2021-04-06 11:44 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: crazylht at gmail dot com @ 2021-04-06 10:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #3)
> But 2 element construction _should_ be cheap.  What is missing is the move
> cost from GPR to XMM regs (but we do not have a good idea whether the sources
> are memory, so it's not as clear-cut here either).
> 
> IMHO a better approach might be to up unaligned vector store/load costs?
> 
> For the testcase at hand why does a throughput of 1 pose a problem?  There's
> only one punpckldq instruction around?
> 

There're several lea/add(which also may use port 5) instructions around
punckldq, considering that FAST LEA and Int ALU will be common in address
computation, throughput of 1 for punckldq will be a bottleneck.

refer to https://godbolt.org/z/hK9r5vTzd for original case

> Note that for the case of non-loop vectorization of 'double' the two element
> vector CTORs are common and important to handle cheaply.  See also all the
> discussion in PR98856

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2021-04-06 10:06 ` crazylht at gmail dot com
@ 2021-04-06 11:44 ` rguenth at gcc dot gnu.org
  2021-07-28  2:48 ` cvs-commit at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-06 11:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #4)
> (In reply to Richard Biener from comment #3)
> > But 2 element construction _should_ be cheap.  What is missing is the move
> > cost from GPR to XMM regs (but we do not have a good idea whether the sources
> > are memory, so it's not as clear-cut here either).
> > 
> > IMHO a better approach might be to up unaligned vector store/load costs?
> > 
> > For the testcase at hand why does a throughput of 1 pose a problem?  There's
> > only one punpckldq instruction around?
> > 
> 
> There're several lea/add(which also may use port 5) instructions around
> punckldq, considering that FAST LEA and Int ALU will be common in address
> computation, throughput of 1 for punckldq will be a bottleneck.
> 
> refer to https://godbolt.org/z/hK9r5vTzd for original case

Too bad.  But this is starting to model resource constraints which are not
at all handled by the generic part of the vectorizer cost model.  We kind-of
have the ability to do this in the target (see how rs6000 models some of this
in its finis_cost hook via rs6000_density_test).  But then the cost model
suffers from quite some GIGO already and I fear adding complexity will only
produce more 'G'.

As you have seen you need quite some offset to make up for the saved store,
I think trying to get integer_to_sse costed for the movd/pinsrq would be a
better way than parametrizing 'vec_construct' (because there's no vec_construct
instruction - there's multiple pieces to it).

> > Note that for the case of non-loop vectorization of 'double' the two element
> > vector CTORs are common and important to handle cheaply.  See also all the
> > discussion in PR98856

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (4 preceding siblings ...)
  2021-04-06 11:44 ` rguenth at gcc dot gnu.org
@ 2021-07-28  2:48 ` cvs-commit at gcc dot gnu.org
  2021-07-28  2:49 ` crazylht at gmail dot com
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-28  2:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:872da9a6f664a06d73c987aa0cb2e5b830158a10

commit r12-2549-g872da9a6f664a06d73c987aa0cb2e5b830158a10
Author: liuhongt <hongtao.liu@intel.com>
Date:   Fri Mar 26 10:56:47 2021 +0800

    Add the member integer_to_sse to processor_cost as a cost simulation for
movd/pinsrd. It will be used to calculate the cost of vec_construct.

    gcc/ChangeLog:

            PR target/99881
            * config/i386/i386.h (processor_costs): Add new member
            integer_to_sse.
            * config/i386/x86-tune-costs.h (ix86_size_cost, i386_cost,
            i486_cost, pentium_cost, lakemont_cost, pentiumpro_cost,
            geode_cost, k6_cost, athlon_cost, k8_cost, amdfam10_cost,
            bdver_cost, znver1_cost, znver2_cost, znver3_cost,
            btver1_cost, btver2_cost, btver3_cost, pentium4_cost,
            nocona_cost, atom_cost, atom_cost, slm_cost, intel_cost,
            generic_cost, core_cost): Initialize integer_to_sse same value
            as sse_op.
            (skylake_cost): Initialize integer_to_sse twice as much as sse_op.
            * config/i386/i386.c (ix86_builtin_vectorization_cost):
            Use integer_to_sse instead of sse_op to calculate the cost of
            vec_construct.

    gcc/testsuite/ChangeLog:

            PR target/99881
            * gcc.target/i386/pr99881.c: New test.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (5 preceding siblings ...)
  2021-07-28  2:48 ` cvs-commit at gcc dot gnu.org
@ 2021-07-28  2:49 ` crazylht at gmail dot com
  2021-07-28 22:47 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: crazylht at gmail dot com @ 2021-07-28  2:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (6 preceding siblings ...)
  2021-07-28  2:49 ` crazylht at gmail dot com
@ 2021-07-28 22:47 ` jakub at gcc dot gnu.org
  2021-07-29  1:09 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-07-28 22:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The testcase is buggy:
Executing on host: /home/jakub/src/gcc/obj19/gcc/xgcc
-B/home/jakub/src/gcc/obj19/gcc/
/home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr99881.c   
-fdiagnostics-plain-output  -Ofa
st -march=skylake -ffat-lto-objects -fno-ident -S -o pr99881.s    (timeout =
300)
spawn -ignore SIGHUP /home/jakub/src/gcc/obj19/gcc/xgcc
-B/home/jakub/src/gcc/obj19/gcc/
/home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr99881.c
-fdiagnostics-plain-output -Ofast
 -march=skylake -ffat-lto-objects -fno-ident -S -o pr99881.s
PASS: gcc.target/i386/pr99881.c (test for excess errors)
ERROR: (DejaGnu) proc "0-9" does not exist.
The error code is TCL LOOKUP COMMAND 0-9
The info on the error is:
invalid command name "0-9"
    while executing
"::tcl_unknown 0-9"
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 ::tcl_unknown $args"


xmm[0-9] should be xmm\[0-9]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (7 preceding siblings ...)
  2021-07-28 22:47 ` jakub at gcc dot gnu.org
@ 2021-07-29  1:09 ` crazylht at gmail dot com
  2021-07-29  2:18 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: crazylht at gmail dot com @ 2021-07-29  1:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Jakub Jelinek from comment #8)
> The testcase is buggy:
> Executing on host: /home/jakub/src/gcc/obj19/gcc/xgcc
> -B/home/jakub/src/gcc/obj19/gcc/
> /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr99881.c   
> -fdiagnostics-plain-output  -Ofa
> st -march=skylake -ffat-lto-objects -fno-ident -S -o pr99881.s    (timeout =
> 300)
> spawn -ignore SIGHUP /home/jakub/src/gcc/obj19/gcc/xgcc
> -B/home/jakub/src/gcc/obj19/gcc/
> /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/pr99881.c
> -fdiagnostics-plain-output -Ofast
>  -march=skylake -ffat-lto-objects -fno-ident -S -o pr99881.s
> PASS: gcc.target/i386/pr99881.c (test for excess errors)
> ERROR: (DejaGnu) proc "0-9" does not exist.
> The error code is TCL LOOKUP COMMAND 0-9
> The info on the error is:
> invalid command name "0-9"
>     while executing
> "::tcl_unknown 0-9"
>     ("uplevel" body line 1)
>     invoked from within
> "uplevel 1 ::tcl_unknown $args"
> 
> 
> xmm[0-9] should be xmm\[0-9]

Yes, sorry about the typo.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (8 preceding siblings ...)
  2021-07-29  1:09 ` crazylht at gmail dot com
@ 2021-07-29  2:18 ` cvs-commit at gcc dot gnu.org
  2021-08-19  2:32 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-29  2:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:7d11da87a1e3c7e0d274788ca43519513dae4bfe

commit r12-2587-g7d11da87a1e3c7e0d274788ca43519513dae4bfe
Author: liuhongt <hongtao.liu@intel.com>
Date:   Thu Jul 29 09:33:15 2021 +0800

    Adjust/Refine testcases.

    gcc/testsuite/ChangeLog:

            PR target/99881
            * gcc.target/i386/pr91446.c:
            * gcc.target/i386/pr92658-avx512bw-2.c:
            * gcc.target/i386/pr92658-sse4-2.c:
            * gcc.target/i386/pr92658-sse4.c:
            * gcc.target/i386/pr99881.c:

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (9 preceding siblings ...)
  2021-07-29  2:18 ` cvs-commit at gcc dot gnu.org
@ 2021-08-19  2:32 ` crazylht at gmail dot com
  2022-02-22  7:59 ` cvs-commit at gcc dot gnu.org
  2022-02-22  8:00 ` rguenth at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: crazylht at gmail dot com @ 2021-08-19  2:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |---
             Status|RESOLVED                    |REOPENED

--- Comment #11 from Hongtao.liu <crazylht at gmail dot com> ---
r12-2549 is reverted due to pr101929 pr101936

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (10 preceding siblings ...)
  2021-08-19  2:32 ` crazylht at gmail dot com
@ 2022-02-22  7:59 ` cvs-commit at gcc dot gnu.org
  2022-02-22  8:00 ` rguenth at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-22  7:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:90d693bdc9d71841f51d68826ffa5bd685d7f0bc

commit r12-7319-g90d693bdc9d71841f51d68826ffa5bd685d7f0bc
Author: Richard Biener <rguenther@suse.de>
Date:   Fri Feb 18 14:32:14 2022 +0100

    target/99881 - x86 vector cost of CTOR from integer regs

    This uses the now passed SLP node to the vectorizer costing hook
    to adjust vector construction costs for the cost of moving an
    integer component from a GPR to a vector register when that's
    required for building a vector from components.  A cruical difference
    here is whether the component is loaded from memory or extracted
    from a vector register as in those cases no intermediate GPR is involved.

    The pr99881.c testcase can be Un-XFAILed with this patch, the
    pr91446.c testcase now produces scalar code which looks superior
    to me so I've adjusted it as well.

    2022-02-18  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/104582
            PR target/99881
            * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
            Cost GPR to vector register moves for integer vector construction.

            * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c: New.
            * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-2.c: Likewise.
            * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c: Likewise.
            * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c: Likewise.
            * gcc.target/i386/pr99881.c: Un-XFAIL.
            * gcc.target/i386/pr91446.c: Adjust to not expect vectorization.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
  2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
                   ` (11 preceding siblings ...)
  2022-02-22  7:59 ` cvs-commit at gcc dot gnu.org
@ 2022-02-22  8:00 ` rguenth at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-22  8:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|REOPENED                    |RESOLVED

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is now fixed again for GCC 12 which enables vectorization at -O2.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-02-22  8:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
2021-04-02 14:29 ` [Bug target/99881] " hjl.tools at gmail dot com
2021-04-02 19:34 ` hjl.tools at gmail dot com
2021-04-06  7:48 ` rguenth at gcc dot gnu.org
2021-04-06 10:06 ` crazylht at gmail dot com
2021-04-06 11:44 ` rguenth at gcc dot gnu.org
2021-07-28  2:48 ` cvs-commit at gcc dot gnu.org
2021-07-28  2:49 ` crazylht at gmail dot com
2021-07-28 22:47 ` jakub at gcc dot gnu.org
2021-07-29  1:09 ` crazylht at gmail dot com
2021-07-29  2:18 ` cvs-commit at gcc dot gnu.org
2021-08-19  2:32 ` crazylht at gmail dot com
2022-02-22  7:59 ` cvs-commit at gcc dot gnu.org
2022-02-22  8:00 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).