public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd
@ 2020-04-29 21:59 gabravier at gmail dot com
  2020-04-29 22:03 ` [Bug tree-optimization/94864] " gabravier at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: gabravier at gmail dot com @ 2020-04-29 21:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864

            Bug ID: 94864
           Summary: Failure to combine vunpckhpd+movsd into single
                    vunpckhpd
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

typedef double v2df __attribute__((vector_size(16)));

v2df move_sd(v2df a, v2df b)
{
    v2df result = a;
    result[0] = b[1];
    return result;
}

LLVM outputs this :

move_sd(double __vector(2), double __vector(2)): # @move_sd(double __vector(2),
double __vector(2))
  vunpckhpd xmm0, xmm1, xmm0 # xmm0 = xmm1[1],xmm0[1]
  ret

GCC outputs this : 

move_sd(double __vector(2), double __vector(2)):
  vunpckhpd xmm1, xmm1, xmm1
  vmovsd xmm0, xmm0, xmm1
  ret

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd
  2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
@ 2020-04-29 22:03 ` gabravier at gmail dot com
  2020-04-30  7:03 ` [Bug rtl-optimization/94864] " rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: gabravier at gmail dot com @ 2020-04-29 22:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864

--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> ---
Note : The compilation options were `-O3 -mavx`

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd
  2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
  2020-04-29 22:03 ` [Bug tree-optimization/94864] " gabravier at gmail dot com
@ 2020-04-30  7:03 ` rguenth at gcc dot gnu.org
  2020-05-04 15:17 ` segher at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-04-30  7:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-04-30
          Component|target                      |rtl-optimization
                 CC|                            |rguenth at gcc dot gnu.org,
                   |                            |segher at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
We're feeding combine with

(insn 7 4 9 2 (set (reg:DF 87)
        (vec_select:DF (reg:V2DF 90)
            (parallel [
                    (const_int 1 [0x1])
                ]))) "y.c":6:26 3195 {sse2_storehpd}
     (expr_list:REG_DEAD (reg:V2DF 90)
        (nil)))
(insn 9 7 14 2 (set (reg:V2DF 88 [ result ])
        (vec_merge:V2DF (vec_duplicate:V2DF (reg:DF 87))
            (reg:V2DF 89)
            (const_int 1 [0x1]))) "y.c":6:21 2918 {vec_setv2df_0}
     (expr_list:REG_DEAD (reg:V2DF 89)
        (expr_list:REG_DEAD (reg:DF 87)
            (nil))))

which makes

(set (reg:V2DF 88 [ result ])
    (vec_merge:V2DF (vec_duplicate:V2DF (vec_select:DF (reg:V2DF 90)
                (parallel [
                        (const_int 1 [0x1])
                    ])))
        (reg:V2DF 89)
        (const_int 1 [0x1])))

out of this which does not match anything because x86 chooses to use
vec_merge in some and vec_select/vec_concat in other patterns:

(define_insn "*vec_interleave_highv2df"
  [(set (match_operand:V2DF 0 "nonimmediate_operand"     "=x,v,v,x,v,m")
        (vec_select:V2DF
          (vec_concat:V4DF
            (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,o,v")
            (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,0,v,0"))
          (parallel [(const_int 1)
                     (const_int 3)])))]
  "TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 1)"
  "@
   unpckhpd\t{%2, %0|%0, %2}

not sure if combine should try to exchange vec_merge for vec_select/vec_concat
or if this is simply the backends fault (or GCCs for even having two ways
to express the same thing...)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd
  2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
  2020-04-29 22:03 ` [Bug tree-optimization/94864] " gabravier at gmail dot com
  2020-04-30  7:03 ` [Bug rtl-optimization/94864] " rguenth at gcc dot gnu.org
@ 2020-05-04 15:17 ` segher at gcc dot gnu.org
  2020-05-07 11:04 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: segher at gcc dot gnu.org @ 2020-05-04 15:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864

--- Comment #3 from Segher Boessenkool <segher at gcc dot gnu.org> ---
vec_duplicate of vec_select is just a vec_select.  Any vec_merge is a
vec_select as well, as you say.

Canonicalisation should make vec_select always.

We probably should have canonicalisation rules for this, so that we do
not get all those rtxes in the instruction stream at all.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd
  2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2020-05-04 15:17 ` segher at gcc dot gnu.org
@ 2020-05-07 11:04 ` rguenth at gcc dot gnu.org
  2023-07-31 11:31 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-05-07 11:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Addressed by the patch for PR94865.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd
  2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2020-05-07 11:04 ` rguenth at gcc dot gnu.org
@ 2023-07-31 11:31 ` rguenth at gcc dot gnu.org
  2023-08-22  9:34 ` cvs-commit at gcc dot gnu.org
  2023-08-22  9:35 ` rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-31 11:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864
Bug 94864 depends on bug 88540, which changed state.

Bug 88540 Summary: Issues with vectorization of min/max operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88540

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd
  2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2023-07-31 11:31 ` rguenth at gcc dot gnu.org
@ 2023-08-22  9:34 ` cvs-commit at gcc dot gnu.org
  2023-08-22  9:35 ` rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-08-22  9:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:27de9aa152141e7f3ee66372647d0f2cd94c4b90

commit r14-3381-g27de9aa152141e7f3ee66372647d0f2cd94c4b90
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Jul 12 15:01:47 2023 +0200

    tree-optimization/94864 - vector insert of vector extract simplification

    The PRs ask for optimizing of

      _1 = BIT_FIELD_REF <b_3(D), 64, 64>;
      result_4 = BIT_INSERT_EXPR <a_2(D), _1, 64>;

    to a vector permutation.  The following implements this as
    match.pd pattern, improving code generation on x86_64.

    On the RTL level we face the issue that backend patterns inconsistently
    use vec_merge and vec_select of vec_concat to represent permutes.

    I think using a (supported) permute is almost always better
    than an extract plus insert, maybe excluding the case we extract
    element zero and that's aliased to a register that can be used
    directly for insertion (not sure how to query that).

    The patch FAILs one case in gcc.target/i386/avx512fp16-vmovsh-1a.c
    where we now expand from

     __A_28 = VEC_PERM_EXPR <x2.8_9, x1.9_10, { 0, 9, 10, 11, 12, 13, 14, 15
}>;

    instead of

     _28 = BIT_FIELD_REF <x2.8_9, 16, 0>;
     __A_29 = BIT_INSERT_EXPR <x1.9_10, _28, 0>;

    producing a vpblendw instruction instead of the expected vmovsh.  That's
    either a missed vec_perm_const expansion optimization or even better,
    an improvement - Zen4 for example has 4 ports to execute vpblendw
    but only 3 for executing vmovsh and both instructions have the same size.

    The patch XFAILs the sub-testcase.

            PR tree-optimization/94864
            PR tree-optimization/94865
            PR tree-optimization/93080
            * match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
            for vector insertion from vector extraction.

            * gcc.target/i386/pr94864.c: New testcase.
            * gcc.target/i386/pr94865.c: Likewise.
            * gcc.target/i386/avx512fp16-vmovsh-1a.c: XFAIL.
            * gcc.dg/tree-ssa/forwprop-40.c: Likewise.
            * gcc.dg/tree-ssa/forwprop-41.c: Likewise.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd
  2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
                   ` (5 preceding siblings ...)
  2023-08-22  9:34 ` cvs-commit at gcc dot gnu.org
@ 2023-08-22  9:35 ` rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-22  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED
   Target Milestone|---                         |14.0

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-22  9:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-29 21:59 [Bug tree-optimization/94864] New: Failure to combine vunpckhpd+movsd into single vunpckhpd gabravier at gmail dot com
2020-04-29 22:03 ` [Bug tree-optimization/94864] " gabravier at gmail dot com
2020-04-30  7:03 ` [Bug rtl-optimization/94864] " rguenth at gcc dot gnu.org
2020-05-04 15:17 ` segher at gcc dot gnu.org
2020-05-07 11:04 ` rguenth at gcc dot gnu.org
2023-07-31 11:31 ` rguenth at gcc dot gnu.org
2023-08-22  9:34 ` cvs-commit at gcc dot gnu.org
2023-08-22  9:35 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).