public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/109764] New: V2SI multiply high is not vectorized on x86_64
@ 2023-05-07 16:21 ubizjak at gmail dot com
  2023-05-07 16:23 ` [Bug tree-optimization/109764] " ubizjak at gmail dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: ubizjak at gmail dot com @ 2023-05-07 16:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764

            Bug ID: 109764
           Summary: V2SI multiply high is not vectorized on x86_64
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

The folowing testcase:

--cut here--
#define N 2

unsigned int ur[N], ua[N], ub[N];

void mulh (void)
{
  int i;

  for (i = 0; i < N; i++)
    ur[i] = ((unsigned long) ua[i] * ub[i]) >> 32;
}

void mulh_slp (void)
{
  ur[0] = ((unsigned long) ua[0] * ub[0]) >> 32;
  ur[1] = ((unsigned long) ua[1] * ub[1]) >> 32;
}
--cut here--

should vectorize on x86_64 with the patch I'm going to attach, and with
-fno-vect-cost-model. The compiler however does not even consider
"<s>mulv2si3_highpart" pattern.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/109764] V2SI multiply high is not vectorized on x86_64
  2023-05-07 16:21 [Bug tree-optimization/109764] New: V2SI multiply high is not vectorized on x86_64 ubizjak at gmail dot com
@ 2023-05-07 16:23 ` ubizjak at gmail dot com
  2023-05-08  7:19 ` rguenth at gcc dot gnu.org
  2023-05-08  8:09 ` ubizjak at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: ubizjak at gmail dot com @ 2023-05-07 16:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
Created attachment 55017
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55017&action=edit
Patch that adds <s>mulv2si3_highpart expander

The compiler should vectorize the testcase using "<s>mulv2si3_highpart"
expander.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/109764] V2SI multiply high is not vectorized on x86_64
  2023-05-07 16:21 [Bug tree-optimization/109764] New: V2SI multiply high is not vectorized on x86_64 ubizjak at gmail dot com
  2023-05-07 16:23 ` [Bug tree-optimization/109764] " ubizjak at gmail dot com
@ 2023-05-08  7:19 ` rguenth at gcc dot gnu.org
  2023-05-08  8:09 ` ubizjak at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-08  7:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-05-08
             Blocks|                            |53947
             Target|                            |x86_64-*-* i?86-*-*
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  Pattern recog recognizes the widening multiplication but not a
highpart multiplication.  That's currently missing.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/109764] V2SI multiply high is not vectorized on x86_64
  2023-05-07 16:21 [Bug tree-optimization/109764] New: V2SI multiply high is not vectorized on x86_64 ubizjak at gmail dot com
  2023-05-07 16:23 ` [Bug tree-optimization/109764] " ubizjak at gmail dot com
  2023-05-08  7:19 ` rguenth at gcc dot gnu.org
@ 2023-05-08  8:09 ` ubizjak at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: ubizjak at gmail dot com @ 2023-05-08  8:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109764

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #2)
> Confirmed.  Pattern recog recognizes the widening multiplication but not a
> highpart multiplication.  That's currently missing.

Please note that the following testcase that multiplies short -> int:

--cut here--
#define N 2

unsigned short ur[N], ua[N], ub[N];

void mulh (void)
{
  int i;

  for (i = 0; i < N; i++)
    ur[i] = ((unsigned int) ua[i] * ub[i]) >> 16;
}

void mulh_slp (void)
{
  ur[0] = ((unsigned int) ua[0] * ub[0]) >> 16;
  ur[1] = ((unsigned int) ua[1] * ub[1]) >> 16;
}
--cut here--

vectorizes with -O2 -fno-vec-cost-model via .MULH:

  vect__15.6_1 = MEM <vector(2) short unsigned int> [(short unsigned int
*)&ua];
  vect__17.9_3 = MEM <vector(2) short unsigned int> [(short unsigned int
*)&ub];
  vect_patt_34.10_5 = .MULH (vect__15.6_1, vect__17.9_3);
  MEM <vector(2) short unsigned int> [(short unsigned int *)&ur] =
vect_patt_34.10_5;

and generates expected:

        movd    ua(%rip), %xmm0
        movd    ub(%rip), %xmm1
        pmulhuw %xmm1, %xmm0
        movd    %xmm0, ur(%rip)

in both cases.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-05-08  8:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-07 16:21 [Bug tree-optimization/109764] New: V2SI multiply high is not vectorized on x86_64 ubizjak at gmail dot com
2023-05-07 16:23 ` [Bug tree-optimization/109764] " ubizjak at gmail dot com
2023-05-08  7:19 ` rguenth at gcc dot gnu.org
2023-05-08  8:09 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).