[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "hubicka at kam dot mff.cuni.cz" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not
Date: Wed, 22 Dec 2021 11:08:18 +0000	[thread overview]
Message-ID: <bug-103797-4-meOOHTpwLg@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-103797-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797

--- Comment #4 from hubicka at kam dot mff.cuni.cz ---
> -E and remove not needed code.
> 
> > The
> > declaratoins are quite convoluted, but the function is well isolated and
> > easy to inspect from full one...
> 
> Do we speak about:
> https://github.com/mozilla/gecko-dev/blob/bd25b1ca76dd5d323ffc69557f6cf759ba76ba23/gfx/2d/FilterNodeSoftware.cpp#L3670-L3691
> ?
Yes.
> 
> It should be possible creating a synthetical test that does the same (and lives
> in a loop, right?).

Well, I tried that for a while and got bit lost (either code got
vectorized by both gcc and clang or by neither).  There are more issues
where we have over 50% regression wrt clang build at gfx code, so I
think I will first try to reproduce those locally and perf them to see
if there is more pattern here.

The releavant code is:

uint32_t mozilla::gfx::{anonymous}::SpecularLightingSoftware::LightPixel
(struct SpecularLightingSoftware * const this, const struct Point3D & aNormal,
const struct Point3D & aVectorToLight, uint32_t aColor)
{

  <bb 2> [local count: 118111600]:
  _48 = MEM[(const struct BasePoint3D
*)aVectorToLight_25(D)].D.75826.D.75829.z;
  _49 = _48 + 1.0e+0;
  _50 = MEM[(const struct BasePoint3D
*)aVectorToLight_25(D)].D.75826.D.75829.y;
  _51 = _50 + 0.0;
  _52 = MEM[(const struct BasePoint3D
*)aVectorToLight_25(D)].D.75826.D.75829.x;
  _53 = _52 + 0.0;
  _80 = _53 * _53;
  _82 = _51 * _51;
  _83 = _80 + _82;
  _85 = _49 * _49;
  _86 = _83 + _85;
  if (_86 u>= 0.0)
    goto <bb 3>; [99.95%]
  else
    goto <bb 4>; [0.05%]

  <bb 3> [local count: 118052545]:
  _87 = .SQRT (_86);
  goto <bb 5>; [100.00%]

  <bb 4> [local count: 59055]:
  _29 = __builtin_sqrtf (_86);

  <bb 5> [local count: 118111600]:
  # _30 = PHI <_29(4), _87(3)>
  _88 = _53 / _30;
  _89 = _51 / _30;
  _90 = _49 / _30;
  _41 = MEM[(const struct BasePoint3D *)aNormal_26(D)].D.75826.D.75829.x;
  _39 = _41 * _88;
  _37 = MEM[(const struct BasePoint3D *)aNormal_26(D)].D.75826.D.75829.y;
  _33 = _37 * _89;
  _27 = _33 + _39;
  _45 = MEM[(const struct BasePoint3D *)aNormal_26(D)].D.75826.D.75829.z;
  _46 = _45 * _90;
  _47 = _27 + _46;
  if (_47 >= 0.0)
    goto <bb 12>; [59.00%]
  else
    goto <bb 6>; [41.00%]


With -Ofast it gets bit more streamlined:


  <bb 2> [local count: 118111600]:
  _48 = MEM[(const struct BasePoint3D
*)aVectorToLight_25(D)].D.75826.D.75829.z;
  _49 = _48 + 1.0e+0;
  _50 = MEM[(const struct BasePoint3D
*)aVectorToLight_25(D)].D.75826.D.75829.y;
  _51 = MEM[(const struct BasePoint3D
*)aVectorToLight_25(D)].D.75826.D.75829.x;
  powmult_78 = _51 * _51;
  powmult_80 = _50 * _50;
  _81 = powmult_78 + powmult_80;
  powmult_83 = _49 * _49;
  _84 = _81 + powmult_83;
  _85 = __builtin_sqrtf (_84);
  _86 = _51 / _85;
  _87 = _50 / _85;
  _88 = _49 / _85;
  _41 = MEM[(const struct BasePoint3D *)aNormal_26(D)].D.75826.D.75829.x;
  _39 = _41 * _86;
  _37 = MEM[(const struct BasePoint3D *)aNormal_26(D)].D.75826.D.75829.y;
  _33 = _37 * _87;
  _27 = _33 + _39;
  _45 = MEM[(const struct BasePoint3D *)aNormal_26(D)].D.75826.D.75829.z;
  _46 = _45 * _88;
  _47 = _27 + _46;
  if (_47 >= 0.0)
    goto <bb 3>; [59.00%]
  else
    goto <bb 9>; [41.00%]

But I do not quite see in the slp dump why this is not considered for
vectorization.

I attach the dump.
Honza

next prev parent reply	other threads:[~2021-12-22 11:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-21 23:29 [Bug tree-optimization/103797] New: " hubicka at gcc dot gnu.org
2021-12-22  1:16 ` [Bug tree-optimization/103797] " pinskia at gcc dot gnu.org
2021-12-22  8:46 ` marxin at gcc dot gnu.org
2021-12-22  9:14 ` hubicka at kam dot mff.cuni.cz
2021-12-22  9:21 ` marxin at gcc dot gnu.org
2021-12-22 11:08 ` hubicka at kam dot mff.cuni.cz [this message]
2021-12-22 11:08 ` hubicka at kam dot mff.cuni.cz
2021-12-22 11:30 ` marxin at gcc dot gnu.org
2021-12-22 13:44 ` hubicka at gcc dot gnu.org
2021-12-22 14:30 ` pinskia at gcc dot gnu.org
2021-12-22 14:59 ` hubicka at kam dot mff.cuni.cz
2021-12-22 19:34 ` jakub at gcc dot gnu.org
2021-12-22 20:29 ` hubicka at gcc dot gnu.org
2021-12-23  8:12 ` ubizjak at gmail dot com
2021-12-23  8:52 ` ubizjak at gmail dot com
2021-12-23  8:58 ` ubizjak at gmail dot com
2021-12-23  9:15 ` jakub at gcc dot gnu.org
2021-12-23  9:47 ` hubicka at kam dot mff.cuni.cz
2021-12-23 11:16 ` ubizjak at gmail dot com
2021-12-24 16:10 ` cvs-commit at gcc dot gnu.org
2022-01-03 13:37 ` hubicka at gcc dot gnu.org
2022-01-04 13:16 ` rguenth at gcc dot gnu.org
2022-01-07  6:39 ` pinskia at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-103797-4-meOOHTpwLg@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).