[Bug tree-optimization/114120] New: add reduction with promotion and then truncation poorly vectorized

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/114120] New: add reduction with promotion and then truncation poorly vectorized
@ 2024-02-26 18:20 pinskia at gcc dot gnu.org
  2024-02-27  8:26 ` [Bug tree-optimization/114120] " rguenth at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-26 18:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114120

            Bug ID: 114120
           Summary: add reduction with promotion and then truncation
                    poorly vectorized
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
            Blocks: 53947
  Target Milestone: ---
            Target: x86_64

Take:
```
unsigned char f(unsigned char *src)
{
        unsigned  sum = 0;
        for(int y = 0; y < 8; y++)
        {
                sum += src[y];
        }
        return sum;
}
```

On x86_64 we should vectorize to the same as what is done for:
```
unsigned char f0(unsigned char *src)
{
        unsigned char sum = 0;
        for(int y = 0; y < 8; y++)
        {
                sum += src[y];
        }
        return sum;
}
```

But GCC does not as GCC keeps sum in unsigned and the reduction is done in
`unsigned int`.

Note LLVM is able to vectorize this decently.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug tree-optimization/114120] add reduction with promotion and then truncation poorly vectorized
  2024-02-26 18:20 [Bug tree-optimization/114120] New: add reduction with promotion and then truncation poorly vectorized pinskia at gcc dot gnu.org
@ 2024-02-27  8:26 ` rguenth at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-27  8:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114120

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-02-27
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think I've seen a duplicate for this.  We lack a pass replacing an IV
(a PHI) based on how that is used outside of the loop.  Basically we fail
to treat PHIs transparently when folding conversions.  This _might_ be sth
for IVCANON since I think it doesn't really fit any other pass.

It also came up in the context of 

int f (unsigned *src)
{
  int sum = 0;
  for (int y = 0; y < 8; y++)
        {
                sum += src[y];
        }
  return sum;
}

which we handle fine in vectorization but still the reduction could be
done in 'unsigned' all the way through (and that conversion handling in
the vectorizer reduction code is somewhat ugly).

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-02-27  8:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-26 18:20 [Bug tree-optimization/114120] New: add reduction with promotion and then truncation poorly vectorized pinskia at gcc dot gnu.org
2024-02-27  8:26 ` [Bug tree-optimization/114120] " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).