public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102435] New: gcc 9: aarch64 -ftree-loop-vectorize results in wrong code
@ 2021-09-21 19:40 dimitry@unified-streaming.com
  2024-02-28  7:56 ` [Bug tree-optimization/102435] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: dimitry@unified-streaming.com @ 2021-09-21 19:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102435

            Bug ID: 102435
           Summary: gcc 9: aarch64 -ftree-loop-vectorize results in wrong
                    code
           Product: gcc
           Version: 9.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dimitry@unified-streaming.com
  Target Milestone: ---

We noticed a problem with a loop optimization enabled by -O3 on a program
targeting AArch64. It turns out that this problem is specifically caused by
-ftree-loop-vectorize, and has actually been fixed by (or as a side-effect of)
commit https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=c89366b12ff4f362
("[AArch64] Support vectorising with multiple vector sizes") by Richard
Sandiford.

However, this commit was made on master when it was gcc-10, so while the
problem does not occur with gcc 10.x and 11.x, it *does* occur with 9.x. In our
particular instance, this is the default version on Ubuntu 20.04 for arm64,
e.g. gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04).

Reduced test case:

// g++ -std=c++17 -O2 -ftree-loop-vectorize testcase.cpp
// or
// g++ -std=c++17 -O3 testcase.cpp

#include <cassert>
#include <cstdint>
#include <iostream>
#include <vector>

struct sample_t
{
  sample_t(uint64_t dts, uint32_t duration)
  : dts_(dts)
  , duration_(duration)
  , cto_(0)
  , sample_description_index_(0)
  , pos_(0)
  , size_(0)
  , flags_(0)
  , aux_pos_(0)
  , aux_size_(0)
  {
  }

  uint64_t dts_;
  uint32_t duration_;
  int32_t cto_;
  uint32_t sample_description_index_;
  uint64_t pos_;
  uint32_t size_;
  uint32_t flags_;
  uint64_t aux_pos_;
  uint32_t aux_size_;
};

typedef std::vector<sample_t> samples_t;

__attribute__((__noinline__))
samples_t get_result(samples_t&& samples)
{
  uint64_t base_media_decode_time = ~0;

  auto first = samples.begin();
  auto last = samples.end();
  if(first != last)
  {
    base_media_decode_time = first->dts_;

    uint32_t duration = 0;
    for(--last; first != last; ++first)
    {
      duration = static_cast<uint32_t>(first[1].dts_ - first->dts_);

      first->duration_ = duration;
    }

    first->duration_ = duration;
  }

  return samples;
}

int main(void)
{
  samples_t samples_in = { {0, 3}, {3, 3}, {6, 3}, {9, 1}, {10, 2} };
  samples_t samples_out = get_result(std::move(samples_in));

  for(sample_t sample : samples_out)
  {
    std::cout << sample.dts_ << ", " << sample.duration_ << '\n';
  }

  // Expected output:
  // 0, 3
  // 3, 3
  // 6, 3
  // 9, 1
  // 10, 1
  //
  // Bad output:
  // 0, 3
  // 3, 0
  // 6, 0
  // 9, 0
  // 10, 0

  return 0;
}

Not that it appears vital that the struct sample_t is pretty large, e.g.
removing all of the members after the first two makes the output correct, even
with gcc 9 and -ftree-loop-vectorize. I have not determined precisely what the
cutoff size is.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/102435] gcc 9: aarch64 -ftree-loop-vectorize results in wrong code
  2021-09-21 19:40 [Bug tree-optimization/102435] New: gcc 9: aarch64 -ftree-loop-vectorize results in wrong code dimitry@unified-streaming.com
@ 2024-02-28  7:56 ` pinskia at gcc dot gnu.org
  2024-02-28  8:09 ` pinskia at gcc dot gnu.org
  2024-02-29 15:19 ` dimitry@unified-streaming.com
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-28  7:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102435

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|9.4.1                       |9.3.0
      Known to work|                            |8.5.0, 9.4.0, 9.5.0

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So I can reproduce it with GCC 9.3.0 but not with GCC 9.4.0. It also works with
GCC 9.5.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/102435] gcc 9: aarch64 -ftree-loop-vectorize results in wrong code
  2021-09-21 19:40 [Bug tree-optimization/102435] New: gcc 9: aarch64 -ftree-loop-vectorize results in wrong code dimitry@unified-streaming.com
  2024-02-28  7:56 ` [Bug tree-optimization/102435] " pinskia at gcc dot gnu.org
@ 2024-02-28  8:09 ` pinskia at gcc dot gnu.org
  2024-02-29 15:19 ` dimitry@unified-streaming.com
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-28  8:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102435

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|9.4.1                       |9.3.0
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=97236
             Status|UNCONFIRMED                 |RESOLVED
   Target Milestone|---                         |9.4
         Resolution|---                         |FIXED

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So this looks like another testcase for PR 97236 .


      duration = static_cast<uint32_t>(first[1].dts_ - first->dts_);

      first->duration_ = duration;

is getting incorrectly vectorized even though dts_ is only used here and not
the rest.

So closing as fixed.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/102435] gcc 9: aarch64 -ftree-loop-vectorize results in wrong code
  2021-09-21 19:40 [Bug tree-optimization/102435] New: gcc 9: aarch64 -ftree-loop-vectorize results in wrong code dimitry@unified-streaming.com
  2024-02-28  7:56 ` [Bug tree-optimization/102435] " pinskia at gcc dot gnu.org
  2024-02-28  8:09 ` pinskia at gcc dot gnu.org
@ 2024-02-29 15:19 ` dimitry@unified-streaming.com
  2 siblings, 0 replies; 4+ messages in thread
From: dimitry@unified-streaming.com @ 2024-02-29 15:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102435

--- Comment #3 from Dimitry Andric <dimitry@unified-streaming.com> ---
Note, in the mean time Ubuntu updated their default gcc version for Ubuntu
20.04 to 9.4.0:

https://packages.ubuntu.com/focal-updates/devel/gcc-9

so this issue won't be encountered there anymore. Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-29 15:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21 19:40 [Bug tree-optimization/102435] New: gcc 9: aarch64 -ftree-loop-vectorize results in wrong code dimitry@unified-streaming.com
2024-02-28  7:56 ` [Bug tree-optimization/102435] " pinskia at gcc dot gnu.org
2024-02-28  8:09 ` pinskia at gcc dot gnu.org
2024-02-29 15:19 ` dimitry@unified-streaming.com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).