public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
@ 2021-08-25  8:41 jamborm at gcc dot gnu.org
  2021-08-25  9:56 ` [Bug tree-optimization/102058] " marxin at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: jamborm at gcc dot gnu.org @ 2021-08-25  8:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

            Bug ID: 102058
           Summary: 450.soplex regressed on x86_64 with -Ofast
                    -march=generic (by 8-15%)
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---

All three LNT x86_64 testers have experienced a regression when
running SPECFP 2006 benchmark 450.soplex compiled with -Ofast
-march=generic (as opposed to -march=native builds which seem not to
be affected).

A znver2 machine regressed by 15%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=414.150.0&plot.1=300.150.0&

A znver1 machine regressed by 8%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=451.150.0&plot.1=27.150.0&

An Intel Kabylake machine regressed by 9%:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=429.150.0&plot.1=25.150.0&

I have bisected the regression (on another znver2 machine) to revision
r12-2733-g31855ba6b16:

31855ba6b16cd138d7484076a08cd40d609654b8 is the first bad commit
commit 31855ba6b16cd138d7484076a08cd40d609654b8
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Jul 29 14:14:48 2021 +0200

    Add emulated gather capability to the vectorizer

    This adds a gather vectorization capability to the vectorizer
    without target support by decomposing the offset vector, doing
    sclar loads and then building a vector from the result.  This
    is aimed mainly at cases where vectorizing the rest of the loop
    offsets the cost of vectorizing the gather.
    [...]


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
@ 2021-08-25  9:56 ` marxin at gcc dot gnu.org
  2021-08-25 10:00 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-08-25  9:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-08-25
             Status|UNCONFIRMED                 |NEW
                 CC|                            |marxin at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
  2021-08-25  9:56 ` [Bug tree-optimization/102058] " marxin at gcc dot gnu.org
@ 2021-08-25 10:00 ` rguenth at gcc dot gnu.org
  2021-08-26  9:59 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-25 10:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 51355
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51355&action=edit
patch to add --param vect-emulated-gather for debugging

When using attached patch then 0 vs. 1 on zen2 indeed reproduces

                                  Estimated                       Estimated
                Base     Base       Base        Peak     Peak       Peak
Benchmarks      Ref.   Run Time     Ratio       Ref.   Run Time     Ratio
-------------- ------  ---------  ---------    ------  ---------  ---------
450.soplex       8340        120       69.5 *    8340        136       61.4 *  

trying to nail it down now.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
  2021-08-25  9:56 ` [Bug tree-optimization/102058] " marxin at gcc dot gnu.org
  2021-08-25 10:00 ` rguenth at gcc dot gnu.org
@ 2021-08-26  9:59 ` rguenth at gcc dot gnu.org
  2021-10-29 13:05 ` [Bug tree-optimization/102058] [12 regression] " hubicka at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-26  9:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
perf isn't particularly helpful, pointing at differences where no differences
in assembly occurs.  But we do now vectorize soplex::SPxSteepPR::entered4, in
particular soplex::Vector::operator* which is

   /// inner product.
   Real operator*(const Vector& w) const
   {
      Real x = 0;
      int n = size();
      Element* e = m_elem;

      while (n--)
      {
         x += e->val * w[e->idx];
         e++;
      }
      return x;
   }

and the e->val * w[e->idx] contains the gather we now handle.

Other parts perf points out are once again not vectorized :/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] [12 regression] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-08-26  9:59 ` rguenth at gcc dot gnu.org
@ 2021-10-29 13:05 ` hubicka at gcc dot gnu.org
  2021-11-02  6:52 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu.org @ 2021-10-29 13:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org
            Summary|450.soplex regressed on     |[12 regression] 450.soplex
                   |x86_64 with -Ofast          |regressed on x86_64 with
                   |-march=generic (by 8-15%)   |-Ofast -march=generic (by
                   |                            |8-15%)

--- Comment #3 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
This still shows comparing trunk to gcc11 on lnt, so marking as regression

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] [12 regression] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-10-29 13:05 ` [Bug tree-optimization/102058] [12 regression] " hubicka at gcc dot gnu.org
@ 2021-11-02  6:52 ` rguenth at gcc dot gnu.org
  2022-01-20 10:33 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-11-02  6:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] [12 regression] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-11-02  6:52 ` rguenth at gcc dot gnu.org
@ 2022-01-20 10:33 ` rguenth at gcc dot gnu.org
  2022-02-10 10:18 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-20 10:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's all with generic arch/tune but -Ofast which is not the most interesting
combination.  But we should see to extract a testcase for the reduction
and see to gather runtime data on the size() distribution.  When
vectorized the loop might also turn from nice small to slightly too big
for efficient cross iteration OOO scheduling.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] [12 regression] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-01-20 10:33 ` rguenth at gcc dot gnu.org
@ 2022-02-10 10:18 ` rguenth at gcc dot gnu.org
  2022-05-06  8:30 ` [Bug tree-optimization/102058] [12/13 " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-10 10:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
It was observed that 450.soplex needs more iterations to converge.  As we now
vectorize a reduction that we didn't before that's definitely a thing that can
impact precision with FP.  But this is expected with -Ofast.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] [12/13 regression] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-02-10 10:18 ` rguenth at gcc dot gnu.org
@ 2022-05-06  8:30 ` jakub at gcc dot gnu.org
  2022-07-26 13:30 ` rguenth at gcc dot gnu.org
  2023-05-08 12:22 ` [Bug tree-optimization/102058] [12/13/14 " rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-05-06  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|12.0                        |12.2

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 12.1 is being released, retargeting bugs to GCC 12.2.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] [12/13 regression] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-05-06  8:30 ` [Bug tree-optimization/102058] [12/13 " jakub at gcc dot gnu.org
@ 2022-07-26 13:30 ` rguenth at gcc dot gnu.org
  2023-05-08 12:22 ` [Bug tree-optimization/102058] [12/13/14 " rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-26 13:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |SUSPENDED

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
See comment#5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/102058] [12/13/14 regression] 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%)
  2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-07-26 13:30 ` rguenth at gcc dot gnu.org
@ 2023-05-08 12:22 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-08 12:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102058

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|12.3                        |12.4

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 12.3 is being released, retargeting bugs to GCC 12.4.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-05-08 12:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-25  8:41 [Bug tree-optimization/102058] New: 450.soplex regressed on x86_64 with -Ofast -march=generic (by 8-15%) jamborm at gcc dot gnu.org
2021-08-25  9:56 ` [Bug tree-optimization/102058] " marxin at gcc dot gnu.org
2021-08-25 10:00 ` rguenth at gcc dot gnu.org
2021-08-26  9:59 ` rguenth at gcc dot gnu.org
2021-10-29 13:05 ` [Bug tree-optimization/102058] [12 regression] " hubicka at gcc dot gnu.org
2021-11-02  6:52 ` rguenth at gcc dot gnu.org
2022-01-20 10:33 ` rguenth at gcc dot gnu.org
2022-02-10 10:18 ` rguenth at gcc dot gnu.org
2022-05-06  8:30 ` [Bug tree-optimization/102058] [12/13 " jakub at gcc dot gnu.org
2022-07-26 13:30 ` rguenth at gcc dot gnu.org
2023-05-08 12:22 ` [Bug tree-optimization/102058] [12/13/14 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).