public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
@ 2020-05-09  2:31 ` cvs-commit at gcc dot gnu.org
  2020-05-09  3:46 ` hp at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-09  2:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hans-Peter Nilsson <hp@gcc.gnu.org>:

https://gcc.gnu.org/g:27228024598c3515389cdb378346433fb2c48551

commit r11-222-g27228024598c3515389cdb378346433fb2c48551
Author: Hans-Peter Nilsson <hp@axis.com>
Date:   Thu Jan 23 02:30:49 2020 +0100

    cris: Emit trivial btstq expected by gcc.target/cris/sync-2i.c, sync-2c.c

    As the added FIXME says, the new insn_and_split generates only a
    small subset of the bit-tests that can be matched by "*btst" and
    that were emitted by the undecc0rated cris.md at combine-time,
    but it's naturally separable from a general variant by being
    just what's needed for the test-cases that were previously
    xfailed, and that no additional CCmodes are required.

    gcc:
            PR target/93372
            * config/cris/cris.md (zcond): New code_iterator.
            ("*cbranch<mode>4_btstq<CC>"): New insn_and_split.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
  2020-05-09  2:31 ` [Bug target/93372] cris performance regressions due to de-cc0 work cvs-commit at gcc dot gnu.org
@ 2020-05-09  3:46 ` hp at gcc dot gnu.org
  2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: hp at gcc dot gnu.org @ 2020-05-09  3:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

--- Comment #3 from Hans-Peter Nilsson <hp at gcc dot gnu.org> ---
In https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545452.html I mentioned a
performance-regression with coremark, from 5227456 cycles (with cc0) to 5238564
(CC_REG), which is about 0.21%.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
  2020-05-09  2:31 ` [Bug target/93372] cris performance regressions due to de-cc0 work cvs-commit at gcc dot gnu.org
  2020-05-09  3:46 ` hp at gcc dot gnu.org
@ 2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
  2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-07-13  8:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hans-Peter Nilsson <hp@gcc.gnu.org>:

https://gcc.gnu.org/g:ef07c7a5884c130b48e653993bfaaf1ae9e6dedd

commit r11-2048-gef07c7a5884c130b48e653993bfaaf1ae9e6dedd
Author: Hans-Peter Nilsson <hp@axis.com>
Date:   Wed Jul 8 23:59:12 2020 +0200

    cris: Use addi.b for additions where flags aren't inspected

    Comparing to the cc0 version of the CRIS port, I ran a few
    microbenchmarks, for example gcc.c-torture/execute/arith-rand.c,
    where there's sometimes an addition between an operation of
    interest and the test on the result.

    Unfortunately this patch doesn't remedy all the performance
    regression for that program.  But, this patch by itself helps
    and makes sense to commit separately: lots of addi.b in
    previously empty delay-slots, with functions shortened by one or
    a few insns, in libgcc.  I had an experience with the
    reload-related caveat of % on constraints, which is "fixed"
    documentationwise since long (soon 15 years ago;
    be3914df4cc8/r105517).  I removed an even older related FIXME.

    gcc:
            PR target/93372
            * config/cris/cris.md ("*add<mode>3_addi"): New splitter.
            ("*addi_b_<mode>"): New pattern.
            ("*addsi3<setnz>"): Remove stale %-related comment.

    gcc/testsuite:
            PR target/93372
            * gcc.target/cris/pr93372-45.c: New test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
@ 2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
  2020-07-16 23:53 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-07-13  8:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hans-Peter Nilsson <hp@gcc.gnu.org>:

https://gcc.gnu.org/g:9a2ae08b02d185a11e3e525e100ba637ce81c7ff

commit r11-2050-g9a2ae08b02d185a11e3e525e100ba637ce81c7ff
Author: Hans-Peter Nilsson <hp@axis.com>
Date:   Sun Jul 12 18:41:25 2020 +0200

    cris: Add new pass eliminating compares after delay-slot-filling

    Delayed-branch-slot-filling a.k.a. reorg or dbr, often causes
    opportunities for more compare-elimination than were visible for
    the cmpelim pass.  With cc0, these were caught by the
    elimination pass run in "final", thus the missed opportunities
    is a regression.  A simple reorg-aware pass run just after reorg
    handles most of them, if not all.  I chose to keep the "mach2"
    pass identifier string I copy-pasted from the SPARC port instead
    of inventing one like "postdbr_cmpelim".  Note the gap in numbers
    in the test-case file names.

    gcc:
            PR target/93372
            * config/cris/cris-passes.def: New file.
            * config/cris/t-cris (PASSES_EXTRA): Add cris-passes.def.
            * config/cris/cris.c: Add infrastructure bits and pass execute
            function cris_postdbr_cmpelim.
            * config/cris/cris-protos.h (make_pass_cris_postdbr_cmpelim):
Declare.

    gcc/testsuite:
            * gcc.target/cris/pr93372-44.c, gcc.target/cris/pr93372-46.c: New.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
@ 2020-07-16 23:53 ` cvs-commit at gcc dot gnu.org
  2020-08-24  1:15 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-07-16 23:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Segher Boessenkool <segher@gcc.gnu.org>:

https://gcc.gnu.org/g:84c5396d4bdbf9f1d628c77db4421808f9a9dcb6

commit r11-2185-g84c5396d4bdbf9f1d628c77db4421808f9a9dcb6
Author: Segher Boessenkool <segher@kernel.crashing.org>
Date:   Thu Jul 16 23:42:46 2020 +0000

    combine: Use single_set for is_just_move

    Since we now only call is_just_move on the original instructions, we
    always have an rtx_insn* (not just a pattern), so we can use single_set
    on it.  This makes no detectable difference at all on all thirty Linux
    targets I test, but it does help cris, and it is simpler, cleaner code
    anyway.

    2020-07-16  Hans-Peter Nilsson  <hp@axis.com>
                Segher Boessenkool  <segher@kernel.crashing.org>

            PR target/93372
            * combine.c (is_just_move): Take an rtx_insn* as argument.  Use
            single_set on it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2020-07-16 23:53 ` cvs-commit at gcc dot gnu.org
@ 2020-08-24  1:15 ` cvs-commit at gcc dot gnu.org
  2021-04-27 11:38 ` jakub at gcc dot gnu.org
  2021-04-27 14:54 ` hp at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-08-24  1:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hans-Peter Nilsson <hp@gcc.gnu.org>:

https://gcc.gnu.org/g:0e6c51de8ec47bf5f0dfaabfd1898c722d0485b4

commit r11-2814-g0e6c51de8ec47bf5f0dfaabfd1898c722d0485b4
Author: Hans-Peter Nilsson <hp@axis.com>
Date:   Mon Aug 24 03:15:21 2020 +0200

    reorg.c (fill_slots_from_thread): Improve for TARGET_FLAGS_REGNUM

    This handles TARGET_FLAGS_REGNUM clobbering insns as delay-slot
    fillers using a method similar to that in commit 33c2207d3fda,
    where care was taken for fill_simple_delay_slots to allow such
    insns when scanning for delay-slot fillers *backwards* (before
    the insn).

    A TARGET_FLAGS_REGNUM target is typically a former cc0 target.
    For cc0 targets, insns don't mention clobbering cc0, so the
    clobbers are mentioned in the "resources" only as a special
    entity and only for compare-insns and branches, where the cc0
    value matters.

    In contrast, with TARGET_FLAGS_REGNUM, most insns clobber it and
    the register liveness detection in reorg.c / resource.c treats
    that as a blocker (for other insns mentioning it, i.e. most)
    when looking for delay-slot-filling candidates.  This means that
    when comparing core and performance for a delay-slot cc0 target
    before and after the de-cc0 conversion, the inability to fill a
    delay slot after conversion manifests as a regression.  This was
    one such case, for CRIS, with random_bitstring in
    gcc.c-torture/execute/arith-rand-ll.c as well as the target
    libgcc division function.

    After this, all known performance regressions compared to cc0
    are fixed.

    gcc:
            PR target/93372
            * reorg.c (fill_slots_from_thread): Allow trial insns that clobber
            TARGET_FLAGS_REGNUM as delay-slot fillers.

    gcc/testsuite:
            PR target/93372
            * gcc.target/cris/pr93372-47.c: New test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2020-08-24  1:15 ` cvs-commit at gcc dot gnu.org
@ 2021-04-27 11:38 ` jakub at gcc dot gnu.org
  2021-04-27 14:54 ` hp at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-27 11:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.0                        |11.2

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.1 has been released, retargeting bugs to GCC 11.2.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/93372] cris performance regressions due to de-cc0 work
       [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2021-04-27 11:38 ` jakub at gcc dot gnu.org
@ 2021-04-27 14:54 ` hp at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: hp at gcc dot gnu.org @ 2021-04-27 14:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93372

Hans-Peter Nilsson <hp at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #9 from Hans-Peter Nilsson <hp at gcc dot gnu.org> ---
Whoops, this should be closed.  All observed related regressions were fixed for
gcc-11.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-27 14:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-93372-4@http.gcc.gnu.org/bugzilla/>
2020-05-09  2:31 ` [Bug target/93372] cris performance regressions due to de-cc0 work cvs-commit at gcc dot gnu.org
2020-05-09  3:46 ` hp at gcc dot gnu.org
2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
2020-07-13  8:15 ` cvs-commit at gcc dot gnu.org
2020-07-16 23:53 ` cvs-commit at gcc dot gnu.org
2020-08-24  1:15 ` cvs-commit at gcc dot gnu.org
2021-04-27 11:38 ` jakub at gcc dot gnu.org
2021-04-27 14:54 ` hp at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).