[Bug target/110024] New: [Bug] 5% performance drop on important benchmark after r260951.

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/110024] New: [Bug] 5% performance drop on important benchmark after r260951.
@ 2023-05-29 14:42 d_vampile at 163 dot com
  2023-05-29 14:46 ` [Bug target/110024] " d_vampile at 163 dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: d_vampile at 163 dot com @ 2023-05-29 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110024

            Bug ID: 110024
           Summary: [Bug] 5% performance drop on important benchmark after
                    r260951.
           Product: gcc
           Version: 10.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: d_vampile at 163 dot com
  Target Milestone: ---

After the patch is submitted on AArch64, the performance of copying subitems in
the stream benchmark decreases by 3%.

Alternatively, you can obtain it from
https://github.com/jeffhammond/stream/archive/master.zip.

Compiling & Running:
gcc -fopenmp -O -DSTREAM_ARRAY_SIZE=100000000 stream.c  -o stream
./stream

Before modification: (copy subitem)
ldr d0, [x2, x0, lsl #3]
str d0, [x3, x0, lsl #3]
add x0, x0, #0x1
cmp x1, x0
b.ne 400a00 <main._omp_fn.4+0x54>
ldr x19, [sp, #16]
ldp x29, x30, [sp], #32
ret

After the modification:
ldr x2, [x3, x0, lsl #3]
str x2, [x4, x0, lsl #3]
add x0, x0, #0x1
cmp x1, x0
b.ne 400a00 <main._omp_fn.4+0x54>
ldr x19, [sp, #16]
ldp x29, x30, [sp], #32
ret

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/110024] [Bug] 5% performance drop on important benchmark after r260951.
  2023-05-29 14:42 [Bug target/110024] New: [Bug] 5% performance drop on important benchmark after r260951 d_vampile at 163 dot com
@ 2023-05-29 14:46 ` d_vampile at 163 dot com
  2023-05-29 14:48 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: d_vampile at 163 dot com @ 2023-05-29 14:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110024

--- Comment #1 from d_vampile <d_vampile at 163 dot com> ---
It can be seen that the vector register (D0) is used before the modification,
and the common register (X0) is used after the modification.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/110024] [Bug] 5% performance drop on important benchmark after r260951.
  2023-05-29 14:42 [Bug target/110024] New: [Bug] 5% performance drop on important benchmark after r260951 d_vampile at 163 dot com
  2023-05-29 14:46 ` [Bug target/110024] " d_vampile at 163 dot com
@ 2023-05-29 14:48 ` pinskia at gcc dot gnu.org
  2023-05-29 15:09 ` d_vampile at 163 dot com
  2023-05-29 15:19 ` d_vampile at 163 dot com
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29 14:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110024

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2023-05-29
     Ever confirmed|0                           |1

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Which core is showing the difference here?
Because some cores I know of, loading/storing using the FP registers is
actually one cycle slower than using GPRs.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/110024] [Bug] 5% performance drop on important benchmark after r260951.
  2023-05-29 14:42 [Bug target/110024] New: [Bug] 5% performance drop on important benchmark after r260951 d_vampile at 163 dot com
  2023-05-29 14:46 ` [Bug target/110024] " d_vampile at 163 dot com
  2023-05-29 14:48 ` pinskia at gcc dot gnu.org
@ 2023-05-29 15:09 ` d_vampile at 163 dot com
  2023-05-29 15:19 ` d_vampile at 163 dot com
  3 siblings, 0 replies; 5+ messages in thread
From: d_vampile at 163 dot com @ 2023-05-29 15:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110024

d_vampile <d_vampile at 163 dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |INVALID

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/110024] [Bug] 5% performance drop on important benchmark after r260951.
  2023-05-29 14:42 [Bug target/110024] New: [Bug] 5% performance drop on important benchmark after r260951 d_vampile at 163 dot com
                   ` (2 preceding siblings ...)
  2023-05-29 15:09 ` d_vampile at 163 dot com
@ 2023-05-29 15:19 ` d_vampile at 163 dot com
  3 siblings, 0 replies; 5+ messages in thread
From: d_vampile at 163 dot com @ 2023-05-29 15:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110024

--- Comment #3 from d_vampile <d_vampile at 163 dot com> ---
(In reply to Andrew Pinski from comment #2)
> Which core is showing the difference here?
> Because some cores I know of, loading/storing using the FP registers is
> actually one cycle slower than using GPRs.
Yes, you're right; This submission is due to my careless post wrong assembly
code location; The performance is better when the X0 register is used before
the modification. The question, however, is why this modification causes the
register to select D0 and performance degradation. In addition, I will continue
to follow up in the new submission, look forward to your reply.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110026

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-29 15:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-29 14:42 [Bug target/110024] New: [Bug] 5% performance drop on important benchmark after r260951 d_vampile at 163 dot com
2023-05-29 14:46 ` [Bug target/110024] " d_vampile at 163 dot com
2023-05-29 14:48 ` pinskia at gcc dot gnu.org
2023-05-29 15:09 ` d_vampile at 163 dot com
2023-05-29 15:19 ` d_vampile at 163 dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).