public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed] [RISC-V] Fix detection of store pair fusion cases
@ 2024-05-01 17:32 Jeff Law
0 siblings, 0 replies; 4+ messages in thread
From: Jeff Law @ 2024-05-01 17:32 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:b00f7226022a259211796781c9efb61645ad30a2
commit b00f7226022a259211796781c9efb61645ad30a2
Author: Jeff Law <jlaw@ventanamicro.com>
Date: Wed May 1 11:28:41 2024 -0600
[committed] [RISC-V] Fix detection of store pair fusion cases
We've got the ability to count the number of store pair fusions happening in
the front-end of the pipeline. When comparing some code from last year vs the
current trunk we saw a fairly dramatic drop.
The problem is the store pair fusion detection code was actively harmful due to
a minor bug in checking offsets. So instead of pairing up 8 byte stores such
as sp+0 with sp+8, it tried to pair up sp+8 and sp+16.
Given uarch sensitivity I didn't try to pull together a testcase. But we could
certainly see the undesirable behavior in benchmarks as simplistic as dhrystone
up through spec2017.
Anyway, bootstrapped a while back. Also verified through our performance
counters that store pair fusion rates are back up. Regression tested with
crosses a few minutes ago.
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Break out
tests for easier debugging in store pair fusion case. Fix offset
check in same.
(cherry picked from commit fad93e7617ce1aafb006983a71b6edc9ae1eb2d1)
Diff:
---
gcc/config/riscv/riscv.cc | 57 ++++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 20 deletions(-)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0f62b295b96..24d1ead3902 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8874,26 +8874,43 @@ riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
extract_base_offset_in_addr (SET_DEST (prev_set), &base_prev, &offset_prev);
extract_base_offset_in_addr (SET_DEST (curr_set), &base_curr, &offset_curr);
- /* The two stores must be contained within opposite halves of the same
- 16 byte aligned block of memory. We know that the stack pointer and
- the frame pointer have suitable alignment. So we just need to check
- the offsets of the two stores for suitable alignment.
-
- Originally the thought was to check MEM_ALIGN, but that was reporting
- incorrect alignments, even for SP/FP accesses, so we gave up on that
- approach. */
- if (base_prev != NULL_RTX
- && base_curr != NULL_RTX
- && REG_P (base_prev)
- && REG_P (base_curr)
- && REGNO (base_prev) == REGNO (base_curr)
- && (REGNO (base_prev) == STACK_POINTER_REGNUM
- || REGNO (base_prev) == HARD_FRAME_POINTER_REGNUM)
- && ((INTVAL (offset_prev) == INTVAL (offset_curr) + 8
- && (INTVAL (offset_prev) % 16) == 0)
- || ((INTVAL (offset_curr) == INTVAL (offset_prev) + 8)
- && (INTVAL (offset_curr) % 16) == 0)))
- return true;
+ /* Fail if we did not find both bases. */
+ if (base_prev == NULL_RTX || base_curr == NULL_RTX)
+ return false;
+
+ /* Fail if either base is not a register. */
+ if (!REG_P (base_prev) || !REG_P (base_curr))
+ return false;
+
+ /* Fail if the bases are not the same register. */
+ if (REGNO (base_prev) != REGNO (base_curr))
+ return false;
+
+ /* Originally the thought was to check MEM_ALIGN, but that was
+ reporting incorrect alignments, even for SP/FP accesses, so we
+ gave up on that approach. Instead just check for stack/hfp
+ which we know are aligned. */
+ if (REGNO (base_prev) != STACK_POINTER_REGNUM
+ && REGNO (base_prev) != HARD_FRAME_POINTER_REGNUM)
+ return false;
+
+ /* The two stores must be contained within opposite halves of the
+ same 16 byte aligned block of memory. We know that the stack
+ pointer and the frame pointer have suitable alignment. So we
+ just need to check the offsets of the two stores for suitable
+ alignment. */
+ /* Get the smaller offset into OFFSET_PREV. */
+ if (INTVAL (offset_prev) > INTVAL (offset_curr))
+ std::swap (offset_prev, offset_curr);
+
+ /* If the smaller offset (OFFSET_PREV) is not 16 byte aligned,
+ then fail. */
+ if ((INTVAL (offset_prev) % 16) != 0)
+ return false;
+
+ /* The higher offset must be 8 bytes more than the lower
+ offset. */
+ return (INTVAL (offset_prev) + 8 == INTVAL (offset_curr));
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* [gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed] [RISC-V] Fix detection of store pair fusion cases
@ 2024-08-16 20:15 Jeff Law
0 siblings, 0 replies; 4+ messages in thread
From: Jeff Law @ 2024-08-16 20:15 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:20d0403c22b1de01a98171ba219086d6461c45e2
commit 20d0403c22b1de01a98171ba219086d6461c45e2
Author: Jeff Law <jlaw@ventanamicro.com>
Date: Wed May 1 11:28:41 2024 -0600
[committed] [RISC-V] Fix detection of store pair fusion cases
We've got the ability to count the number of store pair fusions happening in
the front-end of the pipeline. When comparing some code from last year vs the
current trunk we saw a fairly dramatic drop.
The problem is the store pair fusion detection code was actively harmful due to
a minor bug in checking offsets. So instead of pairing up 8 byte stores such
as sp+0 with sp+8, it tried to pair up sp+8 and sp+16.
Given uarch sensitivity I didn't try to pull together a testcase. But we could
certainly see the undesirable behavior in benchmarks as simplistic as dhrystone
up through spec2017.
Anyway, bootstrapped a while back. Also verified through our performance
counters that store pair fusion rates are back up. Regression tested with
crosses a few minutes ago.
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Break out
tests for easier debugging in store pair fusion case. Fix offset
check in same.
(cherry picked from commit fad93e7617ce1aafb006983a71b6edc9ae1eb2d1)
Diff:
---
gcc/config/riscv/riscv.cc | 57 ++++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 20 deletions(-)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2841787d93e..00c0ae56da9 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8833,26 +8833,43 @@ riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
extract_base_offset_in_addr (SET_DEST (prev_set), &base_prev, &offset_prev);
extract_base_offset_in_addr (SET_DEST (curr_set), &base_curr, &offset_curr);
- /* The two stores must be contained within opposite halves of the same
- 16 byte aligned block of memory. We know that the stack pointer and
- the frame pointer have suitable alignment. So we just need to check
- the offsets of the two stores for suitable alignment.
-
- Originally the thought was to check MEM_ALIGN, but that was reporting
- incorrect alignments, even for SP/FP accesses, so we gave up on that
- approach. */
- if (base_prev != NULL_RTX
- && base_curr != NULL_RTX
- && REG_P (base_prev)
- && REG_P (base_curr)
- && REGNO (base_prev) == REGNO (base_curr)
- && (REGNO (base_prev) == STACK_POINTER_REGNUM
- || REGNO (base_prev) == HARD_FRAME_POINTER_REGNUM)
- && ((INTVAL (offset_prev) == INTVAL (offset_curr) + 8
- && (INTVAL (offset_prev) % 16) == 0)
- || ((INTVAL (offset_curr) == INTVAL (offset_prev) + 8)
- && (INTVAL (offset_curr) % 16) == 0)))
- return true;
+ /* Fail if we did not find both bases. */
+ if (base_prev == NULL_RTX || base_curr == NULL_RTX)
+ return false;
+
+ /* Fail if either base is not a register. */
+ if (!REG_P (base_prev) || !REG_P (base_curr))
+ return false;
+
+ /* Fail if the bases are not the same register. */
+ if (REGNO (base_prev) != REGNO (base_curr))
+ return false;
+
+ /* Originally the thought was to check MEM_ALIGN, but that was
+ reporting incorrect alignments, even for SP/FP accesses, so we
+ gave up on that approach. Instead just check for stack/hfp
+ which we know are aligned. */
+ if (REGNO (base_prev) != STACK_POINTER_REGNUM
+ && REGNO (base_prev) != HARD_FRAME_POINTER_REGNUM)
+ return false;
+
+ /* The two stores must be contained within opposite halves of the
+ same 16 byte aligned block of memory. We know that the stack
+ pointer and the frame pointer have suitable alignment. So we
+ just need to check the offsets of the two stores for suitable
+ alignment. */
+ /* Get the smaller offset into OFFSET_PREV. */
+ if (INTVAL (offset_prev) > INTVAL (offset_curr))
+ std::swap (offset_prev, offset_curr);
+
+ /* If the smaller offset (OFFSET_PREV) is not 16 byte aligned,
+ then fail. */
+ if ((INTVAL (offset_prev) % 16) != 0)
+ return false;
+
+ /* The higher offset must be 8 bytes more than the lower
+ offset. */
+ return (INTVAL (offset_prev) + 8 == INTVAL (offset_curr));
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* [gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed] [RISC-V] Fix detection of store pair fusion cases
@ 2024-05-27 0:04 Jeff Law
0 siblings, 0 replies; 4+ messages in thread
From: Jeff Law @ 2024-05-27 0:04 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:241fbf90eb65e11a028aa72e0b48529fecb3f3c9
commit 241fbf90eb65e11a028aa72e0b48529fecb3f3c9
Author: Jeff Law <jlaw@ventanamicro.com>
Date: Wed May 1 11:28:41 2024 -0600
[committed] [RISC-V] Fix detection of store pair fusion cases
We've got the ability to count the number of store pair fusions happening in
the front-end of the pipeline. When comparing some code from last year vs the
current trunk we saw a fairly dramatic drop.
The problem is the store pair fusion detection code was actively harmful due to
a minor bug in checking offsets. So instead of pairing up 8 byte stores such
as sp+0 with sp+8, it tried to pair up sp+8 and sp+16.
Given uarch sensitivity I didn't try to pull together a testcase. But we could
certainly see the undesirable behavior in benchmarks as simplistic as dhrystone
up through spec2017.
Anyway, bootstrapped a while back. Also verified through our performance
counters that store pair fusion rates are back up. Regression tested with
crosses a few minutes ago.
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Break out
tests for easier debugging in store pair fusion case. Fix offset
check in same.
(cherry picked from commit fad93e7617ce1aafb006983a71b6edc9ae1eb2d1)
Diff:
---
gcc/config/riscv/riscv.cc | 57 ++++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 20 deletions(-)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0f62b295b96..24d1ead3902 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8874,26 +8874,43 @@ riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
extract_base_offset_in_addr (SET_DEST (prev_set), &base_prev, &offset_prev);
extract_base_offset_in_addr (SET_DEST (curr_set), &base_curr, &offset_curr);
- /* The two stores must be contained within opposite halves of the same
- 16 byte aligned block of memory. We know that the stack pointer and
- the frame pointer have suitable alignment. So we just need to check
- the offsets of the two stores for suitable alignment.
-
- Originally the thought was to check MEM_ALIGN, but that was reporting
- incorrect alignments, even for SP/FP accesses, so we gave up on that
- approach. */
- if (base_prev != NULL_RTX
- && base_curr != NULL_RTX
- && REG_P (base_prev)
- && REG_P (base_curr)
- && REGNO (base_prev) == REGNO (base_curr)
- && (REGNO (base_prev) == STACK_POINTER_REGNUM
- || REGNO (base_prev) == HARD_FRAME_POINTER_REGNUM)
- && ((INTVAL (offset_prev) == INTVAL (offset_curr) + 8
- && (INTVAL (offset_prev) % 16) == 0)
- || ((INTVAL (offset_curr) == INTVAL (offset_prev) + 8)
- && (INTVAL (offset_curr) % 16) == 0)))
- return true;
+ /* Fail if we did not find both bases. */
+ if (base_prev == NULL_RTX || base_curr == NULL_RTX)
+ return false;
+
+ /* Fail if either base is not a register. */
+ if (!REG_P (base_prev) || !REG_P (base_curr))
+ return false;
+
+ /* Fail if the bases are not the same register. */
+ if (REGNO (base_prev) != REGNO (base_curr))
+ return false;
+
+ /* Originally the thought was to check MEM_ALIGN, but that was
+ reporting incorrect alignments, even for SP/FP accesses, so we
+ gave up on that approach. Instead just check for stack/hfp
+ which we know are aligned. */
+ if (REGNO (base_prev) != STACK_POINTER_REGNUM
+ && REGNO (base_prev) != HARD_FRAME_POINTER_REGNUM)
+ return false;
+
+ /* The two stores must be contained within opposite halves of the
+ same 16 byte aligned block of memory. We know that the stack
+ pointer and the frame pointer have suitable alignment. So we
+ just need to check the offsets of the two stores for suitable
+ alignment. */
+ /* Get the smaller offset into OFFSET_PREV. */
+ if (INTVAL (offset_prev) > INTVAL (offset_curr))
+ std::swap (offset_prev, offset_curr);
+
+ /* If the smaller offset (OFFSET_PREV) is not 16 byte aligned,
+ then fail. */
+ if ((INTVAL (offset_prev) % 16) != 0)
+ return false;
+
+ /* The higher offset must be 8 bytes more than the lower
+ offset. */
+ return (INTVAL (offset_prev) + 8 == INTVAL (offset_curr));
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* [gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed] [RISC-V] Fix detection of store pair fusion cases
@ 2024-05-07 17:48 Jeff Law
0 siblings, 0 replies; 4+ messages in thread
From: Jeff Law @ 2024-05-07 17:48 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:305a5d2d9a9394579d132364c0fd1ca9e95c9f4c
commit 305a5d2d9a9394579d132364c0fd1ca9e95c9f4c
Author: Jeff Law <jlaw@ventanamicro.com>
Date: Wed May 1 11:28:41 2024 -0600
[committed] [RISC-V] Fix detection of store pair fusion cases
We've got the ability to count the number of store pair fusions happening in
the front-end of the pipeline. When comparing some code from last year vs the
current trunk we saw a fairly dramatic drop.
The problem is the store pair fusion detection code was actively harmful due to
a minor bug in checking offsets. So instead of pairing up 8 byte stores such
as sp+0 with sp+8, it tried to pair up sp+8 and sp+16.
Given uarch sensitivity I didn't try to pull together a testcase. But we could
certainly see the undesirable behavior in benchmarks as simplistic as dhrystone
up through spec2017.
Anyway, bootstrapped a while back. Also verified through our performance
counters that store pair fusion rates are back up. Regression tested with
crosses a few minutes ago.
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Break out
tests for easier debugging in store pair fusion case. Fix offset
check in same.
(cherry picked from commit fad93e7617ce1aafb006983a71b6edc9ae1eb2d1)
Diff:
---
gcc/config/riscv/riscv.cc | 57 ++++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 20 deletions(-)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0f62b295b969..24d1ead3902c 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8874,26 +8874,43 @@ riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
extract_base_offset_in_addr (SET_DEST (prev_set), &base_prev, &offset_prev);
extract_base_offset_in_addr (SET_DEST (curr_set), &base_curr, &offset_curr);
- /* The two stores must be contained within opposite halves of the same
- 16 byte aligned block of memory. We know that the stack pointer and
- the frame pointer have suitable alignment. So we just need to check
- the offsets of the two stores for suitable alignment.
-
- Originally the thought was to check MEM_ALIGN, but that was reporting
- incorrect alignments, even for SP/FP accesses, so we gave up on that
- approach. */
- if (base_prev != NULL_RTX
- && base_curr != NULL_RTX
- && REG_P (base_prev)
- && REG_P (base_curr)
- && REGNO (base_prev) == REGNO (base_curr)
- && (REGNO (base_prev) == STACK_POINTER_REGNUM
- || REGNO (base_prev) == HARD_FRAME_POINTER_REGNUM)
- && ((INTVAL (offset_prev) == INTVAL (offset_curr) + 8
- && (INTVAL (offset_prev) % 16) == 0)
- || ((INTVAL (offset_curr) == INTVAL (offset_prev) + 8)
- && (INTVAL (offset_curr) % 16) == 0)))
- return true;
+ /* Fail if we did not find both bases. */
+ if (base_prev == NULL_RTX || base_curr == NULL_RTX)
+ return false;
+
+ /* Fail if either base is not a register. */
+ if (!REG_P (base_prev) || !REG_P (base_curr))
+ return false;
+
+ /* Fail if the bases are not the same register. */
+ if (REGNO (base_prev) != REGNO (base_curr))
+ return false;
+
+ /* Originally the thought was to check MEM_ALIGN, but that was
+ reporting incorrect alignments, even for SP/FP accesses, so we
+ gave up on that approach. Instead just check for stack/hfp
+ which we know are aligned. */
+ if (REGNO (base_prev) != STACK_POINTER_REGNUM
+ && REGNO (base_prev) != HARD_FRAME_POINTER_REGNUM)
+ return false;
+
+ /* The two stores must be contained within opposite halves of the
+ same 16 byte aligned block of memory. We know that the stack
+ pointer and the frame pointer have suitable alignment. So we
+ just need to check the offsets of the two stores for suitable
+ alignment. */
+ /* Get the smaller offset into OFFSET_PREV. */
+ if (INTVAL (offset_prev) > INTVAL (offset_curr))
+ std::swap (offset_prev, offset_curr);
+
+ /* If the smaller offset (OFFSET_PREV) is not 16 byte aligned,
+ then fail. */
+ if ((INTVAL (offset_prev) % 16) != 0)
+ return false;
+
+ /* The higher offset must be 8 bytes more than the lower
+ offset. */
+ return (INTVAL (offset_prev) + 8 == INTVAL (offset_curr));
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-08-16 20:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-01 17:32 [gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed] [RISC-V] Fix detection of store pair fusion cases Jeff Law
2024-05-07 17:48 Jeff Law
2024-05-27 0:04 Jeff Law
2024-08-16 20:15 Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).