* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
@ 2022-08-16 8:04 ` rguenth at gcc dot gnu.org
2022-08-16 8:34 ` xgchenshy at 126 dot com
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-08-16 8:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2022-08-16
Ever confirmed|0 |1
Status|UNCONFIRMED |WAITING
Keywords| |wrong-code
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Can you provide preprocessed source of the whole translation unit so the
testcase is compilable?
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
2022-08-16 8:04 ` [Bug target/106635] " rguenth at gcc dot gnu.org
@ 2022-08-16 8:34 ` xgchenshy at 126 dot com
2022-08-16 8:35 ` xgchenshy at 126 dot com
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xgchenshy at 126 dot com @ 2022-08-16 8:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #2 from Xiaoguang <xgchenshy at 126 dot com> ---
(In reply to Richard Biener from comment #1)
> Can you provide preprocessed source of the whole translation unit so the
> testcase is compilable?
Sure, please see below complete code.
void CWLCollectReadRegData(u32* dst,u16 reg_start, u32 reg_length,u32*
total_length, addr_t status_data_base_addr)
{
u32 data_length=0;
{
//opcode
*dst++ = (OPCODE_RREG<<27)|(reg_length<<16)|(reg_start*4);
data_length++;
//data
volatile u32 temp_32 = (u32)status_data_base_addr; // fix compiler
optimization -O2 bug: stur x4, [x0, #4]
*dst++ = temp_32;
data_length++;
if(sizeof(addr_t) == 8) {
*dst++ = (u32)(((u64)status_data_base_addr)>>32);
data_length++;
} else {
*dst++ = 0;
data_length++;
}
//alignment
*dst = 0;
data_length++;
*total_length = data_length;
}
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
2022-08-16 8:04 ` [Bug target/106635] " rguenth at gcc dot gnu.org
2022-08-16 8:34 ` xgchenshy at 126 dot com
@ 2022-08-16 8:35 ` xgchenshy at 126 dot com
2022-08-16 8:36 ` xgchenshy at 126 dot com
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xgchenshy at 126 dot com @ 2022-08-16 8:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #3 from Xiaoguang <xgchenshy at 126 dot com> ---
(In reply to Richard Biener from comment #1)
> Can you provide preprocessed source of the whole translation unit so the
> testcase is compilable?
please see below code:
void CWLCollectReadRegData(u32* dst,u16 reg_start, u32 reg_length,u32*
total_length, addr_t status_data_base_addr)
{
u32 data_length=0;
{
//opcode
*dst++ = (OPCODE_RREG<<27)|(reg_length<<16)|(reg_start*4);
data_length++;
//data
*dst++ = (u32)status_data_base_addr;
data_length++;
if(sizeof(addr_t) == 8) {
*dst++ = (u32)(((u64)status_data_base_addr)>>32);
data_length++;
} else {
*dst++ = 0;
data_length++;
}
//alignment
*dst = 0;
data_length++;
*total_length = data_length;
}
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
` (2 preceding siblings ...)
2022-08-16 8:35 ` xgchenshy at 126 dot com
@ 2022-08-16 8:36 ` xgchenshy at 126 dot com
2022-08-16 11:15 ` rearnsha at gcc dot gnu.org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xgchenshy at 126 dot com @ 2022-08-16 8:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #4 from Xiaoguang <xgchenshy at 126 dot com> ---
(In reply to Xiaoguang from comment #2)
> (In reply to Richard Biener from comment #1)
> > Can you provide preprocessed source of the whole translation unit so the
> > testcase is compilable?
>
> Sure, please see below complete code.
>
> void CWLCollectReadRegData(u32* dst,u16 reg_start, u32 reg_length,u32*
> total_length, addr_t status_data_base_addr)
> {
> u32 data_length=0;
> {
> //opcode
> *dst++ = (OPCODE_RREG<<27)|(reg_length<<16)|(reg_start*4);
> data_length++;
>
> //data
> volatile u32 temp_32 = (u32)status_data_base_addr; // fix compiler
> optimization -O2 bug: stur x4, [x0, #4]
> *dst++ = temp_32;
> data_length++;
>
> if(sizeof(addr_t) == 8) {
> *dst++ = (u32)(((u64)status_data_base_addr)>>32);
> data_length++;
> } else {
> *dst++ = 0;
> data_length++;
> }
> //alignment
> *dst = 0;
> data_length++;
>
> *total_length = data_length;
> }
> }
please ignore this, we added volatile to avoid such issue
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
` (3 preceding siblings ...)
2022-08-16 8:36 ` xgchenshy at 126 dot com
@ 2022-08-16 11:15 ` rearnsha at gcc dot gnu.org
2022-08-17 2:26 ` xgchenshy at 126 dot com
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2022-08-16 11:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|WAITING |RESOLVED
--- Comment #5 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Your original code contains (after stripping out the volatile):
u32 temp_32 = (u32)status_data_base_addr;
*dst++ = temp_32;
data_length++;
if(sizeof(addr_t) == 8) {
*dst++ = (u32)(((u64)status_data_base_addr)>>32);
data_length++;
}
Which of course on a 64-bit machine simplifies to
u32 temp_32 = (u32)status_data_base_addr;
*dst++ = temp_32;
data_length++;
*dst++ = (u32)(((u64)status_data_base_addr)>>32);
data_length++;
And which the compiler then further simplifies to
*([unaligned]u64*)dst = status_data_base_addr;
data_length += 2;
dst += 2;
If the location that dst points to is in normal, cachable, memory, then this
will be fine. But if you're writing to non-cachable memory, then you might get
a trap.
the correct fix is to mark dst as volatile in this case.
void CWLCollectReadRegData(volatile u32* dst,u16 reg_start, u32 reg_length,u32*
total_length, addr_t status_data_base_addr)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
` (4 preceding siblings ...)
2022-08-16 11:15 ` rearnsha at gcc dot gnu.org
@ 2022-08-17 2:26 ` xgchenshy at 126 dot com
2022-08-17 2:32 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xgchenshy at 126 dot com @ 2022-08-17 2:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #6 from Xiaoguang <xgchenshy at 126 dot com> ---
(In reply to Richard Earnshaw from comment #5)
> Your original code contains (after stripping out the volatile):
> u32 temp_32 = (u32)status_data_base_addr;
> *dst++ = temp_32;
> data_length++;
>
> if(sizeof(addr_t) == 8) {
> *dst++ = (u32)(((u64)status_data_base_addr)>>32);
> data_length++;
> }
>
> Which of course on a 64-bit machine simplifies to
>
> u32 temp_32 = (u32)status_data_base_addr;
> *dst++ = temp_32;
> data_length++;
>
> *dst++ = (u32)(((u64)status_data_base_addr)>>32);
> data_length++;
>
> And which the compiler then further simplifies to
>
> *([unaligned]u64*)dst = status_data_base_addr;
> data_length += 2;
> dst += 2;
>
> If the location that dst points to is in normal, cachable, memory, then this
> will be fine. But if you're writing to non-cachable memory, then you might
> get a trap.
Thanks Very much for the explaination, Can you tell me why unaligned access
only works in normal cachable memory? where does this constraint come from?
>
> the correct fix is to mark dst as volatile in this case.
>
> void CWLCollectReadRegData(volatile u32* dst,u16 reg_start, u32
> reg_length,u32*
> total_length, addr_t status_data_base_addr)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
` (5 preceding siblings ...)
2022-08-17 2:26 ` xgchenshy at 126 dot com
@ 2022-08-17 2:32 ` pinskia at gcc dot gnu.org
2022-08-17 2:50 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-08-17 2:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Xiaoguang from comment #6)
> (In reply to Richard Earnshaw from comment #5)
> > Your original code contains (after stripping out the volatile):
> > u32 temp_32 = (u32)status_data_base_addr;
> > *dst++ = temp_32;
> > data_length++;
> >
> > if(sizeof(addr_t) == 8) {
> > *dst++ = (u32)(((u64)status_data_base_addr)>>32);
> > data_length++;
> > }
> >
> > Which of course on a 64-bit machine simplifies to
> >
> > u32 temp_32 = (u32)status_data_base_addr;
> > *dst++ = temp_32;
> > data_length++;
> >
> > *dst++ = (u32)(((u64)status_data_base_addr)>>32);
> > data_length++;
> >
> > And which the compiler then further simplifies to
> >
> > *([unaligned]u64*)dst = status_data_base_addr;
> > data_length += 2;
> > dst += 2;
> >
> > If the location that dst points to is in normal, cachable, memory, then this
> > will be fine. But if you're writing to non-cachable memory, then you might
> > get a trap.
> Thanks Very much for the explaination, Can you tell me why unaligned access
> only works in normal cachable memory? where does this constraint come from?
The architect (armv8) explains this. Basically the hardware does not know what
to do when there is a unaligned access as it has to two reads and two writes to
get the data correct.
It in the arm armv8 document.
>
> >
> > the correct fix is to mark dst as volatile in this case.
> >
> > void CWLCollectReadRegData(volatile u32* dst,u16 reg_start, u32
> > reg_length,u32*
> > total_length, addr_t status_data_base_addr)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
` (6 preceding siblings ...)
2022-08-17 2:32 ` pinskia at gcc dot gnu.org
@ 2022-08-17 2:50 ` pinskia at gcc dot gnu.org
2022-08-17 3:27 ` xgchenshy at 126 dot com
2022-08-17 6:37 ` xgchenshy at 126 dot com
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-08-17 2:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
In ARM Armv8, for A-profile architecture (ARM DDI 0487G.b (ID072021)):
>From section B2.5.2 Alignment of data accesses:
An unaligned access to any type of Device memory causes an Alignment fault.
Unaligned accesses to Normal memory
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
` (7 preceding siblings ...)
2022-08-17 2:50 ` pinskia at gcc dot gnu.org
@ 2022-08-17 3:27 ` xgchenshy at 126 dot com
2022-08-17 6:37 ` xgchenshy at 126 dot com
9 siblings, 0 replies; 11+ messages in thread
From: xgchenshy at 126 dot com @ 2022-08-17 3:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #9 from Xiaoguang <xgchenshy at 126 dot com> ---
(In reply to Andrew Pinski from comment #8)
> In ARM Armv8, for A-profile architecture (ARM DDI 0487G.b (ID072021)):
>
> From section B2.5.2 Alignment of data accesses:
>
> An unaligned access to any type of Device memory causes an Alignment fault.
>
> Unaligned accesses to Normal memory
Yeah, I also find such description, my memory type is uncachable normal memory,
but not device memory
I use mmap to get the virtual address with an O_SYNC in fd
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/106635] AARCH64 STUR instruction causes bus error
2022-08-16 6:51 [Bug c/106635] New: AARCH64 STUR instruction causes bus error xgchenshy at 126 dot com
` (8 preceding siblings ...)
2022-08-17 3:27 ` xgchenshy at 126 dot com
@ 2022-08-17 6:37 ` xgchenshy at 126 dot com
9 siblings, 0 replies; 11+ messages in thread
From: xgchenshy at 126 dot com @ 2022-08-17 6:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106635
--- Comment #10 from Xiaoguang <xgchenshy at 126 dot com> ---
(In reply to Xiaoguang from comment #9)
> (In reply to Andrew Pinski from comment #8)
> > In ARM Armv8, for A-profile architecture (ARM DDI 0487G.b (ID072021)):
> >
> > From section B2.5.2 Alignment of data accesses:
> >
> > An unaligned access to any type of Device memory causes an Alignment fault.
> >
> > Unaligned accesses to Normal memory
>
> Yeah, I also find such description, my memory type is uncachable normal
> memory, but not device memory
> I use mmap to get the virtual address with an O_SYNC in fd
Also I didn't see whether normal memory cacheable or not impacts alignment
access , besides, STUR instruction has unscaled imm offset, it should support
unaligned access on normal memory, no matter cached or not,and my X0 is normal
memory so I'm still confusing why it fails, please correct my if my
understanding is wrong. thanks very much
^ permalink raw reply [flat|nested] 11+ messages in thread