public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
@ 2023-03-10 7:36 liwei at loongson dot cn
2023-03-10 7:38 ` [Bug target/109086] " liwei at loongson dot cn
` (13 more replies)
0 siblings, 14 replies; 15+ messages in thread
From: liwei at loongson dot cn @ 2023-03-10 7:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
Bug ID: 109086
Summary: Bug of builtin_strcmp in the case of using the adddi3
instruction patterns
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: liwei at loongson dot cn
Target Milestone: ---
Created attachment 54631
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54631&action=edit
cause builtin_strcmp bug file
When we try to force add<mode>3 insn patterns only match DI mode for some
reason (this is safe because Loongarch defines the macro PROMOTE_MODE to
promote all modes narrower than BITS_PER_WORD to BITS_PER_WORD), it fails in
the following test case.
extern const char* deal(const char* filename);
const char* foo (filename)
const char *filename;
{
if (filename == NULL || !strcmp (filename, "-"))
filename = "";
return deal(filename);
}
We use glibc's script build-many-glibcs.py to cross build loongarch64 gcc, and
use -O2 command to compile above test case.
Part of the assembly code is as follows:
.LFB0 = .
.cfi_startproc
beqz $r4,.L5
ld.bu $r12,$r4,0
addi.w $r12,$r12,-45
beqz $r12,.L7
slli.w $r12,$r13,0
bnez $r12,.L2
...
The r12 register uses the uninitialized r13 register, which makes subsequent
bnez insn wrong.
Through gdb debugging we found that the problem occurs in
builtins.cc:inline_string_cmp, below is the problem code.
rtx result = target ? target : gen_reg_rtx (mode);
...
for (unsigned HOST_WIDE_INT i = 0; i < length; i++)
{
...
op0 = convert_modes (mode, unit_mode, op0, 1);
op1 = convert_modes (mode, unit_mode, op1, 1);
result = expand_simple_binop (mode, MINUS, op0, op1,
result, 1, OPTAB_WIDEN);
...
}
The result is empty at the beginning, and a new temporary register is used to
save the result of the first strcmp comparison, there is no problem in the
first loop.
From the second loop, the expand_simple_binop function actually calls the
expand_binop function. Since there is no addsi3 insn pattern, gcc will promote
the machine mode of the two operands op0 and op1 to DI, then match it to
adddi3, the result is stored in the temporary register and returned (because
the mode is MODE_INT and loongarch supports TRUNCATION). The previous result is
discarded, which resulting in error.
Attached are the diff file used to reproduce the bugs on loongarch and the
corresponding fix patches given.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug target/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
@ 2023-03-10 7:38 ` liwei at loongson dot cn
2023-03-10 8:18 ` [Bug rtl-optimization/109086] " rguenth at gcc dot gnu.org
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: liwei at loongson dot cn @ 2023-03-10 7:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #1 from liwei at loongson dot cn ---
Created attachment 54632
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54632&action=edit
try to fix builtin_strcmp bug patch
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
2023-03-10 7:38 ` [Bug target/109086] " liwei at loongson dot cn
@ 2023-03-10 8:18 ` rguenth at gcc dot gnu.org
2023-03-10 9:31 ` liwei at loongson dot cn
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-10 8:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |loongarch
Keywords| |wrong-code
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
But the previous result is irrelevant? Each and every result is independent
and immediately checked with a compare-and-jump? To me it looks like the
issue is in expand_simple_binop not honoring something not using the provided
'target'. But I can't see how, so I can wonder if you can pin-point better
where it goes wrong.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
2023-03-10 7:38 ` [Bug target/109086] " liwei at loongson dot cn
2023-03-10 8:18 ` [Bug rtl-optimization/109086] " rguenth at gcc dot gnu.org
@ 2023-03-10 9:31 ` liwei at loongson dot cn
2023-03-10 13:12 ` rguenth at gcc dot gnu.org
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: liwei at loongson dot cn @ 2023-03-10 9:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #3 from liwei at loongson dot cn ---
Thank you for your quick reply!
The final call of the expand_simple_binop function returns part of the code as
follows (optabs.cc):
1671 xop0 = widen_operand (xop0, wider_mode, mode, unsignedp, no_extend);
1672
1673 /* The second operand of a shift must always be extended. */
1674 xop1 = widen_operand (xop1, wider_mode, mode, unsignedp,
1675 no_extend && binoptab != ashl_optab);
1676
1677 temp = expand_binop (wider_mode, binoptab, xop0, xop1, NULL_RTX,
1678 unsignedp, OPTAB_DIRECT);
1679 if (temp)
1680 {
1681 if (mclass != MODE_INT
1682 || !TRULY_NOOP_TRUNCATION_MODES_P (mode, wider_mode))
1683 {
1684 if (target == 0)
1685 target = gen_reg_rtx (mode);
1686 convert_move (target, temp, 0);
1687 return target;
1688 }
1689 else
1690 return gen_lowpart (mode, temp);
1691 }
In the above code, target = result, when promote xop0 and xop1 machine mode to
DI, then match adddi3 insn pattern and save the new rtx to temp, then because
the mclass equal MODE_INT (also equal true for TRULY_NOOP_TRUNCATION_MODES_P
function), the code enter else branch and return lowpart of temp which is
irrelevant with target (i.e. result).
Maybe the expand_binop function does not consider the case of dependency with
`target` when generating rtx for the case of promote MODE_INT mode, and maybe
theoretically it does not need to be considered, except that builtin_strcmp
happens to meet such cases, so I think it is enough to modify the processing
logic of builtin_strcmp (my humble opinion).
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (2 preceding siblings ...)
2023-03-10 9:31 ` liwei at loongson dot cn
@ 2023-03-10 13:12 ` rguenth at gcc dot gnu.org
2023-03-14 14:59 ` xry111 at gcc dot gnu.org
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-10 13:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to liwei from comment #3)
> Thank you for your quick reply!
>
> The final call of the expand_simple_binop function returns part of the code
> as follows (optabs.cc):
>
> 1671 xop0 = widen_operand (xop0, wider_mode, mode, unsignedp, no_extend);
> 1672
> 1673 /* The second operand of a shift must always be extended. */
> 1674 xop1 = widen_operand (xop1, wider_mode, mode, unsignedp,
> 1675 no_extend && binoptab != ashl_optab);
> 1676
> 1677 temp = expand_binop (wider_mode, binoptab, xop0, xop1, NULL_RTX,
> 1678 unsignedp, OPTAB_DIRECT);
> 1679 if (temp)
> 1680 {
> 1681 if (mclass != MODE_INT
> 1682 || !TRULY_NOOP_TRUNCATION_MODES_P (mode, wider_mode))
> 1683 {
> 1684 if (target == 0)
> 1685 target = gen_reg_rtx (mode);
> 1686 convert_move (target, temp, 0);
> 1687 return target;
> 1688 }
> 1689 else
> 1690 return gen_lowpart (mode, temp);
> 1691 }
>
> In the above code, target = result, when promote xop0 and xop1 machine mode
> to DI, then match adddi3 insn pattern and save the new rtx to temp, then
> because the mclass equal MODE_INT (also equal true for
> TRULY_NOOP_TRUNCATION_MODES_P function), the code enter else branch and
> return lowpart of temp which is irrelevant with target (i.e. result).
Sure, but expand_simple_binop doesn't need to return 'target'
> Maybe the expand_binop function does not consider the case of dependency
> with `target` when generating rtx for the case of promote MODE_INT mode, and
> maybe theoretically it does not need to be considered, except that
> builtin_strcmp happens to meet such cases, so I think it is enough to modify
> the processing logic of builtin_strcmp (my humble opinion).
I'm still missing the obvious? We assign the result to 'result', so it
should be all fine? I see that emit_cmp_and_jump_insns eventually forces
the lowpart to a register but still it should work.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (3 preceding siblings ...)
2023-03-10 13:12 ` rguenth at gcc dot gnu.org
@ 2023-03-14 14:59 ` xry111 at gcc dot gnu.org
2023-03-15 2:00 ` xry111 at gcc dot gnu.org
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-03-14 14:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
Xi Ruoyao <xry111 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |xry111 at gcc dot gnu.org
--- Comment #5 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Definitely not a __builtin_strlen expansion issue. Things start to go wrong in
318r.bbro pass. In 317r.rtl.dce:
note 42 17 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(note 18 42 19 4 NOTE_INSN_DELETED)
(insn 19 18 20 4 (set (reg:DI 13 $r13 [orig:92 MEM <char[1:2]> [(void
*)filename_3(D)]+1 ] [92])
(zero_extend:DI (mem:QI (plus:DI (reg/v/f:DI 4 $r4 [orig:82 filename ]
[82])
(const_int 1 [0x1])) [0 MEM <char[1:2]> [(void
*)filename_3(D)]+1 S1 A8]))) "t.c":6:30 discrim 1 107 {zero_extendqidi2}
(nil))
(code_label 20 19 43 5 3 (nil) [1 uses])
(note 43 20 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 21 43 22 5 (set (reg:DI 12 $r12 [orig:80 _1 ] [80])
(sign_extend:DI (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
*)filename_3(D)]+1 ] [92]))) "t.c":6:30 discrim 1 116 {extendsidi2}
(expr_list:REG_DEAD (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
*)filename_3(D)]+1 ] [92])
(nil)))
But in 318r.bbro:
Reordered sequence:
2 bb 2
3 bb 3
4 bb 5
5 bb 7
6 bb 8
7 bb 4
8 duplicate of 5
and...
(code_label 20 19 43 4 3 (nil) [1 uses])
(note 43 20 21 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 21 43 22 4 (set (reg:DI 12 $r12 [orig:80 _1 ] [80])
(sign_extend:DI (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
*)filename_3(D)]+1 ] [92]))) "t.c":6:30 discrim 1 116 {extendsidi2}
(expr_list:REG_DEAD (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
*)filename_3(D)]+1 ] [92])
(nil)))
Apparently the BB reorder pass believe BB 5 does not depends on BB 4. Any idea
why this could even happen?
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (4 preceding siblings ...)
2023-03-14 14:59 ` xry111 at gcc dot gnu.org
@ 2023-03-15 2:00 ` xry111 at gcc dot gnu.org
2023-03-15 2:04 ` xry111 at gcc dot gnu.org
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-03-15 2:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #6 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Xi Ruoyao from comment #5)
> Definitely not a __builtin_strlen expansion issue. Things start to go wrong
> in 318r.bbro pass. In 317r.rtl.dce:
>
> note 42 17 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> (note 18 42 19 4 NOTE_INSN_DELETED)
> (insn 19 18 20 4 (set (reg:DI 13 $r13 [orig:92 MEM <char[1:2]> [(void
> *)filename_3(D)]+1 ] [92])
> (zero_extend:DI (mem:QI (plus:DI (reg/v/f:DI 4 $r4 [orig:82 filename
> ] [82])
> (const_int 1 [0x1])) [0 MEM <char[1:2]> [(void
> *)filename_3(D)]+1 S1 A8]))) "t.c":6:30 discrim 1 107 {zero_extendqidi2}
> (nil))
> (code_label 20 19 43 5 3 (nil) [1 uses])
> (note 43 20 21 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
> (insn 21 43 22 5 (set (reg:DI 12 $r12 [orig:80 _1 ] [80])
> (sign_extend:DI (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
> *)filename_3(D)]+1 ] [92]))) "t.c":6:30 discrim 1 116 {extendsidi2}
> (expr_list:REG_DEAD (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
> *)filename_3(D)]+1 ] [92])
> (nil)))
>
> But in 318r.bbro:
>
> Reordered sequence:
> 2 bb 2
> 3 bb 3
> 4 bb 5
> 5 bb 7
> 6 bb 8
> 7 bb 4
> 8 duplicate of 5
>
> and...
>
> (code_label 20 19 43 4 3 (nil) [1 uses])
> (note 43 20 21 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> (insn 21 43 22 4 (set (reg:DI 12 $r12 [orig:80 _1 ] [80])
> (sign_extend:DI (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
> *)filename_3(D)]+1 ] [92]))) "t.c":6:30 discrim 1 116 {extendsidi2}
> (expr_list:REG_DEAD (reg:SI 13 $r13 [orig:92 MEM <char[1:2]> [(void
> *)filename_3(D)]+1 ] [92])
> (nil)))
>
> Apparently the BB reorder pass believe BB 5 does not depends on BB 4. Any
> idea why this could even happen?
Wrong analysis, looks like I was too sleepy yesterday. I'll try to post a new
one now...
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (5 preceding siblings ...)
2023-03-15 2:00 ` xry111 at gcc dot gnu.org
@ 2023-03-15 2:04 ` xry111 at gcc dot gnu.org
2023-03-15 2:17 ` liwei at loongson dot cn
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-03-15 2:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #7 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Things are already wrong in 255r:
(jump_insn 17 16 42 4 (set (pc)
(if_then_else (ne (reg:DI 90)
(const_int 0 [0]))
(label_ref 20)
(pc))) "t.c":4:23 discrim 1 -1
(int_list:REG_BR_PROB 536870916 (nil))
-> 20)
(note 42 17 18 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 18 42 19 5 (set (reg:SI 91)
(zero_extend:SI (mem:QI (plus:DI (reg/v/f:DI 82 [ filename ])
(const_int 1 [0x1])) [0 MEM <char[1:2]> [(void
*)filename_3(D)]+1 S1 A8]))) "t.c":4:23 discrim 1 -1
(nil))
(insn 19 18 20 5 (set (reg:DI 92)
(plus:DI (subreg:DI (reg:SI 91) 0)
(const_int 0 [0]))) "t.c":4:23 discrim 1 -1
(nil))
(code_label 20 19 43 6 3 (nil) [1 uses])
(note 43 20 21 6 [bb 6] NOTE_INSN_BASIC_BLOCK)
(insn 21 43 22 6 (set (reg:DI 80 [ _1 ])
(sign_extend:DI (subreg:SI (reg:DI 92) 0))) "t.c":4:23 discrim 1 -1
(nil))
Note that the jump_insn jumps over insn 19 which initializes the vreg 92 (which
would become r13).
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (6 preceding siblings ...)
2023-03-15 2:04 ` xry111 at gcc dot gnu.org
@ 2023-03-15 2:17 ` liwei at loongson dot cn
2023-03-15 3:02 ` xry111 at gcc dot gnu.org
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: liwei at loongson dot cn @ 2023-03-15 2:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #8 from liwei at loongson dot cn ---
(In reply to Xi Ruoyao from comment #7)
> Things are already wrong in 255r:
>
> (jump_insn 17 16 42 4 (set (pc)
> (if_then_else (ne (reg:DI 90)
> (const_int 0 [0]))
> (label_ref 20)
> (pc))) "t.c":4:23 discrim 1 -1
> (int_list:REG_BR_PROB 536870916 (nil))
> -> 20)
> (note 42 17 18 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
> (insn 18 42 19 5 (set (reg:SI 91)
> (zero_extend:SI (mem:QI (plus:DI (reg/v/f:DI 82 [ filename ])
> (const_int 1 [0x1])) [0 MEM <char[1:2]> [(void
> *)filename_3(D)]+1 S1 A8]))) "t.c":4:23 discrim 1 -1
> (nil))
> (insn 19 18 20 5 (set (reg:DI 92)
> (plus:DI (subreg:DI (reg:SI 91) 0)
> (const_int 0 [0]))) "t.c":4:23 discrim 1 -1
> (nil))
> (code_label 20 19 43 6 3 (nil) [1 uses])
> (note 43 20 21 6 [bb 6] NOTE_INSN_BASIC_BLOCK)
> (insn 21 43 22 6 (set (reg:DI 80 [ _1 ])
> (sign_extend:DI (subreg:SI (reg:DI 92) 0))) "t.c":4:23 discrim 1 -1
> (nil))
>
> Note that the jump_insn jumps over insn 19 which initializes the vreg 92
> (which would become r13).
Thanks for the reply, i am trying to debug inline_string_cmp again to explain
the problem in more detail.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (7 preceding siblings ...)
2023-03-15 2:17 ` liwei at loongson dot cn
@ 2023-03-15 3:02 ` xry111 at gcc dot gnu.org
2023-03-15 4:52 ` liwei at loongson dot cn
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-03-15 3:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #9 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #4)
> > Maybe the expand_binop function does not consider the case of dependency
> > with `target` when generating rtx for the case of promote MODE_INT mode, and
> > maybe theoretically it does not need to be considered, except that
> > builtin_strcmp happens to meet such cases, so I think it is enough to modify
> > the processing logic of builtin_strcmp (my humble opinion).
>
> I'm still missing the obvious? We assign the result to 'result', so it
> should be all fine? I see that emit_cmp_and_jump_insns eventually forces
> the lowpart to a register but still it should work.
The problem here is we are assigning the result (i. e. virtual register number)
to 'result', not *moving* the result (i. e. the difference between s[i] and
t[i]) into 'result'.
When expand_simple_binop is allowed to assign new vreg in the loop, it's
possible that loop iteration 1 uses vreg1, iteration 2 use vreg2, ...,
iteration N use vregN. When we move the result into 'result', the "moving"
will happen only if the execution really reaches the move insn, such a behavior
is correct.
But we we assign the result into 'result', the assignment will "always" happen
(because assignment is a logic in the compiler itself, unlike moving which is a
part of target code). I. e. we always return vregN as the result, which is not
correct if the loop jumps out early.
I'd suggest
diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 305c65c29be..0c14a3b52b3 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -7142,8 +7142,11 @@ inline_string_cmp (rtx target, tree var_str, const char
*const_str,
op0 = convert_modes (mode, unit_mode, op0, 1);
op1 = convert_modes (mode, unit_mode, op1, 1);
- result = expand_simple_binop (mode, MINUS, op0, op1,
- result, 1, OPTAB_WIDEN);
+ rtx diff = expand_simple_binop (mode, MINUS, op0, op1,
+ result, 1, OPTAB_WIDEN);
+ if (diff != result)
+ emit_move_insn (result, diff);
+
if (i < length - 1)
emit_cmp_and_jump_insns (result, CONST0_RTX (mode), NE, NULL_RTX,
mode, true, ne_label);
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug rtl-optimization/109086] Bug of builtin_strcmp in the case of using the adddi3 instruction patterns
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (8 preceding siblings ...)
2023-03-15 3:02 ` xry111 at gcc dot gnu.org
@ 2023-03-15 4:52 ` liwei at loongson dot cn
2023-03-15 7:33 ` [Bug other/109086] __builtin_strcmp generates wrong code if expand_simple_binop assigns new pseudo-register for result xry111 at gcc dot gnu.org
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: liwei at loongson dot cn @ 2023-03-15 4:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #10 from liwei at loongson dot cn ---
(In reply to Richard Biener from comment #4)
> (In reply to liwei from comment #3)
> > Thank you for your quick reply!
> >
> > The final call of the expand_simple_binop function returns part of the code
> > as follows (optabs.cc):
> >
> > 1671 xop0 = widen_operand (xop0, wider_mode, mode, unsignedp, no_extend);
> > 1672
> > 1673 /* The second operand of a shift must always be extended. */
> > 1674 xop1 = widen_operand (xop1, wider_mode, mode, unsignedp,
> > 1675 no_extend && binoptab != ashl_optab);
> > 1676
> > 1677 temp = expand_binop (wider_mode, binoptab, xop0, xop1, NULL_RTX,
> > 1678 unsignedp, OPTAB_DIRECT);
> > 1679 if (temp)
> > 1680 {
> > 1681 if (mclass != MODE_INT
> > 1682 || !TRULY_NOOP_TRUNCATION_MODES_P (mode, wider_mode))
> > 1683 {
> > 1684 if (target == 0)
> > 1685 target = gen_reg_rtx (mode);
> > 1686 convert_move (target, temp, 0);
> > 1687 return target;
> > 1688 }
> > 1689 else
> > 1690 return gen_lowpart (mode, temp);
> > 1691 }
> >
> > In the above code, target = result, when promote xop0 and xop1 machine mode
> > to DI, then match adddi3 insn pattern and save the new rtx to temp, then
> > because the mclass equal MODE_INT (also equal true for
> > TRULY_NOOP_TRUNCATION_MODES_P function), the code enter else branch and
> > return lowpart of temp which is irrelevant with target (i.e. result).
>
> Sure, but expand_simple_binop doesn't need to return 'target'
>
> > Maybe the expand_binop function does not consider the case of dependency
> > with `target` when generating rtx for the case of promote MODE_INT mode, and
> > maybe theoretically it does not need to be considered, except that
> > builtin_strcmp happens to meet such cases, so I think it is enough to modify
> > the processing logic of builtin_strcmp (my humble opinion).
>
> I'm still missing the obvious? We assign the result to 'result', so it
> should be all fine? I see that emit_cmp_and_jump_insns eventually forces
> the lowpart to a register but still it should work.
Thanks for Xi Ruoyao's explanation, here I will add a specific diff example.
Following is the expand pass result of add above bug patch:
;; expand_simple_binop in loop where i = 0
(insn 13 12 14 (set (reg:DI 87)
(plus:DI (subreg:DI (reg:SI 86) 0)
(const_int -45 [0xffffffffffffffd3]))))
;; emit_cmp_and_jump_insns in loop where i = 0
(insn 14 13 15 (set (reg:DI 88)
(sign_extend:DI (subreg:SI (reg:DI 87) 0))))
(jump_insn 15 14 16 (set (pc)
(if_then_else (ne (reg:DI 88)
(const_int 0 [0]))
(label_ref 18)
(pc))))
...
;; expand_simple_binop in loop where i = 1
(insn 17 16 18 (set (reg:DI 90)
(plus:DI (subreg:DI (reg:SI 89) 0)
(const_int 0 [0]))))
(code_label 18 17 19 3 (nil))
;; finish and return result -> subreg:SI (reg:DI 90) 0)
;; next insn
(insn 19 18 0 (set (reg:DI 80 [ _1 ])
(sign_extend:DI (subreg:SI (reg:DI 90) 0))))
The problem is in insn 17, because the compare result is set in reg:DI 90,
which
is diff with compare result(loop0) set in reg:DI 87. The reason for this is
because after loop0 the rtx result = subreg:SI (reg:DI 87) 0), then in
expand_simple_binop function, due to no addsi3 insn pattern, gcc promote mode
and match adddi3 and finally return temporary register (reg:DI 90), and lowpart
it to match mode so after loop1 the result = (subreg:SI (reg:DI 90)).
The way to solve this kind of problem is to record the result of the previous
loop comparison in loop1 and later, and when the comparison result in the
current loop is not the same as the previous one, correct it by `move`, so that
no matter which branch it jumps from, the result is correct.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug other/109086] __builtin_strcmp generates wrong code if expand_simple_binop assigns new pseudo-register for result
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (9 preceding siblings ...)
2023-03-15 4:52 ` liwei at loongson dot cn
@ 2023-03-15 7:33 ` xry111 at gcc dot gnu.org
2023-03-15 7:36 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-03-15 7:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
Xi Ruoyao <xry111 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |xry111 at gcc dot gnu.org
Target|loongarch |
Summary|Bug of builtin_strcmp in |__builtin_strcmp generates
|the case of using the |wrong code if
|adddi3 instruction patterns |expand_simple_binop assigns
| |new pseudo-register for
| |result
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2023-03-15
Ever confirmed|0 |1
Component|rtl-optimization |other
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug other/109086] __builtin_strcmp generates wrong code if expand_simple_binop assigns new pseudo-register for result
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (10 preceding siblings ...)
2023-03-15 7:33 ` [Bug other/109086] __builtin_strcmp generates wrong code if expand_simple_binop assigns new pseudo-register for result xry111 at gcc dot gnu.org
@ 2023-03-15 7:36 ` rguenth at gcc dot gnu.org
2023-03-15 9:28 ` cvs-commit at gcc dot gnu.org
2023-03-15 9:29 ` xry111 at gcc dot gnu.org
13 siblings, 0 replies; 15+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-15 7:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so the issue is that expand_simple_binop will not necessarily return a
register and emit_cmp_and_jump_insns eventually mangles things.
The proposed fix looks OK, but please add a comment like
/* Force the result into a register. */
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug other/109086] __builtin_strcmp generates wrong code if expand_simple_binop assigns new pseudo-register for result
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (11 preceding siblings ...)
2023-03-15 7:36 ` rguenth at gcc dot gnu.org
@ 2023-03-15 9:28 ` cvs-commit at gcc dot gnu.org
2023-03-15 9:29 ` xry111 at gcc dot gnu.org
13 siblings, 0 replies; 15+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-15 9:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Xi Ruoyao <xry111@gcc.gnu.org>:
https://gcc.gnu.org/g:45641f3a99281bb0a429649741a29c2aace4c63e
commit r13-6690-g45641f3a99281bb0a429649741a29c2aace4c63e
Author: Xi Ruoyao <xry111@xry111.site>
Date: Wed Mar 15 15:34:52 2023 +0800
builtins: Move the character difference into result instead of reassigning
result [PR109086]
expand_simple_binop() is allowed to allocate a new pseudo-register and
return it, instead of forcing the result into the provided
pseudo-register. This can cause a problem when we expand the unrolled
loop for __builtin_strcmp: the compiler always generates code for all n
iterations of the loop, so "result" will be an alias of the
pseudo-register allocated and used in the last iteration; but at runtime
the loop can break early, causing this pseudo-register uninitialized.
Emit a move instruction in the iteration to force the difference into
one register which has been allocated before the loop, to avoid this
issue.
gcc/ChangeLog:
PR other/109086
* builtins.cc (inline_string_cmp): Force the character
difference into "result" pseudo-register, instead of reassign
the pseudo-register.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug other/109086] __builtin_strcmp generates wrong code if expand_simple_binop assigns new pseudo-register for result
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
` (12 preceding siblings ...)
2023-03-15 9:28 ` cvs-commit at gcc dot gnu.org
@ 2023-03-15 9:29 ` xry111 at gcc dot gnu.org
13 siblings, 0 replies; 15+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-03-15 9:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109086
Xi Ruoyao <xry111 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #13 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Pushed the patch with a comment explaining the if (result != diff) block.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-03-15 9:29 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 7:36 [Bug target/109086] New: Bug of builtin_strcmp in the case of using the adddi3 instruction patterns liwei at loongson dot cn
2023-03-10 7:38 ` [Bug target/109086] " liwei at loongson dot cn
2023-03-10 8:18 ` [Bug rtl-optimization/109086] " rguenth at gcc dot gnu.org
2023-03-10 9:31 ` liwei at loongson dot cn
2023-03-10 13:12 ` rguenth at gcc dot gnu.org
2023-03-14 14:59 ` xry111 at gcc dot gnu.org
2023-03-15 2:00 ` xry111 at gcc dot gnu.org
2023-03-15 2:04 ` xry111 at gcc dot gnu.org
2023-03-15 2:17 ` liwei at loongson dot cn
2023-03-15 3:02 ` xry111 at gcc dot gnu.org
2023-03-15 4:52 ` liwei at loongson dot cn
2023-03-15 7:33 ` [Bug other/109086] __builtin_strcmp generates wrong code if expand_simple_binop assigns new pseudo-register for result xry111 at gcc dot gnu.org
2023-03-15 7:36 ` rguenth at gcc dot gnu.org
2023-03-15 9:28 ` cvs-commit at gcc dot gnu.org
2023-03-15 9:29 ` xry111 at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).