public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/61810] New: init-regs.c papers over issues elsewhere
@ 2014-07-15 13:07 rguenth at gcc dot gnu.org
2021-08-11 14:21 ` [Bug target/61810] " rguenth at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-07-15 13:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
Bug ID: 61810
Summary: init-regs.c papers over issues elsewhere
Product: gcc
Version: 4.10.0
Status: UNCONFIRMED
Keywords: missed-optimization, wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target: x86_64-*-*, i?86-*-*
Earlier this year Uli complained about undefined uses generating
code in the context of implementing _mm_undefined x86 intrinsics
in the "GCC way". Example:
extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,
__artificial__))
_mm_undefined_pd (void)
{
__m128d __Y = __Y;
return __Y;
}
and the culprit was found to be the init-regs pass which initializes
all must-undefined but used pseudos with zero. That has been
introduced with the dataflow-merge with some big handwaving
comment (and no testcase as far as we could see):
/* Check all of the uses of pseudo variables. If any use that is MUST
uninitialized, add a store of 0 immediately before it. For
subregs, this makes combine happy. For full word regs, this makes
other optimizations, like the register allocator and the reg-stack
happy as well as papers over some problems on the arm and other
processors where certain isa constraints cannot be handled by gcc.
These are of the form where two operands to an insn my not be the
same. The ra will only make them the same if they do not
interfere, and this can only happen if one is not initialized.
There is also the unfortunate consequence that this may mask some
buggy programs where people forget to initialize stack variable.
Any programmer with half a brain would look at the uninitialized
variable warnings. */
Of course earlier this year it wasn't the appropriate time to
kill off init-regs (which doesn't run at -O0 btw...). But now
it is.
All of the issues in the comment at the top of init-regs.c
point to issues elsewhere - papering over them by means
of init-regs.c isn't a correct solution - especially as
passes between init-regs and $issue may expose must-undefined
but used pseudos (if-conversion for example).
Disabling init-regs.c by adjusting its gate to always return 0 causes
FAIL: gcc.dg/vect/vect-strided-a-u8-i8-gap7.c execution test
on x86_64 and
FAIL: gcc.dg/vect/vect-strided-a-u16-i4.c execution test
FAIL: gcc.dg/vect/vect-strided-a-u8-i8-gap7.c execution test
FAIL: gcc.dg/vect/vect-strided-u8-i8-gap7-big-array.c execution test
FAIL: gcc.dg/vect/vect-strided-a-u16-i4.c -flto -ffat-lto-objects execution
test
FAIL: gcc.dg/vect/vect-strided-a-u8-i8-gap7.c -flto -ffat-lto-objects execution
test
on x86_64 -m32
no other FAILs (the comment hints at more issues on other targets).
Patch to disable init-regs (apart from the above bootstraps/tests ok
on x86_64)
Index: gcc/init-regs.c
===================================================================
--- gcc/init-regs.c (revision 212520)
+++ gcc/init-regs.c (working copy)
@@ -147,7 +147,7 @@ public:
{}
/* opt_pass methods: */
- virtual bool gate (function *) { return optimize > 0; }
+ virtual bool gate (function *) { return 0; }
virtual unsigned int execute (function *)
{
initialize_uninitialized_regs ();
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/61810] init-regs.c papers over issues elsewhere
2014-07-15 13:07 [Bug target/61810] New: init-regs.c papers over issues elsewhere rguenth at gcc dot gnu.org
@ 2021-08-11 14:21 ` rguenth at gcc dot gnu.org
2021-11-29 3:58 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-11 14:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Re-running the experiment of disabling init-regs on x86_64 on trunk shows
+FAIL: gcc.dg/lto/pr48622 c_lto_pr48622_0.o-c_lto_pr48622_0.o link, -O -flto
-fi
nline-small-functions -fno-early-inlining
FAIL: gcc.dg/tree-prof/20050826-2.c scan-tree-dump-not dom2 "Invalid sum"
+FAIL: gcc.target/i386/extract-insert-combining.c scan-assembler-times
(?:vmovd|
movd)[ \\\\t]+[^{\\n]*%xmm[0-9] 3
+FAIL: gcc.target/i386/extract-insert-combining.c scan-assembler-times
(?:vpinsr
d|pinsrd)[ \\\\t]+[^{\\n]*%xmm[0-9] 1
+FAIL: gnat.dg/sso8.adb execution test
with both -m64 and -m32
The gcc.dg/lto/pr48622 failure is a link-fail:
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld:
/tmp/cc8guozm.ltrans0.ltrans.o: in function `main':^M
<artificial>:(.text+0x18): undefined reference to `ashift_qi_1'^M
collect2: error: ld returned 1 exit status^M
compiler exited with status 1
I think the testcase is broken - with initregs likely the
int
main ()
{
if (ashift_qi_0 (0xff) != (u8) ((u8) 0xff << 0))
abort ();
test directly resolved to abort (), leaving the rest of the code dead.
The gcc.target/i386/extract-insert-combining.c looks like a combine
missed optimization when facing uninitialized regs compared to all-zero.
We get
pinsrd $0, %esi, %xmm0
pinsrd $0, %edi, %xmm1
movl %esi, -12(%rsp)
paddd %xmm0, %xmm1
pinsrd $0, %esi, %xmm0
paddd %xmm1, %xmm0
movd %xmm0, %eax
ret
preserving the "uninitialized" state of %xmm0 high, when initregs
explicitely zeros %xmm0 then this is matched to movd.
I cannot assess what goes wrong with gnat.dg/sso8.adb, but it might be
a testsuite bug as well.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/61810] init-regs.c papers over issues elsewhere
2014-07-15 13:07 [Bug target/61810] New: init-regs.c papers over issues elsewhere rguenth at gcc dot gnu.org
2021-08-11 14:21 ` [Bug target/61810] " rguenth at gcc dot gnu.org
@ 2021-11-29 3:58 ` pinskia at gcc dot gnu.org
2022-05-20 9:58 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-11-29 3:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577192.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/61810] init-regs.c papers over issues elsewhere
2014-07-15 13:07 [Bug target/61810] New: init-regs.c papers over issues elsewhere rguenth at gcc dot gnu.org
2021-08-11 14:21 ` [Bug target/61810] " rguenth at gcc dot gnu.org
2021-11-29 3:58 ` pinskia at gcc dot gnu.org
@ 2022-05-20 9:58 ` rguenth at gcc dot gnu.org
2022-05-20 12:54 ` amonakov at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-20 9:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hjl.tools at gmail dot com
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #6)
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577192.html
On current trunk x86_64 that gets
FAIL: gcc.target/i386/extract-insert-combining.c scan-assembler-times
(?:vmovd|movd)[ \\\\t]+[^{\\n]*%xmm[0-9] 3
FAIL: gcc.target/i386/extract-insert-combining.c scan-assembler-times
(?:vpinsrd|pinsrd)[ \\\\t]+[^{\\n]*%xmm[0-9] 1
FAIL: gcc.target/i386/pr104441-1b.c execution test
FAIL: gcc.target/i386/pr98335.c scan-assembler movzbl
FAIL: gcc.target/i386/pr98335.c scan-assembler-not movb
FAIL: gnat.dg/sso8.adb execution test
FAIL: libgomp.c/loop-19.c execution test
FAILs can be reproduced in an unpatched tree with specifying
-fdisable-rtl-init-regs
Assembly difference for gcc.target/i386/pr104441-1b.c is (besides RA):
- vpxor %xmm1, %xmm1, %xmm1
+ vpinsrd $1, (%rax,%r10), %xmm5, %xmm1
+ vpinsrd $1, (%rdx,%r9), %xmm4, %xmm3
vmovd (%rax), %xmm0
- vpxor %xmm2, %xmm2, %xmm2
addl $4, %ecx
- vpinsrd $1, (%rax,%r10), %xmm1, %xmm1
- vpinsrd $1, (%rdx,%r9), %xmm2, %xmm2
adding initialization in compute4x_m_sad_avx2_intrin of reg 109 at in block 4
for insn 33.
adding initialization in compute4x_m_sad_avx2_intrin of reg 99 at in block 4
for insn 48.
where we have for example
-(insn 97 31 98 4 (clobber (reg/v:V2DI 109 [ src23 ]))
"/home/rguenther/obj-gcc4-g/gcc/include/smmintrin.h":408:20 -1
- (nil))
-(insn 98 97 33 4 (set (reg/v:V2DI 109 [ src23 ])
- (const_vector:V2DI [
- (const_int 0 [0]) repeated x2
- ])) "/home/rguenther/obj-gcc4-g/gcc/include/smmintrin.h":408:20 -1
- (nil))
-(insn 33 98 36 4 (set (reg:V4SI 138 [ src23 ])
+(insn 33 31 36 4 (set (reg:V4SI 138 [ src23 ])
(vec_merge:V4SI (vec_duplicate:V4SI (reg:SI 137 [ MEM[(int32_t
*)src_62 + _41 * 1] ]))
(subreg:V4SI (reg/v:V2DI 109 [ src23 ]) 0)
(const_int 2 [0x2])))
"/home/rguenther/obj-gcc4-g/gcc/include/smmintrin.h":408:20 6925
{sse4_1_pinsrd}
where this produces { undef, MEM, undef, undef } without init-regs
But it looks like the testcase is broken:
__attribute__((always_inline, target("avx2")))
static __m256i
load8bit_4x4_avx2(const uint8_t *const src, const uint32_t stride)
{
__m128i src01, src23;
src01 = _mm_cvtsi32_si128(*(int32_t*)(src + 0 * stride));
src23 = _mm_insert_epi32(src23, *(int32_t *)(src + 3 * stride), 1);
return _mm256_setr_m128i(src01, src23);
}
it seems to expect that src23 is zero before inserting the data?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/61810] init-regs.c papers over issues elsewhere
2014-07-15 13:07 [Bug target/61810] New: init-regs.c papers over issues elsewhere rguenth at gcc dot gnu.org
` (2 preceding siblings ...)
2022-05-20 9:58 ` rguenth at gcc dot gnu.org
@ 2022-05-20 12:54 ` amonakov at gcc dot gnu.org
2022-05-20 18:10 ` hjl.tools at gmail dot com
2022-05-23 6:07 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-05-20 12:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
Alexander Monakov <amonakov at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |amonakov at gcc dot gnu.org
--- Comment #8 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #7)
> But it looks like the testcase is broken:
>
> __attribute__((always_inline, target("avx2")))
> static __m256i
> load8bit_4x4_avx2(const uint8_t *const src, const uint32_t stride)
> {
> __m128i src01, src23;
> src01 = _mm_cvtsi32_si128(*(int32_t*)(src + 0 * stride));
> src23 = _mm_insert_epi32(src23, *(int32_t *)(src + 3 * stride), 1);
> return _mm256_setr_m128i(src01, src23);
> }
>
> it seems to expect that src23 is zero before inserting the data?
If you look in the original PR 104441 testcase, it has sensible code:
static __m256i __attribute__((always_inline))
load8bit_4x4_avx2(const uint8_t *const src, const uint32_t stride)
{
__m128i src01, src23;
src01 = _mm_cvtsi32_si128(*(int32_t*)(src + 0 * stride));
src01 = _mm_insert_epi32(src01, *(int32_t *)(src + 1 * stride), 1);
src23 = _mm_cvtsi32_si128(*(int32_t*)(src + 2 * stride));
src23 = _mm_insert_epi32(src23, *(int32_t *)(src + 3 * stride), 1);
return _mm256_setr_m128i(src01, src23);
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/61810] init-regs.c papers over issues elsewhere
2014-07-15 13:07 [Bug target/61810] New: init-regs.c papers over issues elsewhere rguenth at gcc dot gnu.org
` (3 preceding siblings ...)
2022-05-20 12:54 ` amonakov at gcc dot gnu.org
@ 2022-05-20 18:10 ` hjl.tools at gmail dot com
2022-05-23 6:07 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: hjl.tools at gmail dot com @ 2022-05-20 18:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
--- Comment #9 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 53008
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53008&action=edit
A patch for pr104441-1a.c
Does it help?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/61810] init-regs.c papers over issues elsewhere
2014-07-15 13:07 [Bug target/61810] New: init-regs.c papers over issues elsewhere rguenth at gcc dot gnu.org
` (4 preceding siblings ...)
2022-05-20 18:10 ` hjl.tools at gmail dot com
@ 2022-05-23 6:07 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-23 6:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #9)
> Created attachment 53008 [details]
> A patch for pr104441-1a.c
>
> Does it help?
Yes, that fixes the issue.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-05-23 6:07 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-15 13:07 [Bug target/61810] New: init-regs.c papers over issues elsewhere rguenth at gcc dot gnu.org
2021-08-11 14:21 ` [Bug target/61810] " rguenth at gcc dot gnu.org
2021-11-29 3:58 ` pinskia at gcc dot gnu.org
2022-05-20 9:58 ` rguenth at gcc dot gnu.org
2022-05-20 12:54 ` amonakov at gcc dot gnu.org
2022-05-20 18:10 ` hjl.tools at gmail dot com
2022-05-23 6:07 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).