* [Bug testsuite/99626] [10/11 regression] gcc.dg/strlenopt-73.c fails for 32 bits
2021-03-17 14:18 [Bug testsuite/99626] New: [10/11 regression] gcc.dg/strlenopt-73.c fails for 32 bits seurer at gcc dot gnu.org
2021-03-17 15:33 ` [Bug testsuite/99626] " rguenth at gcc dot gnu.org
@ 2021-03-17 15:34 ` jakub at gcc dot gnu.org
2021-03-18 15:14 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-17 15:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99626
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Doesn't FAIL on i686-linux.
I wonder if it is SLOW_UNALIGNED_ACCESS or something similar that for powerpc64
-m32 causes a lot of memcpy calls not to be folded.
grep memcpy strlenopt-73.c.023t.ssa
memcpy (pa_41, iftmp.0_21, 17);
memcpy (pa_49, iftmp.2_22, 17);
memcpy (pa_56, iftmp.4_23, 16);
memcpy (pa_63, iftmp.6_24, 15);
memcpy (pa_78, iftmp.10_26, 32);
memcpy (pa_85, iftmp.12_27, 31);
memcpy (pa_92, iftmp.14_28, 30);
is the same between i686 and powerpc64 -m64, while for powerpc64 -m32 shows
grep memcpy strlenopt-73.c.023t.ssa
memcpy (pa_41, iftmp.0_21, 17);
memcpy (pa_49, iftmp.2_22, 17);
memcpy (pa_56, iftmp.4_23, 16);
memcpy (pa_63, iftmp.6_24, 15);
memcpy (pa_78, iftmp.10_26, 32);
memcpy (pa_85, iftmp.12_27, 31);
memcpy (pa_92, iftmp.14_28, 30);
memcpy (pa_25, iftmp.20_13, 8);
memcpy (pa_33, iftmp.22_14, 8);
memcpy (pa_40, iftmp.24_15, 8);
memcpy (pa_47, iftmp.26_16, 8);
memcpy (pa_54, iftmp.28_17, 8);
memcpy (pa_61, iftmp.30_18, 8);
The test_copy_cond_unequal_length_i128 has the following misleading comment:
#if __i386__ && __SIZEOF_INT128__ == 16
/* The following tests assume GCC transforms the memcpy calls into
int128_t assignments which it does only on targets that define
the MOVE_MAX macro to 16. That's only s390 and i386 with
int128_t support. */
I bet it is never tested, because __int128 isn't supported on 32-bit targets.
But __i386__ is defined only on 32-bit x86, so perhaps it meant to use
__x86_64__ define instead?
And test_copy_cond_unequal_length_i64 is essentially the same except with
smaller size, so it again relies on targets transforming the memcpy calls to
long long assignments.
And there is a lot of targets that define MOVE_MAX to 4 or smaller:
config/arc/arc.h:#define MOVE_MAX 4
config/arm/arm.h:#define MOVE_MAX 4
config/c6x/c6x.h:#define MOVE_MAX 4
config/cr16/cr16.h:#define MOVE_MAX 4
config/cris/cris.h:#define MOVE_MAX 4
config/csky/csky.h:#define MOVE_MAX 4
config/ft32/ft32.h:#define MOVE_MAX 4
config/h8300/h8300.h:#define MOVE_MAX 4
config/iq2000/iq2000.h:#define MOVE_MAX 4
config/lm32/lm32.h:#define MOVE_MAX UNITS_PER_WORD
config/m32c/m32c.h:#define MOVE_MAX 4
config/m32r/m32r.h:#define MOVE_MAX 4
config/m68k/m68k.h:#define MOVE_MAX 4
config/mcore/mcore.h:#define MOVE_MAX 4
config/microblaze/microblaze.h:#define MOVE_MAX 4
config/mn10300/mn10300.h:#define MOVE_MAX 4
config/moxie/moxie.h:#define MOVE_MAX 4
config/nds32/nds32.h:#define MOVE_MAX 4
config/nios2/nios2.h:#define MOVE_MAX 4
config/or1k/or1k.h:#define MOVE_MAX 4
config/pdp11/pdp11.h:#define MOVE_MAX 2
config/rl78/rl78.h:#define MOVE_MAX 2
config/rs6000/rs6000.h:#define MOVE_MAX (! TARGET_POWERPC64 ? 4 : 8)
config/rx/rx.h:#define MOVE_MAX 4
config/sh/sh.h:#define MOVE_MAX (4)
config/stormy16/stormy16.h:#define MOVE_MAX 2
config/v850/v850.h:#define MOVE_MAX 4
config/visium/visium.h:#define MOVE_MAX 4
config/xtensa/xtensa.h:#define MOVE_MAX 4
So IMNSHO that function should be compiled only on a couple of targets known to
fold memcpy (, , 8);
to the assignments.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug testsuite/99626] [10/11 regression] gcc.dg/strlenopt-73.c fails for 32 bits
2021-03-17 14:18 [Bug testsuite/99626] New: [10/11 regression] gcc.dg/strlenopt-73.c fails for 32 bits seurer at gcc dot gnu.org
2021-03-17 15:33 ` [Bug testsuite/99626] " rguenth at gcc dot gnu.org
2021-03-17 15:34 ` jakub at gcc dot gnu.org
@ 2021-03-18 15:14 ` cvs-commit at gcc dot gnu.org
2021-03-18 15:22 ` [Bug testsuite/99626] [10 Regression] " jakub at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-03-18 15:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99626
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:fff9faa79043aa53d361e7f6e31b2680007a97e2
commit r11-7718-gfff9faa79043aa53d361e7f6e31b2680007a97e2
Author: Jakub Jelinek <jakub@redhat.com>
Date: Thu Mar 18 16:11:46 2021 +0100
testsuite: Fix up strlenopt-73.c on powerpc [PR99626]
As mentioned in the testcase as well as in the PR, this testcase relies on
MOVE_MAX being sufficiently large that the memcpy call is folded early into
load + store. Some popular targets define MOVE_MAX to 8 or even 16 (e.g.
x86_64 or some options on s390x), but many other targets define it to just
4
(e.g. powerpc 32-bit), or even 2.
The testcase has already one test routine guarded on one particular target
with MOVE_MAX 16 (but does it incorrectly, __i386__ is only defined on
32-bit x86 and __SIZEOF_INT128__ is only defined on 64-bit targets), this
patch fixes that, and guards another test that relies on memcpy (, , 8)
being folded that way (which therefore needs MOVE_MAX >= 8) on a couple of
common targets that are known to have such MOVE_MAX.
2021-03-18 Jakub Jelinek <jakub@redhat.com>
PR testsuite/99626
* gcc.dg/strlenopt-73.c: Ifdef out
test_copy_cond_unequal_length_i64
on targets other than x86, aarch64, s390 and 64-bit powerpc. Use
test_copy_cond_unequal_length_i128 for __x86_64__ with int128
support
rather than __i386__.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug testsuite/99626] [10 Regression] gcc.dg/strlenopt-73.c fails for 32 bits
2021-03-17 14:18 [Bug testsuite/99626] New: [10/11 regression] gcc.dg/strlenopt-73.c fails for 32 bits seurer at gcc dot gnu.org
` (3 preceding siblings ...)
2021-03-18 15:22 ` [Bug testsuite/99626] [10 Regression] " jakub at gcc dot gnu.org
@ 2021-03-19 23:30 ` cvs-commit at gcc dot gnu.org
2021-03-20 8:10 ` jakub at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-03-19 23:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99626
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:c9f698dce2ebdd16997a8d41d6698a2180775671
commit r10-9489-gc9f698dce2ebdd16997a8d41d6698a2180775671
Author: Jakub Jelinek <jakub@redhat.com>
Date: Thu Mar 18 16:11:46 2021 +0100
testsuite: Fix up strlenopt-73.c on powerpc [PR99626]
As mentioned in the testcase as well as in the PR, this testcase relies on
MOVE_MAX being sufficiently large that the memcpy call is folded early into
load + store. Some popular targets define MOVE_MAX to 8 or even 16 (e.g.
x86_64 or some options on s390x), but many other targets define it to just
4
(e.g. powerpc 32-bit), or even 2.
The testcase has already one test routine guarded on one particular target
with MOVE_MAX 16 (but does it incorrectly, __i386__ is only defined on
32-bit x86 and __SIZEOF_INT128__ is only defined on 64-bit targets), this
patch fixes that, and guards another test that relies on memcpy (, , 8)
being folded that way (which therefore needs MOVE_MAX >= 8) on a couple of
common targets that are known to have such MOVE_MAX.
2021-03-18 Jakub Jelinek <jakub@redhat.com>
PR testsuite/99626
* gcc.dg/strlenopt-73.c: Ifdef out
test_copy_cond_unequal_length_i64
on targets other than x86, aarch64, s390 and 64-bit powerpc. Use
test_copy_cond_unequal_length_i128 for __x86_64__ with int128
support
rather than __i386__.
(cherry picked from commit fff9faa79043aa53d361e7f6e31b2680007a97e2)
^ permalink raw reply [flat|nested] 7+ messages in thread