Here's a new version of the patch. On 01/12/2022 14:16, Jakub Jelinek wrote: >> +void __attribute__((noinline)) > > You should use noipa attribute instead of noinline on callers > which aren't declare simd (on declare simd it would prevent cloning > which is essential for the declare simd behavior), so that you don't > get surprises e.g. from extra ipa cp etc. Fixed. >> +/* Ensure the the in-branch simd clones are used on targets that support >> + them. These counts include all call and definitions. */ >> + >> +/* { dg-skip-if "" { x86_64-*-* } { "-flto" } { "" } } */ > > Drop lines line above. I don't want to drop the comment because I get so frustrated by testcases that fail when something changes and it's not obvious what the original author was actually trying to test. I've tried to fix the -flto thing and I can't figure out how. The problem seems to be that there are two dump files from the two compiler invocations and it scans the wrong one. Aarch64 has the same problem. >> +/* { dg-final { scan-tree-dump-times "simdclone" 18 "optimized" { target x86_64-*-* } } } */ >> +/* { dg-final { scan-tree-dump-times "simdclone" 7 "optimized" { target amdgcn-*-* } } } */ > > And scan-tree-dump-times " = foo.simdclone" 2 "optimized"; I'd think that > should be the right number for all of x86_64, amdgcn and aarch64. And > please don't forget about i?86-*-* too. I've switched the pattern and changed to using the "vect" dump (instead of "optimized") so that the later transformations don't mess up the counts. However there are still other reasons why the count varies. It might be that those can be turned off by options somehow, but probably testing those cases is valuable too. The values are 2, 3, or 4, now, instead of 18, so that's an improvement. > >> +/* TODO: aarch64 */ > > For aarch64, one would need to include it in check_effective_target_vect_simd_clones > first... I've done so and tested it, but that's not included in the patch because there were other testcases that started reporting fails. None of the new testcases fail for Aarch64. OK now? Andrew