From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 3D85E3858C2C for ; Thu, 23 Feb 2023 10:03:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3D85E3858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="97950689" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 23 Feb 2023 02:03:32 -0800 IronPort-SDR: FrBToDGoxVJ6/eDa5jjqLneGapeswZlPDBfUtoxWWxNt1Htr5D6pjuFMi12tKOPIBI/ozhDvKJ P1F6HmLX4FiJxF5Z5VNmcIXboToHq1bqOwMBiHdn1zfF2wsMZvRgdBYBOSERX03xJNyqcDNkOj 1k6+FkMttc6GiFw0it+9SR1m2jOACxvDLZIJcT5RIIY/6LIO70hY0xNiUMcWzhTX4PEdgL836X yAWkHscglp/kqUJNcphvVmknnIxcBEyuxaGIJbe3ip4r89tjo7fP0LlMZFdzR8tz2uC41GVsS9 +Zk= Message-ID: Date: Thu, 23 Feb 2023 10:02:59 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH 3/3] vect: inbranch SIMD clones To: Jakub Jelinek CC: Richard Biener , References: <8022b190-387b-c6a9-a8fe-1d18a9140e93@codesourcery.com> <15d420d5-45da-d4d5-13e2-e6ca7691e096@codesourcery.com> Content-Language: en-GB From: Andrew Stubbs In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-15.mgc.mentorg.com (139.181.222.15) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/02/2023 09:11, Jakub Jelinek wrote: >> I've tried to fix the -flto thing and I can't figure out how. The problem >> seems to be that there are two dump files from the two compiler invocations >> and it scans the wrong one. Aarch64 has the same problem. > > Two dumps are because it is in a dg-do run test. > I think it would be better to separate it, have for all cases one > test with defaulted dg-do (in vect.exp that is either dg-do run or dg-do > compile: > # If the target system supports vector instructions, the default action > # for a test is 'run', otherwise it's 'compile'. > ) without the dg-final and then another one with the same TYPE which would > be forcibly dg-do compile with dg-final and > dg-additional-options "-ffat-lto-objects", then you get a single dump only. If I change the testcase to "dg-do compile" then it does indeed only produce one dump, but it's still the wrong one. The command it runs is this (I removed some noise): x86_64-none-linux-gnu-gcc vect-simd-clone-16.c -flto -ffat-lto-objects \ -msse2 -ftree-vectorize -fno-tree-loop-distribute-patterns \ -fno-vect-cost-model -fno-common -O2 \ -fdump-tree-vect-details -fopenmp-simd -mavx With "-S" (dg-do compile) I get vect-simd-clone-16.c.172t.vect Otherwise (dg-do run) I get a-vect-simd-clone-16.c.172t.vect a.ltrans0.ltrans.172t.vect The "ltrans0" dump has the "foo.simdclone" output that we're looking for, but dejagnu appears to be scanning the other, which does not. The filenames vary between the two commands, but the contents is identical. >>>> +/* { dg-final { scan-tree-dump-times "simdclone" 18 "optimized" { target x86_64-*-* } } } */ >>>> +/* { dg-final { scan-tree-dump-times "simdclone" 7 "optimized" { target amdgcn-*-* } } } */ >>> >>> And scan-tree-dump-times " = foo.simdclone" 2 "optimized"; I'd think that >>> should be the right number for all of x86_64, amdgcn and aarch64. And >>> please don't forget about i?86-*-* too. >> >> I've switched the pattern and changed to using the "vect" dump (instead of >> "optimized") so that the later transformations don't mess up the counts. >> However there are still other reasons why the count varies. It might be that >> those can be turned off by options somehow, but probably testing those cases >> is valuable too. The values are 2, 3, or 4, now, instead of 18, so that's an >> improvement. > > But still varries between the architectures, so it is an extra maintainance > nightmare. > >>>> +/* TODO: aarch64 */ >>> >>> For aarch64, one would need to include it in check_effective_target_vect_simd_clones >>> first... >> >> I've done so and tested it, but that's not included in the patch because >> there were other testcases that started reporting fails. None of the new >> testcases fail for Aarch64. > > Sure, that would be for a separate patch. > > Anyway, if you want, commit the patch as is and tweak the testcases if > possible incrementally. I will do so now. It would be nice to fix the testcase oddities, but I don't know how. I wrote the above yesterday, but apparently the email didn't send ... since then some bugs have been reported. I'll try to investigate today, although I think Richi has a fix already. Thanks Andrew