From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=mOkb=AE=arm.com=andre.simoesdiasvieira@sourceware.org>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by sourceware.org (Postfix) with ESMTP id AA4393858D20;
	Thu, 13 Apr 2023 14:25:06 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AA4393858D20
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C0D5ED75;
	Thu, 13 Apr 2023 07:25:50 -0700 (PDT)
Received: from [10.57.68.11] (unknown [10.57.68.11])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 722253F73F;
	Thu, 13 Apr 2023 07:25:05 -0700 (PDT)
Message-ID: <c7ddab89-cb41-a373-649d-ae5f0db4a4a6@arm.com>
Date: Thu, 13 Apr 2023 15:25:00 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
 Thunderbird/102.9.0
Subject: Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c
 scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on
 Linux/x86_64
Content-Language: en-US
To: Richard Biener <richard.guenther@gmail.com>
Cc: Andrew Stubbs <ams@codesourcery.com>,
 "gcc-regression@gcc.gnu.org" <gcc-regression@gcc.gnu.org>,
 "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
 "haochen.jiang@intel.com" <haochen.jiang@intel.com>
References: <202304130148.33D1mmns1987590@shliclel4214.sh.intel.com>
 <PAXPR08MB655950805D323967FAE50EC588989@PAXPR08MB6559.eurprd08.prod.outlook.com>
 <e30e9b8f-f6d9-8ded-01a6-8ba42a98611c@codesourcery.com>
 <341dd608-a512-3c74-303d-1942876a3850@arm.com>
 <CAFiYyc1SD7tgatF1R0ARh-ENv8TbqOsdw-Lu8n4AGU7qzChZ+g@mail.gmail.com>
From: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
In-Reply-To: <CAFiYyc1SD7tgatF1R0ARh-ENv8TbqOsdw-Lu8n4AGU7qzChZ+g@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_NUMSUBJECT,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>


On 13/04/2023 15:00, Richard Biener wrote:
> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>>
>> On 13/04/2023 11:01, Andrew Stubbs wrote:
>>> Hi Andre,
>>>
>>> I don't have a cascadelake device to test on, nor any knowledge about
>>> what makes it different from regular x86_64.
>>
>> Not sure you need one, but yeah I don't know either, it looks like it
>> fails because:
>> in-branch vector clones are not yet supported for integer mask modes.
>>
>> A quick look tells me this is because mask_mode is not VOIDmode.
>> i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set
>> mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect
>> cascadelake is TARGET_AVX512F.
>>
>> This is where I bail out as I really don't want to dive into the target
>> specific simd clone handling of x86 ;)
>>
>>>
>>> If the cascadelake device is supposed to work the same as other x86_64
>>> devices for these vectors then the test has found a bug in the compiler
>>> and you should be looking to fix that, not fudge the testcase.
>>>
>>> Alternatively, if the device's capabilities really are different and the
>>> tests should behave differently, then the actual expectations need to be
>>> encoded in the dejagnu directives. If you can't tell the difference by
>>> looking at the "x86_64*-*-*" target selector alone then the correct
>>> solution is to invent a new "effective-target" selector. There are lots
>>> of examples of using these throughout the testsuite (you could use
>>> dg-require-effective-target to disable the whole testcase, or just use
>>> the name in the scan-tree-dump-times directive to customise the
>>> expectations), and the definitions can be found in the
>>> lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are
>>> fixed expressions and some run the compiler to probe the configuration,
>>> but in this case you probably want to do something with "check-flags".
>>
>> Even though I agree with you, I'm not the right person to do this
>> digging for such target specific stuff. So for now I'd probably suggest
>> xfailing this for avx512f.
>>>
>>> For the unroll problem, you can probably tweak the optimization options
>>> to disable that, the same as has been done for the epilogues feature
>>> that had the same problem.
>>
>> I mistaken the current behaviour for unrolling, it's actually because of
>> a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to
>> determine the vectype of a stmt. For a function like foo, that has the
>> same type (long long) everywhere this wouldn't be a problem, however,
>> because you transformed it into a MASK_CALL that has a function pointer
>> (which is 32-bit in -m32) that now becomes the 'smallest' type.
>>
>> This is all a red-herring though, I don't think we should be calling
>> this function for potential simdclone calls as the type on which the
>> veclen is not necessarily the 'smallest' type. And some arguments (like
>> uniform and linear) should be ignored anyway as they won't be mapped to
>> vectors.  So I do think this might have been broken even before your
>> changes, but needs further investigation.
>>> Since these are new tests for a new feature, I don't really understand
>>> why this is classed as a regression?
>>>
>>> Andrew
>>>
>>> P.S. there was a commit to these tests in the last few days, so make
>>> sure you pull that before making changes.
>>
>> The latest commit to these tests was mine, it's the one Haochen is
>> reporting this regression against. My commit was to fix the issue richi
>> had introduced that was preventing the feature you introduced from
>> working. The reason nobody noticed was because the tests you introduced
>> didn't actually test your feature, since you didn't specify 'inbranch'
>> the omp declare simd pragma was allowing the use of not-inbranch simd
>> clones and the vectorizer was being smart enough to circumvent the
>> conditional and was still able to use simdclones (non inbranch ones) so
>> when the inbranch stopped working, the test didn't notice.
>>
>> The other changes to this test were already after the fix for 108888
>> that broke the inbranch feature you added, and so it was fixing a
>> cascadelake testism but for the not-inbranch simdclones. So basically
>> fixing a testism of a testism :/
>>
>>
>> I am working on simdclone's for AArch64 for next Stage 1 so I don't mind
>> looking at the issue with the vectype being chosen wrongly, as for the
>> other x86 specific testisms I'll leave them to someone else.
> 
> Btw, the new testsuite FAILs could be just epiloge vectorizations, so
> maybe try the usual --param vect-epilogues-nomask=0 ...
It already has those, Jakub added them.

But that's not it, I've been looking at it, and there is code in place 
that does what I expected which is defer the choice of vectype for simd 
clones until vectorizable_simd_clone_call, unfortunately it has a 
mistaken assumption that simdclones don't return :/
see vect_get_vector_types_for_stmt:
...
   if (gimple_get_lhs (stmt) == NULL_TREE
       /* MASK_STORE has no lhs, but is ok.  */
       && !gimple_call_internal_p (stmt, IFN_MASK_STORE))
     {
       if (is_a <gcall *> (stmt))
         {
           /* Ignore calls with no lhs.  These must be calls to
              #pragma omp simd functions, and what vectorization factor
              it really needs can't be determined until
              vectorizable_simd_clone_call.  */
           if (dump_enabled_p ())
             dump_printf_loc (MSG_NOTE, vect_location,
                              "defer to SIMD clone analysis.\n");
           return opt_result::success ();
         }

       return opt_result::failure_at (stmt,
                                      "not vectorized: irregular 
stmt.%G", stmt);
     }
...

I'm working on a patch.
> 
>> Kind Regards,
>> Andre