From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=HEJU=FX=suse.de=rguenther@sourceware.org>
Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c])
	by sourceware.org (Postfix) with ESMTPS id AB6483858404
	for <gcc-patches@gcc.gnu.org>; Mon,  9 Oct 2023 13:05:49 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AB6483858404
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de
Received: from relay2.suse.de (relay2.suse.de [149.44.160.134])
	by smtp-out1.suse.de (Postfix) with ESMTP id DFBD121866;
	Mon,  9 Oct 2023 13:05:48 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa;
	t=1696856748; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=8sqRy9+BKRybRdyaypvOt5ihULDirSTSnMtQz18YY/I=;
	b=V+3q0RO/BNno9mBadlghcJ5O08Emo36UeZSlh7zPmvtmet0brLtv56orLI//BsrjS7KKmz
	ias9YvGKsEs5MyMN7m162lbSyNi5YmCAfvoKzES008lre1Ty1EhfVkIW5NUGqP6ahcaKFo
	gryDV8Oltzcntxe1qkk6aN3vxjKJnuA=
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;
	s=susede2_ed25519; t=1696856748;
	h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=8sqRy9+BKRybRdyaypvOt5ihULDirSTSnMtQz18YY/I=;
	b=yAVdJ2qtc8PQzzGUXMPre7FCn/o89GmqIMHwFYxGLx4gZzU86mSP5GSmSeZgwMmHdM2dk+
	Nt8TJvin3H0F6XBQ==
Received: from wotan.suse.de (wotan.suse.de [10.160.0.1])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by relay2.suse.de (Postfix) with ESMTPS id 89E402C143;
	Mon,  9 Oct 2023 13:05:48 +0000 (UTC)
Date: Mon, 9 Oct 2023 13:05:48 +0000 (UTC)
From: Richard Biener <rguenther@suse.de>
To: Robin Dapp <rdapp.gcc@gmail.com>
cc: Tamar Christina <Tamar.Christina@arm.com>, 
    gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar
 reduction.
In-Reply-To: <5fc554a1-fb0c-ebcc-cfae-6b714d3aca38@gmail.com>
Message-ID: <nycvar.YFH.7.77.849.2310091255210.5561@jbgna.fhfr.qr>
References: <0193b63e-98dc-42bc-cd33-485361ea50bf@gmail.com> <VI1PR08MB5325C3361619D0110E741BC0FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com> <671a575c-02ff-071b-967e-2e93d8986c1a@gmail.com> <VI1PR08MB5325F4C648851C5A9C833E87FFCBA@VI1PR08MB5325.eurprd08.prod.outlook.com>
 <85b08273-7eea-be3f-f08a-edf0780d36a7@gmail.com> <nycvar.YFH.7.77.849.2310060905500.5561@jbgna.fhfr.qr> <187932bd-3a22-acdb-025b-e17ca3408e3a@gmail.com> <f28f3486-a025-59fc-8305-791786b8b6e3@gmail.com> <nycvar.YFH.7.77.849.2310061333310.5561@jbgna.fhfr.qr>
 <f373ac25-b875-98f4-5dff-95b6c6523f09@gmail.com> <nycvar.YFH.7.77.849.2310090822050.5561@jbgna.fhfr.qr> <5fc554a1-fb0c-ebcc-cfae-6b714d3aca38@gmail.com>
User-Agent: Alpine 2.22 (LSU 394 2020-01-19)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Mon, 9 Oct 2023, Robin Dapp wrote:

> > Hmm, the function is called at transform time so this shouldn't help
> > avoiding the ICE.  I expected we refuse to vectorize _any_ reduction
> > when sign dependent rounding is in effect?  OTOH maybe sign-dependent
> > rounding is OK but only when we use a unconditional fold-left
> > (so a loop mask from fully masking is OK but not an original COND_ADD?).
> 
> So we currently only disable the use of partial vectors
> 
>       else if (reduction_type == FOLD_LEFT_REDUCTION
> 	       && reduc_fn == IFN_LAST

aarch64 probably chokes because reduc_fn is not IFN_LAST.

> 	       && FLOAT_TYPE_P (vectype_in)
> 	       && HONOR_SIGNED_ZEROS (vectype_in)

so with your change we'd support signed zeros correctly.

> 	       && HONOR_SIGN_DEPENDENT_ROUNDING (vectype_in))
> 	{
> 	  if (dump_enabled_p ())
> 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> 			     "can't operate on partial vectors because"
> 			     " signed zeros cannot be preserved.\n");
> 	  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
> 
> which is inside a LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P block.
> 
> For the fully masked case we continue (and then fail the assertion
> on aarch64 at transform time).
> 
> I didn't get why that case is ok, though?  We still merge the initial
> definition with the identity/neutral op (i.e. possibly -0.0) based on
> the loop mask.  Is that different to partial masking?

I think the main point with my earlier change is that without
native support for a fold-left reduction (like on x86) we get

 ops = mask ? ops : neutral;
 acc += ops[0];
 acc += ops[1];
 ...

so we wouldn't use a COND_ADD but add neutral elements for masked
elements.  That's OK for signed zeros after your change (great)
but not OK for sign dependent rounding (because we can't decide on
the sign of the neutral zero then).

For the case of using an internal function, thus direct target support,
it should be OK to have sign-dependent rounding if we can use
the masked-fold-left reduction op.  As we do

      /* On the first iteration the input is simply the scalar phi
         result, and for subsequent iterations it is the output of
         the preceding operation.  */
      if (reduc_fn != IFN_LAST || (mask && mask_reduc_fn != IFN_LAST))
        {
          if (mask && len && mask_reduc_fn == IFN_MASK_LEN_FOLD_LEFT_PLUS)
            new_stmt = gimple_build_call_internal (mask_reduc_fn, 5, 
reduc_var,
                                                   def0, mask, len, bias);
          else if (mask && mask_reduc_fn == IFN_MASK_FOLD_LEFT_PLUS)
            new_stmt = gimple_build_call_internal (mask_reduc_fn, 3, 
reduc_var,
                                                   def0, mask);
          else
            new_stmt = gimple_build_call_internal (reduc_fn, 2, reduc_var,
                                                   def0);

the last case should be able to assert that 
!HONOR_SIGN_DEPENDENT_ROUNDING (also the reduc_fn == IFN_LAST case).

The quoted condition above should change to drop the HONOR_SIGNED_ZEROS
condition and the reduc_fn == IFN_LAST should change, maybe to
internal_fn_mask_index (reduc_fn) == -1?

Richard.