From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id AB6483858404 for ; Mon, 9 Oct 2023 13:05:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AB6483858404 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id DFBD121866; Mon, 9 Oct 2023 13:05:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1696856748; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8sqRy9+BKRybRdyaypvOt5ihULDirSTSnMtQz18YY/I=; b=V+3q0RO/BNno9mBadlghcJ5O08Emo36UeZSlh7zPmvtmet0brLtv56orLI//BsrjS7KKmz ias9YvGKsEs5MyMN7m162lbSyNi5YmCAfvoKzES008lre1Ty1EhfVkIW5NUGqP6ahcaKFo gryDV8Oltzcntxe1qkk6aN3vxjKJnuA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1696856748; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8sqRy9+BKRybRdyaypvOt5ihULDirSTSnMtQz18YY/I=; b=yAVdJ2qtc8PQzzGUXMPre7FCn/o89GmqIMHwFYxGLx4gZzU86mSP5GSmSeZgwMmHdM2dk+ Nt8TJvin3H0F6XBQ== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 89E402C143; Mon, 9 Oct 2023 13:05:48 +0000 (UTC) Date: Mon, 9 Oct 2023 13:05:48 +0000 (UTC) From: Richard Biener To: Robin Dapp cc: Tamar Christina , gcc-patches Subject: Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction. In-Reply-To: <5fc554a1-fb0c-ebcc-cfae-6b714d3aca38@gmail.com> Message-ID: References: <0193b63e-98dc-42bc-cd33-485361ea50bf@gmail.com> <671a575c-02ff-071b-967e-2e93d8986c1a@gmail.com> <85b08273-7eea-be3f-f08a-edf0780d36a7@gmail.com> <187932bd-3a22-acdb-025b-e17ca3408e3a@gmail.com> <5fc554a1-fb0c-ebcc-cfae-6b714d3aca38@gmail.com> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 9 Oct 2023, Robin Dapp wrote: > > Hmm, the function is called at transform time so this shouldn't help > > avoiding the ICE. I expected we refuse to vectorize _any_ reduction > > when sign dependent rounding is in effect? OTOH maybe sign-dependent > > rounding is OK but only when we use a unconditional fold-left > > (so a loop mask from fully masking is OK but not an original COND_ADD?). > > So we currently only disable the use of partial vectors > > else if (reduction_type == FOLD_LEFT_REDUCTION > && reduc_fn == IFN_LAST aarch64 probably chokes because reduc_fn is not IFN_LAST. > && FLOAT_TYPE_P (vectype_in) > && HONOR_SIGNED_ZEROS (vectype_in) so with your change we'd support signed zeros correctly. > && HONOR_SIGN_DEPENDENT_ROUNDING (vectype_in)) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > "can't operate on partial vectors because" > " signed zeros cannot be preserved.\n"); > LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; > > which is inside a LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P block. > > For the fully masked case we continue (and then fail the assertion > on aarch64 at transform time). > > I didn't get why that case is ok, though? We still merge the initial > definition with the identity/neutral op (i.e. possibly -0.0) based on > the loop mask. Is that different to partial masking? I think the main point with my earlier change is that without native support for a fold-left reduction (like on x86) we get ops = mask ? ops : neutral; acc += ops[0]; acc += ops[1]; ... so we wouldn't use a COND_ADD but add neutral elements for masked elements. That's OK for signed zeros after your change (great) but not OK for sign dependent rounding (because we can't decide on the sign of the neutral zero then). For the case of using an internal function, thus direct target support, it should be OK to have sign-dependent rounding if we can use the masked-fold-left reduction op. As we do /* On the first iteration the input is simply the scalar phi result, and for subsequent iterations it is the output of the preceding operation. */ if (reduc_fn != IFN_LAST || (mask && mask_reduc_fn != IFN_LAST)) { if (mask && len && mask_reduc_fn == IFN_MASK_LEN_FOLD_LEFT_PLUS) new_stmt = gimple_build_call_internal (mask_reduc_fn, 5, reduc_var, def0, mask, len, bias); else if (mask && mask_reduc_fn == IFN_MASK_FOLD_LEFT_PLUS) new_stmt = gimple_build_call_internal (mask_reduc_fn, 3, reduc_var, def0, mask); else new_stmt = gimple_build_call_internal (reduc_fn, 2, reduc_var, def0); the last case should be able to assert that !HONOR_SIGN_DEPENDENT_ROUNDING (also the reduc_fn == IFN_LAST case). The quoted condition above should change to drop the HONOR_SIGNED_ZEROS condition and the reduc_fn == IFN_LAST should change, maybe to internal_fn_mask_index (reduc_fn) == -1? Richard.