From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id ADE113858D32 for ; Thu, 2 Nov 2023 11:11:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ADE113858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ADE113858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.220.28 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698923511; cv=none; b=rcnBqx5LlfEWy5zc20VKz3HA+ZMR1/8xGcyLnT6h2BwoaA5lstpGdwlKoy6+i4nqZmxVaVoa0qf6diBkBYPqz3HGidMX86rNDnmACOfRgQh51J9uTLSk1Q5/YYxjyoOoEaRRumIZgicEIqR2yrjcfsMpxDEWnhIqABifenU/TvI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698923511; c=relaxed/simple; bh=G+6xZFPzY1oN4uDtIVZruJU6ojAfXFMdwnHHliS5uNs=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=MtvWwCTRwDUqCb7JW8QNk5s1+UoT0kjlAzq2PFCZfR0I9gpH67pY+dhEWBA3RGVj7DAkx3xBQOtJCHK1HMXjie68wHtJp1d7gCBqn7i0tfSdcF8XHVxHrKFfmnKE/rjRHu0btQQ0KILQCw1AhwjTq/xlsCsKMf/hD44sOP8Lpsw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 8759F21883; Thu, 2 Nov 2023 11:11:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1698923499; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=S332ZwtJcK3i93fPozyjN21crsBgkuSEwrFysb4i9tM=; b=K5g3y7sAO/Bl5Z6Em8u56owQKQWHRKZUGO7Q8LzViwck1MUuQHR26XuOPt8KfLmWzH49TP rxihcXhD8a+nj6kFxA2teRKoacER4CRTtjUoJCvkS03iGwnBrqCXzgO8DGD+CnNLyQMe1r 1xb8zbDgkEwBuScWrVqUDfW02gIUClc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1698923499; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=S332ZwtJcK3i93fPozyjN21crsBgkuSEwrFysb4i9tM=; b=yR8A3j9RLtvc8WkrUjcCsTIrqTrgN0ZxOf9gMakdepmUk1ep0X032z0Bj6tGuoIEJaN0qI KcJwPTADuzi4TOBg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 1BCF52CC5D; Thu, 2 Nov 2023 11:11:39 +0000 (UTC) Date: Thu, 2 Nov 2023 11:11:39 +0000 (UTC) From: Richard Biener To: Juzhe-Zhong cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: Re: [tree-optimization/111721] VECT: Support SLP for MASK_LEN_GATHER_LOAD with dummy mask In-Reply-To: <20231102005737.2418307-1-juzhe.zhong@rivai.ai> Message-ID: References: <20231102005737.2418307-1-juzhe.zhong@rivai.ai> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 2 Nov 2023, Juzhe-Zhong wrote: > This patch fixes following FAILs for RVV: > FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts" > FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts" > > Bootstrap on X86 and regtest passed. > > Tested on aarch64 passed. > > Ok for trunk ? > > PR tree-optimization/111721 > > gcc/ChangeLog: > > * tree-vect-slp.cc (vect_get_and_check_slp_defs): Support SLP for dummy mask -1. > * tree-vect-stmts.cc (vectorizable_load): Ditto. > > --- > gcc/tree-vect-slp.cc | 14 ++++++++++++-- > gcc/tree-vect-stmts.cc | 8 +++++++- > 2 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > index 43d742e3c92..23ca0318e31 100644 > --- a/gcc/tree-vect-slp.cc > +++ b/gcc/tree-vect-slp.cc > @@ -756,8 +756,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char swap, > { > tree type = TREE_TYPE (oprnd); > dt = dts[i]; > - if ((dt == vect_constant_def > - || dt == vect_external_def) > + if (dt == vect_external_def > && !GET_MODE_SIZE (vinfo->vector_mode).is_constant () > && (TREE_CODE (type) == BOOLEAN_TYPE > || !can_duplicate_and_interleave_p (vinfo, stmts.length (), > @@ -769,6 +768,17 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char swap, > "for variable-length SLP %T\n", oprnd); > return -1; > } > + if (dt == vect_constant_def > + && !GET_MODE_SIZE (vinfo->vector_mode).is_constant () > + && !can_duplicate_and_interleave_p (vinfo, stmts.length (), type)) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > + "Build SLP failed: invalid type of def " > + "for variable-length SLP %T\n", > + oprnd); > + return -1; > + } I don't think that's quite correct. can_duplicate_and_interleave_p doesn't get enough info here and IIRC even materializing arbitrary constants isn't possible with VLA vectors. The very first thing the function does is tree base_vector_type = get_vectype_for_scalar_type (vinfo, elt_type, count); if (!base_vector_type || !VECTOR_MODE_P (TYPE_MODE (base_vector_type))) return false; but for masks that's not going to get us the correct vector type. While I don't understand why we have that 'BOOLEAN_TYPE' special case (maybe the intent was to identify 'mask' operands that way?), we might want to require that we can materialize both all-zero and all-ones constant 'mask's. But then 'mask' operands should be properly identified here. Maybe we can also simply delay the check to the point we know whether we're facing an uniform constant or not (note for 'first', we cannot really special-case vect_constant_def as the second SLP lane might demote that to vect_external_def). It's always a balance of whether to reject sth at SLP build time (possibly allowing operand swapping to do magic) or to delay checks to stmt analysis time. That might also explain that you do not see fallout of the "wrong" change (the later checking will catch it anyway). There's probably testsuite coverage for SVE here. That said, a "correct" patch might be to simply change && (TREE_CODE (type) == BOOLEAN_TYPE || !can_duplicate_and_interleave_p (vinfo, stmts.length (), type))) to && TREE_CODE (type) != BOOLEAN_TYPE && !can_duplicate_and_interleave_p (vinfo, stmts.length (), type) thus delay 'mask' operand validation here. Note I still think we should improve TREE_CODE (type) == BOOLEAN_TYPE to identify internal function mask operands only. Richard. > > /* For the swapping logic below force vect_reduction_def > for the reduction op in a SLP reduction group. */ > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index 6ce4868d3e1..6c47121e158 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -9859,10 +9859,16 @@ vectorizable_load (vec_info *vinfo, > mask_index = internal_fn_mask_index (ifn); > if (mask_index >= 0 && slp_node) > mask_index = vect_slp_child_index_for_operand (call, mask_index); > + slp_tree slp_op = NULL; > if (mask_index >= 0 > && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index, > - &mask, NULL, &mask_dt, &mask_vectype)) > + &mask, &slp_op, &mask_dt, &mask_vectype)) > return false; > + /* MASK_LEN_GATHER_LOAD dummy mask -1 should always match the > + MASK_VECTYPE. */ > + if (mask_index >= 0 && slp_node && mask_dt == vect_constant_def > + && !vect_maybe_update_slp_op_vectype (slp_op, mask_vectype)) > + gcc_unreachable (); > } > > tree vectype = STMT_VINFO_VECTYPE (stmt_info); > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)