From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by sourceware.org (Postfix) with ESMTPS id 3B0B33858D37 for ; Mon, 16 Oct 2023 12:33:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3B0B33858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3B0B33858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697459638; cv=none; b=uuTpQJXCEHqbJ8p7I9BxuSo2fbl0xI6EwSooOrhm3AB5fF0H3M5eafDXYiu0QXai0lbrHqut2LEii/WTuN7SXwKnZ8ByLRs7Fx8igK8d/SNaeUHYHDIw+jzrBi7ddZMASbTS9Q9wFbieKSLYmsWDOthzURjw8YTEtxwZswoX6RY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697459638; c=relaxed/simple; bh=5qeAD16pgToOk/qSm0yVLPgp/vHMie5Hl9wp/aWTqEU=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=hI7t7qoYlWug6G/wM2v2CnMAXJE/Po5m6O3vIvYnSemP1mFrVy2BT7ochXdUv4VjsQisgdHjRpdDDnxqEpNs64ZnLmEkAYJCM2n0Awmb+T61qQXZFrdcuG+S0oqiX1Wdj+9dZ4vbzGzEm1xUxmyY66ZzvHQvrR6l0+V4DCgjhu8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-4075c58ac39so45726505e9.3 for ; Mon, 16 Oct 2023 05:33:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697459636; x=1698064436; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:cc:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=1t2uqsxOKCT5OuPLSmR2lKcrlaZlpI0W8lRGMotrXhA=; b=AJaAsM6RFG1/9YqRKqTB7lg/9v5B5NroNiUHLwxHcj+a7PqLNIPfKOXZpg/ErWsFkP WSX51Lj62xtdDVYHKpBBPg9ry6kDtzxqoktciQZ9Mlfg0KwZGmK34iMnfrYlI/h6CtG5 l5S71iIcAhvRm9B+WVRN/xt0UQPA81QT0DTnRsWR7wDdkds6RBurGmF3PZiwqpujl2JB KdO0GDHOTuWFnt0jnngP9opbnyxZ7Zq5/AOVMaUUHZl03SJ1WbZH5kunNPQ6Us0uYq2u 1Y13c2BHKGh6NexwsYlAVW15pd+NJW/1d/90uS3tEER+mt08F4CDLAv1ZsB2pgvsZXc7 8AHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697459636; x=1698064436; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:cc:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1t2uqsxOKCT5OuPLSmR2lKcrlaZlpI0W8lRGMotrXhA=; b=kE5mjAy7s1cVBcDwXWULCjoCG9nQCjCTyOllbGg4OAFU9hWNN13VTKe7wXBjoH7Aat EEX+zatLRYcPiChYXcT5CQFcG1lLBSZS1qHGW7bZeDF9ID4R8f3vSnTbXshPtaY84oza 4cfTOuA4wwYwIkjj4bhlGjy5tZ8M9dA6EUGzaQ55UW8rPJHvvvEirp87I4r6IRuLTaDW dTqEtdLpxQqMCixY3heIwC6oPXSCWHzw9jrur2XYsuNnpVbmNNTyX0O8gbGWHud/Xd9W lUbvlG6Rzr2atPuq4lWeYvGtC/YqoG1u+QvUTDz6JUIJdVjZxX3AG5DYbvr+qBWg9N4Z BSHQ== X-Gm-Message-State: AOJu0YzuULeE3jj8v4v2OE8o6j5+qHul7INS5+mb/Q9LLiBp/SuHRWiV /qOyAMkoSidwX3AkbEWX3hU= X-Google-Smtp-Source: AGHT+IFCCQanbhFtMGN3tSqLcPpLSXYv8NVXvqUUnytImnsVnYal9R0voFp5//zf60IncuciKdcdYQ== X-Received: by 2002:adf:f103:0:b0:32d:9cd3:6a9d with SMTP id r3-20020adff103000000b0032d9cd36a9dmr7329045wro.25.1697459635626; Mon, 16 Oct 2023 05:33:55 -0700 (PDT) Received: from [192.168.1.23] (ip-046-223-203-173.um13.pools.vodafone-ip.de. [46.223.203.173]) by smtp.gmail.com with ESMTPSA id v10-20020a5d678a000000b0032d9f32b96csm7373036wru.62.2023.10.16.05.33.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 16 Oct 2023 05:33:55 -0700 (PDT) Message-ID: Date: Mon, 16 Oct 2023 14:33:54 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com, kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com Subject: Re: [PATCH V2] RISC-V: Fix unexpected big LMUL choosing in dynamic LMUL model for non-adjacent load/store Content-Language: en-US To: Juzhe-Zhong , gcc-patches@gcc.gnu.org References: <20231016113108.877163-1-juzhe.zhong@rivai.ai> From: Robin Dapp In-Reply-To: <20231016113108.877163-1-juzhe.zhong@rivai.ai> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Juzhe, > +/* Get STORE value. */ > +static tree > +get_store_value (gimple *stmt) > +{ > + if (is_gimple_call (stmt) && gimple_call_internal_p (stmt)) > + { > + if (gimple_call_internal_fn (stmt) == IFN_MASK_STORE) > + return gimple_call_arg (stmt, 3); > + else > + gcc_unreachable (); > + } > + else > + return gimple_assign_rhs1 (stmt); > +} This was something I was about to mention in the review of v1 that I already started. This is better now. > + basic_block bb = e->src; Rename to pred or so? And then keep the original bb. > if (!live_range) > - continue; > + { > + if (single_succ_p (e->src)) > + { > + /* > + [local count: 850510900]: > + goto ; [100.00%] > + > + Handle this case, we should extend live range of bb 3. > + */ /* If the predecessor is an extended basic block extend it and look for DEF's definition again. */ > + bb = single_succ (e->src); > + if (!program_points_per_bb.get (bb)) > + continue; > + live_ranges = live_ranges_per_bb.get (bb); > + max_point > + = (*program_points_per_bb.get (bb)).length () - 1; > + live_range = live_ranges->get (def); > + if (!live_range) > + continue; > + } > + else > + continue; > + } We're approaching a complexity where reverse postorder would have helped ;) Maybe split out the live range into a separate function get_live_range_for_bb ()? > + for (si = gsi_start_bb (bbs[i]); !gsi_end_p (si); gsi_next (&si)) > + { > + if (!(is_gimple_assign (gsi_stmt (si)) > + || is_gimple_call (gsi_stmt (si)))) > + continue; > + stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si)); > + enum stmt_vec_info_type type > + = STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info)); > + if ((type == load_vec_info_type || type == store_vec_info_type) > + && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info))) > + { > + /* For non-adjacent load/store STMT, we will potentially > + convert it into: > + > + 1. MASK_LEN_GATHER_LOAD (..., perm indice). > + 2. Continguous load/store + VEC_PERM (..., perm indice) > + > + We will be likely using one more vector variable. */ > + unsigned int max_point > + = (*program_points_per_bb.get (bbs[i])).length () - 1; > + auto *live_ranges = live_ranges_per_bb.get (bbs[i]); > + bool existed_p = false; > + tree var = type == load_vec_info_type > + ? gimple_get_lhs (gsi_stmt (si)) > + : get_store_value (gsi_stmt (si)); > + tree sel_type = build_nonstandard_integer_type ( > + TYPE_PRECISION (TREE_TYPE (var)), 1); > + tree sel = build_decl (UNKNOWN_LOCATION, VAR_DECL, > + get_identifier ("vect_perm"), sel_type); > + pair &live_range = live_ranges->get_or_insert (sel, &existed_p); > + gcc_assert (!existed_p); > + live_range = pair (0, max_point); > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "Add perm indice %T, start = 0, end = %d\n", > + sel, max_point); > + } > + } > } > } So we're inserting a dummy vect_perm element (that's live from the start?). Would it make sense to instead increase the number of needed registers for a load/store and handle this similarly to compute_nregs_for_mode? Maybe also do it directly in compute_local_live_ranges and extend live_range by an nregs? Regards Robin